geek talk








The semivariance is based on the idea that two points close together are likely to be more similar to each other than two points farther apart. The degree of similarity caused by spatial proximity is known as autocorrelation. The formula for calculating the semivariance is:

Semivariance (distance h) = 0.5 * average[(value at location i - value at location j)^2]

So in other words the semivariance for all points that are separated by a given distance, you are calculating half the average variance between each pair of points (hence 'semi' variance). Doing this for all distances means that the variance will be calculated between all possible combinations of points. Then you can plot the semivariance on the Y axis and the distance (h) on the X axis. The resulting plot is called a semivariogram. There are some examples of semivariograms on page 15 of this document:
http://www.esri.com/software/arcgis/arcgisxtensions/geostatistical/pdf/a irqualityjgra.pdf

If there is autocorrelation in the data (indicating some spatial structure), then the points towards the left of the X axis will be closer together because these represent the variances of the points that are closest together. The points on the right end of the X axis will be scattered farther apart because points that are farther apart from each other are more variable. If you fit a line to the scatter of points in the semivariogram, it typically crosses the Y intercept somewhere above zero, then rises for some distance until leveling out. You can get some good information about the spatial structure of the data from the fitted line.

Where the line crosses the Y axis is called the 'nugget'. In theory two points separated by zero distance should be identical and have a variance of zero. However, sampling error and variance that occurs below the sample interval or resolution imparts variance to points infinitesimally close together and the nugget gives you an estimate of that variance. The distance (h) where the fitted line flattens out is called the 'range'. Points beyond the range are basically not autocorrelated which is why the line flattens out and the points become a random scatter. The range tells you at what distance points are no longer similar based on proximity - in other words, it is an estimate of the patch size of the phenomena being measured. The value of the semivariance at the range is called the 'sill' and the sill minus the nugget is the 'partial sill'. I'm not really sure what those tell you other than the maximum variance of the autocorrelated values but somebody probably has figured out some handy use for them. I guess I should also mention that you don't actually plot all point pairs. Instead you group them into 'lag bins' (e.g. all points between 20 and 30 cm apart). By playing with the lag distance and seeing how it changes the semivariogram, you can get an idea of the spatial distances where the phenomena is most responsive.

Semivariograms are often used to fit a model to the data that allows you to interpolate the data taking into account the spatial structure ('kriging' is a common technique for this). This lets you plot a continuous surface of a phenomena that hopefully represents the real world. Those fancy weather maps that show temperature or barometric data as a smooth continuous surface are most likely kriged data from point data sources.

The upshot is that if you took a whole bunch of points of something like light in vivaria of different sizes, you might be able to use semivariograms to glean information about how the size of the vivarium (or any manipulation really) changes the spatial structure of the measured phenomena. If you haven't figured it out by now, by 'spatial structure' I mean how the measured phenomena is affected by location.

Okay, now I'm REALLY sorry if anyone read through all of that hoping to find anything that is useful to frogery. Personally, I now feel like I have a brain bot.

Brent















_______from the notes and contributions of Frognet Patrons_______