|
Topic:
The Search for Spatial Associations
OVERVIEW and OBJECTIVES
Conventional statistics deal with IID (Independent and Identically
Distributed) data; spatial autocorrelation invalidates the independence
property of these data. This type of correlation may be viewed as the
presence of redundant information in the data. Impacts of positive spatial
autocorrelation include:
- a distortion of tests for normality,
- inflation of the estimated variance,
- inflation of the estimated covariance, and
- the need to estimate an autoregressive model.
Properly accounting for spatial autocorrelation involves estimating the
inflation factors, and adjusting the sample size, N -- which becomes the
effective sample size, N* -- in order to relate spatially autocorrelated
data to equivalent hypothetical IID data.
The procedural steps involved in doing this are:
- evaluate normality (quantile plots, Shapiro-Wilk statistic), and if
necessary apply a power transformation to each variable;
- estimate the autoregressive parameter for each georeferenced variable
(this lecture employs the Simultaneous AutoRegressive [SAR] model);
- estimate the means, inflation factors, and correlation coefficients;
- estimate the effective sample size N*, and its associated degrees of
freedom;
- calculate the t-statistics; and
- construct confidence intervals for means and perform significance tests
on correlation coefficients.
A variety of georeferenced health data sets are discussed.
OBJECTIVES:
Those who successfully complete the module should be able to
- compute the correct t-statistics for mean georeferenced disease rates,
calculated with geographically aggregated data;
- compute the correct t-statistic for a correlation coefficient
calculated with two georeferenced variables, using geographically
aggregated data; and
- apply these concepts to study and analyze data collected by public
health agencies.
SCENARIOS FOR DISCUSSION
|