Introduction to Spatial Analysis
Day 4 - Measurement
Jonathan Phillips
January, 2019
Spatial Measurement
- What patterns in our data do we want to measure?
- Where events take place
- Whether events are ‘clustered’ in space
- Whether characteristics are ‘clustered’ together (given fixed locations)
- Whether groups are ‘segregated’
Measures of Central Tendency
Measures of Central Tendency
Measures of Central Tendency
- We can also measure the ‘distribution’ of spatial events in two-dimensions
- A ‘spatial ellipse’
- Location, dispersion and direction
Measures of Central Tendency
Spatial Point Patterns
- Our null hypothesis is of ‘complete spatial randomness’
- Each location has an equal probability of an event occurring (poisson model)
- How far away is our distribution of points from this random distribution?
- Simple statistical test
Clustering
In many cases, our spatial units are fixed - states, homes, lakes - but we want to know if the characteristics of these objects follow any spatial pattern
First, we need to understand the ‘space’ we are working in
Neighbours
- Contiguity-based Neighbours
- For polygons, contiguity means ‘touching’
Neighbours
- Contiguity-based Neighbours
- W = Spatial Weights Matrix (NxN)
Neighbours
- Distance-based Neighbours
- Usually used to identify ‘nearest-neighbour’ or \(K\) nearest neighbours
- Might not be ‘close’, but is ‘closer’
Clustering
- We want to measure how similar neighbouring units are
- The degree of Spatial Autocorrelation
- Again, our benchmark is a spatially random distribution of characteristics across our units
Clustering
Clustering
- Random is not one end of the scale, with clustering at the other end
- Random data has clusters! Randomly!
- Just not too many
- Random is more like the middle of the scale
- Positive autocorrelation (clustering) on one end
- Negative autocorrelation (dispersion) on the other end
Clustering
Clustering
Clustering
- Does this data look clustered?
- How do we prove it?
- How much is it clustered?
Clustering
- We need a measure of spatial autocorrelation (clustering)
- When are two neighbours more likely to have similar characteristics than would be expected at random?
- Moran’s I measure of Spatial Autocorrelation
- -1: Perfect negative autocorrelation
- 0: No autocorrelation at all (in large samples)
- 1: Perfect positive autocorrelation
Clustering
Clustering
\[ I = \frac{N}{W} \frac{\sum_i \sum_j w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_i (x_i - \bar{x})} \]
where:
\(i\) and \(j\) are are units
\(x_i\) and \(x_j\) are the characteristic of interest for units \(i\) and \(j\)
\(w_{ij}\) is the spatial weight between units \(i\) and \(j\)
\(N\) is the total number of units
\(W\) is the sum of the spatial weights
Clustering
- Moran’s I:
- We can statistically test whether our value of Moran’s I is higher or lower than we would expect if the characteristic was randomly distributed in space
Clustering
Clustering
- The Moran’s I of Brazilian Presidential voting in 2014:
- I = 0.85
- Expected I = -0.00017
- P-value = 0.00000001
Clustering
- Spatial Autocorrelation is complicated and occurs at different distances
- Not just among neighbours
- We can look at patterns of spatial autocorrelation at multiple scales by using a Variogram
- A variogram shows the average squared difference in characteristics (eg. vote share) at distance \(d\) for many distances
Clustering
Clustering
Local Clustering
Spatial Segregation
- Sometimes we want to study not a single characteristic but the distribution of multiple (>2) groups in space
- Racial groups in cities
- Skilled workers
- Vote shares for multiple parties
- We want to know how ‘segregated’ these groups are into separate spatial areas
Spatial Segregation
Spatial Segregation
- One approach is the Spatial Dissimilarity Index
- A measure of evenness vs. clustering
- On average, how different is the composition of each unit’s local neighbourhood to the composition of the entire region as a whole?
0: Evenness
1: Clustering (segregation)
Spatial Segregation
\[D = \sum_n \sum_i \frac{N_n}{2NI} |t_{ni} - t_i|\]
where:
\(i\) indexes groups
\(n\) indexes neighbourhoods
\(N\) is the total population
\(t_i\) is the % of group \(i\) overall
\(t_{ni}=\frac{L_ni}{L_n}\) is the standardized intensity of group \(i\) in neighbourhood \(n\)
\(I = \sum_i (t_i)(1-t_i)\)
Spatial Segregation
- Segregation (Black/White) in US Cities
- Detroit: 0.867
- New York: 0.843
- Chicago: 0.836
- San Francisco: 0.656
- Jacksonville: 0.371
Spatial Segregation
- Dissimilarity is a global measure
- But we can also measure local dissimilarity
Spatial Segregation
- Spatial Exposure: The average proportion of group \(j\) in the neighbourhood of group \(i\)
Spatial Segregation
- Segregation is complicated
- It varies a lot depending on the scale at which you assess it
- And how we define each unit’s neighbourhood