Introduction to Spatial Analysis
Day 1 - Concepts and Datasets
Jonathan Phillips
January, 2019
Geography
- What is your favourite sport?
- Do you speak Spanish?
- Do you know who Fofão is?
- How many kisses on the cheek do you greet someone with?
- If you are on your own in a taxi do you sit in the front or back?
- Do you think government policy should allow free migration?
- Where do you live?
Geography
Knowledge and communication depend on where we live
Social norms and customs depend on where we live
Political preferences depend on where we live
Geography
Tobler’s First Law of Geography:
“Everything is related to everything else, but near things are more related than distant things”
Geography
What does ‘near’ mean?
- Concepts of distance:
- Euclidean
- Great Circle
- Manhattan
- Levensthein
- Mahalanobis
- Driving
- Network
- Minimum-cost
- Genetics
Geography
- What does ‘related’ mean?
- Correlated
- More similar
- More different (ex. dialing codes to avoid typing errors)
- ‘Related’ does not mean one person ‘causes’ a similar effect on another
- It may just be a common response to a similar environment
- But interactions and spillovers are common
Geography
- Locations of ‘Events’ could be ‘near’ to each other
Geography
- Or characteristics of locations could be ‘near’ to each other
Geography
- Multiple characteristics could also be ‘near’ to each other
Geography
- But isn’t the world getting smaller?
- ‘The death of distance’
- Everything is ‘near’ on the internet
- Relevant distances may be changing
- Cost of flights instead of kilometres or hours
- Language and social network instead of proximity to radio tower
- Spatial relationships take place at multiple scales
- I am Welsh, British, European etc.
- The similarities between rural China and rural Russia are greater than the differences
Geography
- Lots of interesting questions are really non-spatial
- We can draw maps of them
- But the conclusion does not depend on the locations of the units
How many countries have had cases of ebola? (11) |
Which part of Africa was affected by ebola (West and Central)? |
What is the population of the USA? (~325m) |
How many people live West of the Mississippi? (~136m) |
Which state in Brazil is richest? (DF) |
Where in Brazil are states richest? (Southeast) |
Spatial Political Science Questions
- Politics is about “who gets what, when and how” (Laswell 1936)
- The ‘outcomes’ of politics are spatial
- Northern Brazil is poorer than the South
- A child born just inside the border of North Korea has fewer rights
- The ‘causes’ of politics are also spatial
- 78% of China’s GDP located in its cities
- Social networks are spatially concentrated
- Political decision-making is arranged by space
Spatial Political Science Questions
- Types of spatial political analysis
- Identifying clustering
Spatial Political Science Questions
- Types of spatial political analysis
- Correlating spatial relationships
- Eg. Acharya, Blackwell and Sen (2016)
Spatial Political Science Questions
- Types of spatial political analysis
- Measuring how one country affects its neighbours
- Eg. Oberdabernig et al (2017)
Spatial Political Science Questions
- Types of spatial political analysis
- How natural geography affects politics
Spatial Political Science Questions
- Types of spatial political analysis
- How political borders affect modern politics
Spatial Political Science Questions
- Types of spatial political analysis
- How political borders affect modern politics
Merits of Spatial Analysis
Opportunities:
- Deeper explanations for common outcomes
- Where helps us understand why
- Avoid confounding relationships
- Enabling new inferential methodologies
Limitations:
- Data are not ‘independent’ for statistical analysis
- Data are often aggregated, and the level of aggregation affects our conclusions (Modifiable Areal Unit Problem, Ecological Fallacy)
- Distances of complex shapes are not ‘fixed’ (fractals)
Merits of Spatial Analysis
Map Literacy
- Maps are clear and convincing
- Patterns may only be visible when arranged spatially
- If you have spatial data, why put it in a table or a chart?
Map Literacy
- Eg. The population of Acre state by municipality:
|
|
GEOCOD
|
NOME_AC
|
POP_2010
|
|
1
|
1,200,401
|
Rio Branco
|
336,038
|
2
|
1,200,203
|
Cruzeiro do Sul
|
78,507
|
3
|
1,200,500
|
Sena Madureira
|
38,029
|
4
|
1,200,609
|
Tarauac
|
35,590
|
5
|
1,200,302
|
Feij3
|
32,412
|
6
|
1,200,104
|
Brasil4ia
|
21,398
|
7
|
1,200,450
|
Senador Guiomard
|
20,179
|
8
|
1,200,385
|
Pl0cido de Castro
|
17,209
|
9
|
1,200,708
|
Xapuri
|
16,091
|
10
|
1,200,336
|
M2ncio Lima
|
15,206
|
11
|
1,200,252
|
Epitaciol
|
15,100
|
12
|
1,200,807
|
Porto Acre
|
14,880
|
13
|
1,200,427
|
Rodrigues Alves
|
14,389
|
14
|
1,200,351
|
Marechal Thaumaturgo
|
14,227
|
15
|
1,200,013
|
Acrel1ndia
|
12,538
|
16
|
1,200,393
|
Porto Walter
|
9,176
|
17
|
1,200,179
|
Capixaba
|
8,798
|
18
|
1,200,138
|
Bujari
|
8,471
|
19
|
1,200,344
|
Manoel Urbano
|
7,981
|
20
|
1,200,328
|
Jord3o
|
6,577
|
21
|
1,200,054
|
Assis Brasil
|
6,072
|
22
|
1,200,435
|
Santa Rosa do Purus
|
4,691
|
|
Map Literacy
Map Literacy
Map Literacy
- But maps still require careful interpretation
- Scale
- Direction
- Indicator
- Mapping values to colours
Map Literacy
- Scale
- Can I walk from The Art Institute of Chicago to Union Station in 10 minutes?
Map Literacy
Map Literacy
Map Literacy
- Compass
- What’s the best place to view the sunset in the Wirral (UK)?
Map Literacy
- Compass
- What’s the best place to view the sunset in the Wirral (UK)?
Map Literacy
- Choosing the Indicator
- The most important!
- What precisley do we want to measure?
Map Literacy
- Choosing the Indicator
- The most important!
- What precisley do we want to measure?
Map Literacy
- Choosing the Indicator
- The most important!
- What precisley do we want to measure?
Map Literacy
- Mapping values to colours
- Can be manipulated to convey relevant (or misleading!) conclusions
Map Literacy
- Mapping values to colours
- What type of colour scale matches your data?
Map Literacy
- Mapping values to colours
- What type of colour scale matches your data?
Map Literacy
- Mapping values to colours
- What type of colour scale matches your data?
Map Literacy
- Mapping values to colours
- What type of colour scale matches your data?
Map Literacy
- Mapping values to colours
- Hard: Chosing break points between categories
Map Literacy
- Mapping values to colours
- Hard: Chosing break points between categories
Map Literacy
- Mapping values to colours
- Hard: Chosing break points between categories
Vector vs. Raster Data
- Vector
- Start with a blank page
- Add specific objects (points, lines, polygons) defined by coordinates (x,y)
- The computer stores just the coordinates of the objects
- Non-spatial ‘Attributes’ of each object allow complex analyses
- Raster
- Start with a grid
- Each grid square (pixel) has a value
- The computer stores one value for every grid square (fixed memory size)
- Mostly for ‘continuous’ remote sensing (satellite) images
Vector vs. Raster Data
Types of Vector Data
- The choice may depend on scale
- What type of vector data is a river?
Types of Vector Data
- The attributes we assign to vector objects also vary
Locations in Space
- Geographic Coordinate Systems
- ‘Perfect’ representations of earth in the computer
- Longitude and Latitude define any point on earth
- Distance is ‘Great Circle’ Distance
Locations in Space
Locations in Space
- Longitude
- Lines of longitude are perpendicular to the equator (North-South)
- They measure the angle from Greenwich, London, East-West
- Latitude
- Lines of latitude are parallel to the equator (East-West)
- They measure the angle from the equator, North-South
Locations in Space
- Longitude & Latitude can be measured in different units
- DMS: 49°30’00″N, 123°30’00″W
- DM: 49°30.0′, -123°30.0’
- Decimal Degrees: 49.5000°,-123.5000°
- But all of these use the same Geographic Coordinate System
- And we ‘always’ use the same one
- WGS-84
Locations in Space
- This oblate spheroid is estimated by a ‘datum’ so we get the location correct
- No need to worry about this, WGS-84 includes its own datum
Locations in Space
- But we view maps on flat surfaces: paper or screens
- To produce flat maps we need a Projected Coordinate Reference System
- Translating 3-D locations to 2-D locations
- There are many different ways to do this, just as there are many ways to peel an orange
Locations in Space
- Projections can preserve shape, area or distance, but not all three!
Locations in Space
Locations in Space
- Coordinate Reference Systems have useful shortcut EPSG codes
- In R, this is all you need
WGS-84 |
Geographic |
4326 |
Corrego Alegre / UTM zone 23S (Coastal Brazil) |
Projected |
22523 |
Chua / UTM zone 23S (Distrito Federal) |
Projected |
4071 |
Spatial Datasets
Spatial Datasets
- Vector Spatial Datasets
- Coordinates for every object
- Multiple coordinates for lines, polygons
001 |
Minas Gerais |
-48.77246, -17.773988 |
002 |
Rio de Janeiro |
-49.24686, -16.819800 |
Spatial Datasets
- Vector Spatial Datasets
- Coordinates for every object
- Multiple coordinates for lines, polygons
001 |
Minas Gerais |
MULTIPOLYGON ((( -48.77246 -17.773988, -48.77252 -17.773970, -48.77266 -17.773990))) |
002 |
Rio de Janeiro |
MULTIPOLYGON ((( -49.24686 -16.819800, -49.24701 -16.819812, -49.24707 -16.819838))) |
Spatial Datasets
- One single ‘Multipolygon’ can be complicated
- Comprised of many distinct polygons
- Polygons can have ‘holes’ in them
Spatial Datasets
- Raster Spatial Datasets
- Coordinates for every data point
-106.05 |
35.96 |
0 |
-106.06 |
35.96 |
13 |
-105.07 |
35.96 |
2 |
-105.08 |
35.96 |
0 |
… |
… |
… |
Spatial Datasets
- Historically, vector data has been stored as shapefiles
- Shapefiles separate out the tables, location data, projection into separate files
Data.shp |
Geometry details |
Data.dbf |
Non-spatial attribute data (a table) |
Data.shx |
Indexing of the geometry to match the table |
Data.prj |
Details of the projection |
Spatial Datasets
- Raster data is typically stored as .tiff files
- The same as you get from a camera or scanner
- But with location and projection data so that we know ‘where’ the image corresponds to
- ‘GeoTiff’ files
Georeferencing
- Computers understand locations such as -23.562778, -46.725261
- But what if we have a street address?
Georeferencing
- We can also take an image and georeference it to a map
- We need to ‘pin’ the map to at least two points
Georeferencing
Non-Spatial Joins
- Most of our data is non-spatial, but could be made spatial
- Election results
- Death rates
- Welfare payments
- Conflict
- We can make this data spatial if we link it to existing spatial (location) data
- Using common identifiers in both datasets
- Non-spatial joins
Non-Spatial Joins
- Governments publish school performance data
- We know which schools are ‘best’
- But what is the spatial pattern of school performance?
- Better in the city centre or in the suburbs?
- We need a source for the location of the schools
- Perhaps from a separate geographical survey
- Or by georeferencing their addresses
- How do we combine the school performance and location datasets?
Non-Spatial Joins