Latest GIS synthesis test: West Midlands

Further to my previous work on attempting to bring together our multiple datasets into a synthesis (1)(2) and on visualising the results (3), I have now performed a more extensive test of the methodology using all of the data that we have gathered to date for English Heritage’s West Midlands region.

To briefly summarise previous posts, my synthesis methodology consists of creating a tessellation of 1km by 1km grid squares across England, with the SW origin point of each cell being 50m west and 50m south of the 1000m divisions in the OS grid.  We simplify the terminology used in each of our input datasets, run a series of identity queries in ArcGIS to define which cell(s) each object is located in, then collate the data for each cell for all datasets to create a new single synthesis layer.  We can then test these results against various measures such as mean elevation, mean terrain ruggedness index, percentage ground obscuration etc.

The datasets used for this latest synthesis test were:

  • Bronze Age sites as collated by David Yates (2007. Land, power and prestige: Bronze Age field systems in southern England. Oxford: Oxbow).
  • Janice Kinory’s database of Iron Age salt processing sites.
  • National Trust HER records.
  • A layer showing field systems recorded in the AIP database.
  • A layer showing settlements recorded in the AIP database.
  • English Heritage National Record of the Historic Environment data (NRHE).
  • English Heritage MORPH data (for the Marches region).
  • HER data from Birmingham, the Black Country, Coventry, Dudley, Herefordshire, Sandwell, Shropshire, Solihull, Staffordshire, Warwickshire, and Worcestershire (Worcester, Stoke-on-Trent, and Dudley are not yet in our database, although we have the data for each, so it was not possible to include them in the test).

The latest version of our terms thesaurus includes a category called ‘Domestic & Civil’, which covers the following types of sites (some of these types appear in other categories as well, e.g. hillforts in ‘Defensive’):

  • Town / small town.
  • Burh.
  • Civitas capital / colonia.
  • Hamlet / village.
  • Vicus.
  • Canabae Legionis.
  • Oppidum.
  • Hillfort.
  • Unenclosed settlement.
  • Enclosed settlement.
  • Linear settlement.
  • Palisaded settlement.
  • Riverside settlement.
  • Dispersed settlement.
  • Nucleated settlement.
  • Road-side settlement.
  • Midden.
  • Timber platform.
  • Manor.
  • Unspecified settlement.

Obviously, this is quite a broad category, approximating to something like “settlement” generally.  We can then map the distributions of grid cells containing sites that fall within this category for each period:

1 domestic and civil (PR)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; unspecified prehistoric.
2 domestic and civil (BA)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Bronze Age.
3 domestic and civil (IA)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Iron Age.
4 domestic and civil (RO)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Roman.
5 domestic and civil (EM)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; early medieval.

These results show several patterns.  Settlement can be seen to follow the major river valleys in many cases, especially along the Warwickshire Avon.  In the Iron Age, there is a distinct move towards upland settlement in addition, especially in the Marches.  Roman settlement is most dominant in the south east of the region (Warwickshire and Worcestershire).  The cluster of settlement seen in all but the early medieval in the south of the region is in the vicinity of Bredon Hill, which was clearly an important area through the later prehistoric and into the Roman period.  The early medieval distribution is particularly interesting due to the large number of sites seen in western Staffordshire: these are largely records of manor sites.  It seem likely that this is an artifact of data collection, perhaps resulting from a particular researcher or HER officer having a particular research interest in Staffordshire manors (it seems likely that these originated from Domesday records?).

If we want to analyse these trends further, we can collate data according to a series of bands across England from a point off the south-eastern coast moving out north and west.  For the West Midlands, these bands are located like so:

6 bands
South east to north west banding for analytical purposes.

We can then export the results and create graphs in external software:

7 band_elev
Elevation of grid cells showing evidence of sites within the ‘Domestic & Civil’ category, grouped by band and coloured by period.

This graph shows that settlement generally occurred in areas of lower elevation, with most upland occupation being Iron Age in date.  The dominance of Roman data in the (left-hand) south eastern bands is obvious, as is the dominance of early medieval data in bands 40-42 in particular.

8 band_density
Percentage of cells in each band showing evidence of sites within the ‘Domestic & Civil’ category, coloured by period.

This second graph works better, showing the high levels of Roman occupation in bands 33-37, and relatively high levels of early medieval occupation through bands 39-45.  Band 35 contains most of the Warwickshire Avon, providing supporting evidence for my earlier suggestion of its importance based upon visual examination of the map.

As a final point, the distributions mapped above appear quite coarse when studied at the scale of only the West Midlands, but if looked at on the England-wide scale of the project as a whole (for which this is just a test), the resolution acquires a much finer-grained appearance:

9 villas scale of England
1km x 1km grid cells showing evidence of Roman villas (West Midlands only but scaled to show all of England).

This is just the latest test of my synthesis methodology, but I do think it is proving to be a productive and informative way in which to bring together such a large variety of different datasets into a single analytical environment.  The distributions and graphs shown above are just examples of data that we could map.  We could also map any of our other thesaurus categories or specific thesaurus terms, and we could easily perform analyses against other variables beyond elevation (e.g. terrain ruggedness, ground obscuration etc.) and using other bandings or groupings of cells.

Once we have all data collated for England, I will test the system again with all of that data and see how it works.  Only time will tell.

Chris Green

EDIT 09/10/12:  Updated second graph to correct mathematical error.

Geo-spatial visualization

I recently attended a Workshop on Challenges in Geo-spatial Visualization run by the OeRC at Pembroke College, here in Oxford.  The workshop was organised by Prof Min Chen and his colleagues in order to consider challenging problems in the visual analytics of spatial data and to discuss potential solutions.

Jason Dykes and Jo Wood of the giCentre at City University London presented particularly interesting ideas and visualizations, based around cartograms and visualising spatial ‘flows’.  They also emphasised the critical element of visual salience: this is the concept that (spatially) large objects tend to dominate on a map, whereas (interpretatively) important objects ought to be what our attention is drawn to.

I was also particularly taken with the ideas Simon Walton (of the OeRC) in regard to the importance of spatial frequency to visual perception (e.g. if we look at a Google Earth image of a city from space, we arguably tend to think that we distinguish between city and countryside based upon colour [i.e. greens vs greys], but we are in fact more influenced in this regard by the complexity of what we are seeing, with countryside being quite plain and cities complex).

Overall, the workshop was very engaging and challenged my thinking on how I might approach the spatial analysis of EngLaId’s datasets.  In particular, I think I am rather too wedded to the conventional map and, as such, have been experimenting with some alternative visualizations since the workshop.

One idea raised in discussion by Jo Wood was that of making graphs where one axis represents space (in some way) and the other an attribute associated with data located within that space.  It occurred to me that one common concept seen in much archaeological interpretation on the scale of England / Britain was that of difference between the lowland zone of southeastern England and the highlands of the west and north.  Conceptually, we can thus think of this as a trend from south east to north west.

In order to organise our data in such a way as to make it possible to graph data along this axis, I first defined an (arbitrary) point off the south east of England and then created a Euclidean distance raster radiating out from this point:

euclid
Euclidean distance raster from point marked by X.

I then generalised this into 10km width bands and joined the results to the vector grid tessellation that I am using to analyse data on the scale of England:

bands
10km distance bands from point marked by X.

It is then possible to use this banding to plot other attributes recorded in the grid square layer as a graph, such as mean elevation or terrain ruggedness (TRI).  As our datasets are not yet quite complete, I do not currently have the ability to query these down to subsets based upon archaeological site type / period.  Therefore, I experimented with creating some graphs based upon the entire dataset, thus showing patterns along this SE-NW axis for England as a whole.

crazy graphs 1 - elev
Graph of mean elevation of grid squares: x-axis = distance band; y-axis = mean elevation. Points are individual data; heat map shows clustering. Deformed England below x-axis to show approximate spatial element.
crazy graphs 2 - TRI
Graph of mean TRI of grid squares: x-axis = distance band; y-axis = mean TRI. Points are individual data; heat map shows clustering. Deformed England below x-axis to show approximate spatial element.

These two graphs (created in Veusz from a .csv table exported from ArcGIS) are constructed so that the distance bands run from left (SE) to right (NW), with the mean elevation / TRI being shown on the y-axis (with the TRI, the higher the number, the more rugged the terrain).  The dots show individual records and the ‘heat map’ behind shows the frequency /clustering of those dots.  The deformed England map below each graph is intended to show an approximation of where these bands fall spatially, although obviously this is an imperfect relationship.  These graphs both show how the English landscape becomes more elevated / rugged at its extremes as you head north or west from the south east, albeit with its main clustering remaining at fairly low elevations and at fairly low degrees of ruggedness.

crazy graphs 3 - obsc
Graph of percentage obscuration of ground surface of grid squares: x-axis = distance band; y-axis = percentage. The red colouring is for “human” factors; the grey colouring superimposed includes soil / geological obscuration in addition.

This final graph shows the frequency / clustering of the percentage of grid cells in each band that are obscured from the air for the purposes of aerial photography.  The red shading shows ‘human’ factors only (see previous post), with the greyscale shading also including geological / soil type factors (see this post).  This graph is a little harder to read, so probably requires more thought.

If we compare these three graphs, we can see that the areas of the country most obscured by human activity (which in this instance includes woodland and lakes) cluster in the same bands as the areas of England which are predominantly of low elevation / ruggedness.  This suggests that there is a relationship between landscape morphology and human activity (as we would expect), with humans tending to prefer to settle in areas which are arguably easier to live in (i.e. lower, flatter terrain).

This is all very experimental at the moment and the conclusions reached are not yet particularly relevant to archaeological study, but it does prove that there is potential for a methodology such as this to elucidate patterns in our data.  Once we are able to query down this grid square dataset to only include cells with particular types of archaeological feature in them, we will be able to create many different graphs such as these and, as such, attempt to quantify the difference / similarity in the distributions of different archaeological features, based upon several attributes (i.e. elevation, TRI, ground obscuration).

Clearly, the banding chosen for this experiment reflects a particular concept of how distributions might vary across England, albeit one that is very common in archaeological interpretation (e.g. the three zones seen by Roberts & Wrathmell [2000 / 2002] and by Jeremy Taylor [2007] in their respective works): it is thus desirable to test different axes across the country to see whether different patterns might emerge.  It would also be possible to do something similar for bands created around all instances of a particular type of site, although this might be argued to be a little too processualist perhaps…

In conclusion, I do think the methodology outlined has potential for studying patterns in our data, but it will require a lot more thought and experimentation to be certain.

Chris Green

References:

Roberts, B. and S. Wrathmell. 2000. An Atlas of Rural Settlement in England. London: English Heritage.

Roberts, B. and S. Wrathmell. 2002. Region and Place. A Study of English Rural Settlement. London: English Heritage.

Taylor, J. 2007. Atlas of Roman Rural Settlement. London: English Heritage.