Regionality & complexity

This post follows on in part from a post I wrote a couple of years ago on regionality. It will also begin with an apology: the maps presented here will be very difficult for colour blind readers to understand, for which I am sorry. Unfortunately, the technique involved is somewhat limited in terms of control of colour (as it requires three colour channels), so it is not possible (or at least very difficult) to improve the maps to make them more legible for colour blind readers. As such, I would not propose publishing these particular visualisations in any formal setting, but hopefully I can get away with it in a blog post!

Before we get to the maps themselves, I shall describe briefly the mapping technique involved, which is partly inspired by the work of a former colleague of mine at the University of Leicester, Martin Sterry (departmental webpage; Essentially, this method can be used to describe the relationship between three different spatial variables that can be mapped as density surfaces. First, we create density surfaces (KDE here) for each variable and then we combine them into an RGB image using the Composite Bands tool in ArcGIS, with the first layer forming the red channel, the second layer forming the green channel, and the third layer forming the blue channel. However, RGB images (so-called “additive colours”, which work from black by adding light in the red, green, and blue channels), can be rather dark / muddy, so I then converted the images (using “Invert” in Photoshop) to CMY images instead (so-called “subtractive colours” where one works from white by subtracting light in the cyan, magenta, and yellow channels: this is how colour printers work). To do so cleanly, one must set up one’s map document so that anything one wishes to be white in the final image is black in the map document and vice versa. The same applies to greys, which must be set to their inverse (e.g. a 30R 30G 30B grey as seen below for Wales / Scotland / Man should be set to 225R 225G 225B, being 255-30 in each case). This may sound somewhat complicated but the end result is as follows:

  • Cyan (turquoise) tones represent high values in Channel 1, e.g. “complex farmsteads” in the first example below.
  • Magenta tones represent high values in Channel 2, e.g. “enclosed farmsteads” in the first example below.
  • Yellow tones represent high values in Channel 3, e.g. “unenclosed farmsteads” in the first example below.
  • Blue tones represent high values in Channels 1 and 2.
  • Red tones represent high values in Channels 2 and 3.
  • Green tones represent high values in Channels 1 and 3.
  • Dark grey / black tones represent high values in all three Channels.
  • White or pale tones represent low values in all three Channels.

Here is a close up of the colour category zones for the first two examples below:


I began by examining the three main categories of Roman farmstead defined by the Roman Rural Settlement Project (RRSP) at Reading, using their excellent data that is available online (Allen et al. 2015). As they defined only three specific categories, this is an ideal dataset to map in this way. For a first attempt, I made three KDE layers using a 10km kernel (or search window) to structure the size of the clusters in the resulting output, then combined them as described above. When plotted against the regions defined based upon variation in their data by the RRSP team (Smith et al. 2016: Chapter 1), we can see that there is a degree of agreement between the regions and the clustering of particular colours:

1 RRSP_psychedelia_v3_inc_regions

However, there is also clearly considerably more complexity to the data than a simple regional classification might suggest (as the RRSP team would certainly acknowledge, so this is not intended as a criticism in any way). If we construct a new model using a wider kernel (in this case 50km), we can get a really nice sense of regional variation in the data without the need to draw lines on a map:

3 RRSP_psychedelia_v2

There is some interesting structure in this model. For example, one can see a focus on enclosed farmsteads in the north and west, so-called complex farmsteads in parts of the southern and eastern midlands (largely alongside enclosed farmsteads), with quite a different focus on enclosed and unenclosed farmsteads in the south east. The strong peak in enclosed farmsteads in south Yorkshire / the north midlands is also quite striking. Although it relies too much on good colour vision in a reader, I think this model and technique works quite well here, so I decided to apply it to another dataset: our own.

Before we get to the next stage, here is a close-up of the colour category zones for the next two maps (with RO = Roman; PR = Prehistoric; EM = early medieval):

5 CMYK_Englaid

Based on another technique which we published recently (Green et al. 2017), the following two maps are created from a measure of the “complexity” of our datasets: specifically the number of different types of site / monument (based upon our thesaurus of types; see Portal to the Past) per 1x1km square. This measure was calculated for each square for each time period in our database and then density surfaces created for each time period (using a 5km kernel in this instance). A shortcoming of the mapping technique comes into play here: it can only map three categories at once. As such, we had to combine the Bronze Age and Iron Age models into a composite model for later prehistory. The three time period based complexity models were then combined into a single image as previously:

4 complexity_psychedelia_global

There are various nice patterns in this dataset, including the clear strength of prehistory and the early medieval in the south western peninsula, the intense focus on major river valleys (partly due to the large gravel quarry excavations in those areas), and the appearance of Roman roads highlighted in magenta. The Roman period also looks quite dominant generally, with lots of pinks, blues, and reds visible on the map. There is also a very clear difference in intensity between eastern / southern England and northern / western England.

It is possible to lessen the effects of regional and period based variation, by constructing a series of larger kernel density surfaces and using these to “correct” for regional variation in the period based models. This produces a new model which reflects complexity on a more local scale. Essentially, the first model can be thought of as a model of “globally” scaled (by which I mean the whole of the dataset, not the whole of the planet) complexity and the new model can be thought of as a model of locally scaled complexity:

6 complexity_psychedelia_local

This model also shows some interesting patterns. It is much less dominated by single periods in particular regions, with Roman dominance mostly along the Roman roads and Hadrian’s Wall. There are also some nice dark areas, which show high levels of local complexity across all three time periods. These cluster mostly along rivers again or around the large Roman towns, along with a similar cluster in southern Yorkshire / the north Midlands to that seen in the RRSP data.

As with all models of English archaeology, the images presented here represent a very complex data history, being influenced by both where more (and more visible archaeologically) activity took place in the past and where more modern archaeological activity takes place in the present (largely driven by development). They also, as previously noted, come with considerable caveats in regards to legibility, due to the relatively large minority of people with restricted colour vision (c.8-10% of men, and maybe 1% of women). The technique is also restricted by its inability to map more than three variables, but more than three variables would probably overcomplicate matters even if it were possible. However, I hope that this post gives a sense of the variation and complexity in the English archaeological record, locally, regionally, and nationally.

EngLaId is now winding down, having officially ended at Christmas, so this will probably be the last substantive post on technique or data for a while. We will however announce here when any new publications come out, including our main books.

Chris Green


Allen, M., T. Brindle, A. Smith, J.D. Richards, T. Evans, N. Holbrook, M. Fulford, N. Blick. 2015. The Rural Settlement of Roman Britain: an online resource. York: Archaeology Data Service.

Green, C., C. Gosden, A. Cooper, T. Franconi, L. Ten Harkel, Z. Kamash & A. Lowerre. 2017. Understanding the spatial patterning of English archaeology: modelling mass data from England, 1500 BC to AD 1086. Archaeological Journal 174(1): p.244–280.

Smith, A., M. Allen, T. Brindle & M. Fulford. 2016. New Visions of the Countryside of Roman Britain. Volume 1: the Rural Settlement of Britain. Britannia Monograph Series No. 29. London: Society for the Promotion of Roman Studies.

EDIT: Since writing this blog post, Martin Sterry has published a paper on his visualisation techniques, which can be found here:

Effective communication and cartography

I have been thinking a lot recently about using maps as effective tools for visual communication of data. Chen et al. (2014) wrote that visualization of data should be about getting your message across in a time-efficient manner, which Kent (2005) stated depends upon producing aesthetically pleasing results. All maps (being one form of data visualization) are imperfect models of the world (as all models are imperfect) and we must take care to make sure that our maps communicate the messages we wish to express effectively.

Without wishing to get unduly political, I want to work through these ideas using the example of this summer’s “Brexit” vote. Data on the referendum results can be found here and data on UK boundary lines here. There are many (infinite?) different potential ways of visualising this data spatially, but I am going to explain the messages I see in a few examples here.

First up, we have a simple rendering of the results using the district divisions by which the data was originally counted and parcelled up, in which the saturation of the yellows (remain) and blues (leave) show the percentage lead each vote had in districts which each side “won”:


Yellow and blue have been used as that seems to be the convention settled on by most of our media. This map shows which areas felt particularly strongly one way or the other about the question asked and works well in that regard. However, it also gives a somewhat misleading message, as some of the high value districts are of relatively low population density. As an alternative then, we can keep the same division into “leaver” and “remainer” districts, but instead use the shading to show population density:


This map loses the nuance of showing how strong the vote was in either direction, but gains something by showing which districts have more people living in them. Most notable is the stark difference between the districts in eastern England around The Wash, which are of low population density (for the UK!), but which felt very strongly that the UK should leave the EU.

We can also look at the result in much more stark terms. The recent High Court decision has increased the likelihood of their being a Parliamentary vote on invoking Article 50, so I wanted to see which way the various constiuencies fell in terms of “leave” or “remain”. This is not simple, however, as the results were reported using districts, which often do not match constiuencies. As such, I reapportioned the vote from districts between consituencies on the basis of spatial area (e.g. if a constiuency covered half a district, it would receive half the votes). This is imperfect, as population density is not uniform across any district, but was the best I could do with the data to hand. The results show that, if Parliament does get to vote on Article 50 and MPs vote as their constituents voted, then “Leave” will comfortably win (Northern Ireland has not been included, but does not have enough MPs to make a difference either way):


All of these maps work reasonably well at expressing one element of the data, but I wanted to come up with a visualization that produced a more complex picture of the results yet without abandoning geographic space (i.e. I did not want to use a cartogram):


This final map reworks the results into hexagonal spatial bins, using the same method as when I reworked the results into constiuencies (i.e. assignment by spatial area overlap). Here, the blue / yellow shading has returned to showing the strength of the result, but we can now also see data on population at the same time through the thickness / blackness of the lines around the hexagons. I feel that this map does a pretty good job of showing the distribution of the vote (spatially, strength-wise, and population-wise) whilst still allowing people to locate themselves reasonably well geographically (which would not be the case with a cartogram). Hexagons have been preferred over squares largely to their visual appeal and due to the fact that humans have a tendency to see false straight lines in data binned into square-based grids.

Whatever you think of the referendum result, I hope that my worked example has helped to explain how making a map is not always a simple task. Careful thought about audience, message, and data structure needs to go into any visualisation if effective communication is to be achieved. I hope that my final map succeeds in that task!

Chris Green


Chen, M., L. Floridi, and R. Borgo. 2014. What is visualization really for? The Philosophy of Information Quality. Springer Synthese Library Volume 358, 75-93

Kent, A.J. 2005. Aesthetics: a lost cause in cartographic theory? Cartographic Journal 42(2), 182-188

All maps contain Ordance Survey data (C) Crown copyright and database right 2016

CAA 2014, Paris

I just returned from this year’s Computer Applications and Quantitative Methods (CAA) conference, which was held in Paris last week.  Overall, the conference was a great success, despite a number of teething troubles (particularly with IT support [ironically?]).


I spoke on the Friday morning about using Heisenberg’s Uncertainty Principle as a metaphor for good cartographic practice.  I’ll try to write more about that at a later time.


One particularly impressive visualization of data that I saw was Lost Change, which maps PAS coins and their mint locations.  Another very interesting paper I heard was about MicroPasts (another British Museum backed venture), which is designed to allow archaeologists to access crowdsourced labour and crowdfunded funding.  I also enjoyed Philip Verhagen’s paper, as his project is encountering many of the same data rationalization issues as our own (and he only has to work with a single source database, rather than the 70+ that we are trying to combine).


There is a storify of the conference tweets here:

Next year’s CAA will be in Siena, Italy.  They know how to pick places with good food and good cheer!

Chris Green

Extracting trends (VIII)

This is yet another short post about trend surfaces, following on from previous (I)(II)(III)(IV)(V)(VI)(VII), but with a new dataset.  After this, I think I have probably exhausted the possibilities for getting information out of our data using trend surface modelling, which is best thought of as an initial exploratory technique in any event.

This time, I have been looking at spatial trends present in English Heritage’s Excavation Index, which has been kindly supplied to us by Tim Evans at the ADS, who recently wrote an excellent journal article on the potential of the Index as a research tool.  This is a record of excavations and investigations that have taken place in England since around the mid nineteenth century.  I do not think that it pretends in any way to be comprehensive, but it is another way of filling in gaps in our data, especially for archaeological work that took place before 1990.

In any event, here are the trend surfaces that I have created based upon the Excavation Index (to different scales [the values being records per], but the broad picture is the important thing):

1 eh_excind_trend all
12th power linear trend surface for all data in the Excavation Index.
2 eh_excind_trend englaid
12th power linear trend surface for EngLaID period data in the Excavation Index.
3 eh_excind_trend PR
12th power linear trend surface for unspecified prehistoric data in the Excavation Index.
4 eh_excind_trend BA
12th power linear trend surface for Bronze Age data in the Excavation Index.
5 eh_excind_trend IA
12th power linear trend surface for Iron Age data in the Excavation Index.
6 eh_excind_trend RO
12th power linear trend surface for Roman data in the Excavation Index.
7 eh_excind_trend EM
12th power linear trend surface for early medieval data in the Excavation Index.

So, what can we see from looking at these maps?  Overall, the Index shows greatest density of work in the south, particularly around Bristol, London and Kent.  For the EngLaID period as a whole, the pattern is similar, but with the area around Dorset becoming more important.  The unspecified prehistoric is biased towards London and Kent, but there are too few of these records to say that this is particularly meaningful.  The Bronze Age stands out as very distinct from all other periods, with clear peaks in Wessex, eastern Yorkshire and the Peak District: my assumption is that this represents particular research projects undertaken by EH.  The Iron Age shows peaks north of London and stretching down to Kent and towards Wessex.  The Roman trend is similar to the overall pattern for all periods, which is not surprising due to the high numbers of Roman records in the database.  The early medieval peaks around Hampshire, Kent and London, with greater emphasis also on East Anglia than the other periods.

Overall, most of these trends are fairly similar to those seen with previous datasets, at least when considered on a broad brush basis.  The major exception is for the Bronze Age, where the high trend surface peaks previously seen in south west England are no longer as dramatic.  London is also standing out more strongly in the Index than it had in most previous datasets, I think (although this is less pertinent when comparing with the NRHE, as we did not receive NRHE data for London).

Chris Green

Extracting trends (VII)

Further to my previous post, I have now had another go at constructing trend surfaces for the four broad main periods covered by this project.  This time, however, I have filtered out records that are explicitly related only to artefact findspots (for each period).  This was in an attempt to downplay the influence in the previous trends from differential inclusion of PAS material between HERs.  The remaining records should, hopefully, thus primarily relate to sites with other archaeological evidence beyond just one or more artefacts.

Here are the results (to the same attribute scale as previous):

Trend surface for Bronze Age HER data, exc. findspots
Trend surface for Bronze Age HER data, exc. findspots
Trend surface for Iron Age HER data, exc. findspots
Trend surface for Iron Age HER data, exc. findspots
Trend surface for Roman HER data, exc. findspots
Trend surface for Roman HER data, exc. findspots
Trend surface for early medieval HER data, exc. findspots
Trend surface for early medieval HER data, exc. findspots

Comparing to the previous surfaces, we can see a general reduction in trend peaks, especially over Norfolk and Yorkshire.  The Bronze Age remains similar to previous; the Iron Age also, albeit with much lower peaks; the Roman period shows an increasing strength across Gloucestershire; the early medieval shows the most distinct reductions in eastern regions.

Chris Green

Extracting trends (VI) and national synthesis update

This post follows on from my previous posts on trend surface modelling (I)(II)(III)(IV)(V) and my posts on synthesis of multiple datasets using grid squares (I)(II)(III)(IV).

As our HER dataset is now nearly complete (only Merseyside is expected from now on; North Somerset and Bath & North East Somerset are unable to provide data), we are finally able to begin attempting to study the data which we have gathered on a nationwide scale.  Broad period classifications (Prehistoric; Bronze Age; Iron Age; Roman; early medieval; uncertain; “bad date” [i.e. outside our period]) were calculated for the HER data using a script (based upon the multitude of period designations applied by individual HERs or upon start / end dates) and the data was converted to shapefile format and merged into a single point layer.  This shapefile layer can then be very coarsely queried to produce distributions of records of different periods.

As an initial method for investigating this mass of data (around 400,000 records), I experimented with the production of a few trend surfaces.  First, one for all of the data received:

Trend surface for all EngLaId HER data
Trend surface for all EngLaId HER data

I think that there are two major factors at play in this trend.  The first is the general bias in English archaeology towards greater density of (probably) settlement and (certainly) fieldwork in the south and east of the country.  The second (possibly more dominant?) is the variation in recording methods used across the country.  Even where the same software is used, different HERs catalogue their data somewhat differently: some like to split everything up into individual periods and types, others like to collate into multi-period sites; some cast their nets wide to include as much data as possible (e.g. PAS data, MORPH data), others like to only include sites of certain and clear provenance.  This means that the density of data across the country is as much about modern practice as it is about activity in the ancient past.

We can then produce similar surfaces for our broad periods (all to the same numerical scale):

Trend surface for Bronze Age HER data
Trend surface for Bronze Age HER data
Trend surface for Iron Age HER data
Trend surface for Iron Age HER data
Trend surface for Roman HER data
Trend surface for Roman HER data
Trend surface for early medieval HER data
Trend surface for early medieval HER data

These four surfaces still reflect to some extent the differences seen in modern practice, but they are closer to the genuine distribution of past activity.  The Bronze Age surface seems to be biased towards uplands and towards Wessex.  The Iron Age surface has a clear bias towards the south east.  The Roman surface is biased towards lowland Britain but also towards the pockets of military activity in the north of England.  The early medieval surface is biased towards the eastern parts of England.

However, the distributions behind all of these trends are still heavily influenced by modern archaeological and CRM practices.  This is only going to get worse when we begin to produce duplication in our dataset by building in English Heritage NRHE data and other datasets.  As discussed in previous posts, one way in which to minimise these modern effects and reduce the influence of duplication is to collate data by 1 by 1 km grid cells.  This requires the application of a thesaurus containing simplified monument terms and the step already undertaken of assigning standardised period terms.  The result is a tessellation of 1 x 1 km grid squares across England recording the presence of different types of archaeological site for each of our broad periods, which we can then query and use to produce maps.

As an example, I constructed a few more trend surfaces, based upon the presence / absence of evidence for sites within our broad “domestic and civil” category.  This category includes: town, burh, civitas capital, colonia, hamlet, village, vicus, canabae legionis, oppidum, hillfort, anything defined using the word “settlement”, midden, timber platform (several of these sub-types belong to more than one broad category).  We can then look at how the underlying trends behind this category changed over time (these trend surfaces are logistic rather than linear, reflecting the probability of binary presence / absence relationship rather than density):

Trend surface (logistic) for synthesised data: Bronze Age "domestic and civil" category.
Trend surface (logistic) for synthesised data: Bronze Age “domestic and civil” category.
Trend surface (logistic) for synthesised data: Iron Age "domestic and civil" category.
Trend surface (logistic) for synthesised data: Iron Age “domestic and civil” category.
Trend surface (logistic) for synthesised data: Roman "domestic and civil" category.
Trend surface (logistic) for synthesised data: Roman “domestic and civil” category.
Trend surface (logistic) for synthesised data: early medieval "domestic and civil" category.
Trend surface (logistic) for synthesised data: early medieval “domestic and civil” category.

There is still some bias in these trend surfaces from the amount of data recorded by different modern archaeological entities (e.g. Northamptonshire is a very “completist” HER, which partially accounts for it showing up so strongly in many of the trend surfaces seen in this blog post), but the patterns are still quite interesting.  The Bronze Age is heavily influenced by the very high number of records present on Dartmoor and Bodmin Moor.  The Iron Age is probably mostly interesting for the low probability area across the “waist” of England from Cheshire to Lincolnshire.  The Roman is pretty much how I would expect it: high likelihood in the lowland zone and around Hadrian’s Wall (this includes “native” sites [whatever that means!] of Roman period date).  The early medieval is fairly flat, showing settlement across the country with greatest probability in central and eastern England (the peaks in Devon possibly need further investigation).

All of this is just a very preliminary, very coarse analysis of what is a very large and detailed set of data.  Some interesting patterns are beginning to emerge, but these may diminish as we continue to work on our material.

Chris Green

Extracting trends (III)

Following again on from my previous two posts (1)(2), I have been experimenting further with constructing trend surfaces, this time for specific sub-sets of my downloaded AIP data for evaluations and post-determination / research results from 1990 to 2010.

First, I removed all of the data for investigations that had no results in terms of dated features, which results in a very similar trend surface to that for all of the data including investigations with no substantive positive evidence:

1 AIP_trend_noNegEvid
12th power trend surface for AIP data (excluding investigations with no positive results)

Then I constructed trend surfaces for the same data but filtered down to investigations producing results for each of EngLaId’s four main broad time periods:

2 AIP_trend_BA
12th power trend surface for AIP data (Bronze Age)
3 AIP_trend_IA
12th power trend surface for AIP data (Iron Age)
4 AIP_trend_RO
12th power trend surface for AIP data (Roman)
5 AIP_trend_EM
12th power trend surface for AIP data (early medieval)

These results all look quite interesting to me, especially as they all vary quite significantly from the overall trend for all periods (albeit this is less the case for the Roman data).  The Bronze Age data shows a very clear bias towards an arc across south-eastern England from Dorset through to Kent and up into parts of East Anglia (the dry bits essentially), with the exception of the South Downs and the Weald.  The Iron Age is very strongly biased towards the counties north of London up to Cambridgeshire, across to north-east Kent and along the south coast.  There is also more of a northern trend than in the Bronze Age, with quite a significant peak in East Yorkshire.  The Roman data is distinctly biased towards London, Kent, the south coast, East Yorkshire and the Severn estuary region.  There is a surprising lack of any significant peak in the Tyneside area, considering the significant peak there in the data for all periods and the presence of Hadrian’s Wall.  For the early medieval, there is a very clear bias towards eastern England around the Fens and towards Kent.

I particularly like these results as they largely differ so significantly from the overall trend for all periods, which suggests that these patterns are more likely to be due to genuine distributions of underlying archaeological data, not just due to patterns of modern fieldwork (albeit this will still remain a very significant factor).  I am not sure any of the results are particularly surprising, interpretively, but they do confirm for me that we can extract spatial patterning from AIP data that is not just wholly biased towards areas of significant modern development.

Chris Green

Latest GIS synthesis test: West Midlands

Further to my previous work on attempting to bring together our multiple datasets into a synthesis (1)(2) and on visualising the results (3), I have now performed a more extensive test of the methodology using all of the data that we have gathered to date for English Heritage’s West Midlands region.

To briefly summarise previous posts, my synthesis methodology consists of creating a tessellation of 1km by 1km grid squares across England, with the SW origin point of each cell being 50m west and 50m south of the 1000m divisions in the OS grid.  We simplify the terminology used in each of our input datasets, run a series of identity queries in ArcGIS to define which cell(s) each object is located in, then collate the data for each cell for all datasets to create a new single synthesis layer.  We can then test these results against various measures such as mean elevation, mean terrain ruggedness index, percentage ground obscuration etc.

The datasets used for this latest synthesis test were:

  • Bronze Age sites as collated by David Yates (2007. Land, power and prestige: Bronze Age field systems in southern England. Oxford: Oxbow).
  • Janice Kinory’s database of Iron Age salt processing sites.
  • National Trust HER records.
  • A layer showing field systems recorded in the AIP database.
  • A layer showing settlements recorded in the AIP database.
  • English Heritage National Record of the Historic Environment data (NRHE).
  • English Heritage MORPH data (for the Marches region).
  • HER data from Birmingham, the Black Country, Coventry, Dudley, Herefordshire, Sandwell, Shropshire, Solihull, Staffordshire, Warwickshire, and Worcestershire (Worcester, Stoke-on-Trent, and Dudley are not yet in our database, although we have the data for each, so it was not possible to include them in the test).

The latest version of our terms thesaurus includes a category called ‘Domestic & Civil’, which covers the following types of sites (some of these types appear in other categories as well, e.g. hillforts in ‘Defensive’):

  • Town / small town.
  • Burh.
  • Civitas capital / colonia.
  • Hamlet / village.
  • Vicus.
  • Canabae Legionis.
  • Oppidum.
  • Hillfort.
  • Unenclosed settlement.
  • Enclosed settlement.
  • Linear settlement.
  • Palisaded settlement.
  • Riverside settlement.
  • Dispersed settlement.
  • Nucleated settlement.
  • Road-side settlement.
  • Midden.
  • Timber platform.
  • Manor.
  • Unspecified settlement.

Obviously, this is quite a broad category, approximating to something like “settlement” generally.  We can then map the distributions of grid cells containing sites that fall within this category for each period:

1 domestic and civil (PR)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; unspecified prehistoric.
2 domestic and civil (BA)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Bronze Age.
3 domestic and civil (IA)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Iron Age.
4 domestic and civil (RO)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Roman.
5 domestic and civil (EM)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; early medieval.

These results show several patterns.  Settlement can be seen to follow the major river valleys in many cases, especially along the Warwickshire Avon.  In the Iron Age, there is a distinct move towards upland settlement in addition, especially in the Marches.  Roman settlement is most dominant in the south east of the region (Warwickshire and Worcestershire).  The cluster of settlement seen in all but the early medieval in the south of the region is in the vicinity of Bredon Hill, which was clearly an important area through the later prehistoric and into the Roman period.  The early medieval distribution is particularly interesting due to the large number of sites seen in western Staffordshire: these are largely records of manor sites.  It seem likely that this is an artifact of data collection, perhaps resulting from a particular researcher or HER officer having a particular research interest in Staffordshire manors (it seems likely that these originated from Domesday records?).

If we want to analyse these trends further, we can collate data according to a series of bands across England from a point off the south-eastern coast moving out north and west.  For the West Midlands, these bands are located like so:

6 bands
South east to north west banding for analytical purposes.

We can then export the results and create graphs in external software:

7 band_elev
Elevation of grid cells showing evidence of sites within the ‘Domestic & Civil’ category, grouped by band and coloured by period.

This graph shows that settlement generally occurred in areas of lower elevation, with most upland occupation being Iron Age in date.  The dominance of Roman data in the (left-hand) south eastern bands is obvious, as is the dominance of early medieval data in bands 40-42 in particular.

8 band_density
Percentage of cells in each band showing evidence of sites within the ‘Domestic & Civil’ category, coloured by period.

This second graph works better, showing the high levels of Roman occupation in bands 33-37, and relatively high levels of early medieval occupation through bands 39-45.  Band 35 contains most of the Warwickshire Avon, providing supporting evidence for my earlier suggestion of its importance based upon visual examination of the map.

As a final point, the distributions mapped above appear quite coarse when studied at the scale of only the West Midlands, but if looked at on the England-wide scale of the project as a whole (for which this is just a test), the resolution acquires a much finer-grained appearance:

9 villas scale of England
1km x 1km grid cells showing evidence of Roman villas (West Midlands only but scaled to show all of England).

This is just the latest test of my synthesis methodology, but I do think it is proving to be a productive and informative way in which to bring together such a large variety of different datasets into a single analytical environment.  The distributions and graphs shown above are just examples of data that we could map.  We could also map any of our other thesaurus categories or specific thesaurus terms, and we could easily perform analyses against other variables beyond elevation (e.g. terrain ruggedness, ground obscuration etc.) and using other bandings or groupings of cells.

Once we have all data collated for England, I will test the system again with all of that data and see how it works.  Only time will tell.

Chris Green

EDIT 09/10/12:  Updated second graph to correct mathematical error.

Geo-spatial visualization

I recently attended a Workshop on Challenges in Geo-spatial Visualization run by the OeRC at Pembroke College, here in Oxford.  The workshop was organised by Prof Min Chen and his colleagues in order to consider challenging problems in the visual analytics of spatial data and to discuss potential solutions.

Jason Dykes and Jo Wood of the giCentre at City University London presented particularly interesting ideas and visualizations, based around cartograms and visualising spatial ‘flows’.  They also emphasised the critical element of visual salience: this is the concept that (spatially) large objects tend to dominate on a map, whereas (interpretatively) important objects ought to be what our attention is drawn to.

I was also particularly taken with the ideas Simon Walton (of the OeRC) in regard to the importance of spatial frequency to visual perception (e.g. if we look at a Google Earth image of a city from space, we arguably tend to think that we distinguish between city and countryside based upon colour [i.e. greens vs greys], but we are in fact more influenced in this regard by the complexity of what we are seeing, with countryside being quite plain and cities complex).

Overall, the workshop was very engaging and challenged my thinking on how I might approach the spatial analysis of EngLaId’s datasets.  In particular, I think I am rather too wedded to the conventional map and, as such, have been experimenting with some alternative visualizations since the workshop.

One idea raised in discussion by Jo Wood was that of making graphs where one axis represents space (in some way) and the other an attribute associated with data located within that space.  It occurred to me that one common concept seen in much archaeological interpretation on the scale of England / Britain was that of difference between the lowland zone of southeastern England and the highlands of the west and north.  Conceptually, we can thus think of this as a trend from south east to north west.

In order to organise our data in such a way as to make it possible to graph data along this axis, I first defined an (arbitrary) point off the south east of England and then created a Euclidean distance raster radiating out from this point:

Euclidean distance raster from point marked by X.

I then generalised this into 10km width bands and joined the results to the vector grid tessellation that I am using to analyse data on the scale of England:

10km distance bands from point marked by X.

It is then possible to use this banding to plot other attributes recorded in the grid square layer as a graph, such as mean elevation or terrain ruggedness (TRI).  As our datasets are not yet quite complete, I do not currently have the ability to query these down to subsets based upon archaeological site type / period.  Therefore, I experimented with creating some graphs based upon the entire dataset, thus showing patterns along this SE-NW axis for England as a whole.

crazy graphs 1 - elev
Graph of mean elevation of grid squares: x-axis = distance band; y-axis = mean elevation. Points are individual data; heat map shows clustering. Deformed England below x-axis to show approximate spatial element.
crazy graphs 2 - TRI
Graph of mean TRI of grid squares: x-axis = distance band; y-axis = mean TRI. Points are individual data; heat map shows clustering. Deformed England below x-axis to show approximate spatial element.

These two graphs (created in Veusz from a .csv table exported from ArcGIS) are constructed so that the distance bands run from left (SE) to right (NW), with the mean elevation / TRI being shown on the y-axis (with the TRI, the higher the number, the more rugged the terrain).  The dots show individual records and the ‘heat map’ behind shows the frequency /clustering of those dots.  The deformed England map below each graph is intended to show an approximation of where these bands fall spatially, although obviously this is an imperfect relationship.  These graphs both show how the English landscape becomes more elevated / rugged at its extremes as you head north or west from the south east, albeit with its main clustering remaining at fairly low elevations and at fairly low degrees of ruggedness.

crazy graphs 3 - obsc
Graph of percentage obscuration of ground surface of grid squares: x-axis = distance band; y-axis = percentage. The red colouring is for “human” factors; the grey colouring superimposed includes soil / geological obscuration in addition.

This final graph shows the frequency / clustering of the percentage of grid cells in each band that are obscured from the air for the purposes of aerial photography.  The red shading shows ‘human’ factors only (see previous post), with the greyscale shading also including geological / soil type factors (see this post).  This graph is a little harder to read, so probably requires more thought.

If we compare these three graphs, we can see that the areas of the country most obscured by human activity (which in this instance includes woodland and lakes) cluster in the same bands as the areas of England which are predominantly of low elevation / ruggedness.  This suggests that there is a relationship between landscape morphology and human activity (as we would expect), with humans tending to prefer to settle in areas which are arguably easier to live in (i.e. lower, flatter terrain).

This is all very experimental at the moment and the conclusions reached are not yet particularly relevant to archaeological study, but it does prove that there is potential for a methodology such as this to elucidate patterns in our data.  Once we are able to query down this grid square dataset to only include cells with particular types of archaeological feature in them, we will be able to create many different graphs such as these and, as such, attempt to quantify the difference / similarity in the distributions of different archaeological features, based upon several attributes (i.e. elevation, TRI, ground obscuration).

Clearly, the banding chosen for this experiment reflects a particular concept of how distributions might vary across England, albeit one that is very common in archaeological interpretation (e.g. the three zones seen by Roberts & Wrathmell [2000 / 2002] and by Jeremy Taylor [2007] in their respective works): it is thus desirable to test different axes across the country to see whether different patterns might emerge.  It would also be possible to do something similar for bands created around all instances of a particular type of site, although this might be argued to be a little too processualist perhaps…

In conclusion, I do think the methodology outlined has potential for studying patterns in our data, but it will require a lot more thought and experimentation to be certain.

Chris Green


Roberts, B. and S. Wrathmell. 2000. An Atlas of Rural Settlement in England. London: English Heritage.

Roberts, B. and S. Wrathmell. 2002. Region and Place. A Study of English Rural Settlement. London: English Heritage.

Taylor, J. 2007. Atlas of Roman Rural Settlement. London: English Heritage.