Regionality & complexity

This post follows on in part from a post I wrote a couple of years ago on regionality. It will also begin with an apology: the maps presented here will be very difficult for colour blind readers to understand, for which I am sorry. Unfortunately, the technique involved is somewhat limited in terms of control of colour (as it requires three colour channels), so it is not possible (or at least very difficult) to improve the maps to make them more legible for colour blind readers. As such, I would not propose publishing these particular visualisations in any formal setting, but hopefully I can get away with it in a blog post!

Before we get to the maps themselves, I shall describe briefly the mapping technique involved, which is partly inspired by the work of a former colleague of mine at the University of Leicester, Martin Sterry (departmental webpage; Essentially, this method can be used to describe the relationship between three different spatial variables that can be mapped as density surfaces. First, we create density surfaces (KDE here) for each variable and then we combine them into an RGB image using the Composite Bands tool in ArcGIS, with the first layer forming the red channel, the second layer forming the green channel, and the third layer forming the blue channel. However, RGB images (so-called “additive colours”, which work from black by adding light in the red, green, and blue channels), can be rather dark / muddy, so I then converted the images (using “Invert” in Photoshop) to CMY images instead (so-called “subtractive colours” where one works from white by subtracting light in the cyan, magenta, and yellow channels: this is how colour printers work). To do so cleanly, one must set up one’s map document so that anything one wishes to be white in the final image is black in the map document and vice versa. The same applies to greys, which must be set to their inverse (e.g. a 30R 30G 30B grey as seen below for Wales / Scotland / Man should be set to 225R 225G 225B, being 255-30 in each case). This may sound somewhat complicated but the end result is as follows:

  • Cyan (turquoise) tones represent high values in Channel 1, e.g. “complex farmsteads” in the first example below.
  • Magenta tones represent high values in Channel 2, e.g. “enclosed farmsteads” in the first example below.
  • Yellow tones represent high values in Channel 3, e.g. “unenclosed farmsteads” in the first example below.
  • Blue tones represent high values in Channels 1 and 2.
  • Red tones represent high values in Channels 2 and 3.
  • Green tones represent high values in Channels 1 and 3.
  • Dark grey / black tones represent high values in all three Channels.
  • White or pale tones represent low values in all three Channels.

Here is a close up of the colour category zones for the first two examples below:


I began by examining the three main categories of Roman farmstead defined by the Roman Rural Settlement Project (RRSP) at Reading, using their excellent data that is available online (Allen et al. 2015). As they defined only three specific categories, this is an ideal dataset to map in this way. For a first attempt, I made three KDE layers using a 10km kernel (or search window) to structure the size of the clusters in the resulting output, then combined them as described above. When plotted against the regions defined based upon variation in their data by the RRSP team (Smith et al. 2016: Chapter 1), we can see that there is a degree of agreement between the regions and the clustering of particular colours:

1 RRSP_psychedelia_v3_inc_regions

However, there is also clearly considerably more complexity to the data than a simple regional classification might suggest (as the RRSP team would certainly acknowledge, so this is not intended as a criticism in any way). If we construct a new model using a wider kernel (in this case 50km), we can get a really nice sense of regional variation in the data without the need to draw lines on a map:

3 RRSP_psychedelia_v2

There is some interesting structure in this model. For example, one can see a focus on enclosed farmsteads in the north and west, so-called complex farmsteads in parts of the southern and eastern midlands (largely alongside enclosed farmsteads), with quite a different focus on enclosed and unenclosed farmsteads in the south east. The strong peak in enclosed farmsteads in south Yorkshire / the north midlands is also quite striking. Although it relies too much on good colour vision in a reader, I think this model and technique works quite well here, so I decided to apply it to another dataset: our own.

Before we get to the next stage, here is a close-up of the colour category zones for the next two maps (with RO = Roman; PR = Prehistoric; EM = early medieval):

5 CMYK_Englaid

Based on another technique which we published recently (Green et al. 2017), the following two maps are created from a measure of the “complexity” of our datasets: specifically the number of different types of site / monument (based upon our thesaurus of types; see Portal to the Past) per 1x1km square. This measure was calculated for each square for each time period in our database and then density surfaces created for each time period (using a 5km kernel in this instance). A shortcoming of the mapping technique comes into play here: it can only map three categories at once. As such, we had to combine the Bronze Age and Iron Age models into a composite model for later prehistory. The three time period based complexity models were then combined into a single image as previously:

4 complexity_psychedelia_global

There are various nice patterns in this dataset, including the clear strength of prehistory and the early medieval in the south western peninsula, the intense focus on major river valleys (partly due to the large gravel quarry excavations in those areas), and the appearance of Roman roads highlighted in magenta. The Roman period also looks quite dominant generally, with lots of pinks, blues, and reds visible on the map. There is also a very clear difference in intensity between eastern / southern England and northern / western England.

It is possible to lessen the effects of regional and period based variation, by constructing a series of larger kernel density surfaces and using these to “correct” for regional variation in the period based models. This produces a new model which reflects complexity on a more local scale. Essentially, the first model can be thought of as a model of “globally” scaled (by which I mean the whole of the dataset, not the whole of the planet) complexity and the new model can be thought of as a model of locally scaled complexity:

6 complexity_psychedelia_local

This model also shows some interesting patterns. It is much less dominated by single periods in particular regions, with Roman dominance mostly along the Roman roads and Hadrian’s Wall. There are also some nice dark areas, which show high levels of local complexity across all three time periods. These cluster mostly along rivers again or around the large Roman towns, along with a similar cluster in southern Yorkshire / the north Midlands to that seen in the RRSP data.

As with all models of English archaeology, the images presented here represent a very complex data history, being influenced by both where more (and more visible archaeologically) activity took place in the past and where more modern archaeological activity takes place in the present (largely driven by development). They also, as previously noted, come with considerable caveats in regards to legibility, due to the relatively large minority of people with restricted colour vision (c.8-10% of men, and maybe 1% of women). The technique is also restricted by its inability to map more than three variables, but more than three variables would probably overcomplicate matters even if it were possible. However, I hope that this post gives a sense of the variation and complexity in the English archaeological record, locally, regionally, and nationally.

EngLaId is now winding down, having officially ended at Christmas, so this will probably be the last substantive post on technique or data for a while. We will however announce here when any new publications come out, including our main books.

Chris Green


Allen, M., T. Brindle, A. Smith, J.D. Richards, T. Evans, N. Holbrook, M. Fulford, N. Blick. 2015. The Rural Settlement of Roman Britain: an online resource. York: Archaeology Data Service.

Green, C., C. Gosden, A. Cooper, T. Franconi, L. Ten Harkel, Z. Kamash & A. Lowerre. 2017. Understanding the spatial patterning of English archaeology: modelling mass data from England, 1500 BC to AD 1086. Archaeological Journal 174(1): p.244–280.

Smith, A., M. Allen, T. Brindle & M. Fulford. 2016. New Visions of the Countryside of Roman Britain. Volume 1: the Rural Settlement of Britain. Britannia Monograph Series No. 29. London: Society for the Promotion of Roman Studies.

EDIT: Since writing this blog post, Martin Sterry has published a paper on his visualisation techniques, which can be found here:

Pondering regionality

I have recently been pondering the definition of regions, in the sense of carving England (or any country) up into contiguous zones of particular archaeological character. I would suppose that as a method of archaeological enquiry, this probably goes back at least as far as Fox’s division of Britain into “lowland” and “upland” zones along a dividing line running approximately from Dorset to Yorkshire. As a modern practice, I would suggest that recent interest in defining regions probably arises, at least in part, from the influential work of Roberts and Wrathmell (2000).

The reason why I have especially been thinking about this subject of late is due to the way in which two projects contemporary to our own have gone about structuring their reporting of their results. Their final report currently in press (Rippon et al. 2015), the Fields of Britannia (FoB) project divided the country (in this case being England and Wales) up into a series of regions (made up of groups of bio-geographical pays”). Similarly, the Roman Rural Settlement Project (RRSP) has also divided the country up into their own set of regions based upon the archaeological character of the excavated evidence found within each. Both of these projects based their regions around conglomerations of Natural England’s “Natural Areas“.

1 regions
Regions defined by other projects

If we compare these various regions on a map against the “Settlement Provinces” defined by Roberts and Wrathmell (R&W), we can see that there are broad similarities but also substantial local differences between the various regions (and provinces) defined. Herein lies the major problem with projects defining their own regions for analysis and reporting: it makes cross-comparison between different projects’ results difficult. For example, the Chilterns and the Berkshire Downs both fall within the south east regions of R&W and FoB, but within RRSP’s central zone: as such, can their respective “central” zones truly be compared? The simplest solution to this would be defining regions based upon modern political boundaries or, say, 100x100km grid squares. However, such an approach would result in regions that are archaeologically and bio-geographically irrelevant, which is very far from ideal (and so not recommended here!).

More fundamentally perhaps, I am also not convinced that archaeological remains (and thus, by implication, past human culture) truly lacks variety across such continuous areas of space and changes according to such sharp boundaries. I am sure that all of the researchers involved would agree with me on that and there is no doubt that defining regions helps in formulating ideas / arguments and in reporting results. However, I just wonder if there is a better way to structure our space? Some degree of structure is necessary, or all would be chaos and incomprehensible, but could alternative structures be preferrable?

2 HiLo model
Another experimental model (HiLo)

As an experiment, I constructed a regional model for England, but one that did not result in continuous regions, but rather fractured zones spread across the whole country. This model was based upon a mixed classification of elevation and terrain ruggedness and resulted in three new zones: a coastal zone (which largely seems to accord with former wetland areas), a lowland zone, and a highland zone (which seems to capture every important range of hills in England). These zones can exist in pockets within one another: they are not contiguous. Although not (by design at least) archaeologically relevant, these zones certainly have a degree of bio-geographic meaning. Furthermore, they would be reproducible by other scholars, assuming I publicised their construction method. As a Warwickshire man, I am particularly taken with the result that my county almost looks like a “natural” division of the country!

3 regions vs HiLo
Other regions against HiLo

If we compare these three “HiLo” zones (named for Oxford’s infamous Jamaican inn) against the regions of the other projects we can again see some similarities between the borders of my zones and those of the other projects, but again with substantial local differences. Obviously, if we were to use my HiLo regions for reporting on our project, we would just end up compounding the problem of difficulty of comparison, but the experiment remains of interest.

Elevation: Roberts and Wrathmell
Elevation: Fields of Britannia
Elevation: Roman Rural Settlement Project
Elevation: HiLo

I then tested each set of regions against a series of other datasets: elevation, terrain ruggedness, broad soil types, soil wetness, etc. The graphs above show just the elevation results, but the broad conclusions were similar for all comparisons. Essentially, the FoB and RRSP regions look far more distinct than the R&W provinces. This is hardly surprising as they are of smaller spatial extent: the smaller a sample area, the more distinct from the general “population”/pattern a variable ought to tend to be. This is clearly the case here. However, the HiLo model sits somewhere in between. It only has three zones, but they appear far more clearly differentiated than the R&W provinces. As such, we can conclude that they have greater geographic differentiation, due to their non-contiguous nature, despite being of similarly large extent.

8 regions_thes_all
Regions: archaeological variation (area normalised)
9 regions_thes_pc_all
Regions: archaeological variation (percentage)

As a final test, I then compared each set of regions against our archaeological data, using our coarsest level of thesaurus categories. I did this for each broad time period, but the results shown above are for all EngLaID time periods combined (unspecified prehistoric, Bronze Age, Iron Age, Roman, early medieval). The conclusions, interestingly but perhaps not surprisingly, are very similar to those seen when comparing against the “natural” factors described just above. FoB and RRSP regions look fairly distinct, R&W rather homogeneous (albeit with less dense data in the north west), and HiLo regions are more distinct than R&W but less so than the others. Again, the size of regions remains key (due to the MAUP).

Since undertaking these comparative experiments, I have been reading a recent report by Historic England’s Andrew Lowerre (2015). In the second half of the report, Lowerre uses a mixture of environmental variables alongside Roberts and Wrathmell’s data to define regions using automated clustering techniques. The regions that he produced (across a series of different models), much like my HiLo model, are non-contiguous and possess fuzzy borders. As such, to me at least, they seem much more representative of the data than regions defined manually. I wonder if this type of automated region creation is the way forward if we wish to define regions for our analysis and reporting?

Regions are undoubtedly a useful and intuitive way to divide up space that makes analysis and reporting of results within the context of a project relatively simple and straightforward, both in terms of how a team thinks about their data and in terms of how an audience may digest the same. However, the cross-comparison issue is distinctly problematic when one begins to think beyond the bounds of the results of a single project. We could potentially define a set of regions based on the natural environment that all projects should attempt to use, but we as archaeologists often seem to be naturally inclined to always do our own thing, so I am not sure that would be fruitful. Plus the set of regions defined might not be relevant across multiple time periods.

As such, I do wonder if we ought to avoid the idea of archaeological character regions altogether and just talk about variation in data across space. So long as that data is quantifiable and mappable as continuous fields, then cross-comparison becomes simple: map overlay is after all the most obvious application and strength of GIS, with whole suites of tools and methods dedicated to it.

This post is not intended as a criticism of the methods of other projects, which have undoubtedly proved fruitful and interesting in each case. I just wanted to express why I feel we (as EngLaID) ought to avoid regions in our reporting, especially as a project looking across traditional period boundaries. Others might disagree, but I do feel the cross-comparison issue of bespoke regions is a thorny problem, particularly for those interested in broad syntheses across time and space.

Chris Green


Lowerre, A. 2015. Rural Settlement in England: Analysing Environmental Factors and Regional Variation in Historic Rural Settlement Organisation Using Regression and Clustering Technique. Portsmouth: English Heritage.

Rippon, S., C. Smart and B. Pears. 2015 (in press). The Fields of Britannia: Continuity and Change in the Late Roman and Early Medieval Landscape. Oxford: Oxford University Press.

Roberts, B. and S. Wrathmell. 2000. An Atlas of Rural Settlement in England. London: English Heritage.

More on mapping pottery

Further to my previous post on mapping broad-brush pottery distributions, I was reasonably content with the maps for prehistory and the Roman period (albeit that they had significant shortcomings in terms of temporal currency), but I was not really satisfied with the amount of data I could find for the early medieval period. One particular shortcoming was the lack of data for the earlier half of the period, for which I had been able to discover very little.

After my blogpost, Helena Hamerow found a map in a publication by Catherine Hills which included a map of earlier Anglo-Saxon pottery. A little bit more investigation showed that this was adapted from Myres 1969 (Map 1). As such, it is clearly a very old source, with no evidence included from the massive post-1990 explosion in developer-funded archaeology, but it seems to remain the most complete national map for the period. I therefore digitised the dots in this map and the Blinkhorn (2012) map referenced previously and turned them into density surfaces. With this simple task complete, it felt like my picture was becoming more useful:

Blinkhorn, Myres (labelled Hills) and Vince overlaid.

However, the fact that the Myres data is quite so out-of-date suggested we ought to find some more modern proxy for the ceramic evidence. Letty suggested we ask Toby Martin, a British Academy postdoc here at the Institute, if we could use his corpus (2011) of Anglo-Saxon furnished graves (C5th to C6th) as just such a proxy, insofar as she felt that the people buried in such a fashion should also be people who use pottery. Toby was happy to oblige and so I created a model using a density plot of his data in addition to the previously mentioned three sources. Because Toby’s material is not actually ceramics but just being used as a proxy, I gave his data a lower weighting in the model. I did the same for Vince (1993) as his zones are rather too vague in extent for my purposes here. So, essentially, all of the four sources were normalised by their maximum value (so that they varied between 0 and 1), and then combined as follows (in two steps, so I could separate out earlier and later):

(Myres + 0.5 Martin) + (Blinkhorn + 0.5 Vince)

Combined model inc. Martin in addition.

However, there is one very clear problem with this model and that is that all of the sources used are explicitly “Anglo-Saxon”. In other words, where are the “Britons”? Toby and I did a bit of investigating and found a very interesting PhD thesis by Imogen Wood (2011) which included three maps of Cornish pottery of the early medieval period. This was exactly what I needed to help colour at least one of the none “Anglo-Saxon” parts of the map. Rather nicely, Wood’s first map was largely temporally coincident with Myres and Martin and her second map largely temporally coincident with Blinkhorn and Vince. So, I simply expanded the model as follows (again split into two stages):

(Myres + 0.5 Martin + Wood_early) + (Blinkhorn + 0.5 Vince + Wood_late)

Final model also including Wood.

The final model is shown above. I also have a couple of maps which split this out into the earlier (C5th-6/7th) and later (C7/8th-9th) parts of the early medieval period, but I feel that the combined model is probably the most robust. Although some of the input data is not perfect due to its age (Myres) or its spatial vagueness (Vince), I feel that is probably the best model we can currently construct for broad brush early medieval pottery presence / absence, at least without putting in substantially more work.

Any thoughts or disagreements are more than welcome, however!

Chris Green


Blinkhorn, P. 2012. The Ipswich Ware Project: Ceramics, Trade and Society in Middle Saxon England. Medieval Pottery Research Group Occasional papers.

Martin, Toby F. 2011. Identity and the cruciform brooch in early Anglo-Saxon England: an investigation of style, mortuary context, and use. Unpublished PhD thesis, University of Sheffield.

Myres, J.N.L. 1969. Anglo-Saxon pottery and the settlement of England. Oxford: Clarendon Press.

Vince, A. 1993. “Forms, Functions and Manufacturing Techniques of Late Ninth- and Tenth- Century Wheelthrown Pottery in England and their Origins.” In D. Piton (ed.), Travaux du Groupe de Recherches et D’Etudes sur la Céramique dans le Nord – Pas-de-Calais; Actes du Collque D’Outreau (10 -12 Avril 1992). Numéro hors-série de Nord-Ouest Archéologie, pp.151-64.

Wood, Imogen. 2011. Changing the fabric of life in post-Roman and early medieval Cornwall: an investigation into social change through petrographic analysis. Unpublished PhD thesis: University of Exeter.

Affordances, sites and monuments

In a previous post, I looked at how we might model some of the modern factors that affect the distribution of PAS finds.  I termed these factors “affordances”, as they can be thought of as helping to govern the opportunity for metal detectorists (and others) to be able to discover archaeological material.  I then began thinking about the affordances that affect the opportunity / likelihood of sites and monuments being discovered (if they were there to begin with in the past, of course).

I would argue that the six primary routes by which sites and monuments are discovered in the present day are:

  1. As clusters of spot finds.
  2. As documentary records (or place names, etc.).
  3. Via excavation / other intrusive field evaluation (e.g. watching briefs, test pits, etc.).
  4. Via geophysical survey.
  5. Via aerial photography as crop marks.
  6. Via aerial photography as earthworks.

To these could be added other routeways, such as LiDAR prospection, but these account for quite low percentages of sites and monuments at present.  Of the affordances relating to these six possible discovery pathways, number 1 has been outlined in the post referred to above, number 2 is, I think, impossible to define / quantify, and number 4 is hard to define / quantify.  Therefore, I have focused on numbers 3, 5, and 6.

Excavation / other intrusive field evaluation:

The main affordance in this category is where the opportunity to undertake archaeological excavation (etc.) exists or has existed.  Ideally, this variable would be based on data not explicitly linked to archaeological investigation: i.e. planning decisions on major projects and small projects in sensitive areas (albeit sensitive areas brings in an archaeological element), but this data is very hard to discover and collate.  In particular, planning statistics are inconsistently archived on the UK Government websites and changes in planning authority boundaries over time make collation particularly problematic.  Although I feel that putting the effort into compiling this information (since 1990 ideally, i.e. post PPG16) would be worthwhile, it would be too great an amount of work to undertake for our current project and purposes.

Therefore, I decided to base this affordance variable on data for where excavations (etc.) have taken place.  The most complete source of national data on this is the Excavation Index maintained by English Heritage and hosted by the ADS.  I extracted relevant types (e.g. excavation, watching brief, etc.) for post-1990 events and constructed a KDE plot of the resulting distribution.  I included events resulting in discoveries of all periods in order to minimise any effects of period bias.  The results were very “peaky” in the sense that the density of events in certain cities (in particular London) dwarfed that of the rest of the country.  In order to create my affordance surface, I therefore capped off the variation at +3 standard deviations (c.1.45 events per sq. km.), and divided by this maximum value to produce an affordance surface that varied between 0 (low chance) and 1 (high chance):

EI affordance
Affordance for excavation etc.: white is low, yellow medium, and red is high.

The results look pretty convincing, with areas of well-known high amounts of fieldwork showing up in the reds (e.g. London, the Upper Thames, Peterborough, Cambridgeshire).  As such, I am pretty happy with this result.

Aerial photography:

For (both types of) aerial photographic prospection, I based the affordance on two (overlapping) factors: modern land use and obscuration of the ground surface.  As with the PAS affordance, the land use was taken from LCM 2007 data and the obscuration based upon my earlier EngLaID work using OS and BGS data (1)(2)(3).  To keep it (relatively) simple, “cropmark” affordance was built from arable land with areas obscured by human (buildings, etc.), environmental (water, woodland, etc.) and soil factors (see previous posts) removed, and “earthwork” affordance was built from grazing land with areas obscured by human and environmental factors removed (but not the soil factor).  The results can be plotted onto the same map, as the areas of affordance are mutually exclusive (green = earthwork affordance present, orange = cropmark affordance present):

AP affordance
Affordance for aerial photography: green is good for earthworks, orange for crop marks.

Although there are some issues with this model, in particular the fact that cropmarks can appear on grazing land in dry years, the results look fairly robust.

In combination:

As a final step in the process, I then combined these three affordance patterns based upon the relative percentages of records in our database recording each discovery method as an evidence type.  This results in a composite affordance map for sites and monuments, albeit only based on the types of evidence that it is possible for me to map affordances for:

combined affordance
Combined archaeological affordance for sites and momuments

Although the model is clearly imperfect, the results do look intuitively correct, with the more archaeological dense areas of the country generally showing up in yellow and with clusters of high affordance around historic towns / cities.  The lower affordance levels in the north west, around the Wash and across the Weald are genuinely (relatively) low areas of site / momument density in our database.  Some areas of lower affordance can have higher densities of sites / monuments, which is particularly true of Bronze Age activity in upland areas (especially Dartmoor and the Peak District), but I do not think that this represents a problem.  There will inevitably exist areas with low affordance but high densities of sites (and vice versa) and, in some ways, these are perhaps the areas of most interest (as they are the ones where we can say that there are definitely past high activity levels coming through rather than simply being due to intensity of investigation)?

Chris Green

Extracting trends (VIII)

This is yet another short post about trend surfaces, following on from previous (I)(II)(III)(IV)(V)(VI)(VII), but with a new dataset.  After this, I think I have probably exhausted the possibilities for getting information out of our data using trend surface modelling, which is best thought of as an initial exploratory technique in any event.

This time, I have been looking at spatial trends present in English Heritage’s Excavation Index, which has been kindly supplied to us by Tim Evans at the ADS, who recently wrote an excellent journal article on the potential of the Index as a research tool.  This is a record of excavations and investigations that have taken place in England since around the mid nineteenth century.  I do not think that it pretends in any way to be comprehensive, but it is another way of filling in gaps in our data, especially for archaeological work that took place before 1990.

In any event, here are the trend surfaces that I have created based upon the Excavation Index (to different scales [the values being records per], but the broad picture is the important thing):

1 eh_excind_trend all
12th power linear trend surface for all data in the Excavation Index.
2 eh_excind_trend englaid
12th power linear trend surface for EngLaID period data in the Excavation Index.
3 eh_excind_trend PR
12th power linear trend surface for unspecified prehistoric data in the Excavation Index.
4 eh_excind_trend BA
12th power linear trend surface for Bronze Age data in the Excavation Index.
5 eh_excind_trend IA
12th power linear trend surface for Iron Age data in the Excavation Index.
6 eh_excind_trend RO
12th power linear trend surface for Roman data in the Excavation Index.
7 eh_excind_trend EM
12th power linear trend surface for early medieval data in the Excavation Index.

So, what can we see from looking at these maps?  Overall, the Index shows greatest density of work in the south, particularly around Bristol, London and Kent.  For the EngLaID period as a whole, the pattern is similar, but with the area around Dorset becoming more important.  The unspecified prehistoric is biased towards London and Kent, but there are too few of these records to say that this is particularly meaningful.  The Bronze Age stands out as very distinct from all other periods, with clear peaks in Wessex, eastern Yorkshire and the Peak District: my assumption is that this represents particular research projects undertaken by EH.  The Iron Age shows peaks north of London and stretching down to Kent and towards Wessex.  The Roman trend is similar to the overall pattern for all periods, which is not surprising due to the high numbers of Roman records in the database.  The early medieval peaks around Hampshire, Kent and London, with greater emphasis also on East Anglia than the other periods.

Overall, most of these trends are fairly similar to those seen with previous datasets, at least when considered on a broad brush basis.  The major exception is for the Bronze Age, where the high trend surface peaks previously seen in south west England are no longer as dramatic.  London is also standing out more strongly in the Index than it had in most previous datasets, I think (although this is less pertinent when comparing with the NRHE, as we did not receive NRHE data for London).

Chris Green

Extracting trends (VII)

Further to my previous post, I have now had another go at constructing trend surfaces for the four broad main periods covered by this project.  This time, however, I have filtered out records that are explicitly related only to artefact findspots (for each period).  This was in an attempt to downplay the influence in the previous trends from differential inclusion of PAS material between HERs.  The remaining records should, hopefully, thus primarily relate to sites with other archaeological evidence beyond just one or more artefacts.

Here are the results (to the same attribute scale as previous):

Trend surface for Bronze Age HER data, exc. findspots
Trend surface for Bronze Age HER data, exc. findspots
Trend surface for Iron Age HER data, exc. findspots
Trend surface for Iron Age HER data, exc. findspots
Trend surface for Roman HER data, exc. findspots
Trend surface for Roman HER data, exc. findspots
Trend surface for early medieval HER data, exc. findspots
Trend surface for early medieval HER data, exc. findspots

Comparing to the previous surfaces, we can see a general reduction in trend peaks, especially over Norfolk and Yorkshire.  The Bronze Age remains similar to previous; the Iron Age also, albeit with much lower peaks; the Roman period shows an increasing strength across Gloucestershire; the early medieval shows the most distinct reductions in eastern regions.

Chris Green

Extracting trends (VI) and national synthesis update

This post follows on from my previous posts on trend surface modelling (I)(II)(III)(IV)(V) and my posts on synthesis of multiple datasets using grid squares (I)(II)(III)(IV).

As our HER dataset is now nearly complete (only Merseyside is expected from now on; North Somerset and Bath & North East Somerset are unable to provide data), we are finally able to begin attempting to study the data which we have gathered on a nationwide scale.  Broad period classifications (Prehistoric; Bronze Age; Iron Age; Roman; early medieval; uncertain; “bad date” [i.e. outside our period]) were calculated for the HER data using a script (based upon the multitude of period designations applied by individual HERs or upon start / end dates) and the data was converted to shapefile format and merged into a single point layer.  This shapefile layer can then be very coarsely queried to produce distributions of records of different periods.

As an initial method for investigating this mass of data (around 400,000 records), I experimented with the production of a few trend surfaces.  First, one for all of the data received:

Trend surface for all EngLaId HER data
Trend surface for all EngLaId HER data

I think that there are two major factors at play in this trend.  The first is the general bias in English archaeology towards greater density of (probably) settlement and (certainly) fieldwork in the south and east of the country.  The second (possibly more dominant?) is the variation in recording methods used across the country.  Even where the same software is used, different HERs catalogue their data somewhat differently: some like to split everything up into individual periods and types, others like to collate into multi-period sites; some cast their nets wide to include as much data as possible (e.g. PAS data, MORPH data), others like to only include sites of certain and clear provenance.  This means that the density of data across the country is as much about modern practice as it is about activity in the ancient past.

We can then produce similar surfaces for our broad periods (all to the same numerical scale):

Trend surface for Bronze Age HER data
Trend surface for Bronze Age HER data
Trend surface for Iron Age HER data
Trend surface for Iron Age HER data
Trend surface for Roman HER data
Trend surface for Roman HER data
Trend surface for early medieval HER data
Trend surface for early medieval HER data

These four surfaces still reflect to some extent the differences seen in modern practice, but they are closer to the genuine distribution of past activity.  The Bronze Age surface seems to be biased towards uplands and towards Wessex.  The Iron Age surface has a clear bias towards the south east.  The Roman surface is biased towards lowland Britain but also towards the pockets of military activity in the north of England.  The early medieval surface is biased towards the eastern parts of England.

However, the distributions behind all of these trends are still heavily influenced by modern archaeological and CRM practices.  This is only going to get worse when we begin to produce duplication in our dataset by building in English Heritage NRHE data and other datasets.  As discussed in previous posts, one way in which to minimise these modern effects and reduce the influence of duplication is to collate data by 1 by 1 km grid cells.  This requires the application of a thesaurus containing simplified monument terms and the step already undertaken of assigning standardised period terms.  The result is a tessellation of 1 x 1 km grid squares across England recording the presence of different types of archaeological site for each of our broad periods, which we can then query and use to produce maps.

As an example, I constructed a few more trend surfaces, based upon the presence / absence of evidence for sites within our broad “domestic and civil” category.  This category includes: town, burh, civitas capital, colonia, hamlet, village, vicus, canabae legionis, oppidum, hillfort, anything defined using the word “settlement”, midden, timber platform (several of these sub-types belong to more than one broad category).  We can then look at how the underlying trends behind this category changed over time (these trend surfaces are logistic rather than linear, reflecting the probability of binary presence / absence relationship rather than density):

Trend surface (logistic) for synthesised data: Bronze Age "domestic and civil" category.
Trend surface (logistic) for synthesised data: Bronze Age “domestic and civil” category.
Trend surface (logistic) for synthesised data: Iron Age "domestic and civil" category.
Trend surface (logistic) for synthesised data: Iron Age “domestic and civil” category.
Trend surface (logistic) for synthesised data: Roman "domestic and civil" category.
Trend surface (logistic) for synthesised data: Roman “domestic and civil” category.
Trend surface (logistic) for synthesised data: early medieval "domestic and civil" category.
Trend surface (logistic) for synthesised data: early medieval “domestic and civil” category.

There is still some bias in these trend surfaces from the amount of data recorded by different modern archaeological entities (e.g. Northamptonshire is a very “completist” HER, which partially accounts for it showing up so strongly in many of the trend surfaces seen in this blog post), but the patterns are still quite interesting.  The Bronze Age is heavily influenced by the very high number of records present on Dartmoor and Bodmin Moor.  The Iron Age is probably mostly interesting for the low probability area across the “waist” of England from Cheshire to Lincolnshire.  The Roman is pretty much how I would expect it: high likelihood in the lowland zone and around Hadrian’s Wall (this includes “native” sites [whatever that means!] of Roman period date).  The early medieval is fairly flat, showing settlement across the country with greatest probability in central and eastern England (the peaks in Devon possibly need further investigation).

All of this is just a very preliminary, very coarse analysis of what is a very large and detailed set of data.  Some interesting patterns are beginning to emerge, but these may diminish as we continue to work on our material.

Chris Green

Latest GIS synthesis test: West Midlands

Further to my previous work on attempting to bring together our multiple datasets into a synthesis (1)(2) and on visualising the results (3), I have now performed a more extensive test of the methodology using all of the data that we have gathered to date for English Heritage’s West Midlands region.

To briefly summarise previous posts, my synthesis methodology consists of creating a tessellation of 1km by 1km grid squares across England, with the SW origin point of each cell being 50m west and 50m south of the 1000m divisions in the OS grid.  We simplify the terminology used in each of our input datasets, run a series of identity queries in ArcGIS to define which cell(s) each object is located in, then collate the data for each cell for all datasets to create a new single synthesis layer.  We can then test these results against various measures such as mean elevation, mean terrain ruggedness index, percentage ground obscuration etc.

The datasets used for this latest synthesis test were:

  • Bronze Age sites as collated by David Yates (2007. Land, power and prestige: Bronze Age field systems in southern England. Oxford: Oxbow).
  • Janice Kinory’s database of Iron Age salt processing sites.
  • National Trust HER records.
  • A layer showing field systems recorded in the AIP database.
  • A layer showing settlements recorded in the AIP database.
  • English Heritage National Record of the Historic Environment data (NRHE).
  • English Heritage MORPH data (for the Marches region).
  • HER data from Birmingham, the Black Country, Coventry, Dudley, Herefordshire, Sandwell, Shropshire, Solihull, Staffordshire, Warwickshire, and Worcestershire (Worcester, Stoke-on-Trent, and Dudley are not yet in our database, although we have the data for each, so it was not possible to include them in the test).

The latest version of our terms thesaurus includes a category called ‘Domestic & Civil’, which covers the following types of sites (some of these types appear in other categories as well, e.g. hillforts in ‘Defensive’):

  • Town / small town.
  • Burh.
  • Civitas capital / colonia.
  • Hamlet / village.
  • Vicus.
  • Canabae Legionis.
  • Oppidum.
  • Hillfort.
  • Unenclosed settlement.
  • Enclosed settlement.
  • Linear settlement.
  • Palisaded settlement.
  • Riverside settlement.
  • Dispersed settlement.
  • Nucleated settlement.
  • Road-side settlement.
  • Midden.
  • Timber platform.
  • Manor.
  • Unspecified settlement.

Obviously, this is quite a broad category, approximating to something like “settlement” generally.  We can then map the distributions of grid cells containing sites that fall within this category for each period:

1 domestic and civil (PR)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; unspecified prehistoric.
2 domestic and civil (BA)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Bronze Age.
3 domestic and civil (IA)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Iron Age.
4 domestic and civil (RO)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; Roman.
5 domestic and civil (EM)
1km x 1km grid cells showing evidence of sites within the ‘Domestic & Civil’ category; early medieval.

These results show several patterns.  Settlement can be seen to follow the major river valleys in many cases, especially along the Warwickshire Avon.  In the Iron Age, there is a distinct move towards upland settlement in addition, especially in the Marches.  Roman settlement is most dominant in the south east of the region (Warwickshire and Worcestershire).  The cluster of settlement seen in all but the early medieval in the south of the region is in the vicinity of Bredon Hill, which was clearly an important area through the later prehistoric and into the Roman period.  The early medieval distribution is particularly interesting due to the large number of sites seen in western Staffordshire: these are largely records of manor sites.  It seem likely that this is an artifact of data collection, perhaps resulting from a particular researcher or HER officer having a particular research interest in Staffordshire manors (it seems likely that these originated from Domesday records?).

If we want to analyse these trends further, we can collate data according to a series of bands across England from a point off the south-eastern coast moving out north and west.  For the West Midlands, these bands are located like so:

6 bands
South east to north west banding for analytical purposes.

We can then export the results and create graphs in external software:

7 band_elev
Elevation of grid cells showing evidence of sites within the ‘Domestic & Civil’ category, grouped by band and coloured by period.

This graph shows that settlement generally occurred in areas of lower elevation, with most upland occupation being Iron Age in date.  The dominance of Roman data in the (left-hand) south eastern bands is obvious, as is the dominance of early medieval data in bands 40-42 in particular.

8 band_density
Percentage of cells in each band showing evidence of sites within the ‘Domestic & Civil’ category, coloured by period.

This second graph works better, showing the high levels of Roman occupation in bands 33-37, and relatively high levels of early medieval occupation through bands 39-45.  Band 35 contains most of the Warwickshire Avon, providing supporting evidence for my earlier suggestion of its importance based upon visual examination of the map.

As a final point, the distributions mapped above appear quite coarse when studied at the scale of only the West Midlands, but if looked at on the England-wide scale of the project as a whole (for which this is just a test), the resolution acquires a much finer-grained appearance:

9 villas scale of England
1km x 1km grid cells showing evidence of Roman villas (West Midlands only but scaled to show all of England).

This is just the latest test of my synthesis methodology, but I do think it is proving to be a productive and informative way in which to bring together such a large variety of different datasets into a single analytical environment.  The distributions and graphs shown above are just examples of data that we could map.  We could also map any of our other thesaurus categories or specific thesaurus terms, and we could easily perform analyses against other variables beyond elevation (e.g. terrain ruggedness, ground obscuration etc.) and using other bandings or groupings of cells.

Once we have all data collated for England, I will test the system again with all of that data and see how it works.  Only time will tell.

Chris Green

EDIT 09/10/12:  Updated second graph to correct mathematical error.

Geo-spatial visualization

I recently attended a Workshop on Challenges in Geo-spatial Visualization run by the OeRC at Pembroke College, here in Oxford.  The workshop was organised by Prof Min Chen and his colleagues in order to consider challenging problems in the visual analytics of spatial data and to discuss potential solutions.

Jason Dykes and Jo Wood of the giCentre at City University London presented particularly interesting ideas and visualizations, based around cartograms and visualising spatial ‘flows’.  They also emphasised the critical element of visual salience: this is the concept that (spatially) large objects tend to dominate on a map, whereas (interpretatively) important objects ought to be what our attention is drawn to.

I was also particularly taken with the ideas Simon Walton (of the OeRC) in regard to the importance of spatial frequency to visual perception (e.g. if we look at a Google Earth image of a city from space, we arguably tend to think that we distinguish between city and countryside based upon colour [i.e. greens vs greys], but we are in fact more influenced in this regard by the complexity of what we are seeing, with countryside being quite plain and cities complex).

Overall, the workshop was very engaging and challenged my thinking on how I might approach the spatial analysis of EngLaId’s datasets.  In particular, I think I am rather too wedded to the conventional map and, as such, have been experimenting with some alternative visualizations since the workshop.

One idea raised in discussion by Jo Wood was that of making graphs where one axis represents space (in some way) and the other an attribute associated with data located within that space.  It occurred to me that one common concept seen in much archaeological interpretation on the scale of England / Britain was that of difference between the lowland zone of southeastern England and the highlands of the west and north.  Conceptually, we can thus think of this as a trend from south east to north west.

In order to organise our data in such a way as to make it possible to graph data along this axis, I first defined an (arbitrary) point off the south east of England and then created a Euclidean distance raster radiating out from this point:

Euclidean distance raster from point marked by X.

I then generalised this into 10km width bands and joined the results to the vector grid tessellation that I am using to analyse data on the scale of England:

10km distance bands from point marked by X.

It is then possible to use this banding to plot other attributes recorded in the grid square layer as a graph, such as mean elevation or terrain ruggedness (TRI).  As our datasets are not yet quite complete, I do not currently have the ability to query these down to subsets based upon archaeological site type / period.  Therefore, I experimented with creating some graphs based upon the entire dataset, thus showing patterns along this SE-NW axis for England as a whole.

crazy graphs 1 - elev
Graph of mean elevation of grid squares: x-axis = distance band; y-axis = mean elevation. Points are individual data; heat map shows clustering. Deformed England below x-axis to show approximate spatial element.
crazy graphs 2 - TRI
Graph of mean TRI of grid squares: x-axis = distance band; y-axis = mean TRI. Points are individual data; heat map shows clustering. Deformed England below x-axis to show approximate spatial element.

These two graphs (created in Veusz from a .csv table exported from ArcGIS) are constructed so that the distance bands run from left (SE) to right (NW), with the mean elevation / TRI being shown on the y-axis (with the TRI, the higher the number, the more rugged the terrain).  The dots show individual records and the ‘heat map’ behind shows the frequency /clustering of those dots.  The deformed England map below each graph is intended to show an approximation of where these bands fall spatially, although obviously this is an imperfect relationship.  These graphs both show how the English landscape becomes more elevated / rugged at its extremes as you head north or west from the south east, albeit with its main clustering remaining at fairly low elevations and at fairly low degrees of ruggedness.

crazy graphs 3 - obsc
Graph of percentage obscuration of ground surface of grid squares: x-axis = distance band; y-axis = percentage. The red colouring is for “human” factors; the grey colouring superimposed includes soil / geological obscuration in addition.

This final graph shows the frequency / clustering of the percentage of grid cells in each band that are obscured from the air for the purposes of aerial photography.  The red shading shows ‘human’ factors only (see previous post), with the greyscale shading also including geological / soil type factors (see this post).  This graph is a little harder to read, so probably requires more thought.

If we compare these three graphs, we can see that the areas of the country most obscured by human activity (which in this instance includes woodland and lakes) cluster in the same bands as the areas of England which are predominantly of low elevation / ruggedness.  This suggests that there is a relationship between landscape morphology and human activity (as we would expect), with humans tending to prefer to settle in areas which are arguably easier to live in (i.e. lower, flatter terrain).

This is all very experimental at the moment and the conclusions reached are not yet particularly relevant to archaeological study, but it does prove that there is potential for a methodology such as this to elucidate patterns in our data.  Once we are able to query down this grid square dataset to only include cells with particular types of archaeological feature in them, we will be able to create many different graphs such as these and, as such, attempt to quantify the difference / similarity in the distributions of different archaeological features, based upon several attributes (i.e. elevation, TRI, ground obscuration).

Clearly, the banding chosen for this experiment reflects a particular concept of how distributions might vary across England, albeit one that is very common in archaeological interpretation (e.g. the three zones seen by Roberts & Wrathmell [2000 / 2002] and by Jeremy Taylor [2007] in their respective works): it is thus desirable to test different axes across the country to see whether different patterns might emerge.  It would also be possible to do something similar for bands created around all instances of a particular type of site, although this might be argued to be a little too processualist perhaps…

In conclusion, I do think the methodology outlined has potential for studying patterns in our data, but it will require a lot more thought and experimentation to be certain.

Chris Green


Roberts, B. and S. Wrathmell. 2000. An Atlas of Rural Settlement in England. London: English Heritage.

Roberts, B. and S. Wrathmell. 2002. Region and Place. A Study of English Rural Settlement. London: English Heritage.

Taylor, J. 2007. Atlas of Roman Rural Settlement. London: English Heritage.

Nationwide survey – what would you be interested in seeing?

As our databases are nearing completion, we will shortly be in a position to begin looking at patterns in our data on the scale of all of England.  Our basic methodology for beginning to do this has been described previously (1) (2).  Obviously, this will just be one element of what we will be undertaking as part of this nationwide study, and we will also be looking in more detail at the form of features in the rural landscape (using NMP data primarily).

During our discussions with archaeological researchers across England over the last 8 months or so, it has become clear to us that many people have quite particular ideas about what they would like to see out of our England-wide survey, for instance the mapping together of particular sets of evidence.  Obviously because of the scale of our dataset we are also aware that we won’t be able to pursue all potential avenues of research.  Consequently, and since, as a project, we wish to be as open to outside ideas as we can, we thought this might be a good time to give blog-readers the opportunity to mention any groups of evidence that they would particularly like to see mapped and analysed on a nationwide scale?  All thoughts are welcome!

If you are interested in making any suggestions to us, please feel free to do so, either by commenting on this post or via email.