PPA methods are broken into three primary classes of methods: quadrat methods, density estimation and distance based methods. With quadrat methods the study area is broken up into regular sized units, usually four sided (hence, quadrats, although other options are possible), where a summary statistic such as the number of sites or average site size is
recorded for each quadrat. It should be noted, though, that the starting data set going into this analysis is always a set of events, each of which has its own Cartesian coordinates. The specific events within each quadrat are combined and summarized for that quadrat. Quadrat methods suffer from a number of short comings. First and foremost, they represent a summary of the data. For instance, if the data represents the number of archaeological sites in a specific Borden number, any specific patterning within that unit is lost. Thus, if all of the Late Archaic sites in one region occur in one specific river valley and we choose quadrat analysis as an analytical tool and then summarize by one kilometre squares identified from a topographic map, we might see three adjacent one kilometre squares, each with a count of sites such as 10, 18, and 2. In doing so, we would completely miss the particular pattern, which might have been detected had we had access to the specific latitude/longitude of each site and plotted them accordingly on a map with regional topography.
Another problem which occurs with quadrat analysis is the Modifiable Area Unit Problem
(MAUP) (Goodchild 1996). The choice of the size and positioning of the unit is entirely arbitrary and different size units and/or different origins for the grid can give different results. For example, in a site excavation we could summarize the data by 1, 2, 5 or 10m square units. Second, if we select one of these, say a five metre square unit, it is also necessary to
determine the origin of the grid. Normally one chooses a point value such as (0,0) for the origin of a grid of 5 m squares, but could as easily choose a value for the origin such as (2.5,2.5). In the first case, the square to the northeast of the grid origin would start at (5,5) and in the second, it would start at (7.5,7.5). The point is that the patterning of the data within the grids might well appear different, depending on the choice of origin and the size of the grid. In fact, assignment of the original (0,0) for the site excavation is, in most cases,
entirely arbitrary. The MAUP can occur with both of these choices. Thus, for quadrat analysis, careful consideration of the unit size and position of the origin is critical. The other and most significant problem with quadrat analysis is the creation of quadrats essentially summarizes the data and we lose the fine detail that might have been seen in a plot of all of the (x,y) coordinates. An excellent example of this can be found in the case study in Chapter 4 (e.g. compare Figure 4-17 with Figure 4-21).
On the positive side, in archaeology much of our data consists of a summary by grid unit, so Quadrat methods have a great deal of utility. The standard CRM excavation report, which shows the number of artifacts in each one metre excavation unit, is an elementary example. Given the limitations of quadrat analysis, generally it should be avoided if specific Cartesian coordinates are available for study or at least used to supplement other statistical approaches, such as those used in this study. However, if we can be reasonably sure we are avoiding or minimizing the aforementioned problems, summary counts by unit change nominal data to ratio data. In one sense this summarization by quadrats effectively converts point data into area data and has the strength that it enables a number of statistical techniques not applicable to point patterns. An example of this procedure will be presented in Chapter 4.
The second category of PPA methods is called density estimation. This category is likely familiar to most archaeologists, as we have created density patterns of archaeological deposits for many years. Basically, these are all fairly simple models. The density of each point on the output map is calculated by determining the density of events within a specified radius of each point on the map. It should be noted that there are options for calculating the density. The main option is the radius, but there are various methods for calculating a density value that can be as simple as the naïve density (count/area, where 𝑎𝑟𝑒𝑎 = 𝜋𝑟2) to more complex weighting methods known as Kernel Density, where events closer to the point being calculated are weighted heavier than points close to extremity of the radius. An example of this procedure can be found in the Davidson case study below.
The third category of Point Pattern methods is called distance methods, which operate on data consisting of a series of points with coordinates in two dimensional Cartesian space. Certainly three dimensional analytics are possible and are being considered (see Baddeley 2010b) but they are not well developed at this time and not widely applied. Regardless, with archaeologically excavated material, three dimensional analyses would have a great deal of
utility. All forms of analysis in this study are two dimensional. There are a number of
distance based methods both in point pattern analysis as explained in the geographic texts and in previous archaeological work. In fact, much of the archaeological specific methods
developed 30 years ago are essentially distance based methods. What all distance based methods have in common is that they calculate the distance between two events using the Pythagorean Theorem. Indeed, in developing the set of R programs used here, the very first ‘function” developed was one to calculate the distance between two points.
One of the more common problems with distance methods is known as edge effects. Edge effects occur with some distance based measures in cases where the distribution extends beyond the edge of the study area. The Nearest Neighbour (NN) statistic is a good example of a statistic that is susceptible to this problem as hinted above. In cases where edge effect occurs, the NN statistic can be distorted because distances from points close to the boundary must be computed to other points within the study area, when there could be closer points just outside that study area. Thus, the statistic being developed might be distorted by lack of access to data beyond the boundary of the study area. One technique of dealing with this problem is to define a buffer area around the edge of the study area, effectively reducing it in size but leading to the calculation of a more accurate statistic. Another technique would be to evaluate statistical significance with a Monte Carlo technique. A good example where edge effects could exist is the Kellis 2 cemetery
discussed in Chapter 5, where unexcavated graves are found to the west, north and east of the excavated portion.