5.14.1
Ouptut areas
By comparing the simulated and actual proportion of people with a limiting long–term illness or disability, it is clear how close the simulated values are to the known values. The mean simulated proportion is 0.26 with standard deviation of 0.06, compared to the actual median proportion of 0.26 with standard deviation of 0.09.
The biggest discrepancy between the simulated data and the actual values is that in a small number of cases the simulation does not assign enough people with a limiting long–term illness or disability: it undersimulates. The maximum proportion assigned by the simulation is 0.53 compared to 0.72 for the actual data. However, only eight output areas—or less than 1%—have an actual maximum proportion higher than 0.53, suggesting the simulation has estimated that majority of cases of limiting long–term illness or disability well.
This is further demonstrated by the similarity of the thematic maps produced. Figure 5.14 shows the simulated proportion of individuals in each output area with a limiting long–term illness or disability. Figure 5.15 shows the actual proportion of individuals in each output area with a limiting long–term illness or disability, based on the 2011 census. The simulation picks up high proportions of limiting long–term illness or disability in areas across the borough, including Conisbrough and Mexborough to the west, Carcroft to the north, Thorne and Armthorpe to the east, and Rossington to the south.
Simulated proportion of
residents with limiting long−term illness or disability 0.0 to 0.1 0.1 to 0.2 0.2 to 0.3 0.3 to 0.4 0.4 to 0.5 0.5 to 0.6 0.6 to 0.7 0.7 to 0.8
Figure 5.14: Limiting long-term illness or disability spatial microsimula- tion results for Doncaster output areas
Census proportion of
residents with limiting long−term illness or disability 0.0 to 0.1 0.1 to 0.2 0.2 to 0.3 0.3 to 0.4 0.4 to 0.5 0.5 to 0.6 0.6 to 0.7 0.7 to 0.8
limiting long–term illness or disability. This is the output area to the south of Rossington and east of the A1(M) at the council boundary. I believe this is because there is a high number of residents who have never worked or are long–term unemployed (NS–SEC 8) living in this area, 61 compared to a mean of just 16.05.
If the survey data set has a disproportionate number of people who have never worked or are long–term unemployed, but not on medical grounds, this may affect the simulation. For example, a bias in the survey sample design may have identified people who have never worked because their partner works instead which may not fit the demographic of this area.
5.15
Conclusion
This chapter is a proof–of–concept spatial microsimulation using Under- standing Society (Understanding Society) respondents and 2011 census tables obtained from Nomisweb. It simulates if the population have or have had a health condition for each resident in Doncaster aged 16 and above. It uses a self–written package for R, rakeR, to perform the spatial microsimulation using the iterative proportional fitting method.
The final output of the spatial microsimulation model is a data table with one row per geographical zone and one column per variable, included the simulated variable. The results of the internal validation are encouraging and suggest this model forms a good basis to expand to include resilience and indicators of poverty, which I simulate in the next chapter.
Chapter 6
Health resilience spatial
microsimulation
6.1
Introduction
After successfully simulating a pilot spatial micro dataset in Chapter 5 I moved on to simulate health resilience, which includes clinical depression and measures of deprivation, and indicators of poverty which I use to examine the likely effects of a number of local and national policy proposals in Chapter 7. This was again a simulation of Doncaster, my case study area, at output area level. To perform the simulation I used the same data sources as the pilot simulation, namely Understanding Society and the 2011 census tables. Where this simulation differed was in the increased number of target variables that I simulated to help identify health resilience, and in the increased number of constraint variables I used to improve the accuracy of the simulation.
For the target variables I compared two approaches to identify resilience. One approach was to simulate mental health outcomes, specifically preva- lence of clinical depression, at the area level. I then combined these results with area–level deprivation measures to identify which area or areas could be considered resilient, if any. This is similar to the approach taken by
much contemporary social science research into health resilience, such as that by Bartley (2006), Mitchell et al. (2009), or Cairns (2013). The other approach was to simulate variables that identify concepts thought to promote resilience, as outlined in Chapter 3. With this approach I was able to specify which areas might be resilient under certain assumptions. These two approaches are documented in Section 6.7. Finally I simulated various indicators of economic and social status, which I use to exam- ine the possible effects of proposed national and local policy changes in Chapter 7.
For the constraints I wanted to test additional variables because more constraints can lead to a more accurate simulation, although some authors suggest the number of possible categories for each constraint is at least as important as the number of constraints themselves:
. . . a model constrained by two variables, each containing 10 categories (20 constraint categories in total), will be better constrained than a model constrained by 5 binary variables such as male/female, young/old etc. (Lovelace and Dumont, 2016: 52).
Regardless of the efficacy of using multiple variables or multiple levels, by testing additional constraints I was able to satisfy both requirements, as many of the constraints have several response categories. Of course, the constraints are only as good as their ability to predict the target variable, so I empirically tested this relationship in Section 6.4.