The AHDSS is a multi-round, prospective community study that involves continuous demographic monitoring of the entire geographically defined Agincourt sub-district (Kahn, 2006). The baseline household census was first conducted in 1992 and since then, there has been a systematic recording of all birth, death and migration events in the populations, as well as recording and updating on household creation and dissolution (Kahn et al., 1999; Collinson et al., 2002). Between 1993 and 1999, the census was first updated at approximately 15-18 month intervals, since 1999, it has been conducted annually (Collinson et al., 2002; Kahn, 2006).
Hand-drawn maps were initially used during the census by fieldworkers. These maps included roads, dwellings other reference landmarks such as schools, churches, rivers, water reservoirs, soccer fields (Collinson et al., 2002). In order to link individual and household data to a village structure, each dwelling was allocated a unique identifying number (Kahn, 2006). Since 2004, a full geographic information system with geo- referencing of households has been introduced and fieldworkers now rely on digital maps (Kahn, 2006). These maps are updated annually as some new households are formed and some old ones are dissolved.
During the annual census, a fieldworker interviews the most competent respondent available and individual information is checked for every household member. Also, all events that occurred since the previous census are recorded and updated (Wittenberg and Collinson, 2006). Where possible, questions are directed to particular household members, for instance, a woman is directly asked about her maternity history and pregnancy outcomes (Collinson, 2002). Key variables related to an event are recorded, for example, in the case of a death, these would include place of death, name of hospitable, if applicable, and whether or not the death was registered (Kahn, 2006). In order to obtain information on education, labour force participation, socio-economic status and migration, occasional ―modules‖ are also added to the annual census and repeated at intervals. In the initial phase of the AHDSS, there was a single data-collection team of ten field workers and a field supervisor; some these fieldworkers were trained to conduct verbal autopsies (VA) (Kahn, 2006). With the aim of increasing the speed of data collection, the field team was expanded to twenty fieldworkers, four supervisors, one VA supervisor and four VA fieldworkers (Collinson et al., 2002).
4.3.1 Verbal autopsy data
The AHDSS also includes the verbal autopsy data, which aims to identify the cause of death for each death in the study site. The verbal autopsy was introduced in 1993 (Collinson et al., 2002). The verbal autopsy (VA) is usually conducted by specially trained fieldworkers in a household where a death had been recorded. The caregiver or the person most closely associated with the deceased is usually selected as the respondent (Tollman et al., 1999; Kahn, 2000). Verbal autopsies are usually completed between one
month and one year of the deaths; in keeping with traditional mourning practices, no VA is conducted prior to at least a month after bereavement (Kahn, 2006). The VA interview schedule is an adaptation of the one previously used in Niakhar, Senegal (another INDEPTH site). The interview guide from Niakhar was translated into Shangaan and modified to include culturally appropriate terminology (Kahn et al., 1999). In the interview schedule, there is an open section or narrative which elicits the symptoms and signs preceding death in the respondent‘s own words. Also, fieldworkers probe for completeness of information, the sequence of signs and symptoms, and response to treatment (Kahn, 2006). Several filtering questions follow, such as; ―Did the deceased cough?‖ and, the answer either leads to a detailed module on that symptom or the interview proceeds to the next filtering question (Kahn, 1999).
The verbal autopsies collected from respondents are later assessed by three medical officers to determine the cause of death (Kahn et al., 2000; Tollman et al., 1999). A probable cause of death is assigned by two doctors independently and the diagnosis is accepted when their reviews correspond. However, when they differ, the doctors discuss the case in an effort to reach consensus (Kahn, 2006). If no consensus is reached, a third practitioner is called in to make an independent and blind assessment. The case is reviewed if two out of three diagnoses correspond. The diagnosis is accepted as the ―probable cause of death‖ when consensus is achieved, if not, the cause of death is described as ―undetermined‖ (Kahn, 2000).
The Agincourt VA tool and assessment process was validated in the mid-1990s when final diagnoses were compared with the corresponding hospital diagnoses. The diagnoses were categorised as the same where the VA diagnosis and hospital record were in agreement, while diagnoses that were not the same were categorised as either ―different‖ or ―undetermined‖ (Kahn, 2000; Kahn, 2006). A validation of diagnoses between 2001 and 2005 against hospital records has also shown that HIV/AIDS diagnoses are reliable (Khan, 2006).
4.3.2 Quality control and data entry
In order to ensure data quality, quality control of the AHDSS data exists at five levels from field to data room. At the field level, data collection forms, including the VA questionnaires, are firstly checked by the fieldworkers themselves on a daily basis, secondly by fellow team members‘ cross-checking on a weekly basis, and thirdly by supervisors making random checks (Kahn, 2006). Errors are either corrected in the field office or done after a household was revisited; random duplicate visits are also conducted by the supervisor on 2% of the population (Collinson et al., 2006). Fourthly, at the main Agincourt field office, a specialized ―quality checker‖ carries out a final review of the interviews after which detected errors are recorded and forms returned back to the field for the fieldworkers to correct (Collinson et al., 2006). Lastly, at the data entry level, there are programmed computer checks for invalid codes, missing values, inconsistency of records and duplicated entries (Kahn, 2006). An error message is produced by the computer if there are any data items that do not pass the pre-determined validation rules;
the form is then re-checked manually and then sent back to the field, if necessary (Kahn, 2006). All these steps ensure the quality of the AHDSS census and verbal autopsy data.
After a data collection form has left the field and passed all quality checks, it proceeds to the level of data capturing (Collinson et al., 2002). In order to enhance data quality as well as facilitate working relationships between data and field teams, data is entered within the field site, in spite of infrastructural limitations (Collinson et al., 2002). The Agincourt database management system was first held in FoxPro, later re-written into Microsoft Access, but in 2001, it was converted into SQL Server which ensures a higher standard of data technology, data protection as well as improved means of querying the database (Kahn, 2006). The database is made up of related tables that store different aspects of the data, for instance, the ―Individual‖ table stores key information on all individuals; the ―Residence‖ table provides information on individual residence episodes, that is, the entry and exit of a person at a particular location at the field site; the ―Memberships‖ table records information on how and when an individual entered and exited a particular household (Collinson et al., 2002). There is also a separate table for each vital event as well as special module tables, such as the asset survey, which are updated with varying frequency (Kahn, 2006).