The data from Section 3.1 is reproduced in Figure 22 using the visualization scheme introduced in Section 3.3.1 to facilitate ROC curve generation.
Figure 22: Compilation of Exclusion & Inclusion Data Using Simulated Mixtures
As in Section 3.3.1, the bars represent cumulative, normalized fractions with the red and green bars for a given bin summing to 1 and—separately—the blue and yellow bars for a given bin summing to 1. Green represents Pr(Correct Exclusion); red, Pr(Incorrect Inclusion); yellow, Pr(Correct Inclusion); blue, Pr(Incorrect Exclusion)
Compiling the ROC results for all levels of drop-out analyzed with the simulated mixture data produces Figure 23.
Figure 23: Error Analysis Results: ROC Rollup Using Simulated Mixtures
Decision threshold labels have been omitted for legibility.
Figure 23 shows the correct inclusion rate (calculated from the inclusion data) versus the incorrect inclusion rate (calculated from the exclusion data) at various levels of drop-out for the full range of possible discrepancy decision criteria. Figure 23 shows that as the tolerated number of allelic discrepancies (i.e., the decision threshold) increases, both the correct inclusion and incorrect inclusion fractions increase (while the incorrect exclusion and correct exclusion fractions experience complementary decreases), albeit at different rates. Further, as the level of drop-out increases for a given decision threshold, both the correct inclusion and incorrect inclusion fractions decrease.
ROC analysis may therefore be used to determine a complexity threshold. That is, a laboratory may choose to specify particular levels of error that are to be tolerated, thus
bounding the false positive and false negative error rates and establishing a “complexity threshold.” This amounts to “zooming in” on a region in the ROC space that accords with the tolerable error levels. An example is shown in Figure 24.
Figure 24: Notional Complexity Threshold: Simulated Mixture Results
This plot “zooms in” on the upper left of Figure 23 in a manner consistent with error bounds specified by an imaginary laboratory’s Standard Operating Procedure. (The same color scheme is employed as that specified in Figure 23’s legend, where the Pr(D) increases from blue to red or from top-left to bottom-right in increments of 0.1. Also, the first couple of decision thresholds (i.e., τ1 – τ4) for the darkest blue line, corresponding to
no drop-out, are so close that they cannot be distinguished from one another.) In this case, the error bounds provide for no more than a 10% chance of incorrect inclusion of a reference sample while insisting on at least 30% correct inclusion determinations. Any points (determined by error rates and drop-out levels) that lie outside of this space fail the complexity threshold specified by the SOP and should not be interpreted.
Complexity thresholds, which are defined in ROC space by pre-specified,
laboratory error bounds, may be utilized in two ways: 1) to determine when the Pr(D) is too high to allow for reliable evidence profile interpretation; 2) to determine an a priori
ceiling on the number of allelic discrepancies tolerated (while still allowing for a
determination of reference exclusion as a contributor to the evidence sample) before false inclusion rates become intolerable.
Even if a determination of drop-out level can be made, certain error limitations may sufficiently bound the problem of establishing a complexity threshold. For instance, if a laboratory demanded that false inclusion occur less than 1% of the time and that true inclusions occur at least 85% of the time (Figure 25), then samples where the probability of drop-out is greater than 0.2 should not be analyzed since none of the ROC curve is represented in the bounded space; that is, the complexity threshold, which is based on pre-determined, tolerable error rates, is not met.
Figure 25: More Realistic Complexity Threshold: Simulated Mixture Results
This plot “zooms in” on the upper left of Figure 24 in a manner consistent with error bounds specified by an imaginary laboratory. (The same color scheme is employed as that specified in Figure 23’s legend, which has been left off here for space considerations. Also, the first couple of decision thresholds (i.e., τ1 – τ2) for the darkest blue line,
corresponding to no drop-out, are so close that they cannot be distinguished from one another.) A laboratory’s Standard Operating Procedure specifies tolerable error bounds that will define the complexity threshold. In this case, false inclusions are limited to occurring less than 1% of the time while false negatives are allowed 15% of the time. The highest number of allelic discrepancies (representing possible decision thresholds) for any level of drop-out that falls within this space is 8 (for Pr(D) = 0.30).
If these are the bounds selected by the laboratory, only those samples with a corresponding Pr(D) = 0.00 – 0.20 should be evaluated. It should be noted that the present study simulated two-person mixtures in a contributor ratio of 1:1. Supplementary studies that simulate mixtures with varying ratios and numbers of contributors are
expected due to low-template, the samples should not be analyzed. It would be classified as a Type C [7] or uninterpretable mixture [11].
Previous work suggests that Pr(D) increases with decreasing DNA input levels and can be characterized via peak heights [9]. Gill et al. [61] showed that Pr(D) ≈ 0.20 when allele peak heights are between approximately 50 – 100 RFU for the AmpFℓSTR® SGM Plus® PCR Amplification Kit. Therefore, for the aforementioned bounds, if it is
suspected that less than ~0.1 ng was amplified for a given contributor—as evidenced by peak heights less than ~100 RFU at an analytical threshold of 50 RFU—then the
laboratory would deem the sample indeterminable and would not use it for comparison purposes. The decision would be made before comparison to a reference sample.
Next, the ROC plot can be used to determine the number of “allowed” discrepancies based on a given Pr(D). For example, considering Figure 25, if a
laboratory has settled upon acceptable rates of incorrect inclusion and correct inclusion at ≤1% and ≥85%, respectively, with Pr(D) ≈0.2, then no more than eight allelic
discrepancies should be allowed by the laboratory’s analysts. That is, if an analyst were to include reference samples as potential contributors to evidence stains despite observing 9 or more allelic discrepancies, error rates greater than the laboratory’s specified
tolerances would be encountered.
Alternatively, given Blackstone’s Ratio [31], a laboratory might solely prioritize the minimization of spurious inculpations at the expense of identifying every true positive, in which case only false inclusions are considered in the constitution of a laboratory’s acceptable error bounds. An example is shown in Figure 26.
Figure 26: Complexity Threshold Based on Blackstone’s Ratio: Simulated Mixture Results
This complexity threshold focuses solely on diminishing the incidence of false positives without specifying a bound on false negatives. Only selected thresholds are labeled.
In this example, any rate of incorrect exclusion is allowed and only the rate of incorrect inclusions is bounded. Here, all samples regardless of drop-out rates may be interpreted, but given a particular Pr(D), a specified number of allowable allelic discrepancies may be defined that increases as with increasing Pr(D). This increase in the number of allowable allelic discrepancies with increasing Pr(D) is again the result of obtaining less allelic information with increasing drop-out, which leads to an increased chance in correctly excluding a reference who ought to have been excluded. (Refer to Figure 15 and Table 7 for detailed discussions of this point.)