2º Bachillerato Matriz de especificaciones

Label-free identification of rod precursor cells in heterogeneous retina samples is one main motivation of this thesis. This section shows a practical sorting experiment in which rod precursor cells were successfully enriched.

Dataset assembly

Dissociated retina cells were resuspended in measurement buffer for RT-DC to a concentration of 20 million cells/ml (prepared by Karen Teßmer). The sample was loaded into a sorting chip (chip design is shown in 2.1.2) and RT-FDC, together with ShapeIn were used to capture training data. Normally a 40x magnification is used during RT-FDC experiments (see Figure 2.1). For sorting-experiments it is currently essential to use a 20x magnification to have a wider field of view, allowing to supervise the sorting-region. Therefore, training data was captured using 20x magnification. The resulting data shows the typical spread of GFP expressions, which was also seen for previous experiments, performed using 40x magnification (see Figure 3.9). ShapeOut 83 was employed to gate for events of the small GFP+_{and small GFP}-_{fraction (similar as} shown in Figure 3.10). A region with medium GFP expression was omitted since it is difficult to assign the events either to GFP+_{or GFP}-_{class, creating a slight unbalance in}

118

the dataset. To obtain a balanced validation set, 2000 cells of each class were randomly selected (see Table 9). The numbers of cells for each class in training and validation set are shown in Table 9.

Training Validation Balanced validation

Small GFP+ ₂₄₇₁ ₂₀₄₈ ₂₀₀₀

Small GFP- ₄₅₁₂ ₄₄₃₄ ₂₀₀₀

Table 9 Training and validation data of retina cells

Training of MLP for rod identification using AID

The balanced validation set was loaded into AID (see section 3.4.1) and an MLP with 24, 16 and 24 nodes in the first, second and third hidden layer was trained (MLP1). The performance of the resulting model, when using a threshold P(GFP+₎

thresh of 0.6 is shown in the confusion matrix in Figure 3.32. In the balanced validation set, the initial concentration of GFP+_{cells is 50% and sorting for GFP}+_{cells would theoretically result in} a target concentration of 𝑐_𝐺𝐹𝑃+ = 1238

1238+490∙ 100 = 71.6%.

Figure 3.32 Performance of MLP1 on validation set

Confusion matrix shows the performance of the final model (MLP1) when being applied to the validation

dataset, which contains 2000 images of small GFP+_{and small GFP}-_{cells. A cell is only classified to be}

GFP+_{, when the corresponding probability (P(GFP}+₎

thresh) is ≥0.6 (red rectangle). Those cells would

theoretically be sorted when using this model for an actual sorting experiment.

Application of final MLP to enrich rod precursor cells

The model was converted to .nnet using AID and loaded into the Sorting Software (version 1.556_rev1727, see section 3.4.2). A bounding box mediated gating for cells of length between 4 and 12 µm was applied to gate out events that are too small (debris) or too large. The SAW function generator was connected to the IDTs of the sorting chip and frequency as well as phase were adjusted such that pulses of 2 ms pushed single

119 cells into the target outlet. AI-based sorting was carried out for 1 hour, effectively collecting 25,000 cells, which corresponds to an average sorting speed of approximately 7 cells/second.

For analysis of the target and initial sample, a normal glass-PDMS chip (with 20 µm channel) and 40x magnification (standard setting for RT-FDC) was used to obtain optimal fluorescence signals. The scatterplots in Figure 3.33 show the cell size and fluorescence expression for the initial and target sample, respectively. The color code of the scatter dots illustrates the event-density, suggesting a maximum density at a fluorescence intensity of approximately 300 and 4000 for the initial and target sample, respectively. Apparently, cells in the target sample tend to have higher fluorescence expression, which is also confirmed by the medians of the fluorescence intensity (MInit=728 and MTarg=1684, see Figure 3.33). The solid green rectangle in Figure 3.33 indicates a gating strategy for GFP+_{cells, which was chosen manually. The percentage of} events within that gate is 𝑐_{𝐺𝐹𝑃+}𝐼𝑛𝑖𝑡 ₌3957

7428∙ 100 = 53.2% for the initial sample and

𝑐_{𝐺𝐹𝑃+}𝑇𝑎𝑟𝑔 =1516

2180∙ 100 = 69.5% for the target sample. When omitting doublets from the count

of GFP+_{cells (region indicated by dashed green rectangle in Figure 3.33), the} concentration of GFP+_{cells in initial and target sample is 𝑐}

𝐺𝐹𝑃+𝐼𝑛𝑖𝑡 =2949₇₄₂₈∙ 100 = 39.7% and

𝑐_{𝐺𝐹𝑃+}𝑇𝑎𝑟𝑔 =1187

120

Figure 3.33 RT-FDC analysis after sorting rod precursor cells

The scatterplots show RT-FDC experiments of the initial and target sample. Axes display cells size and fluorescence expression and the color code represents the density of scatter dots. MInit (=728) and MTarg

(=1684) show the locations of the medians of the fluorescence intensity. The solid green box indicates a gating strategy to select GFP+_{events, resulting in 53.2% and 69.5% GFP}+_{cells in the initial and target}

sample, respectively. An alternative gating strategy, indicated by dashed lines, results in 39.7% and 54.4% GFP+_{cells in the initial and target sample, respectively.}

The sorting process apparently caused a shift of the distribution of fluorescence expressions towards higher values, which means an enrichment of GFP+_{rod precursor} cells. Each gating strategy (solid and dashed green rectangles in Figure 3.33) indicates an increase of GFP+_{cells by approximately 15%.}

While the presented sorting experiment shows enrichment of rod precursor cells, it is still an open question whether the suggested MLP is capable to generalize for new biological samples and new sorting chips, given enough training-data. Furthermore, so far, the suggested MLP architecture was only applied for a binary classification task of retina cells (rod vs. non-rod) and it is not clear, whether the architecture would also work well for another specimen and more classes. Therefore, in the next section, a large number of existing datasets of human blood is leveraged to answer these questions.

In document PROYECTO DE ORDEN MINISTERIAL POR LA QUE SE DETERMINAN LAS (página 71-75)