CAPITULO 3. DISEÑO DE LA PLANIFICACIÓN ESTRATÉGICA
3.2 ANÁLISIS AMBIENTAL
3.2.1 ANÁLISIS EXTERNO
3.2.1.2 Micro Ambiente
The aim of this thesis was to test the feasibility o f as sem bl in g a set of experimental and s o ft w ar e tools that could be used in large scale ol ig onucleotide finge rpr inting projects. Much o f the foundations for su ch an approach have been layed in the co ur se o f this work. The w or k has involved the deve lo pm ent o f both experimental procedures and analytical so ftw ar e tools, design ed to automate very large scale data analysis.
On the experimental side there has been the d ev el op m en t o f a large scale P CR sy ste m to allow the amplification o f entire cDNA libraries. In the c our se o f this w o r k over 10 0, 00 0 cDNA clones have been amplified with this system, although hy br idi sa ti on data has so far only been generated on a s u b se t o f these. Also, there has been the adaptation o f oligonucleotide hyb ridisati ons to a method that enables a single p e r so n to pe rfo rm hyb rid is ati ons on 30 high density clone arrays per day. That can either be one p ro be on 1 , 0 0 0 , 0 0 0 clones or 30 prob es on 3 6 , 0 0 0 clones per day. A significant amou nt o f automation had to be developed to allow the reliable handl ing o f tens o f th o u sa nd s of clones clones fro m picking of the col oni es, th ro u g h amplification o f the cDNA inserts, to the arraying o f P C R pr od uct s on high density arrays. Much o f this w o r k was carried out with a team o f both scientists and engineers and was ba sed on the f urt her develo pm ent o f existing tec hnology. Finally, a large am oun t o f w o r k was invested in generating so ftw ar e tools required fo r the analysis o f millions o f pr o b e / target ineractions. Indeed, at least 50% o f the time spent d ur ing the co u r se o f this thesis was devoted to the informatics aspects o f a large scale molecular bio lo gy project.
Incomplete attempts have been made in this w o r k to analyse the hy br idi sa ti on data generated us in g the set o f tools that were developed.
Only during the very end o f the study were significant strides achieved toward the meaningful analysis o f the noisy data. An invaluable contribution to the analysis was provided by the control clone data, as it gave an indication o f the quality of the data and therefore helped in the choice of suitable strategies. Even noisy data can contirbute useful information provided its significance can be as se ss ed correctly.
F r o m the data generated to date and its preliminary analysis it is clear that while useful information can be obtained, considerable increase in both quality and quantity o f data will be required for the approach of oligonucleotide fingerprinting to fulfill its considerable potential. Clustering o f homologous sequences appears to be possible with the present dataset, even if some noi se still remains in these clusters. Matching the fingerprints to k n o w n sequences has not yet been po ssible and will require the generation o f much more data, or data o f much higher quality, preferably both.
It seems that an important lesson from this work has been to derive a set o f oligonucleotide sequences that function well in a fingerprinting approach. Due to the unpredictability o f the hybridisation behav iou r of sh ort oligonucleotides it appears that an empirically determined and tested set o f sequences mu st be derived fro m controlled test hy br idisations that can accurately evaluate the hybridisation characteristics o f oligonucleotide probes under the exact experimental conditions used.
One o f the less demanding applications for which an oligonucleotide finge rprintin g approach can be used is as a preselection step in large scale sequencing projects. F in ger pr int s o f the level generated in this thesis can be used to identify clones with high seque nce homologies. This information can then be used to sequence initially only t ho se clones that are different, thus red uci ng con siderably the redu nda ncy in seque nce generation. Of course, this approach only becomes efficient
when a very large number of clones are to be sequenced, such as large scale ’tag '- se qu en ci ng and genomic sequencing projects.
F o r future oligonucleotide fingerprinting experiments, there are many pointers in this thesis to aspects that can and sh ould still be improved upon. On the experimental side, PCR amplification and oligonucleotide hybridisations ought to be singled out as areas in which improvements to the quality o f the hybridisation data can be most readily made. The reliability o f PCR amplification could be improved by switching to a different kind o f microtitre plate. The large thermal mass o f the Q-plates will always be a hinderance in reliable amplification. The overall success rate and yield was in fact greater in the initial waterbath PCR trials using thin walled 96-well plates. A thin walled 384-well plate is currently under design. P re viously the problem with a thin walled 384- well plate has been that only small wells could be moulded by the process of th er moforming and that these were unsuitable for clone storage purp ose s. F o r PCR pu rp ose s however, a small volume of around 15 - 20 pi per well would be sufficient.
The hybridisation data generated to date indicate that there is a very large variation in the hybridisation characteristics of the oligonucleotides used. Conditions used for hybridisations in this pilot study, were standardised for the sake o f high through put. A great deal o f improvement in the quality o f the hybridisation data is likely to be achievable by modifications to the hy bridisation conditions. Hy br idisa tion protocols could be adjusted for individual oligonucleotides according to their previous hybridisation characteristics. Pr obes with poo r signal strength could be hybridised at a higher concentration and/or specific activity and could be washed less extensively. Conv erse ly, oligonucleotides with good signal strengths but po or seque nce specificity could be wash ed for longer periods or at slightly higher temperatures. A modification that should be tested for all
h y b r id i s a t i o n s is the use o f tetra-alkyl a m m o n iu m salts to reduce the d if f er en ce in contribution to the dup lex stability bet we en A-T and G- C b a s e pairs and to incre ase the du ple x yield o f many ol ig o n uc le o ti d es . A s ig n i f i ca nt advance will also be achieved wh en n o n - ra d i o a c t i v e detection s y s te m s can be used ro utinely. Recent r esu lt s in the lab s u g g e s t that amplified f lu o r es ce nc e could p r o v i d e an immediate alternative to r ad i o la b el li n g (Maier et al., 1994b).
F u r t h e r advances in the quan ti ty o f h y b r id i s a t i o n data that can be g en er at ed will rely on c o nt in u ed d e v e l o p m e n t o f the automated p r o c e s s e s , especially the h y b r i d i s a t i o n s , and on the de v elo pm en t o f m in i at u ri s ed clone arrays. The co nce pt o f the ' clone c h i p ’ has been v a r i o u s l y p ub li sh ed ( S o u th e r n et a l . , 1992; F o d o r et a l . , 1993; M ir za b ek ov et al., 1994) and po in ts the way to the fu tur e o f large scale h y b r id i s a t i o n ap pr o ac h es . Certainly, te chn ol og ica l d ev elo pm ent s will c o n ti n u e to be the driv in g force in this area o f m o lec ul ar bio lo gy for s o m e years to come.
E x p e r i e n c e from this w o r k has s h o w n that a key area for fu rt h er d e v e l o p m e n t lies in the in formatics tools that are available f o r ha n d li n g very la rge data sets. N o t only are p o w e r fu l co mp u ter s and efficient p r o g r a m s re quired for the an al ys is o f exper im ent al data, but interfaces need to be generated th r o u g h w h ic h b i o l o g i st s can access and in terpret the o u t p u t o f complex statistical s y st em s.