Histidine-tagged expression systems are versatile in respect to purification of the fusion protein via the use of metal chelating resin. Versatility is imparted due to the fact that if the protein is insoluble upon expression then denaturing conditions can be used to solubilise the protein without any adverse effect on the binding of the fusion protein to the aflSnity matrix. The use of non-ionic detergents (Triton X-100 or Tween-20) or high salt concentrations (0.1-lM) has no effect on the binding of the fusion protein. This facilitates the removal of any non-specific contaminants as well as any host organism proteins that may have an affinity for the cation charged resin (Hochuli et a/., 1988; Stuber et al., 1990). This makes the resin useful for the purification and removal of DNA or RNA from nucleotide binding proteins.
The relative stability of the fusion protein interaction with the affinity matrix under varying concentrations of chaotropic salts, makes this system ideal for use in protein folding/refolding studies. If the product is produced in an unfolded state, the resin can be used during controlled refolding conditions, as a stable anchor for the protein in question (Jaenicke & Rudolph, 1990).
Due to the relatively small size of the fusion partner, it exhibits very low antigenicity, and enables the generation of antibodies to the inserted recombinant sequence. To the disadvantage of the system, this low antigenicity means that it is very difficult to raise antibodies to the polyhistidine tag (making detection of expressed product difficult) and many companies sell antibodies to the polylinker regions expressed by their respective histidine tagged expression systems. An alternative method to detect the fiision tag is by epitope-tagging of the -COOH terminus of the expressed product (Olah et al., 1994). By inserting the last 12 amino acids of the protein kinase C epsilon gene into the MCS
of the pTrcHisA vector, expressed products can be easily identified by using a readily available antibody specific for the epsilon-peptide. Recent work by Zentgraf and co workers (1995) has yielded a highly specific monoclonal antibody to the polyhistidine tag by using a mixture of his-tagged proteins for immunisation.
As the fusion partner may be relatively small compared to a vast number of recombinant proteins, usually the tag will not interfere in the folding or functionality of the protein in question.
The pTrcHis system has been used to express a variety of proteins; for monoclonal and polyclonal antibody production (Johnson et a/., 1994; Yoo & Wolin, 1994), binding studies (Russnak etal., 1995; Kim eta l, 1995), epitope screening (Jenkins et al., 1993). IMAC has been used; to purifiy histidine tagged proteins for structural determination (Nikolov et al, 1992; Zhang et a l, 1993; Kubalek et a l, 1994) and for the purification of antibody Fab fi’agments (Skerra,1994), rat protein disulphide isomerase (DeSutter et al , 1994), and kringle domains (Marti et al., 1994).
3.4. CONCLUSIONS
Both the pGEX and pTrcHis expression systems have been utilised in the expression of the G3 domain of aggrecan (Chapter 6). These systems have been chosen as preparation and expression of gene constructs is relatively straight forward with growth and selection of clones being easy to do. The use of expression systems that couple the domains to a fiision partner with affinity bindin enables potential purification of expressed products to very high levels in just one step. Coupled with the fact that many proteins expressed by these systems are functional (or easily refolded to attain function), this makes them ideal candidates for expressing domains of aggrecan and link protein for structural analysis.
CHAPTER FOUR
STRUCTURAL PREDICTION OF THE
4.1. INTRODUCTION
Proteoglycans consist of many long anionic polysaccharide chains (glycosaminoglycans) covalently attached to an extended central protein core which stabilise the extracellular matrix (Hardingham & Fosang, 1992; Hardingham et a l, 1992). Aggrecan is the archetypal member of this group, and contains a globular N-terminal region G1 that is constructed from an immunoglobulin fold domain and two proteoglycan tandem repeat (PTR) domains (Hardingham & Fosang, 1992; Hardingham et a l, 1992). A second region G2 in aggrecan next to G1 contains two PTRs. Link protein also contains one immunoglobulin fold and two PTR domains (Neame & Barry, 1993). G1 and link protein form a very stable ternary complex with hyaluronate. X-ray and neutron solution scattering and electron microscopy show that Gl, link protein and the ternary complex possess compact structures (Morgelin et a l, 1988; Perkins e ta l, 1991; Perkins e ta l,
1992). Aggrecan also contains a globular C-terminal region (G3) with variable numbers of epidermal growth factor domains, followed by a carbohydrate recognition domain (CRD) belonging to Group I of the C-type lectin superfamily (Drickamer, 19936), and a short consensus/complement repeat domain.
Both the PTR and CRD superfamilies are associated with carbohydrate binding, where the PTR binds hyaluronate and the CRD binds a variety of oligosaccharide ligands. The PTR superfamily includes the CD44 group of cell surface receptors. Previous consensus secondary structure analyses of 15-20 PTR sequences indicated the occurrence of a- helices (A) and p-strands (B) in the sequence BABABBB (Perkins et a l, 1989). This is very similar to recent consensus secondary structure prediction for 129 CRD sequences which gave BABABBBB (Brissett & Perkins, 1996). Crystal structures of rat mannose binding protein and human E-selectin in the CRD superfamily are available for comparison with these predictions (Weis et a l, 1992; Graves et a l, 1994).
In this chapter, this similarity between the PTR and CRD superfamilies is examined further, using a proven approach to predict a protein fold prior to its crystal structure determination (Edwards & Perkins, 1995; Lee et a l, 1995; Edwards & Perkins, 1996).
Consensus structure predictions were performed for 59 PTR sequences for comparison with the previous analysis of 129 CRD sequences. Protein fold recognition analyses were performed in which the 59 PTR sequences were scored against 254 known folds. These analyses showed a relationship between the PTR and CRD folds. Molecular graphics modelling of the PTR based on CRD crystal structures (Weis et al., 1992; Graves et al., 1994) showed that the PTR and CRD structures are characterised by a conserved a-helix/p-sheet core and distinct P-sheet regions. The model was examined for information relating to a hyaluronate binding site.