In our effort to collect, organize and assemble data from lymphocyte cDNA libraries, we assign DNA restriction sites collectively to the spots on two-dimensional (2D) gel patterns. In order to test the efficiency and reliability of such an approach, we have modeled the restriction analysis of cDNA libraries with a panel of restriction endonucleases. The work has two parts. In the first, we have chosen 255 proteins from the EMBL data base and determined whether or not their coding sequences contain restriction sites for the enzymes of our choice. In order to apply a sufficient discriminatory power we decided to use a relatively large number of cleaving enzymes with low and high cutting frequencies. In total, 13 restriction enzymes were chosen, which could distinguish 2(13) or 8192 different restriction site combinations. We have compiled a table in which the absence or presence of restriction sites yields a pattern of 'zeros' and 'ones'. Such a restriction pattern can be read as a binary number. The binary numbers with maximally 13 digits would uniquely assign each of the 255 proteins if the nucleotide sequences would be truly at random. As the restriction sites are not randomly distributed, the 'typing' does not yield a unique assignment. The choice of sequences was not random either. In fact, there are some human nucleotide sequences which possess the same cut number (the decimal equivalent of the binary number representing the restriction pattern). In spite of this redundancy, 141 coding sequences could uniquely be distinguished by the above treatment. In the second part of the project we have used the above mentioned coding sequences to prepare two-dimensional maps (plots of charge vs size) of the same kind as one obtains from experimental 2D gels and submitted such a map together with 13 maps of restriction enzyme treated populations to a computer image analysis. Ideally, one would expect results (cut numbers) congruent to those obtained in the first part of the work. In the modeled system we were confronted with 2D maps which closely resembled the experimental situation (e.g. some spots were close together and overlapping) and instances of incorrect spot detection yielding 'false cut numbers'. From 255 proteins we were able to assign unequivocally 161 proteins. To implement the model in an actual experiment we will perform the digestion with the restriction enzymes in duplicate, and only spots assigned the same cut number upon the two independent treatments will be considered as carrying a valid restriction tag.
|Original language||English (US)|
|Number of pages||14|
|Journal||Applied and theoretical electrophoresis : the official journal of the International Electrophoresis Society|
|State||Published - 1993|
ASJC Scopus subject areas