Several frequency asymmetries unexpectedly observed (e.g. the frequency difference between T 1 and T 2 in the frame 0), are related to a new property of the subset T 0 involving substitutions. An evolutionary analytical model at three parameters ( p, q, t) based on an independent mixing of the 22 co
An Evolutionary Model of a Complementary Circular Code
✍ Scribed by Didier G. Arquès; Jean-Paul Fallot; Christian J. Michel
- Publisher
- Elsevier Science
- Year
- 1997
- Tongue
- English
- Weight
- 444 KB
- Volume
- 185
- Category
- Article
- ISSN
- 0022-5193
No coin nor oath required. For personal study only.
✦ Synopsis
The subset X0 = [sequence: see text] of 20 trinucleotides has a preferential occurrence in frame 0 (a reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. This subset X0++ has the rarity property (6 x 10(-8)) to be a complementary maximal circular code with two permutated maximal circular codes X1 and X2 in frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5'-3' direction). X0 is called a C3 code. A quantitative study of these three subsets X0, X1 and X2 in the three frames 0, 1 and 2 of eukaryotic protein genes shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of X0, X1 and X2 in frame 0 of the eukaryotic protein genes are 48.5%, 29% and 22.5% respectively. These properties are not observed in the 5' and 3' regions of eukaryotes where X0, X1 and X2 occur with variable frequencies around the random value (1/3). Several frequency asymmetries unexpectedly observed, e.g. the frequency difference between X1 and X2 in the frame 0, are related to a new property of the C3 code X0 involving substitutions. An evolutionary model at three parameters (p, q, k) based on an independent mixing of the 20 codons (trinucleotides in frame 0) of X0 with equiprobability (1/20) followed by k approximately 5 substitutions per codon in the three codon sites in proportions p approximately 0.1, q approximately 0.1 and r = 1-p-q approximately 0.8 respectively, retrieves the frequencies of X0, X1 and X2 observed in the three frames of protein genes and explains these asymmetries.
📜 SIMILAR VOLUMES
Recently, shifted periodicities 1 modulo 3 and 2 modulo 3 have been identified in protein (coding) genes of both prokaryotes and eukaryotes with autocorrelation functions analysing eight of 64 trinucleotides (Arquès et al., 1995). This observation suggests that the trinucleotides are associated with
A circular code has been identi"ed in the protein (coding) genes of both eukaryotes and prokaryotes by using a statistical method called trinucleotide frequency (TF) method [Arque`s & Michel (1996). J. theor. Biol. 182, 45}58]. Recently, a probabilistic model based on the nucleotide frequencies with
A new maximal circular code X0(MIT) with two permutated maximal circular codes X1(MIT) and X2(MIT) is identified in the protein coding genes of mitochondria. The three subsets of 20 trinucleotides X0(MIT)={ACA, ACC, ATA, ATC, CTA, CTC, GAA, GAC, GAT, GCA, GCC, GCT, GGA, GGC, GGT, GTA, GTC, GTT, TTA,