International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 1
ISSN 2229-5518
Molecular Biocoding of Insulin – Amino
Acid Gly
Lutvo Kuric
Abstract - The modern science mainly treats the biochemical basis of sequencing in bio-macrom olecules and processes in medicine and biochemistry. One can ask weather the language of biochemistry is the adequate scientific language to explain the phenomenon in that science. Is there maybe some other language, out of biochemistry, that determines how the biochemical processes will function and what the structure and organization of life syst ems will be? The research results provide some answers to these questions. They reveal to us that the process of sequencing in bio-macromolecules is conditioned and det ermined not only through biochemical, but also through cybernetic and information principles. Many studies have indicated that analysis of protein sequence codes and various sequence-based prediction approaches, such as predicting drug-target interaction networks (He et al., 2010), predicting functions of prot eins (Hu et al., 2011; Kannan et al., 2008), analysis and prediction of the metabolic stability of prot eins (Huang et al., 2010), predicting the network of substrate-enzyme-product triads (Chen et al., 2010), membrane protein type prediction (Cai and Chou, 2006; Cai et al., 2003; Cai et al., 2004), protein structural class prediction (Cai et al., 2006; Ding et al., 2007), protein secondary structure prediction (Chen et al., 2009; Ding et al.,
2009b), enzyme family class prediction (Cai et al., 2005; Ding et al., 2009a; Wang et al., 2010), identifying cyclin prot eins
(Mohabatkar, 2010), protein subcellular location prediction (Chou and Shen, 2010a; Chou and Shen, 2010b; Kandaswamy et al.,
2010; Liu et al., 2010), among many others as summarized in a recent review (Chou, 2011), can timely provide very useful information and insights for both basic research and drug design and hence are widely welcom e by science community. The present study is attempted to develop a novel sequence-based method for studying insulin in hopes that it may becom e a useful tool in the relevant areas.
Index Terms-Amino Acid Gly, Human Insulin, Insulin Model, Insulin Code.
—————————— • ——————————
number of atoms in the relevant amino acids,
The biologic role of any given protein in essential life processes, eg, insulin, depends on the positioning of its component amino acids, and is understood by the „positioning of letters forming words“. Each of these words has its biochemical base. If this base is expressed by corresponding discrete numbers, it can be seen that any given base has its own program, along with its own unique cybernetics and information characteristics.
Indeed, the sequencing of the molecule is determined not only by distin biochemical features, but also by cybernetic and information principles. For this reason, research in this field deals more with the quantitative rather than qualitative characteristcs of genetic information and its biochemical basis. For the purposes of this paper, specific physical and chemical factors have been selected in order to express the genetic information for insulin.Numerical values are them assigned to these factors, enabling them to be measured. In this way it is possible to determine oif a connection really exists between the quantitative ratios in the process of transfer of genetic information and the qualitative appearance of the insulin molecule. To select these factors, preference is given to classical physical and chemical parameters, including the
their analog values, the position in these amino acids in the peptide chain, and their frenquencies.There is a arge numbers of these parameters, and each of their gives important genetic information. Going through this process, it becomes clear that there is a mathematical relationship between quantitative ratios and the qualitative appearance of the biochemical
„genetic processes“ and that there is a
measurement method that can be used to describe the biochemistry of insulin.
Insulin can be represented by two different forms, ie, a discrete form and a sequential form. In the discrete form, a molecule of insulin is represented by a set of discrete codes or a multiple dimension vector. In the sequential form, an insulin molecule is represent by a series of amino acids according to the order of their position in the chains 1AI0.
Therefore, the sequential form can naturally reflect all the information about the sequence order and lenght of an insulin molecule. The key issue is whether we can develop a different discrete method of representing an insulin
molecule that will allow accomodation of partial,
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 2
ISSN 2229-5518
if not all sequence order information? Because a protein sequence is usually represented by a series of amino acids should be assigned to these codes in order to optimally convert the sequence order information into a series of numbers for the discrete form representation?
The matrix mechanism of Insulin, the evolution of biomacromolecules and, especially, the biochemical evolution of Insulin language, have been analyzed by the application of cybernetic methods, information theory and system theory, respectively. The primary structure of a molecule of Insulin is the exact specification of its atomic composition and the chemical bonds connecting those atoms.
The structure 1AI0 has in total 12 chains: A,B,C,D,E,F,G,H,I,J,K,L.
1AI0:A
G | I | V | E | Q | C | C | T | S | I | C | S | L | Y | Q | L | E | N | Y | C | N |
10 | 22 | 19 | 19 | 20 | 14 | 14 | 17 | 14 | 22 | 14 | 14 | 22 | 24 | 20 | 22 | 19 | 17 | 24 | 14 | 17 |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 |
1AI0:B
F V N Q H I C G S H L V E A L
23 19 17 20 20 22 14 10 14 20 22 19 19 13 22
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Y L V C G E R G F I Y T P K T
24 22 19 14 10 19 26 10 23 22 24 17 17 24 17
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
etc.
Fig. 1. Group of chains A,B,C,D,E,F,G,H,I,J,K,L.
Notes: Aforementioned aminoacids are positioned from number 1 to 306. Numbers 1, 2, 3, n... present the position of a certain aminoacid. This positioning is of the key importance for understanding of programmatic, cybernetic and information principles in this protein. The scientific key for interpretation of bio chemical processes is the same for insulin and as well as for the other proteins and other sequences in biochemistry.
The first aminoacid in this example has 10 atoms, the second one 22, the third one 19, etc. They have exactly these numbers of atoms because there are many codes in the insulin molecule, analog codes, and other voded features. In fact, there is a cybernetic algorithm which it is „recorded“ that the firs amino acid has to have 10 atoms, the second one 22, the third one 19, etc. The first amino acid has its own biochemistry, as does the second and the third, etc. The obvious conclusion is that there is a concrete relationship between quantitative ratios in the process of transfer of genetic information and qualitative appearance, ie, the characteristcs of the organism.
We shall now give some mathematical evidences that will prove that in the biochemistry of hemoglob in there really is programmatic and cybernetic algorithm in which it is „recorded“,
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 3
ISSN 2229-5518
in the language of mathematics, how the molecule will be built and what will be the quantitative characteristics of the given genetic information.
3.2.1 Atomic progression
Step 1 (Amino acids from 1 to 306)
AC1 = 10 atoms; AC2 = 22 atoms; AC3 = 19 atoms;... AC306 = 17 atoms;
[AC1 + (AC1+ AC2) + (AC1+ AC2+ AC3)..., + (AC1+ AC2+ AC3..., + AC147)] = S1; AC1 = APa1 = 10;
(AC1+ AC2) = (10+22) = APa2 = 32;
(AC1+ AC2+ AC3) = (10+22+19) = APa3 = 51; (AC1+ AC2+ AC3..., + AC306) = APa306 = 5640 atoms; APa1,2,3,n = Atomic progression of amino acids 1,2,3,n
[APa1+APa2+APa3)..., + APa306)] = (10+32+51…, + 5640) = S1; S1 = 863 208;
Example :
Atomic progression 1 (APa)
K T Sum
T Sum
(0+10) = 10; (10+22)=32; (10+11+19) = 51; etc. Fig. 2. Atomic progression 1 (APa) of amino acids from 1 to 306.
Notes: By using chemical-information procedures, we calculated the arithmetic progression for the information content of aforementioned aminoacids.
Step 2 (Amino acids from 306 to 1)
AC306 = 17 atoms; AC305 = 24 atoms; AC304 = 17 atoms;... AC1 = 10 atoms;
[AC306 + (AC306+ AC305) + (AC306+ AC305+ AC304)..., + (AC306+AC305+AC304..., +AC1)] = S2; AC306 = APb306 = 17;
(AC306+ AC305) = (17+24) = APb306 = 41;
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 4
ISSN 2229-5518
(AC306+ AC305+ AC304) = (17+24+17) = APb304 = 58; (AC306+ AC305+ AC304..., + AC1) = APb1 = 5640 atoms;
APb306,305,304, …,1 = Atomic progression of amino acids 306,305,304,…1; [APb306+APb305+APb1304)..., + APb1)] = (17+41+58…, + 5640) = 868 272; S2 = 868 272;
Example:
Atomic progression 2 (APb)
10 22 19 . .
Y T P K T Sum
Sum
T P K T
(0+17) = 17; (17+24)=41; (17+24+17)=58; etc.
Fig. 3. Schematic representation of the atomic progression 2 from 306 to 1.
Within the digital pictures in biochemistry, the physical and chemical parameters are in a strict compliance with programmatic, cybernetic and information principles. Each bar in the protein chain attracts only the corresponding aminoacid, and only the relevant aminoacid can be positioned at certain place in the chain. Each peptide chain can have the exact number of aminoacids necessary to meet the strictly determined mathematical conditioning. It can have as many atoms as necessary to meet the mathematical balance of the biochemical phenomenon at certain mathematical level, etc. The digital language of biochemistry has a countless number of codes and analogue codes, as well as other information content. These pictures enable us to realize the very essence of functioning of biochemical processes. There are some examples:
Table 1. Atomic progression APa and APb (Amino acid Gly – position from
1 to 306 AA)
The structure 1AI0 – Amino acid Gly
G | G | G | G | G | G | G | G | |
Number of atoms | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Rank | 1 | 29 | 41 | 44 | 52 | 80 | 92 | 95 |
G | G | G | G | G | G | G | G | |
Number | ||||||||
of atoms | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Rank | 103 | 131 | 143 | 146 | 154 | 182 | 194 | 197 |
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 5
ISSN 2229-5518
G | G | G | G | G | G | G | G | |
Number of atoms | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Rank | 205 | 233 | 245 | 248 | 256 | 284 | 296 | 299 |
Table 1. Schematic representation of the atomic progression APa and APb (Amino acid Gly –
position from 1 to 306 AA).
Notes: Namely, having mathematically analyzed the atomic preogression model of Insulin Model (Table 1) we have found out that the protein code is based on a periodic law. This being the only to „read“ the picture, the solution of the main problem (concering an arrangement where each amino acid takes only one, precisely determined position in the code), is quite manifest:
Atomic progression model of insulin should, in fact, be „remodelled“ into a periodic system. Examples:
Atomic progression APa and APb
G G G G G G G G
Rank 299 1 29 296 41 284 44 256 > 1250
5496 10 523 5441 741 5223 796 4710 > 22940
> 680
5640 154 209 5127 427 4909 940 4854 > 22260
G G G G G G G G
Rank 1 299 296 29 284 41 256 44 > 1250
R > (5496-5640) = (-)144; (10-154) = (-)144; (523-209) = 314; etc.
G G G G G G G G
Rank 52 248 80 245 92 233 95 205 > 1250
950 4556 1463 4501 1681 4283 1736 3770 > 22940
> 680
1094 4700 1149 4187 1367 3969 1880 3914 > 22260
G G G G G G G G
Rank 248 52 245 80 233 92 205 95 > 1250
G G G G G G G G
Rank 103 197 131 194 143 182 146 154 > 1250
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 6
ISSN 2229-5518
1890 | 3616 | 2403 | 3561 | 2621 | 3343 | 2676 | 2830 | > | 22940 | |
> | 680 | |||||||||
2034 | 3760 | 2089 | 3247 | 2307 | 3029 | 2820 | 2974 | > | 22260 | |
G | G | G | G | G | G | G | G | |||
Rank | 197 | 103 | 194 | 131 | 182 | 143 | 154 | 146 | > | 1250 |
G G G G G G G G G G G G
10 10 10 10
Rank 299 1 44 256 52 248 95 205 103 197 146 154
5496 10 796 4710 950 4556 1736 3770 1890 3616 2676 2830
-144 -144 -144 -144 -144 -144 -144 -144 -144 -144 -144 -144
5640 154 940 4854 1094 4700 1880 3914 2034 3760 2820 2974
G G G G G G G G G G G G
Rank 10 10 10 10 10 10 10 10 10 10 10 10
1 299 256 44 248 52 205 95 197 103 154 146
G G G G G G G G G G G G
10 10 10 10
Rank 29 296 41 284 80 245 92 233 131 194 143 182
523 5441 741 5223 1463 4501 1681 4283 2403 3561 2621 3343
314 314 314 314 314 314 314 314 314 314 314 314
209 5127 427 4909 1149 4187 1367 3969 2089 3247 2307 3029
G G G G G G G G G G G G
10 10 10 10 10 10 10 10 10 10 10 10
Rank 296 29 284 41 245 80 233 92 194 131 182 143
Fig. 4. Atomic progression APa and APb (Amino acid Gly – position from 1 to 306 AA).
In this example, the amino acids Gly atomic progression APa and APb as a result was given the codes 144 and 314th.
As we see, the insulin code is itself a unique structure of program, cybernetic and informational system and law.
The research we carried out have shown that atomic progression are one of quantitative characteristics in biochemistry. Atomic progression is, actually, a discrete code that protects and guards genetic information
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 7
ISSN 2229-5518
coded in bio-chemical processes. This a recently discovered code, and more detailed knowledge on it is yet to be discovered.
In a similar way we shall calculate bio codes of other unions of amino acids. Once we do this, we will find out that all these unions of amino acids are connected by various bio codes, analogue codes as well as other quantitative features. Examples:
Atomic progression APa and APb
G G G G G G G G
@ | @ | @ | @ | @ | @ | @ | @ |
APa | 5496 10 523 5441 | 741 5223 796 4710 | |||||
APa | 950 4556 1463 4501 | 1681 4283 1736 3770 | |||||
APa | 1890 3616 2403 3561 | 2621 3343 2676 2830 | |||||
APb | 5640 154 209 5127 | 427 4909 940 4854 | |||||
APb | 1094 4700 1149 4187 | 1367 3969 1880 3914 | |||||
APb | 2034 3760 2089 3247 | 2307 3029 2820 2974 |
@ @ @ @
67800 67800
G G G G Sum
> 11470
> 11470
> 11470
G G G G Sum
> 11470
> 11470
> 11470
G G G G Sum
> 11130
> 11130
> 11130
G G G G Sum
> 11130
> 11130
> 11130
Fig. 5. Atomic progression APa and APb (Amino acid Gly – position from 1 to 306 AA).
Atomic progression presented in figure 2 are calculated using the relationship between corresponding groups of those rogressions. These are groups with different progression. There are different ways and methods of selecting these groups of progressions, which method is most efficient some We hope that science will determine which method is most efficient for this selection.
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 8
ISSN 2229-5518
3.2.2 Rank of atomic progression
G G G G G G G G Sum
@ @ @ @ @ @ @ @
Rank 299 1 29 296 41 284 44 256 1250
Rank 1 299 296 29 284 41 256 44 1250
Rank 52 248 80 245 92 233 95 205 1250
Rank 248 52 245 80 233 92 205 95 1250
Rank 103 197 131 194 143 182 146 154 1250
Rank 197 103 194 131 182 143 154 146 1250
Sum 900 900 975 975 975 975 900 900
G G G G
@ @ @ @
Rank 299 1 29 296 > 625
Rank 1 299 296 29 > 625
Rank 52 248 80 245 > 625
Rank 248 52 245 80 > 625
Rank 103 197 131 194 > 625
Rank 197 103 194 131 > 625
G G G G Sum
@ @ @ @
Rank 41 284 44 256 > 625
Rank 284 41 256 44 > 625
Rank 92 233 95 205 > 625
Rank 233 92 205 95 > 625
Rank 143 182 146 154 > 625
Rank 182 143 154 146 > 625
G G G G Sum
@ @ @ @
Rank 299 1 44 256 > 600
Rank 1 299 256 44 > 600
Rank 52 248 95 205 > 600
Rank 248 52 205 95 > 600
Rank 103 197 146 154 > 600
Rank 197 103 154 146 > 600
Sum 900 900 900 900
G G G G Sum
@ @ @ @
Rank 29 296 41 284 > 650
Rank 296 29 284 41 > 650
Rank 80 245 92 233 > 650
Rank 245 80 233 92 > 650
Rank 131 194 143 182 > 650
Rank 194 131 182 143 > 650
Sum 975 975 975 975
Fig. 6. Rank atomic progression APa and APb (Amino acid Gly – position from 1 to 306
AA).
In those examples, we have the mathematical balance in rows and columns in this figure.
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 9
ISSN 2229-5518
In fact, we have discovered the mathematical balance in distribution of sequences in figure 6 is achieved.
3.2.3 Correlation of atomic progression
Atomic progression of this amino acid to us as a result of its correlation give a variety of codes. Here are some examples:
G | G | G | G | G | G | |||||||||
10 | 10 | 10 | 10 | 10 | 10 | |||||||||
41 | 284 | > | 325 | 80 | 245 | > | 325 | 143 | 182 | > | 325 | |||
APa | 741 | 5223 | > | 5964 | 1463 | 4501 | > | 5964 | 2621 | 3343 | > | 5964 | ||
APb | 4909 | 427 | > | 5336 | 4187 | 1149 | > | 5336 | 3029 | 2307 | > | 5336 | ||
@ | @ | @ | @ | @ | @ | @ | @ | @ | ||||||
R | 314 | 314 | 11300 | 314 | 314 | 11300 | 314 | 314 | 11300 |
G | G | G | G | G | G | |||||||||
10 | 10 | 10 | 10 | 10 | 10 | |||||||||
44 | 256 | > | 300 | 52 | 248 | > | 300 | 146 | 154 | > | 300 | |||
APa | 796 | 4710 | > | 5506 | 950 | 4556 | > | 5506 | 2676 | 2830 | > | 5506 | ||
APb | 4854 | 940 | > | 5794 | 4700 | 1094 | > | 5794 | 2974 | 2820 | > | 5794 | ||
@ | @ | @ | @ | @ | @ | @ | @ | @ | ||||||
R | 314 | 314 | 11300 | 314 | 314 | 11300 | 314 | 314 | 11300 |
Fig. 7. Correlation of atomic progression and rank of APa and APb (Amino acid Gly – position from 1 to 306
AA).
In those examples we have the correlation of atomic progression and rank of APa and APb.
3.2.4 Odd and even progression
Progression of the APa and APb, in fact, odd and even numbers. These numbers are one of the keys to decoding and decoding molecules insulina. This decoding we can make this:
Steam progression
APa | 10 796 950 1736 1890 2676 2830 3616 3770 4556 4710 5496 5640 4854 4700 3914 3760 2974 2820 2034 1880 1094 940 154 |
APb | 10 796 950 1736 1890 2676 2830 3616 3770 4556 4710 5496 5640 4854 4700 3914 3760 2974 2820 2034 1880 1094 940 154 |
5650 5650 5650 5650 5650 5650 5650 5650 5650 5650 5650 5650 |
etc.
Steam progression are:
APa = (10, 5640); (1736, 3914); (1890, 3760); etc. Progression are the odd and even rank.
Odd rank
G | G | G | G | G | G | ||
10 | 10 | 10 | 10 | 10 | 10 | ||
Rank 1 | 95 | 103 | 197 | 205 | 299 | > | 900 |
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 10
ISSN 2229-5518
16518
> 17382
Even rank
G G G G G G
10 10 10 10 10 10
Rank 44 52 146 154 248 256 > 900
16518
> 17382
Odd rank = Even rank = 900; Odd APa = Even APa = 16518; Odd APb = Even APb = 17382;
Fig. 8. Odd and even progression of aminoacid Gly.
Notes: Within the digital pictures in biochemistry, the physical and chemical parameters are in a strict compliance with programmatic, cybernetic and information principles. Each bar in the protein chain attracts only the corresponding aminoacid, and only the relevant aminoacid can be positioned at certain place in the chain. Each peptide chain can have the exact number of aminoacids necessary to meet the strictly determined mathematical conditioning. It can have as many atoms as necessary to meet the mathematical balance of the biochemical phenomenon at certain mathematical level, etc. The digital language of biochemistry has a countless number of codes and analogue codes, as well as other information content. These pictures enable us to realize the very essence of functioning of biochemical processes.
Odd rank progression
G | G | G | G | G | G | ||
@ | @ | @ | @ | @ | @ | ||
Rank 1 | 299 | 95 | 205 | 103 | 197 | > | 900 |
10 5496 1736 3770 1890 3616 > 16518
154 | 5640 | 1880 | 3914 | 2034 | 3760 | > | 17382 |
@ | @ | @ | @ | @ | @ | ||
G | G | G | G | G | G | ||
Rank 299 | 1 | 205 | 95 | 197 | 103 | > | 900 |
34836
(1+10+(-)144+154+299…+ 103) = 34836; Even rank progression
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 11
ISSN 2229-5518
796 4710 950 4556 2676 2830 > 16518
940 | 4854 | 1094 | 4700 | 2820 | 2974 | > | 17382 |
@ | @ | @ | @ | @ | @ | ||
G | G | G | G | G | G | ||
Rank 256 | 44 | 248 | 52 | 154 | 146 | > | 900 |
(44+796+ (-)144+940+256…+ 146) = 34836;
34836
Fig. 9. Odd and even rank progression of aminoacid Gly.
Odd AP-a progression
APa | 523 741 1463 1681 2403 2621 3343 3561 4283 4501 5223 5441 5127 4909 4187 3969 3247 3029 2307 2089 1367 1149 427 209 |
APb | 523 741 1463 1681 2403 2621 3343 3561 4283 4501 5223 5441 5127 4909 4187 3969 3247 3029 2307 2089 1367 1149 427 209 |
5650 5650 5650 5650 5650 5650 5650 5650 5650 5650 5650 5650 |
Odd APa progression – Even rank
G G G G G G
@ @ @ @ @ @
Rank 80 296 92 284 182 194
1463 5441 1681 5223 3343 3561
209 | 4187 | 427 | 3969 | 2089 | 2307 | |
@ | @ | @ | @ | @ | @ | |
G | G | G | G | G | G | |
Rank | 296 | 80 | 284 | 92 | 194 | 182 |
Odd APa progression – Odd rank
G G G G G G
@ @ @ @ @ @
Rank 29 245 41 233 131 143
@ @ @ @ @ @
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 12
ISSN 2229-5518
APa 523 4501 741 4283 2403 2621
1149 | 5127 | 1367 | 4909 | 3029 | 3247 | |
@ | @ | @ | @ | @ | @ | |
G | G | G | G | G | G | |
Rank 245 | 29 | 233 | 41 | 143 | 131 | |
Even rank and odd rank |
Rank
G G G G G G
@ @ @ @ @ @
par 80 296 92 284 182 194
Rank
nepar 29 245 41 233 131 143
51 51 51 51 51 51
Rank
par 296 80 284 92 194 182
Rank
nepar 245 29 233 41 143 131
51 51 51 51 51 51
(80-29) = 51; (296-245) = 51; (92-41) = 51;
Even APa and odd rank
APa rank
G G G G G G
@ @ @ @ @ @
even 1463 5441 1681 5223 3343 3561
APa rank
odd 523 4501 741 4283 2403 2621
940 940 940 940 940 940
APb | ||||||
odd | 1149 | 5127 | 1367 | 4909 | 3029 | 3247 |
APb | ||||||
even | 209 | 4187 | 427 | 3969 | 2089 | 2307 |
940 | 940 | 940 | 940 | 940 | 940 |
(80-29) = 51; (296-245) = 51; (92-41) = 51;
Figure 10. Odd AP-a progression, odd APa progression–even rank, odd APa progression –odd rank, even rank and odd rank and even APa and odd rank
3.3 Bio frequency
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 13
ISSN 2229-5518
Insulin is composed of aminoacids with various numerical values. This numerical values are in an irregular order. For example, the first one has 10 atoms, the second one 22. Their frequency is X. Second amino acid has 22 atoms, and the third one 19. Their frequency is Y; etc... Frequency is the measurement for establishment of intervals of numerical values of amino acids in proteins. This value can be positive, negative or a zero value. These frequencies are showing us one completely new dimension of protein sequencing. Through these frequencies we can establish which of aminoacids are of primary, and which are
of secondary significance in biochemical processes of insulin. Here is a concrete example:
G | I | V | E | Q | C | C | T | |
10 | 22 | 19 | 19 | 20 | 14 | 14 | 17 | |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
10 | 12 | -3 | 0 | 1 | -6 | 0 | 3 | -3 |
From 0 to 10 = 10; From 10 to 22 = 12; From 22 to 19 = (-) 3; From 19 to 19 = 0; etc
Schematic representation of the amino acid and frequency we will show in the fig.11.
G | G | G | G | G | G | G | G | G | G | G | G | ||
10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | ||
1 | 29 | 41 | 44 | 52 | 80 | 92 | 95 | 103 | 131 | 143 | 146 | ||
12 | 4 | 9 | 13 | 12 | 4 | 9 | 13 | 12 | 4 | 9 | 13 | > | 114; |
G G G G G G G G G G G G
10 10 10 10 10 10 10 10 10 10 10 10
154 182 194 197 205 233 245 248 256 284 296 299
12 4 9 13 12 4 9 13 12 4 9 13 > 114;
@@
G G G G G G G G
@ @ @ @ @ @ @ @
Rank 1 299 29 296 41 284 44 256 > 1250
12 13 4 9 9 4 13 12 > 76
> 0
13 12 9 4 4 9 12 13 > 76
@ @ @ @ @ @ @ @
G G G G G G G G
Rank 299 1 296 29 284 41 256 44 > 1250
G | G | G | G | G | G | G | G | ||
@ | @ | @ | @ | @ | @ | @ | @ | ||
Rank 52 | 248 | 80 | 245 | 92 | 233 | 95 | 205 | > | 1250 |
12 | 13 | 4 | 9 | 9 | 4 | 13 | 12 | > | 76 |
> | 0 | ||||||||
13 | 12 | 9 | 4 | 4 | 9 | 12 | 13 | > | 76 |
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 14
ISSN 2229-5518
@ | @ | @ | @ | @ | @ | @ | @ | ||
G | G | G | G | G | G | G | G | ||
Rank 248 | 52 | 245 | 80 | 233 | 92 | 205 | 95 | > | 1250 |
G | G | G | G | G | G | G | G | ||
@ | @ | @ | @ | @ | @ | @ | @ | ||
Rank 103 | 197 | 131 | 194 | 143 | 182 | 146 | 154 | > | 1250 |
@ | @ | @ | @ | @ | @ | @ | @ |
-1 1 | -5 5 5 -5 | 1 -1 | |||||
@ | @ | @ | @ | @ | @ | @ | @ |
APb |
Figure 11. Schematic representation of the amino acid Gly and their frequency
Odd rank and frequency
G | G | G | G | G | G | G | G | G | G | G | G | ||
10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | ||
1 | 29 | 41 | 95 | 103 | 131 | 143 | 197 | 205 | 233 | 245 | 299 | ||
12 | 4 | 9 | 13 | 12 | 4 | 9 | 13 | 12 | 4 | 9 | 13 | > | 114; |
Even rank and frequency
G | G | G | G | G | G | G | G | G | G | G | G | ||
10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | ||
44 | 52 | 80 | 92 | 146 | 154 | 182 | 194 | 248 | 256 | 284 | 296 | ||
13 | 12 | 4 | 9 | 13 | 12 | 4 | 9 | 13 | 12 | 4 | 9 | > | 114; |
Figure 12. Schematic representation of the amino acid Gly and their odd-even rank and
frequency
Therefore, there is a mathematical balance between the group of aminoacids with positive frequency and those of negative frequency. Aminoacids with a positive frequency have a primary role in the mathematical picture of that protein, and the negative frequencies have a secondary role in it. We assume that aminoacids with a positive frequency have a primary role in the biochemical picture of that protein, and the negative frequencies have a secondary role in it. If this really is the case and research on an experimental level proves it, a radically new way of learning about biochemical processes will be opened.
3.4 Analog bio code
Each numerical value has its analogue expression. For example: The analogue expression for number 19 is
91.
91 || 19
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 15
ISSN 2229-5518
In a similar way we can calculate the analogue expression for any numerical value. Our research has shown that analog codes are quantitative characteristics in biochemistry. Analogue biocode is a discrete code that protects and guards genetic information coded in biochemical processes.
This a recently discovered code, and more detailed knowledge about it is necessary. Odd rank and analog frequency
G | G | G | G | G | G | G | G | G | G | G | G | ||
10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | ||
1 | 29 | 41 | 95 | 103 | 131 | 143 | 197 | 205 | 233 | 245 | 299 | ||
Frequencu > 12 | 04 | 09 | 13 | 12 | 04 | 09 | 13 | 12 | 04 | 09 | 13 | > | 114; |
@ | @ | @ | @ | @ | @ | @ | @ | @ | @ | @ | @ |
Analog
frequency > 21 40 90 31 21 40 90 31 21 40 90 31 > 546;
Even rank and analog frequency
G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | G 10 | ||
44 | 52 | 80 | 92 | 146 | 154 | 182 | 194 | 248 | 256 | 284 | 296 | ||
Frequency > 13 | 12 | 04 | 09 | 13 | 12 | 04 | 09 | 13 | 12 | 04 | 09 | > | 114; |
@ Analog | @ | @ | @ | @ | @ | @ | @ | @ | @ | @ | @ | ||
Frequency > 31 | 21 | 40 | 90 | 31 | 21 | 40 | 90 | 31 | 21 | 40 | 90 | > | 546; |
Figure 13. Schematic representation of the amino acid Gly and their odd-even rank and
analog frequency
Analogue code is , actually, a discrete code that protects and guards genetic information coded in bio- chemical processes.
In the previous examples we translated the physical and chemical parameters from the language of biochemistry into the digital language of programmatic, cybernetic and information principles. This we did by using the adequate mathematical algorithms. By using chemical-information procedures, we calculated the numerical value for the information content of molecules. What we got this way is the digital picture of the phenomenon of biochemistry. These digital pictures reveal to us a whole new dimension of this science. They reveal to us that the biochemical process is strictly conditioned and determined by programmatic, cybernetic and information principles.
From the previous examples we can see that this protein really has its quantitative characteristics. It can be concluded that there is a connection between quantitative characteristics in the process of transfer of genetic information and the qualitative appearance of given genetic processes.
4 DISCUSSION
The results of our research show that the processes of sequencing the molecules are conditioned and arranged not only with chemical and biochemical lawfulness, but also with program, cybernetic and informational lawfulness too. At the first stage of our research we replaced nucleotides from the Amino Acid Code Matrix with numbers of the atoms and atomic numbers in those nucleotides. Translation of the biochemical language of these amino acids into a digital language may be very useful for developing new methods of predicting protein sub-cellular localization, membrane protein type, protein structure
secondary prediction or any other protein attributes.
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 16
ISSN 2229-5518
The success of human genome project has generated deluge of sequence information. The explosion of biological data has challenged scientists to accelerate the speed for their analysis. Nowadays, protein sequences are generally stored in the computer database system in the form of long character strings. It would act like a snail's pace for human beings to read these sequences with the naked eyes (Xiao and Chou,
2007). Also, it is very hard to extract any key features by directly reading these long character strings. However, if they can be converted to some signal process, many important features can be automatically manifested and easily studied by means of the existing tools of information theory (Xiao and Chou, 2007). The novel approach as presented here may help improve this kind of situation.
The process of sequencing in bio-macromolecules is conditioned and determined not only through biochemical, but also through cybernetic and information principles. The digital pictures of biochemistry provide us with cybernetic and information interpretation of the scientific facts. Now we have the exact scientific proofs that there is a genetic language that can be described by the theory of systems and cybernetics, and which functions in accordance with certain principles.
[1] Cai, Y.D., and Chou, K.C., 2006. Predicting membrane protein type by functional
Domain composition and pseudo amino acid composition. J Theor Biol 238, 395-400. [2] Cai, Y.D., Zhou, G.P., and Chou, K.C., 2003. Support vector machines for predicting
membrane protein types by using functional domain composition. Biophys J 84, 3257-
3263.
[3] Cai, Y.D., Zhou, G.P., and Chou, K.C., 2005. Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition. J Theor Biol
234, 145-149.
[4] Cai, Y.D., Feng, K.Y., Lu, W.C., and Chou, K.C., 2006. Using LogitBoost classifier to predict protein structural classes. J Theor Biol 238, 172-176.
[5] Cai, Y.D., Pong-Wong, R., Feng, K., Jen, J.C.H., and Chou, K.C., 2004. Application of
SVM to predict membrane protein types. J Theor Biol 226, 373-376.
[6] Chen, C., Chen, L., Zou, X., and Cai, P., 2009. Prediction of protein secondary structure content by using the concept of Chou's pseudo amino acid composition and support vector machine. Protein & Peptide Letters 16, 27-31.
[7] Chen, L., Feng, K.Y., Cai, Y.D., Chou, K.C., and Li, H.P., 2010. Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition. BMC Bioinformatics 11, 293.
[8] Chou, K.C., 2011. Some remarks on protein attribute prediction and pseudo amino acid composition (50th Anniversary Year Review). J Theor Biol 273, 236-247.
[9] Chou, K.C., and Shen, H.B., 2010a. Cell-PLoc 2.0: An improved package of web-servers for predicting subcellular localization of proteins in various organisms. Natural Science
2, 1090-1103 (openly accessible at http://www.scirp.org/journal/NS/).
[10] Chou, K.C., and Shen, H.B., 2010b. Plant-mPLoc: A Top-Down Strategy to Augment the
Power for Predicting Plant Protein Subcellular Localization. PLoS ONE 5, e11335.
[11] Ding, H., Luo, L., and Lin, H., 2009a. Prediction of cell wall lytic enzymes using Chou's amphiphilic pseudo amino acid composition. Protein & Peptide Letters 16, 351-355.
[12] Ding, Y.S., Zhang, T.L., and Chou, K.C., 2007. Prediction of protein structure classes
with pseudo amino acid composition and fuzzy support vector machine network. Protein
& Peptide Letters 14, 811-815.
[13] Ding, Y.S., Zhang, T.L., Gu, Q., Zhao, P.Y., and Chou, K.C., 2009b. Using maximum entropy model to predict protein secondary structure with single sequence. Protein &
IJSER © 2011 http://www.ijser.org
International Journal of Scientific & Engineering Research Volume 2, Issue 5, May-2011 17
ISSN 2229-5518
Peptide Letters 16, 552-560.
[14] He, Z.S., Zhang, J., Shi, X.H., Hu, L.L., Kong, X.G., Cai, Y.D., and Chou, K.C., 2010.
Predicting drug-target interaction networks based on functional groups and biological features. PLoS ONE 5, e9603.
[15] Hu, L., Huang, T., Shi, X., Lu, W.C., Cai, Y.D., and Chou, K.C., 2011. Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties PLoS ONE 6, e14556.
[16] Huang, T., Shi, X.H., Wang, P., He, Z., Feng, K.Y., Hu, L., Kong, X., Li, Y.X., Cai,
Y.D., and Chou, K.C., 2010. Analysis and prediction of the metabolic stability of proteins based on their sequential features, subcellular locations and interaction networks PLoS ONE 5, e10972.
[17] Kandaswamy, K.K., Pugalenthi, G., Moller, S., Hartmann, E., Kalies, K.U., Suganthan, P.N., and Martinetz, T., 2010. Prediction of Apoptosis Protein Locations with Genetic Algorithms and Support Vector Machines Through a New Mode of Pseudo Amino Acid Composition. Protein and Peptide Letters 17, 1473-1479.
[18] Kannan, S., Hauth, A.M., and Burger, G., 2008. Function prediction of hypothetical proteins without sequence similarity to proteins of known function. Protein & Peptide Letters 15, 1107-1116.
[19] Liu, T., Zheng, X., Wang, C., and Wang, J., 2010. Prediction of Subcellular Location of
Apoptosis Proteins using Pseudo Amino Acid Composition: An Approach from Auto
Covariance Transformation. Protein & Peptide Letters 17, 1263-9.
[20] Mohabatkar, H., 2010. Prediction of cyclin proteins using Chou's pseudo amino acid composition. Protein & Peptide Letters 17, 1207-1214.
[21] Wang, Y.C., Wang, X.B., Yang, Z.X., and Deng, N.Y., 2010. Prediction of enzyme subfamily class via pseudo amino acid composition by incorporating the conjoint triad feature. Protein & Peptide Letters 17, 1441-1449.
[22] Xiao, X., and Chou, K.C., 2007. Digital coding of amino acids based on hydrophobic index. Protein & Peptide Letters 14, 871-875.
IJSER © 2011 http://www.ijser.org