


showed that amino acid propensities are different for specific locations of α-helix depending on amino acids. The obtained propensities for α-helix are consistent between studies, with the pair-wise correlation coefficient (R) frequently being >0.8, although Richardson et al. The propensities have been estimated from statistical analysis of three-dimensional structures, experimental determination of α-helix or β-sheet content in peptides, and experimental determination of the thermodynamic stability of mutant proteins. In 2009, we developed a quaternary structural database for proteins, OLIGAMI in which the oligomer information was added to the SCOP classification, to allow an exhaustive survey of tertiary or quaternary structures of proteins.Ī large number of studies have been carried out to obtain amino acid propensities for α-helix and β-sheet. Remaining folds are assigned to "Multi-domain", "Membrane and cell surface" or "Small" proteins classes. Most of the folds (899/1086) are assigned to one of the four structural classes “all-α”, “all-β”, “α/β” (for proteins with α-helices and β-strands that are largely interspersed) and “α + β” (for those in which α-helices and β-strands are largely segregated). The classification is on hierarchical levels: the first two levels, family and superfamily, describe near and far evolutionary relationships the third, fold, describes geometrical relationships. SCOP classification (Structural Classification of Protein) is one of the major database which provides a detailed and comprehensive description of the relationships of all known proteins structures. Since then, a vast number of protein structures have been determined and classified to reflect both structural and evolutionary relatedness. In 1974, Chou and Fasman published the calculated frequency of occurrence and conformational propensity of each amino acid in the secondary structures of 15 proteins, consisting of 2473 amino acid residues. Furthermore, the correlations we detected suggest that amino acid composition is related to folding properties such as the twist of a β-strand or association between two β sheets. However, β-sheet propensities calculated for exposed residues differ from those for buried residues, indicating that the exposed-residue fraction is one of the major factors governing amino acid composition in β-strands. The α-helix propensities are similar for all folds and for exposed and buried residues. "All-β" proteins tend to have a higher content of Tyr, Trp, Gln and Ser, whereas "α/β" proteins tend to have a higher content of Val, Ile and Leu.

At buried sites in β-strands, the content of Tyr, Trp, Gln and Ser correlates negatively with the content of Val, Ile and Leu (correlation coefficient = −0.93). Folds with a high Ser, Thr and Asn content at exposed sites in β-strands tend to have a low Leu, Ile, Glu, Lys and Arg content (correlation coefficient = −0.90) and to have flat β-sheets. We also found some fold dependence on amino acid frequency in β-strands. The propensities calculated for exposed sites and buried sites are similar for α-helix, but such is not the case for the β-sheet propensities. Results showed that α-helix propensities do not differ significantly by fold, but β-sheet propensities are diverse and depend on the fold.

The propensities were also calculated for exposed and buried sites, respectively. We calculated amino acid propensities for α-helices and β-sheets for 39 and 24 protein folds, respectively, and addressed whether they correlate with the fold. On the other hand, the β-sheet propensities obtained by several studies differed significantly, indicating that the context significantly affects β-sheet propensity. The obtained propensities for α-helices are consistent with each other, and the pair-wise correlation coefficient is frequently high. A large number of studies have been carried out to obtain amino acid propensities for α-helices and β-sheets.
