In this work, we systematically analyzed the secondary structure amino acid compositions of acidic and alkaline enzymes and compared them with neutral ones. We found that the propensity of the individual residues to participate in secondary structures and the consistently higher composition of neutral and tiny residues might be the general stability mechanisms for their adaptation to pH extremes. Based on this, we presented a secondary structure amino acid composition method for extracting useful features from sequence. The overall prediction accuracy evaluated by the 10-fold cross-validation reached 80.3%. Comparing our method with other feature extraction methods, the improvement of the overall prediction accuracy ranged from 9.4% to 18.7%. The random forests algorithm also outperformed other machine learning techniques with an improvement ranging from 2.7% to 21.8%.
张光亚,高嘉强,方柏山. 基于二级结构氨基酸组成识别酸性、中性及碱性酶[J]. Chinese Journal of Biotechnology, 2009, 25(10): 1508-1515
Copy® 2024 All Rights Reserved