Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Modeling global properties of proteins (CROSBI ID 532859)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija

Lučić, Bono Modeling global properties of proteins // Book of Abstracts, Regional Biophysics Conference 2007 / Zimanyi, Laszlo ; Kota, Zoltan ; Szalontai, Balazs (ur.). Balatonfuered: -, 2007. str. 63-63

Podaci o odgovornosti

Lučić, Bono

engleski

Modeling global properties of proteins

Modeling global structural properties of proteins (like folding type, secondary structure content, folding or unfolding rate constants, location of a protein in the cell, etc.) from the structure is one of the most important challenges of computational structural biology and biophysics, and the first step is analysis of properties of a new protein sequence. There have been many attempts to predict global protein features, but a lot of models that include many non-significant parameters have been developed and published. Such models are not of high accuracy, especially not of such high accuracy as it was presented in original publication. I will illustrate overfitting problems in modeling global features of proteins on examples of publications related to modeling protein folding rate constants [1, 2] and protein secondary structure contents [3, 4]. Folding and unfolding rate constants are modeled by using average of physical/chemical properties of amino acid residue of protein [1, 2]. The authors selected many parameters in models comparing with the total number of proteins in data sets. Due to this reason, correlation in developed models are really due to the chance, and although statistical parameters of fit and leave one out procedure 'are' excellent, they are consequence of random correlation. To illustrate this, we re-calculated model parameters for 10 proteins of mixed class (eq. 4 in ref. 1) using four parameters (polarity, refractive index, solvent-accessible surface area upon unfolding, and unfolding entropy change of hydration) each having three decimal places for each protein, and obtained completely the same model parameters (correlation coefficient r = 0.994). But, after that we used the same parameters in which each value was rounded to two decimal places, and model parameters were drastically changed, as well as statistical parameters (r = 0.885). In the final model for all classes (29 proteins) 16 parameters were selected, what is unambiguous indication that the model is overfitted. The same case is for all other models developed in refs 1 and 2. Improvement of models for folding rate constants by inclusion of novel parameters that are based on properties of amino acid residues and their distribution through sequence will be presented Second example is related to modeling the protein secondary structure content on four data sets having 166, 262, 398 and 475 soluble proteins [3]. Developed model in ref. 3 involved 57 independent parameters (optimized constants) for all three secondary structure types (α , β and coil) in linear and 247 optimized parameters in nonlinear models. By performing selection of small number (only five) of most important parameters (among 20 frequencies of amino acid residues and 210 frequencies of products of frequencies, and among them product of ala x leu was most important ones), I selected much simpler and better models. Mean absolute error for data set of 262 proteins for three secondary structure contents is 9% with the model having only five parameters for each of three secondary structure types, comparing with corresponding error of 11% obtained in ref. 3. These models can be improved by inclusion of autocorrelation function that are computed using relevant properties of amino acid residues for each protein sequence, what will be illustrated.

modeling global properties of proteins; folding/unfolding rates; secondary structure content; protein structural class; calculated protein structure attributes

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

63-63.

2007.

objavljeno

Podaci o matičnoj publikaciji

Zimanyi, Laszlo ; Kota, Zoltan ; Szalontai, Balazs

Balatonfuered: -

Podaci o skupu

Regional Biophysics Conference 2007

plenarno predavanje

21.08.2007-25.08.2007

Balatonfüred, Mađarska

Povezanost rada

Kemija