Structure and dynamics of the ‘protein folding code’ infe(2)

发布时间:2021-06-08

图片特写:“争夺”图书馆 | 依法治校

R.Wallace/BioSystems103 (2011) 18–26

19

Fig.1.From Hartl and Hayer-Hartl(2009).Energy landscape spectrum of protein folding and aggregation,parsed according to the degree of intra-vs.inter-molecular contact.Each energy valley defines an equivalence class,and the set of such classes defines the‘protein folding groupoid’,in the sense of Weinstein(1996).Four basic classifications can be seen;native state,amorphous aggregates,semi-structured oligomers,and quasi-crystalline amyloidfibrils.Within the native state and the amyloidfibrils,systematic subclasses can be identified,leading to afine structure for protein coding.

strategies against amyloid formation,which include both qual-ity control mechanisms through molecular chaperones as well as sequence-based[evolutionary]prevention of amyloid aggregation...

...[E]ach protein may exist,not only in an unfolded or folded state,but,by containing at least one amino acid segment that is capable of participating in a sequence-specific,ordered, cross-␤-sheet aggregated state,may also exist in an amyloid-like aggregate.The process of protein aggregation can thus be viewed as a primitive folding mechanism,resulting in a defined, aggregated conformation with each aggregated protein having its own distinctive properties.

Krebs et al.(2009),however,in a paper tellingly titled‘Protein aggregation:more than justfibrils’,find that the amyloidfibril is not the only structure that aggregating proteins of widely different types may adopt.For example,the occurrence of spherulites,which have been found in vivo as well as in vitro,appears to be generic, although the factors that determine the equilibrium between free fibril and spherulite are not as yet clear.That is,we have not fully explained the spectrum implied in Fig.1.Nevertheless,here we will use Tlusty’s(2007a,b,2008a,b,c,2010a,b)arguments on the evolu-tion of the genetic code to explore something of that spectrum.

Thefirst papers in this series(Wallace,2010a,b)applied Tlusty’s rate distortion analysis of the genetic code to protein folding dynamics,and made a pilot application to the simplest‘protein folding code’.Here we greatly expand the topological methods from that work,focusing on normal three-dimensional globular proteins and the eightfold symmetry of the steric zipper associ-ated with amyloidfibrils(Sawaya et al.,2007),but extending that work significantly,to empirical studies of protein folding rates and to intrinsically disordered proteins.

As Kamtekar et al.(1993)point out,experimental studies of natural proteins show how their structures are remarkably toler-ant to amino acid substitution,but that tolerance is limited by a need to maintain the hydrophobicity of interior side chains.Thus, while the information needed to encode a particular protein fold is highly degenerate,this degeneracy is constrained by a require-ment to control the locations of polar and nonpolar residues.This is the precise protein folding analog to Tlusty’s error network anal-ysis of the genetic code,and his graph coloring arguments should thus apply,in some measure,to protein folding as well,allowing inference on the underlying structure of the‘protein folding codes’to be associated with the horizontal axis of Fig.1.

Tycko,2006,likewise,argues that the amyloidfibril is a generically stable structural state of a polypeptide chain,compet-ing thermodynamically and kinetically with globular monomeric states and unfolded monomeric states.Peptides and proteins that are known to form amyloidfibrils have widely diverse amino acid sequences and molecular weights.He particularlyfinds that

The near sequence independence of amyloid formation repre-sents a challenge to our understanding of the physical chemistry of peptides and proteins.

Such sequence independence is,again,very precisely the degen-eracy associated with Tlusty’s error network approach.

Intermediate forms in Fig.1remain to be studied from this per-spective.

Some of these matters have,of course,already been the sub-ject of considerable attention.A series of elegant experiments by the Hecht group(e.g.,Hecht et al.,2004),extending the Kamtekar et al.(1993)work,has focused on a basic understanding of pro-tein folding through substitution of different polar and nonpolar amino acids in the construction of normal andfibril proteins.␣-Helices are found to be natural outcomes of amino acid sequences having a3.6residue/turn patten,i.e.,a digital signal of the form 101100100110,where1indicates a polar,and0a nonpolar amino acid.The resulting three-dimensional structures are formed by the propensity of the different residues to interact with an aqueous environment.

␤sheets,on the other hand,emerge from a simpler period2 code,e.g.,1010101,matching the structural repeat of the sheets. More recent work(Kim and Hecht,2006)finds that generic hydrophobic residues of this form are sufficient to promote aggre-gation of the Alzheimer’s A␤42peptide.However,while the positioning of hydrophobic residues is more important than the exact identities of the hydrophobic side chains for determining overall geometry,reaction kinetics,the rate offibril formation, was profoundly affected by those identities.This suggests that the ‘protein folding code’may be,in no small part,contextual,that is, determined as much by in vivo cellular regulatory machinery as by in vitro hydrophobic/hydrophilic physical interactions.This,we will suggest below,likely involves the operation of something like the catalytic mechanisms that Wallace and Wallace(2009)and Wallace (2010a)describe.

Before beginning the formal explorations,some comment is necessary regarding a‘biological’explanation of the relation between the number of network holes and tertiary protein sym-metries,according to Tlusty’s treatment.Following the argument of Tlusty(2010b),the genetic code is a mapping of one codon to one amino acid.By contrast,the‘protein folding code’is a mapping of genes to folded amino acid chains,and the com-plexity gap between the two codes is very great indeed(e.g., Mirny and Shakhnovich,2001).The strategy that allows adapta-tion of Tlusty’s methods to protein folding is a coarse-graining of protein structure into a matrix of larger building blocks,e.g.,␣-helices and␤-sheets.At this lower resolution a‘code’is a mapping between short DNA stretches,analogous to codons,and the con-voluted motifs of proteins,playing the role of amino acids.As a consequence of the great tolerance to amino acid substitutions described above,as long as charge and polarity are conserved, it is possible to cluster all the sequences that encode the same structural motif.This greatly reduces the size of the resulting DNA sequence graph and thus limits the number of possible building blocks.

Structure and dynamics of the ‘protein folding code’ infe(2).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219