4 Protein Structure and Function

Session Level Objectives (SLOs): after completing the session, students will be able to:

SLO 1. Know the elements of protein secondary, tertiary, and quaternary structure.

SLO 2. Explain the roles of hydrophilic vs. hydrophobic aminoacyl residues in protein folding.

SLO 3. Explain the importance of correct protein folding, chaperone proteins, and how misfolding can lead to pathology.

SLO 4. Understand common post-translational modifications of proteins (phosphorylation; disulfide bond formation; glycosylation) and know why specific modifications occur predominantly on proteins within the cytoplasm or in extracytoplasmic environments.

SLO 5. Know that different proteins are targeted to specific locations inside and outside of cells.

SLO 1. Know the elements of protein secondary, tertiary, and quaternary structure.

If you want to understand function, study structure. F. Crick

As a nascent polypeptide emerges from the ribosome, it must fold into a specific, functional, three- dimensional structure. The functional native” fold of a protein is determined by the linear sequence of amino acids in the polypeptide.

We think about folding as a hierarchical process:

  • A polypeptide’s primary structure its linear sequence of amino acids. This sequence, as you have just seen, is specified by the sequence of codons in an mRNA template.
  • Secondary structure elements are “folding motifs” that form through local interactions between residues within the polypeptide chain (H-bonding, salt bridges, van der Waals interactions, etc.). The most common and important secondary structure motifs are the αhelix (Fig. 1) and the β sheet (Fig. 2).
image
Fig. 1. α-helix motif. The polypeptide backbone forms a right- handed helix. The helix is stabilized by hydrogen bonds formed between backbone amino and carbonyl groups on successive turns of the helix. The amino acid side chains (blue) project outward from the helix.
image
Fig. 2. β-sheet motif. Hydrogen bonds between backbone amine and carbonyl groups connect adjacent segments of the polypeptide. The side chains project above and below the sheet. The strands can run parallel (then N- and C- termini are on the same side) or anti- parallel. Source: Pauling & Corey, 1951.
  • Tertiary structure describes the overall arrangement of a polypeptide’s secondary structure elements. This is the overall 3-dimensional fold of the polypeptide. Many proteins contain mainly α-helix or β-sheet folds. Others use both kinds of folds; an example is shown in Fig. 3.
  • Many, many proteins operate as larger complexes. The assembly of more than one polypeptide into a protein complex is the quaternary structure. This can mean as few as two small polypeptides, or an assemblage as big as the nuclear pore or — even bigger — silk or human hair.
image
Fig. 3. Influenza virus HA protein. This protein is a homotrimer, meaning that the quaternary structure is a complex containing three identical copies of a single type of polypeptide. In this rendering, each of the three chains has both α-helix (red corkscrews) and β-sheet (blue arrows) secondary structure folds, connected by “loop” segments (purple). The polypeptides are post- translationally glycosylated with carbohydrate molecules (green).

SLO 2. Explain the roles of hydrophilic vs. hydrophobic aminoacyl residues in protein folding.

The principles that control protein folding are the exactly same ones that we have already seen with RNA and DNA: The hydrophobic effect, charge interaction and repulsion, Van der Waaals contacts, etc.

  1. Water molecules form many hydrogen bonds with one another, and they have high entropy (they can diffuse freely, translate, and rotate).
  2. Hydrophobic (greasy) amino acid side chains are surfaces where water cannot hydrogen bond (this is an enthalpic penalty). Near these surfaces the water has reduced entropy, as well. Because water “hates” hydrophobic side chains, these chains “want” to be shielded from the aqueous solvent. Thus, hydrophobic amino acid residues tend to be buried within folded portions of the protein.
  3. Hydrophilic (polar or charged) amino acid side chains can form energetically favorable hydrogen bonds with water. They are often exposed to the aqueous solvent. If they cannot interact with the solvent (if they are buried), they generally interact with other portions of the polypeptide through hydrogen bonds or salt bridges.
  4. Additional inter-chain interactions that contribute to protein stability include van der Waals contacts, aromatic stacking interactions (analogous to the base stacking that we saw with DNA and RNA), and electrostatic repulsion between similarly charged (-/- or +/+) groups on the polypeptide.
Fig. 4. Summary of the hierarchy of protein folding. Notice that, in contrast to the flu virus protein in Fig. 3, hemoglobin is an exclusively α-helical protein.

SLO 3. Explain the importance of correct protein folding, chaperone proteins, and how misfolding can lead to pathology.

Protein folding, chaperone proteins, and how misfolding can lead to pathology

Mutations that cause protein sequence changes, or errors in transcription or translation, can change the balance of forces that we have just described, causing misfolding of a protein loss of its function. Other mutations may still allow a protein to fold more or less correctly, but change the protein’s activity. For example, some mutations result in ion channels that open more easily than they otherwise would, leading to neurological disorders.

Both within our cells, and in extracellular spaces (cartilage, blood, cerebrospinal fluid, etc.), proteins are present at extremely high overall concentrations. This dense proximity means that proteins will touch other proteins both during and after folding, with enormous potential for inappropriate interactions that can lead to non-specific aggregation.

You’re probably familiar with one protein aggregation process: making Jell-O™. We start with a clear aqueous solution of soluble proteins at high concentration. We then heat the solution so that the proteins unfold. That is, they are denatured. As the unfolded proteins cool, they aggregate into a single disordered gel.

In cells, protein aggregates are major sources of cytotoxicity, and — as we will see — they contribute to pathologies ranging from Alzheimer’s disease to type II diabetes.

Mutations can cause proteins to misfold at elevated rates, but even non-mutant proteins sometimes misfold, especially in the presence of stresses such as heat or oxidation. To mitigate inappropriate contact between un-folded or partially-folded proteins, cells use special proteins called chaperones. There are many different chaperones. Some passively shield proteins from inappropriate contacts.

Others use energy from ATP hydrolysis to mechanically pull apart proteins that have formed inappropriate contacts, giving the proteins a “second chance” to fold correctly.

When a protein cannot fold correctly even with the assistance of chaperones, the cell may recognize the misfolded polypeptide as hopeless, and mark it for destruction. This cellular surveillance process operates in almost every cell and is called protein quality control. The quality control system has at least two branches: the ubiquitin–proteasome system, and the autophagy–lysosome system. We’ll discuss these systems later in the block.

SLO 4. Understand common post-translational modifications of and know why specific modifications occur predominantly on proteins within the cytoplasm or in extra- cytoplasmic environments.

Once synthesized, most polypeptides undergo covalent posttranslational modifications. These fall into several different categories. The following list is not comprehensive! It’s illustrative, showing some important examples.

  1. image
    Fig. 5. An antibody (IgG) molecule. Each IgG is a heterotetramer containing four polypeptides: two identical light chains and two identical heavy chains. IgG is an entirely β-sheet protein. The quaternary structure of the complex is stabilized by non-covalent interactions between the chains, and also by disulfide bonds that covalently cross-link the two heavy chains together.

    Proteolysis. Many proteins are precisely clipped before they are fully functional. For example, many digestive enzymes are made as inactive proenzymes — a form safe for transport through sensitive cellular compartments. Upon secretion into the digestive tract an inhibitory portion of the polypeptide is clipped off, and the enzyme is activated.

  2. Disulfide bonding. The terminal sulfhydryl group on the amino acid cysteine can be oxidized to form a cysteinecysteine disulfide bond.
    • Disulfide bonds are most often used to form mechanically stabilizing cross-links within a polypeptide chain or to cross-link two chains together in a protein complex.
    • In general, the cytoplasm and nucleus of a cell have a chemically reducing potential, while the extracellular environment has a relatively oxidizing potential. What this means: we very seldom see proteins with disulfide bonds in the cytoplasm or nucleus, but lots of secreted proteins such as antibodies (Fig. 5), and many cell-surface proteins, have disulfide bonds.
      image
      Fig. 6. The glycan “shield” of a Coronavirus spike protein. Like the Influenza HA protein shown in Fig. 3, the Spike protein is used by Coronaviruses to gain entry into host cells during infection. The spike consists of a protein homotrimer, anchored to the viral envelope (membrane) at the base. Attached to each monomer are over twenty complex carbohydrate molecules (blue). Each sphere represents a hexose. The structure of the carbohydrates, ascertained using mass spectrometry, is shown in schematic form on the right.The carbohydrates both stabilize the spike protein and shield it from proteases, antibodies, and other host defenses.Source: D. Veesler, UW Biochemistry. Nat Struct Mol Biol. (2016) 23(10):899-905
  3. Glycosylation. Sugars are attached to most, but not all, secreted and cell surface proteins. Usually these are short-chain, branched carbohydrates. These sugars are used in cell–cell recognition and in cell signaling processes, and they can stabilize and protect proteins that are exposed to harsh extracellular environments. On the down side, viruses such as HIV and SARS use glycosylation to shield their surface proteins from recognition and attack by our immune systems (Fig. 6). Very few cytoplasmic or nuclear proteins are glycosylated (though the ones that are may be of great importance).
  4. Phosphorylation. The covalent transfer of phosphate groups from ATP to polypeptides (protein phosphorylation) is a critical regulatory mechanism that controls almost every aspect of cell physiology.
    • The phosphotransfer reaction is mediated by protein kinase enzymes. The recipient is always an amino acid residue with a terminal hydroxyl group on its side chain: serine, threonine, or tyrosine. The product is a phosphoester.
    • Phosphorylation is reversible. Hydrolysis of the phosphoester removes the phospphoryl group from the protein. Dephosphorylation is catalyzed by protein phosphatase enzymes.
    • Several hundred kinases and phosphatases are encoded in the human genome.
    • Protein kinases and phosphatases were discovered here in the UW School of Medicine, by Professors Ed Krebs and Eddie Fischer. For their discoveries, these lifelong friends and collaborators shared the Nobel Prize.

SLO 5. Know that proteins are targeted to specific locations inside and outside of cells.

Different proteins have different functions — and the proteins must carry out their functions in different locations. Inside a cell, for example, some proteins operate in the cytoplasm, some in the nucleus, some within organelles such as mitochondria, and some within the plane of the plasma membrane. Many proteins are secreted from cells. Examples include the collagen that holds our tissues together (Fig. 7), antibodies and other proteins in blood and serum, digestive enzymes in the gut, and polypeptide hormones such as insulin.

Each protein must have a way to get to its site of action. Protein targeting typically involves a specific amino acid sequence within the polypeptide that serves as a “molecular zip code” used to direct the protein to its destination.

For example, the RNA polymerase II complex consists of several polypeptides. It is synthesized, folded, and assembled into a complex in the cytoplasm, but it has a “nuclear localization sequence” that directs the folded complex through the nuclear pore and into the nucleus, where it will transcribe mRNA molecules. As we’ll see, some mutations cause proteins to go to incorrect locations, resulting in disease.

image
Fig. 7. Extracellular matrix. Many big and small proteins including collagen (long white and violet fibers) assemble into complex structural webs that link cells together within tissues. Each of these proteins carries a signal sequence that directs its secretion after it is synthesized inside the cell. Many secreted proteins are heavily post-translationally modified. Source: David Goodsell

License

Molecular Biology Copyright © by Alexey Merz; Timothy Cherry; and kullberm. All Rights Reserved.

Share This Book