Researchers Sarah Slavoff and Zhenkun Na of Yale’s Department of Chemistry are standing up for the little guy.
More than a hundred years in the making, the Human Genome Project was born from our need to understand the little parts of ourselves. With our entire genome sequenced at the turn of the century, researchers began picking proteins to study as if from a lineup during gym class. Insulin, flaunting its pharmaceutical applications, was chosen first. A blood clotting factor went second. Around twenty thousand picks later, lil’ old NoBody (NBDY) microprotein is ready for its time in the limelight. Microproteins have and will continue to be master regulators in cells, even if they’re not winning popularity contests.
Sarah Slavoff entered the field of proteomics—the study of the proteins that make up life—asking all the right questions but none of the popular ones. She began working to fill the gaps in our knowledge of what she calls “the dark matter of the genome” during her postdoctoral fellowship at Harvard. Like many others taking a protein-based approach to gene discovery, Slavoff sought to separate the junk from jewel. And while it would be nice if regions of our DNA could scream to us, “Hey, I’m important!”, natural selection hasn’t quite worked that kink out yet. Instead, researchers relied on a strict set of rules when identifying new protein-coding gene sequences:
- They must begin with a specific three-letter sequence (AUG) known as a start codon.
- A single mammalian transcript encodes one protein.
- That protein must be longer than a hundred amino acids.
Slavoff’s proteomic experiments, however, began producing tens of thousands of potential results that were discounted because they broke one or more of these rules. “Biology is just as messy and beautiful as you would expect it to be,” Slavoff noted a decade later. Through ribosome profiling and bioinformatics approaches, her lab has discovered exceptions to identifying protein-coding genes. “All of these rules are actually broken. And they’re not just broken in rare exceptions, they’re broken very widely,” Slavoff said. After adjusting experimental parameters to account for this inconsistency, the floodgates were opened: in the world of proteomics, the mavericks and outcasts now shared a table at lunch with the jocks and socialites.
In an effort led by postdoctoral fellow Zhenkun Na, the lab further justified that these “new” proteins aren’t just sitting around. In fact, they might be serving some of the most important functions in cells. Enter: NoBody, a microprotein that is a mere sixty-eight amino acids long.
One unique property of NoBody is its ability to behave like a fluid, forming liquid droplets in cells. While in this droplet state, certain modifications to NoBody, such as the addition of chemical groups known as phosphates, cause the dissociation of membrane-less organelles known as processing bodies, or P-bodies. Small like the proteins that regulate it, the complexity of P-bodies’ anatomy is not to be underestimated. They serve as storage sites for enzymes that function in the processing and breakdown of RNA. Thus, NoBody’s mood at any given moment—in other words, its phosphorylation state—can make the difference between whether or not certain RNA sequences, and the proteins they encode for, are produced by the cell.
What’s even more astounding is that NoBody can regulate P-body dynamics seemingly without formal or consistent structure. It is not made of folded sequences such as alpha helices or beta sheets, which are some of the defining features of secondary structure in typical proteins. NoBody is just one of many “intrinsically disordered” microproteins with the power of order over our cells. The very existence of microproteins challenges everything we know about what proteins look like, what they do, and where to look for them.
One proteomics database, OpenProt.org, predicts the existence of over forty thousand microproteins and other proteins missing from our modern understanding of the human proteome. As of today, characterized proteins in the human body make up only half that number. With each one of these unfilled links potentially representing a new function, location, or structure in the cell, we should take a long, hard think before choosing the next protein from our lineup. “It took us over a hundred years to build up and annotate the human genome right. We don’t have another hundred years to figure out what these things are doing,” Slavoff said.