The Origin of the Nucleus: The Discovery of a Cellular Fossil

Photo courtesy of Ellie Gabriel

Our archaeological record of evolution is dotted with a few key events, such as life’s emergence from water to land and the emergence of functional wings on birds. However, one of the most fundamental evolutionary changes—the emergence of the nucleus—remained a mystery until recently. The study of more recent evolutionary events, such as the emergence of wings on birds, benefits from rich sources of information, like fossils. In contrast, as much as we understand the nucleus’s central role in eukaryotic cells, its development is still difficult to study due to the sparse data in the archaeological record. Thanks to a new study led by Dieter Söll, a Sterling Professor of Molecular Biochemistry and Biophysics at Yale, molecular clues to nuclear development, known as nuclear localization signals (NLSs), have brought us a step closer to understanding how, when, and why the nucleus evolved.

What are NLSs?

Life on Earth is divided into three domains: Prokaryotes, Eukaryotes, and Archaea. Within these domains, only eukaryotes have a nucleus—a specialized, membrane-bound compartment within the cell that stores genetic information. The nuclear membrane acts as a barrier to compartmentalize the genetic information and enable the cell to regulate the amount of proteins that can enter and interact with the genetic material. In this way, the cell can regulate how much each gene is expressed. Large and bulky protein molecules are unable to pass through the nuclear membrane themselves. Instead, transport proteins interspersed throughout the nuclear membrane—called karyopherins—facilitate the transport of eligible proteins in and out of the nucleus via nuclear pores.

For proper cellular function, only certain proteins can enter or leave the nucleus. To allow karyopherins to detect which proteins can enter the nucleus, eukaryotic proteins contain NLSs, short amino acid sequences that karyopherins can recognize. These NLSs are always contained in nucleic acid binding regions—regions of the protein that can interact with nucleic acids. Proteins with NLSs that can be recognized by karyopherins can enter the nucleus, while those without cannot. “In eukaryotic cells, [NLSs] serve as a ticket to get into the nucleus, recognized by karyopherins that binds the signals. The very same sequences can bind to ribosomal RNA and facilitate three-dimensional folding of RNA,” said Sergey Melnikov, a postdoctoral fellow in Soll’s lab and the first author of the recent study.

What did they do?

The project began with a peculiar set of results. To study how the nucleus and nuclear transport pathways may have evolved in eukaryotes, the researchers focused on ribosomes—cellular machinery responsible for synthesizing proteins—and studied differences in ribosomal protein sequence and structure in Eukaryotes, Prokaryotes, and Archaea. The team studied different aspects of protein structure in these domains, including both their sequence and three-dimensional structure. Knowing that NLSs were the unique features that allowed eukaryotic proteins to enter the nucleus, the group first looked for proteins in Prokaryotes and Archaea with sequences resembling NLSs. To their surprise, they found four ribosomal proteins with NLS-like sequences in Archaea. This was surprising since Archaea have no nucleus. Ribosomes are also comprised of nucleic acids, and these NLS motifs were found to exist in nucleic acid binding regions, where they glue the ribosomal proteins and RNA together. To investigate the function of these NLS-type motifs in Archaea, the team then compared the secondary and tertiary (three-dimensional) structures of the motifs in Archaea proteins and Eukaryote proteins. They found that in addition to having similar sequences, the NLS-type motifs in Archaea and NLSs in Eukaryotes have similar three-dimensional structures.

This discovery was even more surprising. “At that moment, I knew that I could learn something from this,” Melnikov said. The existence of motifs thought to be involved in nuclear transport in cells without nuclei seemed wasteful, even paradoxical. In search of a conventional biological explanation for these NLS-type motifs in Archaea, Melnikov sought out to find chaperones—proteins that assist other proteins in performing a specific function—in Archaea that could have used NLS-type motifs for recognition or transport. His search ultimately turned up nothing. This failure, however, hinted that the very existence of these NLS-type motifs with no apparent function in the cell was a window to even bigger questions. “Sometimes even the most trivial fact can become a great discovery when you look at it from an interesting angle,” Melnikov said.

Having identified these motifs, the researchers then studied whether the motifs were universally present throughout the Archaea domain or only in certain subsets. The researchers expected that among the four main lineages of Archaea, only the most recently diverged—and therefore most closely related to eukaryotes—would feature these NLS-type motifs. However, researchers instead found that the motifs were conserved in all lineages, even the most ancient and least eukaryote-like.

Why would cells without a nucleus have proteins with NLS-like sequences and structures? One possible explanation was that these sequences could have co-evolved with changes in ribosomal RNA. The researchers theorized that changes in the nucleic acid elements of the ribosome over time may have selected for changes in the protein elements resembling NLSs. To test this theory, the team compared the structures of ribosomal RNA known to be binding partners with NLS-type motifs in Prokaryotes and Archaea. To their surprise, they found no differences between the two ribosomal structures, suggesting a different factor was at play.

Finally, the research team tested the functions of test whether Archaea NLS-type motifs could function in living cells. Using eGFP-fusion proteins, proteins whose fluorescence allows researchers to track their locations within the cell, the team replaced NLSs within eukaryotic proteins with their Archaea NLS-type motif counterparts. They found that these modified proteins could equally well bind to karyopherins and enter the nucleus. That is to say, the Archaea NLS motifs were not only similar in sequence and structure to eukaryotic NLS, they could also fulfill the same biological function.

Lessons from nuclear evolution

All things considered, these NLS-type motifs in Archaea represent yet another example of a common phenomenon in evolution. The presence of NLS-type motifs in cells that do not contain a nucleus provides valuable information about the sequence of possible events that led to the emergence of the nucleus. Archaea without nuclei but with NLS-type motifs represent intermediate species between “start” and “end” points in nuclear evolution, analogous to the fossil of an intermediate between a bird with wings and its ancestor without wings. “It took millions of years to evolve, but for most of this time wings were not strong enough to support flight. For a species to have kept evolving them, these premature half-wings had to have been useful for other reasons,” Melnikov said. “[As to] how this major organ of flight emerged, we believe that these premature wings helped animals either glide or climb in an accelerated motion.” In other words, before the wing could support flight, it had to be doing something useful to “justify” its evolutionary persistence beyond a few generations.

In the same way, the presence of NLS-type motifs in cells that have no use for nuclear transport capabilities suggests that these motifs may have initially played other roles in cells. Currently, the researchers ascribe three biological functions to NLSs: recognizing nucleic acids, providing a recognition site for karyopherins to mediate transport across the nuclear membrane, increasing specificity of the protein’s interactions with nucleic acids through karyopherin binding. The results from this study clues scientists to the order in which these functions may have evolved. The researchers suggest that NLSs may have originally evolved as a method for proteins to bind more specifically to nucleic acid sequences. They further hypothesize that chaperone proteins evolved to recognize specific NLSs, increasing specificity similar to how chaperone proteins recognize hydrophobic sequences on proteins. Finally, as the nucleus developed, these chaperones became karyopherins, transporting proteins with the specific NLSs into the nucleus.

What’s next?

To the authors, a key question remains unanswered: why ribosomal protein regions in Prokaryotes and Archaea that bind to the same ribosomal RNA targets have different structures. In the paper, the researchers provide three possible explanations—differing evolutionary histories between Prokaryotic and Archaea ribosomal proteins, differences in the mechanisms Prokaryotes and Archaea use to create ribosomes, and that this differentiation is a method for increasing specificity in an increasingly complex cellular landscape—but further research is needed to fully understand these differences.

While the results concerning NLSs and ribosomal components described this study are exciting, much remains to be explored. “The riddles of life are written in ribosomes,” Melnikov said. Of the roughly sixty proteins universally encoded in all life, ribosomes and their associated machinery account for around half. Because of their ubiquity, ribosomes offer researchers not only clear footholds upon which to build phylogenetic trees and compare the relatedness of different life forms, but also a rare glimpse into the types of environmental conditions a cell evolved to adapt to. In the future, Melnikov hopes to research methods by which different conditions of an environment—such as temperature, salinity, or pH—can be deduced purely from a cell’s ribosomal structure and sequence.

While understanding evolution at the cellular and microscopic level may be a daunting task, advances in computational power and biochemical methods, more than ever, support and further our understanding. For now, from initially puzzling data concerning ribosomal components, researchers have managed to shed new light on the origin of cellular machinery, the nucleus, and life.


1. Sergey Melnikov, Hui-Si Kwok, Kasidet Manakongtreecheep, Antonia van den Elzen, Carson C Thoreen, Dieter Söll, Archaeal Ribosomal Proteins Possess Nuclear Localization Signal-Type Motifs: Implications for the Origin of the Cell Nucleus, Molecular Biology and Evolution, , msz207,

2. Interview with Dr. Sergey Melnikov, Yale Department of Molecular Biochemistry and Biophysics, Interview on 11/5/2019