Why Boys are Not Girls

To watch a video interview with Dr. Joel Rozowsky, click here.

Boys are not girls; their external differences are obvious. Yet, what permeates the trillions of cells in the human body underneath the skin are not sexes, but molecules. Only through their infinitesimally complex interactions do we emerge. So, while we all appear different on the outside, how different are we biologically on the inside? Specifically, at what level do we begin to sexually differentiate?

The evolutionary perspective satisfactorily answers the why behind sex differentiation by framing it as an adaptation to increase fitness by exponentially increasing genetic diversity. However, evolution does not adequately answer how sexes differ physically, and so a different explanation is needed. Many studies have analyzed the macroscopic anatomical distinctions between males and females, focusing on what is found in men but not in women and vice versa, but little light has been shone on the root cause of our differential structural and physiological characteristics: the differential gene expression of proteins.

With the help of Yale research scientist Dr. Joel Rozowsky and his laboratory for computational analysis, former Yale Professor Dr. Michael Snyder and his then graduate student John Rinn conducted a methodical survey of gene expression in the tissues of male and female mice to investigate sexual difference at the fundamental level using DNA microarrays. Through their pioneering work, the character of the molecular differences between males and females can begin to be appreciated.

Discovering Our Differences

To investigate the molecular differences between males and females, Rinn began by extracting and analyzing the mRNA present in four types of tissues of adult mice to correspond to a wide breadth of functional categories: liver and kidney for physiology, hypothalamus for behavior, and gonads for reproduction. Once excised, these tissues’ mRNAs were isolated, pooled together, and run on a microarray, culminating with a data set of over 1,500,000 gene expression data points.

Despite the daunting mass of data collected, the technique behind all the numbers and points is fairly straightforward. Anchoring firmly onto the central dogma of molecular biology, the microarray technique deduces protein abundance by scanning for the presence or absence of their precursor molecules, mRNA. Fluorescent-labeled sample mRNA is added to a microarray plate containing hundreds to thousands of wells of cDNA molecules. Through complementary base pairing and hybridization, the extracted mRNA molecules bind to complementary sequences on the array, causing the well to change color. Because the magnitude of the color change is proportional to the abundance of the strand, researchers can deduce not only which mRNA strands are present in the sample but also how much.

After performing a total of 72 microarrays, Professor Snyder and Rinn compiled and normalized the data to observe the differences in fluorescence and expression between the two sexes. While this preliminary examination gave a broad overview of the results, bioinformatics specialist Joel Rozowsky’s analysis would reveal more specific findings.

Samples of the microarray data collected by John Rinn. Courtesy of Dr. Rozowsky.

Region-Specific Genes

After Rozowsky performed an Analysis of Variance (ANOVA) test, a total of 39 somatic genes remained that had statistically significant (p < 0.001), at least three-fold differences in expression. Of these, 20 were expressed in both males and females but at a significantly higher rate in one gender, and 19 were present in only one gender. Five exhibited gender bias in all tissues, 27 were differentially expressed in the kidney, six in the liver, and only one in the hypothalamus.

In the kidney, most of the 27 genes were involved in drug and steroid metabolism, with a strong plurality encoding cytochrome p450 proteins. Of the others, six had functions relating to osmotic regulation, one was a corticosteroid binding globulin, and most of the rest had largely undetermined functions. In the liver, again a majority of the genes discovered related to drug and steroid metabolism with a few others having undetermined functions. Taken together, these results indicate that most of the molecular differences between genders in physiological tissues relate to either metabolism or osmotic regulation.

In contrast to the physiological tissues, only a few differences were observed in the behavioral tissues of the hypothalamus. In fact, only one unique gene, TSIX, was found to be differentially expressed. Beyond simply indicating the molecular similarity between behavioral tissues, this result is especially interesting because male and female hypothalami have several well-known anatomical differences. The paucity of molecular differences suggests that the underlying processes in the behavioral tissues are quite similar.

A diagram of the genes found to be expressed significantly differently in male and female kidneys. Courtesy of Dr. Rozowsky.

As for the gonad tissues, significantly more than 39 genes were found to be differentially expressed — over 4,000, in fact. Still, this was entirely expected, as the tissues themselves are fundamentally different. Besides having entirely different structures, these organs produce different molecules and perform different chemistry. Even so, after increasing the stringency of the ANOVA to p > 0.000001, 882 differentially expressed genes were found and identified (534 of which had increased expression in the testes, 358 in the ovaries).

Three interesting observations can be made. First, DNA polymerase II (pol II) genes were highly enriched in the testes. Second, immune response genes showed little gender differentiation. Third, a significant number (42) of signal transduction genes showed enriched expression in the testes. Taken together with the somatic tissue results, Rozowsky’s analysis clearly demonstrates the presence of multiple differentially expressed genes based on gender in mammalian tissues.

A diagram of differential gene expression in male and female livers. Courtesy of Dr. Rozowsky.

How Different Are We, Really?

While the microarray survey revealed significant molecular expressivity differences between male and female mice that are convincingly generalized to other organisms, including humans, these differences are relatively few in number. Beyond the sex organ tissues, whose molecular compositions are expected to be drastically different because of their different functions, the vast majority of molecules were expressed at similar levels in both sexes.

Perhaps puzzling at first, this discovery is actually quite harmonious with scientific intuition. Because the organ systems of both sexes have similar functions, it makes sense that they should each have similar structures and compositions. Just as the male pancreas needs to produce insulin, so too does the female pancreas. Here again, scientific analysis has revealed that we are much more similar than expected. On the molecular level at least, males and females seem to differ more in details than in composition.

Males and females are obviously not biologically equivalent, as evidenced by our divergent anatomy. However, this analysis lends support to the general conclusion that the differences between sexes result primarily not from a fundamentally different molecular composition of organs themselves but from differences in gene expression in developmental pathways, ultimately precipitating large-scale anatomical differences rather than microscopic variations.

So, returning to the questions posed: How different are we? Are we simply variations on a theme or fundamentally different? While neither has an absolute answer, the molecular data collected thus far strongly supports the notion that we are much more alike than is commonly presumed. While we undoubtedly differ, these differences largely emerge at a higher level of composition than at the molecular one.

On the Cutting Edge

Even just a few years removed from the groundbreaking work itself, the techniques used have fallen out of fashion. While microarrays continue to be a central (and in many ways, irreplaceable) technique, a new field of sequence-based approaches has come into vogue. Rather than identifying the mRNA through complementary base pairing on a grid, this new field sequences it directly.

Before the Human Genome Project (HGP) revolutionized the field of DNA sequencing, sequencing was conducted using the Sanger method, also known as the chain-termination method, which relied on replication termination upon incorporation of a radioactive di-deoxynucleotide and fragment separation on a gel. While this method could reliably sequence continuous segments of ~800 bp, it was slow. With modern sequencing methods, what used to take ten years can now be done in a few days with a small desktop machine. This unprecedented rapidity allows researchers interested in gene expression to quickly and cheaply sequence the mRNA collected. Besides allowing identification of expected strands, as microarray analysis does, this direct sequencing approach allows for the identification of previously unknown or unexpected mRNA strands, as well as those with significant mutations. With this new technology, perhaps even more expansive studies in more tissues can be conducted to uncover more about how boys are not girls — or how boys and girls are both humans.

Sanger sequencing reveals sequencing information by running the prematurely terminated and fluorescently labeled nucleic acids on an agar gel. Courtesy of Abizar Lakdawalla.

About the Author
John Urwin is a sophomore Molecular Biophysics and Biochemistry major in Jonathan Edwards College. He is a contributing writer for the Yale Scientific Magazine and has worked in Professor Colón-Ramos’ lab studying nervous system development in C. elegans.

The author would like to especially thank Joel Rozowsky for his time and expertise.

Further Reading
Tae Oh and Jong Yun, “DNA microarray analysis reveals differential gene expression in the soleus muscle between male and female rats exposed to a high fat diet,” Molecular Biology Reports (2012): 1-12.
PD Dimusto et al, “Increased PAI-1 in females compared to males is protective for abdominal aortic aneurysm formation in a rodent model,” American Journal of Physiology (2012).