Debunking the Google Algorithm

William Zhang | william.zhang@yale.edu October 2, 2010

Ask anyone what Google is and in almost all cases, he or she will not only know what it is, but also use it, have it as a search bar in his or her Internet browser, have set it as his or her home page, and probably even use the word synonymously with the verb search. Google, which started out as an idea to order search results, has now become a ubiquitous information powerhouse and search engine juggernaut. In fact, Google controls approximately two-thirds of the market share of search queries, and millions of people around the world use the search engine as their homepage. When the company was established in 1998, co-founders Larry Page and Sergey Brin wanted “to organize the world’s information and make it universally accessible and useful”—and they have done just that. Using its highly guarded, intricate algorithm, Google has emerged at the forefront of search engines, understanding what we mean and producing what we want, all in the matter of milliseconds.

But is Google’s dominance based on a superior algorithm or just an exceptional business-marketing model? Is it really the paragon of search engines or do its advertising and hype, clean layout, clever holiday logo themes, crisp Gmail interface, and, as of this year, touching Super Bowl commercials, drive its success instead?

Google has undeniably revolutionized the search engine with the development of its first and most fundamental innovation, PageRank. When you do a Google search for, let’s say, “science,” and click the enter key, Google retrieves 655,000,000 results in short order, all ranked by importance and relevance. While previous search engines have conducted queries based on the number of word hits, Google utilizes over 200 different signals, including PageRank, to evaluate the website as a whole. The PageRank algorithm inspects the link structures of websites and uses this hierarchy to determine the reliability, relevance and importance of a search result.

“When PageRank first came out, it provided us with much better search results,” Yale Professor of Applied Mathematics and Computer Science Dan Spielman said. “They were just so much better.” Spielman recalls the time when Google first surfaced in the search engine scene and jumped miles ahead of its competitors.

In addition to PageRank, a unique hypertext-matching analysis is another innovation Google employs to produce users with quality search results. By sifting through both page content and context, rather than only scanning the easily manipulated page text, Google can ensure results are pertinent and relevant. This technology examines the webpage text within the context of fonts, subdivisions, and locations. Google even analyzes related websites, as well.

Google’s edge is also attributable to its constant push for innovation. Throughout the years, Google has continually upgraded its search results, which can now identify misspellings, rearranged placement of words, word associations, and even middle initials. This year, it is reported that Google will introduce approximately 550 new developments.

Another way Google continues to improve its algorithm is through the massive number of people who use it. Every time we use Google, we become involuntary Google lab rats. By using what results we click on, how we fix misspellings, what words we replace in searches, and other search data, Google can continually improve its results and fix its algorithm to match trends. And when Google engineers want to test a new fix to the algorithm, they conduct an experiment with the users; a small percentage of random searchers will receive the new search results, while the others act as the control.

It’s a perfect example of a virtuous cycle. Because of its established dominance, Google can continue to collect data, improve, and stay on top of the game. Google controls so much of the search engine market share that businesses have become obsessed over their Google ranking, some even going as far to sue Google for their fluctuating page ranks. In today’s world when the average person will only scroll through one (and maybe two) pages of search results, a high Google ranking is absolutely critical to attract website traffic, and a low ranking is essentially equivalent to online invisibility.

Ultimately, it is the combination of both the superb algorithm, and Google’s distinctive advertising presence that accounts for Google’s sustained success. More recently, Google has developed many additional features like Gmail, Google Maps, Google Docs, Google Earth, Google Street View, and Google Labs. They often offer the features as “Beta” versions to make users feel like they are having a special preview.

Google has come to satisfy everything we are looking for in a search engine and more. It initially presented us with a superior method of searching and is now embedded in our culture. In these few short years, Google has rapidly established its dominance and ubiquitous presence, truly establishing itself as a worldwide phenomenon.