Shazam for Seismologists? How a new data mining technique is shaking up earthquake science

Have you ever experienced an earth­quake? Because many earthquakes are small and can go by unnoticed, the answer for most people is yes. A common miscon­ception about earthquakes is that they are always associated with ruptured pavement, collapsed buildings, and injured people trapped under rubble. These images are based on history’s largest, most devastating earthquakes. As significant as these events may seem, seismologists spend an equal amount of time studying the little earth­quakes we may never even feel.

There are small earthquakes occurring constantly around the globe, with about twelve thousand to fourteen thousand re­corded each year. To sort through all of this data, researchers have developed a variety of new techniques. A collaboration between California universities has been working to each develop their own new technology. One of these techniques, called Fingerprint and Similarity Thresholding (FAST), adapts technology used in audio-clip identifica­tion to search for previously undiscovered earthquakes. The FAST algorithm is now being used to detect small earthquakes in under-researched areas across the globe.

Earthquakes are recorded through a seis­mic network, which measures the move­ment of the ground at particular sites. Each individual seismograph collects data that reflects the intensity of the vibrations beneath the instrument. Traditionally, an event must be recorded at a minimum of four different seismic stations to be includ­ed in an earthquake catalog, which contain information on the location, magnitude, and origin of all known earthquakes.

While a wealth of seismic data is avail­able from these networks, the data is often contaminated by surrounding background noise, from a helicopter preparing for take­off to a construction worker using a jack­hammer on concrete. The FAST project aimed to detect earthquakes of magnitude 2.0 or below, which often risk being over­looked in seismic data catalogs. Research­ers are hoping that expanding the study of small earthquakes will improve seismic monitoring technology, find previously un­known faults, and perhaps provide the tools for predicting earthquakes in the future or at least better understanding what they may look like.

How FAST works

FAST builds on the currently used “tem­plate-matching” technique of earthquake detection, which searches seismic data for signals that match those of known earth­quakes. This method works well when re­searchers know what signals to look for. Thus, it is most effective in commonly studied areas, such as Southern Califor­nia, where a template has been previously established. FAST goes beyond this meth­od by creating “fingerprints” of seismic data, then using a computer algorithm to search through massive datasets for simi­lar items that match. The fingerprints are rarely identical but sometimes are similar enough to be matched. Similar algorithms are already being used for a variety of pur­poses, from detecting plagiarism in docu­ments to identifying certain music based on similar audio clips.

This FAST process is extremely efficient and surprisingly accurate. In a controlled test, the algorithm detected twenty-one of twenty-four cataloged earthquakes within two hours—a pace 140 times faster than other methods—and identified sixty-eight previously uncatalogued events. For best results, the process is suggested to be used in conjunction with other detection tech­niques and human supervision. FAST is not a tool for forecasting earthquakes, but it can help us understand the processes be­hind them.

Clara Yoon and Karianne Bergen both worked on FAST as PhD students at Stan­ford University. Yoon, now a geophysicist for the US Geological Survey, remembered that her advisor originally played around with the music discovery app Shazam to set the groundwork for FAST. “We took a pro­totype code for audio recognition and made some changes so that it would take seismic data. It was an experiment that taught me geophysicists should look outside their own field when attempting to solve important problems.” she said. “When our paper first came out, the community of geophysicists interested in these new methods was still fairly small. Over the past couple of years, I think that [seismologists] are becoming more excited to use techniques from data science and machine learning,” added Ber­gen, now a postdoctoral fellow at Harvard.

Applications of FAST

FAST already has made waves within the field of earthquake detection science. In 2017, Yoon published a paper in which she used FAST to monitor seismic activi­ty in Guy-Greenbrier, Arkansas. The area had experienced a number of small earth­quakes beginning in July 2010 known as the “Guy-Greenbrier earthquake swarm.” Researchers suggested that these events had been triggered by hydraulic fractur­ing, but they acknowledged that the area was prone to natural seismic events. Yoon and her team concluded that hydraulic fracturing did, in fact, cause most of these earthquakes, with wastewater disposal ac­counting for a smaller fraction.

They applied FAST in conjunction with template-matching to seismic data collected in the first three months of the 2010 swarm to determine that there had been over four­teen thousand micro-seismic events—a drastic increase from the seventy-five previ­ously catalogued. Most of these small earth­quakes could be correlated with the location of one or more of the area’s several produc­tion wells. “The possibility of larger earth­quakes depends on many things, like types of faults, and levels of stress in an area. We should monitor earthquake activity before and during fracking just in case,” Yoon said. The earthquakes in Guy-Greenbrier may have been too small to cause much damage, but their connection to oil and gas extraction present a challenge to geologists in charge of determining earthquake risk. Yoon and her team proposed taking action before more devastating events would occur.

FAST represents just one of the many new data mining techniques under development to detect earthquakes. The use of artificial intelligence to generate new information from existing datasets could help geologists fully understand the complex processes be­hind earthquakes and thus, better predict them. Hopefully, FAST and other methods will encourage researchers to think across disciplines when searching for solutions to problems in their own fields.