One of the advantages of having data online is that it lowers the barrier to serendipity: you can stumble across something in these 500 billion words and be the first person ever to make that discovery.
–The Official Google Blog, 12/16/2010 11:08:00 AM
Only days after Google’s Books Ngram Viewer went live, I was hooked. For this I can blame LPR’s co-publisher (the one not named “Mike”). Or fate, since I soon learned “determinism” is on the upswing while “free will” is still charting its own indeterminate course.
I could arrive at such fascinating findings on my own because back in 1996 Google co-founders Sergey Brin and Larry Page, grad students conducting research supported by the Stanford Digital Library Technologies Project, got the idea that what was then called a “Web crawler” could be used to index the content of digitized books and analyze relationships among them.
Fast forward to 16 December 2010. Google makes an announcement it has scanned more than 15 million books and brought a viewer online that allows anyone to access subsets and gives them the ability to graph and compare up to five at a time of more than 500 billion words published between 1500 to 2008. Suddenly we have a serious analytical tool or a intriguing toy, whichever we want it to be.
This being the holiday season, I veered toward the latter, as did other early adopters. Wall Street Journal’s Digits blogger, for example, looked at changes in the percentages of “Merry Christmas” versus “Happy Holidays” over time.
But being of a literary bent, I eventually settled on topics like tags. Comparing “he said” and “she said,” I saw that–as in life–men do more of the talking in books, although authors in recent years seem to be doing less tagging of any sort.
After making numerous other discoveries while extracting information from that addictive Viewer, my last–for the moment–allows me to end on a positive note. I am pleased to report that, after being in bad decline, “Once upon a time” has recovered nicely and “happily ever after” is showing steady improvement.
Try the Books Ngram Viewer yourself when you run out of fun things to do. Just click on the About link first and appreciate the finer points before jumping to conclusions. The prognosis for “once upon a time” was pretty Grimm until I realized that the Viewer is case-sensitive and the phrase is typically found at the start of a sentence.