In today’s Wall Street Jour­nal, Carl Bia­lik writes about “sta­tis­ti­cal time travel” per­formed by number-crunching researchers.

In recent years,” writes Bia­lik, “sta­tis­ti­cians have cre­ated time machines to answer a wide range of his­tor­i­cal hypo­thet­i­cals, from how today’s Supreme Court would have voted on Roe v. Wade to what sort of sci­en­tific papers Ein­stein might write today.”

One of the researchers high­lighted in the arti­cle is Prince­ton com­puter sci­en­tist David Blei, who has done a com­pu­ta­tional analy­sis of more than a hun­dred years’ worth of Sci­ence magazine.

This is how Bia­lik describes Blei’s research:

His sys­tem iden­ti­fies top­ics from scratch and assigns topic scores — say, 80% neu­ro­science and 20% phi­los­o­phy, or 40% biol­ogy and 60% chem­istry. Any papers that have the same topic scores could then be grouped together, even if they are decades apart and key­words or con­cepts didn’t yet exist. (Think of quarks or H1N1.)

Here the crit­i­cal bridge — the nec­es­sary over­lap to relate past decades to the present — were key­words that were asso­ci­ated with oth­ers before they faded… Such tech­niques con­nected an 1880 paper on orang­utan brains with a 1976 paper on mon­key brains.

That tech­nique helps dig up research that was ahead of its time. For instance, these very time machines, includ­ing Dr. Blei’s, make use of so-called Bayesian sta­tis­tics, which were devel­oped decades before there was suf­fi­cient com­put­ing power to use them fully.”

You can hear Blei talk about his work in this 2007 Google Tech Talk. Blei’s recent research includes papers on “find­ing latent sources in recorded music,” “a com­pu­ta­tional approach to style in Amer­i­can poetry,” and “aug­ment­ing social net­works with text” — this last paper being coau­thored with for­mer stu­dent Jonathan Chang, now at Face­book, who in a recent blog post describes var­i­ous visu­al­iza­tions he cre­ated of theonion.com’s twit­ter traf­fic.