Erin McKean, the founder of the self-organizing online dictionary Wordnik, told me at the end of our first interview back in 2011 that she would like her company to become a verb, the way Google has. When you’re telling someone to look up a word, you’d say “Wordnik that.’”
Well, McKean has gotten her wish, but not exactly in the way she meant. In January the startup relaunched itself under a new name, Reverb, which has “verb” in it and is itself a verb.
It’s a typical bit of word fun for the company, which is built around McKean’s voracious interest in words, definitions, and concepts—she’s the former editor-in-chief of the New Oxford American Dictionary—and is now expanding beyond its automated dictionary to offer new kinds of services to publishers.
The first new product, called Reverb for Publishers, is a free plug-in for bloggers and Web publishers that shows site visitors more articles closely related to whatever they’re currently reading. When you get to the bottom of a post on McKean’s own blog A Dress A Day, for example, you’ll see three recommended posts from her blog, and three recommended articles from other sites that use Reverb.
The system eschews keywords, headlines, popularity, or other standard cues and instead uses the same machine-learning technology behind Wordnik to construct a more sophisticated understanding of what a post means. That helps the system find other posts that are truly similar, which, in theory, readers will find more interesting.
“You can imagine two articles about chocolate cake,” McKean explains. “This one might be about a decadent cake for a party. Another one might be about a super-low-calorie cake for dieters. Those are not meaningful to the same people, even though they are both about chocolate cake.”
In a way, Reverb’s goal is to come up with recommendations that reflect the taste and the calorie count of an article, rather than just the ingredients it’s made from. And that will lead more people to click on the recommended articles, driving more traffic—which is the real point.
There are plenty of “recommended for you” widgets available to publishers. (Xconomy uses one from Disqus and another from SimpleReach.) But Tony Tam, Reverb’s co-founder and CEO, says the click-through rate for articles recommended by Reverb is two to four times higher than the industry average. Sites that join the Reverb network generally see their traffic increase by about 25 percent, he says. That’s an enormous-sounding number, but even if it were lower, it would be pretty good for a tool that, so far, is completely free.
And Wordnik hasn’t gone away. In fact, the dictionary keeps growing by thousands of words a day, as it automatically scours the Web for new words and word usages and integrates them into its “word graph” (a true mathematical graph with tens of millions of words as nodes and many tens of millions of word relationships as edges). But McKean and Tam say that as they got ready to roll out the related-content service in late 2012, the old name just wasn’t working.
McKean says the company got some good advice from investor Mike Maples, who joined the board after his firm Floodgate participated in an $8 million Series C round in 2011. “He said, ‘You guys just did too darn good a job with Wordnik. It’s so connected in people’s minds with dictionaries that you have to spend the first 10 minutes of every meeting explaining why Reverb for Publishers is different.’”
With the relaunch under the name Reverb, the company wanted to turn its identity “inside out,” McKean says. “We said, ‘What if we were a tech company that, as one of its products, had a giant, awesome dictionary?’ The way we’ve been describing it is, okay, what would it be like if you were a caterpillar that turned into a butterfly but you got to keep the caterpillar as a pet?”
The core asset at Reverb, built mostly by Tam between 2008 and 2012, is a system that combines natural language processing, machine learning, and lexicography to figure out how strongly one word or group of words is related to another. Wordnik was the system’s first manifestation, and it focused on what McKean calls “monogamous” relationships between words—how one word was related to another word.
But the same system is capable of looking at relationships between word clusters, even clusters of up to several thousand words. Tam and McKean realized they could put this power to work by analyzing the content in publishers’ archives.
“Evergreen, long-tail content does not have a very good home” on the Web, Tam says. Often, Tam says, the lists of related posts that show up below or beside a page’s main text are based on