Diffbot’s A.I. Engine Draws Global Map of Machine Learning Expertise

A year ago, the leading Chinese Internet company Tencent Holdings pegged the global number of artificial intelligence researchers and professionals at 300,000 or less—just as the unmet demand for such experts was pushing salary offers to as much as $1 million. In February, the Canadian firm Element AI estimated that talent pool at no more than about 90,000, Bloomberg reported.

Now, Silicon Valley company Diffbot has used its A.I.-powered fact-mining engine to comb the global Web and make its own census of skilled people in the field.

“We found that there’s much more,” Diffbot co-founder and CEO Mike Tung says. Rather than looking for all A.I. talent, Mountain View, CA-based Diffbot searched only for people with expertise in machine learning, a much-valued specialization within A.I. In a report released this week, the company says it found 720,325 professionals with machine learning expertise, and 221,592 of them are in the United States alone. Diffbot says it’s the single largest survey of machine learning skills ever compiled.

The report demonstrates the growing potential of A.I. software to quickly amass data and analyze it, compared with traditional methods such as time-consuming surveys of limited population samples, and the extrapolation of those results to estimate the size of entire groups. Diffbot’s study may also contribute more granular evidence showing which countries are leading the way in A.I. technology development, and whether the field’s much-lamented talent shortage is as deep as hiring companies fear.

The fastest-growing job category in a 2017 study by LinkedIn was machine learning engineer. These experts design advanced software systems that can change their behavior based on “insights” from the results of their earlier actions.

Tung says Diffbot’s search method, which employs software that uses A.I. technologies such as machine learning, computer vision, and natural language processing, allowed the company to turn the Web into a sensor that used A.I. to detect professionals with advanced A.I. skills.

“Like Cerebro in ‘X-Men,’ you can find all the mutants in the world,” he says.

By sharing summary data about the thousands of experts it found, Diffbot is revealing what Tung calls a competitive advantage that his company has long held in its own hiring of A.I. experts. Diffbot has deployed its automated Web-scouring engine to find the types of job candidates who are key to its progress in the development of that core product itself.

“We’ve been using it for this reason for many years,” Tung (pictured in center above) says.

Now, by releasing an analysis of the database of machine learning talent it compiled, Diffbot is hoping to demonstrate the kind of information that clients can derive by the same means—about A.I. workers and many other topics.

Diffbot’s customers can query its “Knowledge Graph” of more than a trillion facts gleaned about 10 billion “entities,” which include people and products.

Diffbot has extended the reach of Web data capture by scanning types of items not usually tracked by search engines, such as advertisements, images, and the reader comments posted below articles. The system finds connections among the facts scooped up from these public sources—like linking a product’s description to all the prices for it found in current ad displays. Diffbot structures the facts within its searchable Knowledge Graph, which is continually updated.

Diffbot, founded in 2008, attracted customers including Cisco (NASDAQ: [[ticker:CSCO]]), Salesforce (NYSE: [[ticker:CRM]]), and Crunchbase while operating in beta mode until August, when it opened up its “knowledge-as-a-service” tool to the general public.

The August announcement sparked interest among new customers, Tung says, and Diffbot’s staff of about 30 are working to help these potential clients integrate the company’s services into their existing systems.

The company’s machine learning expertise report is Diffbot’s first major release of a study based on data in its Knowledge Graph, Tung says. Diffbot may produce more such reports if people find them interesting and useful, he says.

The global machine learning expertise map

Tung says Diffbot found a significantly greater number of A.I. experts than Element AI or Tencent (which is one of Diffbot’s investors) because its Web-crawling engine casts a much wider net, and works in multiple languages. Montreal-based Element AI had relied on LinkedIn profiles to estimate the number of A.I. professionals, according to Bloomberg.

To find people who identified themselves as skilled in machine learning techniques, Diffbot’s algorithms scanned an array of document types, including publicly posted resumés, curriculum vitae, personal Web pages, company staff biographies, university faculty directories, news articles, scholarly publications, papers found through searches of Google Scholar, and professional sites such as GitHub’s.

Tung says top academics and machine learning experts at companies are more likely to be found through these sources than through LinkedIn, where new graduates and jobseekers commonly create profiles.

Diffbot’s count includes a wide variety of professionals, including those who are not PhD’s. It encompasses top engineers who can build entire machine learning systems and practitioners who can write code. Diffbot’s Web-crawling engine picks up on terms in addition to “machine learning” that indicate when a person is involved in the field, including “neural networks,” and “TensorFlow.’’

On the other hand, it would not award a place on the experts’ list to the drummer for a band dubbed “Machine Learning,’’ Tung says. Due to the search engine’s natural language processing capablities, it can identify the sense in which a term is being used, he says.

Companies using the database as a recruiting resource would, of course, have to verify the

Author: Bernadette Tansey

Bernadette Tansey is a former editor of Xconomy San Francisco. She has covered information technology, biotechnology, business, law, environment, and government as a Bay area journalist. She has written about edtech, mobile apps, social media startups, and life sciences companies for Xconomy, and tracked the adoption of Web tools by small businesses for CNBC. She was a biotechnology reporter for the business section of the San Francisco Chronicle, where she also wrote about software developers and early commercial companies in nanotechnology and synthetic biology.