Early investors in AI support services company Scale AI will be cheering this week, as the startup’s valuation broke through the billion-dollar mark following a USD 100m Series C round. The news was widely reported by various media outlets, including Bloomberg and TechCrunch. At its essence, Scale AI is a tech-driven AI support services company that relies on a large crowd of human workers to label data. Sound familiar?
This is the business Lionbridge CEO John Fennelly mentioned to Slator in early 2019, saying he could envision it becoming larger than the LSP’s traditional localization business.
Fennelly made the statement after announcing the acquisition of Toyko-based Gengo, which started out as a crowdsourced translation services provider but branched into AI support services a couple of years ago. The move paid off for Gengo as it factored heavily into Lionbridge’s decision to buy the company, according to CEO Fennelly.
Partner spotlight
How teams localize with AI.
Browse a full day of sessions built to drive results this quarter.
Founded as recently as 2016, Scale AI’s numbers are impressive. Bloomberg reports that the company has 100 employees in San Francisco and has already built up a network of 30,000 contractors. Its 22-year-old CEO and Co-founder, Alexandr Wang, has guided Scale AI through four funding rounds, this latest one securing the company’s status as a bona fide unicorn.
With the help of its network of contractors, Scale AI provides an array of AI-destined services, broadly divided into the categories of computer vision and natural language processing (NLP).
Scale AI’s computer vision services include video content tagging and image categorization. This labeled video and image data is then used by Scale AI’s customers to improve their AI systems; for instance, to train self-driving cars and drones to “see” better by being able to understand and respond to different landscapes.
Under the umbrella of NLP, the company provides text classification, speech and voice transcription, and OCR (optical character recognition) transcription services. These, in turn, can be used in search relevance and e-commerce listing matchings, for example.
Slator 2020 Language Industry M&A and Funding Report
40 pages on translation, localization industry M&A, venture funding. Valuations, PE funds, deal rationale, geo, investment theses.
The Scale AI website says its tasks are performed by humans with “additional layers of both human, data and machine learning driven quality control checks” — meaning they have found a way to semi-automate the data-labeling process.
On the Lionbridge Radar
While Lionbridge has only recently begun to aggressively compete in the space, Australia-listed Appen’s early bet has paid off handsomely. Since IPO’ing in 2015, Appen’s share price skyrocketed to over 4,000%, pushing its market cap to nearly USD 2bn. Appen CEO Mark Brayan is scheduled to speak at SlatorCon San Francisco on September 12, 2019.
Similar to Scale AI, Appen has built an army of crowd workers who manually sort and annotate data. Taking a shortcut on what Appen said was five years’ worth of tech development, the company, in March 2019, acquired data annotation platform Figure Eight for up to USD 300m.
Scale AI has gone direct to an Appen-plus-Figure-Eight-model, combining Appen’s crowd power with Figure Eight-type automation capabilities.
AI support services share a number of similarities with the language industry in terms of organizational model (i.e., crowdsourcing) and processes (human-in-the-loop). But here is a fundamental conceptual difference between translation and data labeling. Translation has intrinsic value and typically serves its own specific purpose, be it marketing, compliance, or something else. By contrast, labeling data is a means to an end. It is used to train AI and is not a standalone product; hence the term “AI support service.”
Slator 2021 Data-for-AI Market Report
44-pages on how LSPs enter and scale in AI Data-as-a-service. Market overview, AI use cases, platforms, case studies, sales insights.
Put simply, a translation is requested because a translation is required, while data labeling is requested because a better AI model is required. Granted, translation output can be, and is, used to train neural machine translation models, but this is not its primary function in the marketplace.
Featured
Partner spotlight
Boost Language Access
Improve health outcomes and ensure compliance for individuals with LEP