In 2019, LinkedIn’s engineering team announced DataHub, a metadata tool it had built to help it organise, search and discover insights from its vast data trove. In 2020, LinkedIn open sourced it. Now, a startup co-founded by one of the creators of DataHub and by a former senior engineer from Airbnb who helped build the latter company’s Dataportal, is coming out of stealth — backed by LinkedIn, among others — to usher the DataHub platform into its latest chapter: commercialization.
Acryl Data, as the company is called, is launching today with $9 million led by 8VC, with LinkedIn and Insight also participating, to help other companies use the tools for their own big data needs.
The impetus for Acryl Data comes from the salient fact that big data, and specifically being able to organize, understand, make the most of fragmented big data troves (with information coming from and living in multiple places, be it Snowflake, or Databricks, or Looker, or something else altogether), is a challenge that impacts any organization that has a large digital component in its operations. Traditionally, big tech companies have been some of the more innovative in addressing that issue, with a number of them also open sourcing their technology to make it usable by others.
The breakthrough for the founders of Acryl — discovered before they started the company, when they were still working at their respective big tech companies — was the realization that metadata held the key to organizing that big data information.
“The interesting part about metadata is actually that it has become a big data problem,” said Shirshanka Das, the LinkedIn alum who is CEO and co-founded the company with Swaroop Jagadish (the Airbnb alum who is CTO). “And so all of the data infrastructure DNA that we have, in terms of building large-scale data collections, streaming, indexing, searching — those all need metadata management solutions that can actually scale to the demands of the modern enterprise. That, I think, is really our secret sauce, that we’ve been able to build a metadata platform that takes in all the best practices of doing data infrastructure right, and applied it to doing metadata infrastructure.”
As an open source project, DataHub has picked up some significant traction. In addition to LinkedIn itself, Expedia, Saxo Bank, Klarna and many others are using the framework — essentially a generalized metadata search and discovery tool — to build their own metadata graph to connect their various data entities together. Altogether the project has racked up over 3,200 GitHub stars and has more than a 100 contributors.
Acryl Data, like other open source commercialization efforts, is setting out to build a toolset that will make that framework easier to scale and apply in more use cases, particularly at those companies that might lack the resources to build these implementations on their own. The first of these, it says, will be a data catalogue that is based on design learnings from Airbnb’s Dataportal. LinkedIn will be collaborating with Acryl Data, along with the wider open source community, on future products.
“LinkedIn’s unique view of the global economy provides us with the opportunity to improve economic outcomes for hundreds of millions of people around the world through data-driven insights and AI-powered products. To discover the right data, navigate the tens of thousands of derived datasets that our researchers and engineers use every day and manage them well, we rely on DataHub,” said Igor Perisic, Chief Data Officer at LinkedIn, in a statement. “We are excited to partner with Acryl Data to continue to advance DataHub with them.”
The opportunity is a big one. Collibra, a competitor in the same space, last year raised a round at a $2.3 billion valuation. Another, Alation, was valued at $1.2 billion earlier this month. But with a lot of space for innovation left, it’s interesting to see the people who built some of the most foundational tools in the space getting stuck in as entrepreneurs themselves to meet the challenge.
“The modern data stack needs a fundamental rethink in how metadata is managed,” said Insight Partners MD George Mathew, in a statement. “We believe a next-generation, real-time metadata platform is needed, and Acryl Data is the best team to lead this transformation based on their groundbreaking work with DataHub.”