Fingerprinting and AI automated tagging patent emerges
|Richard Harris in Big Data Thursday, January 24, 2019|
Waterline Data's fingerprinting combines big data analysis, machine learning and human curation, to automatically catalog data and data lineage at scale, providing for easier discovery of data stored in data warehouses, cloud services and databases across the enterprise.
Waterline Data announced it has been granted a patent for key aspects of its unique Fingerprinting and automated tagging technology, which provides for faster and easier discovery of the vast amounts of data stored in data warehouses, cloud services and databases across the enterprise. Waterline Data Fingerprinting combines big data analysis, machine learning and human curation to automatically catalog data and data lineage at scale, reducing the manual tagging of data by over 80%, increasing the overall data inventory and lineage accuracy 10x, and reducing the people cost for manual tagging and inventorying of data by 90%.
Among the chief challenges in implementing a data catalog is populating it with useful information. Many organizations have a business glossary with defined terms and definitions but find that connecting this business metadata with technical metadata (which contains statistical demographics and the actual location where the data lives) is a costly, labor-intensive and highly error-prone process. While some catalogs attempt to accomplish this through crowdsourcing, the approach doesn’t scale to accommodate the high volume of data that’s rapidly pouring into today’s petabyte organizations.
Waterline Data addresses this challenge through its now patented Fingerprinting technique. The driving force behind Waterline Data’s flagship AI-driven Data Catalog, Fingerprinting works on the concept that a column of data has a distinctive signature, or a fingerprint, that incorporates its technical metadata, content, format, and context. By examining this signature, an AI system, like Waterline Data’s AI-driven Data Catalog, can identify what that data is, determine the other columns that share similar fingerprints, and connect the data to a business term or label for easy discovery and analysis. Waterline’s Fingerprinting technique is the industry’s only data catalog to combine AI and machine learning with best-in-class crowdsourcing and big data scalability to deliver a modern data catalog that meets today’s enterprise needs.
Fingerprinting and AI automated tagging patent for finding a vast amount of data faster details
The patent publication US 2015-0356123 is titled “Systems and Methods for Management of Data Platforms” with Waterline Data founder and CTO Alex Gorelik as the inventor. Over his career, Alex Gorelik has been granted over 20 patents in Data Management.
“It has always been Waterline Data’s mission to deliver the fastest and most accurate big data discovery engine with the highest scalability in the industry,” said Waterline Data founder and CTO Alex Gorelik. “With Waterline Data’s patented Fingerprinting technology powering our data catalog, petabyte enterprises gain a competitive edge through the fast and accurate self-service discovery of complex data using modern analytic approaches. Allowing business users to easily access and analyze data means being able to put your data to work so that it’s delivering real value to the organization--now.”