Optimizing Developers Time With Adaptive Databases
|Richard Harris in Analytics Wednesday, March 23, 2016|
We recently took a deep dive into adaptive databases with Chad Jones, the Chief Strategy Officer at Deep Information Sciences, whose deepSQL platform offers an adaptive, application-aware database that unifies operational transactions and real-time analytics while using machine learning to automatically adapt to application demands at cloud scale.
ADM: How do database limitations affect developers and their applications?
Jones: Developers have been using LAMP stack as the foundation for building first-generation production applications for quite some time. It’s easy, familiar and enables you to quickly get to the core of application creation.
However, the “M” portion of the stack – MySQL Relational Database – often causes problems when applications begin to scale. Because MySQL, like many other databases, was built on tree-structure science that was developed in the 1970s for that era’s limited computing and data requirements, it is inherently unable to perform with the speed and flexibility required for today’s increasingly large-scale realities.
As a result, just when your application is gaining traction and getting rapid adoption, it begins to slow down, getting sluggish, hiccupping or even crashing. Taking the application down to retool or rearchitect it with the aim of better handling the latest demands isn’t really feasible, or necessarily smart, in our 24x7 economy, but often it seems to be the only choice.
ADM: How have database vendors and developers tried to circumvent these limitations?
Jones: A number of vendors have created alternative solutions, such as NoSQL and Hadoop, to get around MySQL’s performance-at-scale problem. However, these solutions require trade-offs in terms of ease-of-development and functionality. They give up the relational functionality that many applications require in order to get incremental performance/scale. Because of this, the application itself has to handle the relating of data, which means developers have to build relational capabilities into their applications.
Whereas relational databases are purpose-built to maintain relationships between complex data, now developers have to rewrite all of that functionality and make sure it’s durable. Not only does this require much more time and skill to develop first-gen applications, but as you add functionality to the software over time, you have to write code that can reliably handle increasingly complex data entanglements.
ADM: What is an adaptive relational database and how does it differ from traditional databases?
Jones: Traditional databases, whether B-tree, B+ trees, LSM, etc., have severe limitations related to scale, flexibility and performance. While some more modern databases have tried to enhance these capabilities by either rejecting key features from relational databases (NoSQL) in favor of incremental speed or moving databases into a faster hardware medium (in-memory databases), none have reimagined the underlying science.
They all still use the same basic math. As a result, they either write better than they read or vice versa – none can do both well concurrently, and all require the database – and, therefore, the application, going offline for continual manual tuning in attempts to improve performance.
In addition, because traditional databases are either optimized for analytics or transactions, developers often have to split workloads among multiple databases to be able deliver the full functionality. This results in data ETL (extract, transfer, load) lag times that can dramatically slow down database, and application, performance.
Adaptive databases are built on a fundamentally new science that uses machine learning to autonomically and continually optimize performance at any scale, without ever going offline and while retaining relational functionality. Purpose-built for processing data in real-time, they leverage a new real-time kernel that ensures the highest parallel processing capacity, enabling CPU and memory to be added on-the-fly, and delivering scalability without service interruption.
Adaptive databases also feature a new way to organize data using Dynamic Virtual Data Representation and Summarization, which allows the machine learning engine to predictively reorganize data in memory based on the types of workloads. They read and write at extreme speed, and perform transactions and analytics on the same data set with no ETLs.
The result is automated, DBA-less tuning, extreme scalability to hundreds of billions of rows, and blazing speed at all times, without application changes. With adaptive databases, applications can handle exponential data growth and data surges without slowing down or interrupting business.
ADM: What role does machine learning play a role in an adaptive database?
Jones: Consider this: There are 1016 possible configuration choices within MySQL. Typically, a database is tuned for one workload type, one scenario. Without machine learning, you have to take databases offline for trial-and-error configuring (a.k.a. ‘guesstimate-tuning’) or to rebuild them to accommodate application changes. And you’re guaranteed to never be optimally tuned because your changes, by necessity, lag behind real-time conditions. Plus, in a 24x7 always-on world, the last thing you want to do is take anything offline.
Machine learning is the secret sauce that gives you a continuous, always-optimized environment, without any human effort. Machine learning understands workloads, machines, type of information, and how to best optimize the database on-the-fly without taking it offline and without a time-consuming trial-and-error process.
It analyzes workflows during ingest to understand how information is changing and predict how it may change in the future. Relevant information is kept hot in memory and non-relevant information cold on disk, where cold data is continually defragmented and/or compressed. If it looks like future writes/reads will not achieve optimal performance, the database begins orchestrating machine resources (e.g., processors, memory and storage) and reorganizing. There is no need for manual intervention, ever.
ADM: How does an adaptive database help developers?
Jones: Developers benefit in several important ways. First, you don’t need special database skills or DBA (database administrator) training in order to get optimally performing applications since there’s no need to recalibrate database configurations to meet changing conditions – the tuning is all done autonomically.
In addition, it’s much easier to write applications and add new features since all the complexity around relating data is already built into the adaptive database – there’s no need to code it into the application. Also, because a single adaptive database can handle both transactions and analytics in real-time with no ETLs, your applications don’t have to interface with multiple databases and you don’t have to deal with sluggish application performance. All of this also makes it easy to scale applications to the highest levels.
ADM: What kinds of business challenges do adaptive databases solve?
Jones: Today’s on-demand economy requires lightning speed, adaptability and 24/7 availability. But traditional databases, which are the heart of applications, get in the way, slowing everything down.
With the amount of data doubling or, by some accounts, tripling year-over-year, and with data and traffic surges hitting applications with much greater frequency, it’s increasingly difficult to deliver high performance. Often, businesses don’t even realize they have performance issues until it’s too late – when surges hit and problems are exposed and exacerbated. When applications become unavailable or sluggish, then user experience and brand image suffer, business slows and revenue is often lost.
With adaptive databases, performance is never a problem. Adaptive databases consistently speed complex queries by 2-7x, while boosting data ingest up to 50x and transactions by 64x, over traditional databases. They perform at these high levels while scaling into hundreds of billions of rows, and have even been proven at 1.2 trillion rows.
All of which means your applications deliver optimal performance, and continuous business, at any scale, even when the inevitable data and traffic tsunamis hit.
ADM: How does an adaptive database facilitate key trends like the IoT?
Jones: The Internet of Things is so exciting because it can enable applications, and businesses, to manipulate and learn from connected devices in ways that improve those devices, enhance the user experience and generate new revenue streams.
In order for applications to react in real-time to physical world stimulus, you need to be able to rapidly ingest and instantly perform analytics on live data as it’s collected from hundreds of thousands to millions of devices. However, since most databases can’t read and write, you have to ETL between databases, which slows analytics and reaction time considerably. Their limited scalability also hurts the equation.
Adaptive databases, on the other hand, handle both read and write at high-performance. They are purpose-built to meet the demands of massive data ingestion while instantly performing analytics on the same operational dataset. This eliminates time-consuming ETLs, enables real-time responses to connected device events, and accelerates time-to-insight.
ADM: Which industries benefit the most from an adaptive database?
Jones: Adaptive databases benefit businesses in a variety of industries, from travel and ecommerce to research and financial services.
For instance, performing genome studies requires a huge amount of data, massive ETLs and days of analytic processing time to get results. With adaptive databases, a seven-day research process can be reduced to just 15 hours, enabling researchers to get answers faster than ever before. In the case of Ruder Boskovic Institute, their database was able to ingest, clean, query and pre-process genes 600% faster using 40% few computing resources, with no DBA experts on hand.
Similarly, GEMServers, a managed WordPress hoster, was able to deliver 200% faster page loads and 39x faster transactions, both of which are critical to its ecommerce customers. GEMServers also cut bandwidth requirements and database size significantly, making scaling easy and eliminating the need to charge customers an additional 12-20% for Google Cloud Platform-related costs.
In the travel industry, response time for complex inquiries often slow to a crawl, especially during busy holidays, causing frustrated site visitors, shopping cart abandonment and lost business. Adaptive databases dynamically meet seasonal travel surges and reduces attrition thanks to page load times that are often 7x faster.
ADM: How has Deep contributed to the innovation of adaptive databases, and how will it continue to do so in the future?
Jones: Deep invented adaptive database technology by reimaging the fundamental science underlying databases and applying machine learning to a relational world. Today, we’re focused on leveraging this for individual databases and extending deepSQL’s scale-up ability with enhanced scale-out capabilities.
Our vision takes this to the next step, creating an adaptive data fabric that extends autonomic scalability and elastic balance across geographically dispersed data sets, to handle the changing demands of applications as they occur, in real-time – no database expertise or special tuning required.
Read more: https://deepis.com