1. https://appdevelopermagazine.com/analytics
  2. https://appdevelopermagazine.com/data-analytics-trends-shared-by-oxylabs/
7/25/2024 7:31:45 AM
Data analytics trends shared by Oxylabs
Data,Analytics,Trends,AI,Data privacy,Big data,Security,Oxylabs
https://news-cdn.moonbeam.co/Data-analytics-trends-shared-by-Oxylabs-App-Developer-Magazine_eqdbjukl.jpg
App Developer Magazine
Data analytics trends shared by Oxylabs

Analytics

Data analytics trends shared by Oxylabs


Thursday, July 25, 2024

Richard Harris Richard Harris

We recently caught up with Rytis Ulys, Analytics Team Lead at Oxylabs to discuss data analytics trends, he highlighted transformations including digitization, big data, and generative AI. He also emphasized precision, AI's role in data democratization, and evolving data privacy and security, along with strategies for harnessing big data and fostering a data-driven culture.

Rytis Ulys discusses future trends in data analytics and business intelligence, including digitization, the rise of big data, and the impact of generative AI models. These advancements are leading to data democratization, allowing non-specialists to engage with data analysis through tools like “text to SQL” products. Ulys also highlights the need for precision and understanding the business landscape, cautioning against over-reliance on AI-powered tools. He notes that generative AI and traditional AI can complement each other to enhance data-driven decision-making.

ADM: What key trends do you see shaping the future of data analytics in business intelligence?

Ulys: In a little more than a decade, data analytics went through several big transformations. First, it became digitized. Second, we witnessed the emergence of ‘big data’ analytics, driven partly by digitization and partly by massively improving storage and processing capabilities. Finally, in the last couple of years, analytics has been transformed once again by emerging generative AI models that can analyze data at a previously unseen scale and speed. Gen AI is becoming a data analyst’s personal assistant, taking over less exciting tasks from basic code generation to data visualization. 

I believe the key effect of generative AI - and the main future trend for data analytics - is data democratization. Recently, there’s been a lot of activity around “text to SQL” products to run queries in natural language, meaning that people without specialization in data sciences get the possibility to dive deeper into data analysis. 

Data analytics trends: Transformations, AI, and data democratization

However, we shouldn’t get carried away with the hype too quickly. Those AI-powered tools are neither 100% accurate nor error-free, and noticing errors is more difficult for less experienced users. The holy grail of analytics is precision combined with a nuanced understanding of the business landscape - skills that are impossible to automate unless we reach some sort of a “general” AI. 

The second trend that is critical for business data professionals is moving towards a single umbrella-like AI system capable of integrating sales, employee, finance, and product analytics into a single solution. It could bring immense business value due to cost savings (ditching separate software) and also help with the data democratization efforts. 

The role of machine learning and AI in next generation data analytics

ADM: Can you elaborate on the role of machine learning and AI in next-generation data analytics for businesses?

Ulys: Generative AI somehow drew an artificial arbitrary line between next-gen analytics (powered by Gen AI) and “legacy” AI systems (anything that came before Gen AI). In the public discourse around AI, people often miss the fact that the “traditional” AI isn't an outdated legacy; Gen AI is intelligent only on the surface, and both fields are actually complementary. 

In my previous answer, I highlighted the main challenges of using generative AI models for business data analytics. Gen AI isn’t, strictly speaking, intelligence - it is a stochastic technology functioning on statistical probability, which is its ultimate limitation. 

Increased data availability and innovative data scraping solutions were the main drivers behind the Gen AI “revolution”; however, further progress can’t be achieved by simply pouring in more data and computational power. Moving towards a “general” artificial intelligence, developers will have to reconsider what “intelligence” and “reasoning” mean. Before this happens, there’s little possibility that generative models will bring to data analytics something more substantial than they have already done. 

Saying this, I don’t mean there are no methods to improve generative AI accuracy and make it better at domain-specific tasks. A number of applications already do it. For example, guardrails sit between an LLM and users, ensuring the model provides outputs that follow the organization’s rules, while retrieval augmented generation (RAG) is increasingly employed as an alternative to LLM fine-tuning. RAG is based on a set of technologies, such as vector databases (think Pinecone, Weaviate, Qdrant, etc.), frameworks (LlamaIndex, LangChain, Chroma), and semantic analysis and similarity search tools.

How businesses effectively harness big data to gain actionable insights

ADM: How can businesses effectively harness big data to gain actionable insights and drive strategic decisions? 

Ulys: In today’s globalized digital economy, businesses don't have a choice of avoiding data-driven decisions, unless they operate in a very confined local market and are of limited size. To drive competitiveness, an increasing number of businesses are collecting not only consumer data they can get from their owned channels but also publicly available information from the web for price intelligence, market research, competitor analysis, cybersecurity, and other purposes. 

Up to a point, businesses might try to get away without using data-backed decisions; however, when the pace of growth increases, companies that rely on gut feeling only unavoidably start lagging behind. Unfortunately, there are no universal approaches to harnessing data effectively that would suit all companies. Any business has to start from the basics: first, define the business problem; second, answer, very specifically, what kind of data might help to solve it. Over 75% of data businesses collect ends up as “dark data.” Thus, deciding what data you don’t need is no less important than deciding what data you need. 

ADM: In what ways do you envision data visualization evolving in the context of business intelligence and analytics?

Ulys: Most data visualization solutions today have AI-powered functionalities that provide users with a more dynamic view and enhanced accuracy. Further, AI-driven automation also allows businesses to analyze patterns and generate insights from larger and more complex datasets while freeing analysts from mundane visualization tasks. 

I believe data visualization solutions will have to evolve towards more democratic and noob-friendly alternatives, bringing data insights beyond data teams and into sales, marketing, product, and client support departments. It is hard to tell, unfortunately, when we could expect such tools to arrive. Up until now, the focus of the industry hasn’t been on finding the single best visualization solution. There are many different tools available on the market, and they all have their advantages and disadvantages. 

The importance of data privacy and security in the era of advanced analytics

ADM: Could you discuss the importance of data privacy and security in the era of advanced analytics, and how businesses can ensure compliance while leveraging data effectively?

Ulys: Data privacy and security were no less important before the era of advanced analytics. However, the increased scale and complexity of data collection and processing activities also increased the risks related to data mismanagement and sensitive data leaks. Today, the importance of proper data governance cannot be understated: mistakes can lead to financial penalties, legal liability, reputational damage, and consumer distrust. 

In some cases, companies deliberately “cut corners” in order to cut costs or gain other business benefits, resulting in data mismanagement. In many cases, however, improper data conduct is unintentional. 

Let’s take an example of Gen AI developers who need massive amounts of multifaceted data to train and test ML models. When collecting data at such a scale, it is easy for a company to miss that parts of these datasets contain personal data or copyrighted material that the company wasn’t authorized to collect and process. Even worse, getting consent from thousands of internet users who might be technically regarded as “copyright” owners is virtually impossible.

So, how can businesses ensure compliance? Again, it depends on the context, such as the company’s country of origin. US, UK, and EU data regimes are quite different, with the EU having the most stringent one. The newly released EU AI Act will definitely have an additional effect on data governance as it tackles both developers and deployers of AI systems within the EU. Although generative models fall in the low-risk zone, in certain cases, they might still be subject to transparency requirements, obliging developers to reveal the sources of data the AI systems have been trained on as well as data management procedures.

However, there are basic principles that apply to any company. First, companies must thoroughly evaluate the nature of the data they are planning to fetch. Second, more data doesn't equal better data - deciding which data brings added value for the business and omitting data that is excessive or unnecessary is the first step towards better compliance and fewer data management risks.

ADM: How can businesses foster a culture of data-driven decision-making throughout their organizations?

Ulys: The first step is, of course, laying down the data foundation - building the Customer Data Platform (CDP), which integrates structured and cleaned data from various sources the company uses. To be successful, such a platform must include no-code access to data for non-technical stakeholders, and this isn’t an easy task to achieve. 

No-code access means that the chosen platform (or “solution”) must hold both an SQL interface for experienced data users and some sort of “drag and drop” function for beginners. At Oxylabs, we chose Apache Superset to advance our self-service analytics. However, there is no solution that would fit any company and would only have pros and no cons. Moreover, these solutions require well-documented data modeling.

When you have the necessary applications in place, the second big challenge is building data literacy and confidence of non-technical users. It requires proper training to ensure that employees handle data, interpret it, and draw insights correctly. Why is this a challenge? Because it is a slow process, and it will take time away from the data teams. 

Fostering a data-driven culture isn’t a one-off project - to turn data into action, you will need a culture shift inside the organization, as well as constant monitoring and refinement efforts to ensure that non-technical employees feel confident about deploying data in everyday decisions. Management support and well-established cooperation between teams are key to making self-service analytics (or data democratization, as it is often called) work for your company.

About Rytis Ulys

Rytis Ulys holds over eight years of experience in various analytical and consulting roles in both startup businesses and enterprise-grade organizations. Currently, he is leading a team of seven data professionals at Oxylabs, a market-leading web intelligence acquisition platform. Rytis managed to build one of the company’s core teams from scratch in just two years. As a thought leader, he is covering topics ranging from data architecture and data engineering to advanced data modeling. 

About Rytis Ulys

Subscribe to App Developer Magazine

Become a subscriber of App Developer Magazine for just $5.99 a month and take advantage of all these perks.

MEMBERS GET ACCESS TO

  • - Exclusive content from leaders in the industry
  • - Q&A articles from industry leaders
  • - Tips and tricks from the most successful developers weekly
  • - Monthly issues, including all 90+ back-issues since 2012
  • - Event discounts and early-bird signups
  • - Gain insight from top achievers in the app store
  • - Learn what tools to use, what SDK's to use, and more

    Subscribe here