IBM Releases Apache Spark Powered Data Science Collaborative Platform

Posted on Thursday, June 9, 2016 by RICHARD HARRIS, Executive Editor

IBM has announced a new cloud-based development environment for near real-time, high performance analytics. The new IBM Data Science Experience is an interactive, collaborative, cloud-based environment where data scientists can use multiple tools to activate insights. It is in limited preview and IBM has created a waiting list for individuals interested in accessing the platform.

Available on the IBM Cloud Bluemix platform, the Data Science Experience provides 250 curated data sets, open source tools and a collaborative workspace to help data scientists uncover and share insights with developers, making it easier to rapidly develop applications that are infused with intelligence.

The new offering is part of IBM’s investment in developing Apache Spark as a type of “analytics operating system.” IBM created the Data Science Experience to extend the speed and agility of the analytical process through new contributions to SparkR, SparkSQL and Apache SparkML. As a result, data scientists who work in R will have faster access to more data. 

The Data Science Experience’s environment allows data scientists to accelerate and simplify data ingestion, curation and analysis by bringing together content, data, models, and open source resources from IBM and others including H2O, RStudio, Jupyter Notebooks on Apache Spark in a single security-rich managed environment.

IBM is collaborating with data science organizations including Galvanize, H2O.ai, LightBend and RStudio to promote an integrated and unified data science ecosystem. IBM is also joining the R Consortium to help accelerate data science’s readiness for the enterprise.

On top of the development of open source capabilities, IBM is adding new features and APIs which include:


- Sparkling.Data: Cleaning and preparing data for analysis are the tasks that data scientists typically spend the majority of their time on. IBM created a library that helps users discover the different file types and returns a data frame loaded with data (by default) from the file type that occurs the most. It can be used to infer the schema, discover data types, profile data sets, view range and distribution, reveal and fix bad data, and more.

- Prescriptive Analytics: The Decision Optimization CPLEX Modeling library (DOcplex) contains modeling packages such as Mathematical Programming and Constraint Programming.

- Shiny: Data scientists typically create visualizations to share their analysis with others. IBM includes Shiny in the IBM Data Science Experience to provide the ability to create interactive analytic web applications without coding any HTML, CSS, or JavaScript - only R.

- Data Connections: From the Notebook interface, users can set up data connections to Bluemix data services like Cloudant or dashDB or to on-premises or external services.

- Schedule Jobs: The Notebook interface provides the ability to schedule jobs to run periodically.

IBM has contributed to related projects including Apache Toree, EclairJS, Apache Quarks, Apache Mesos, Apache Tachyon now called Alluxio, and major contributions to Apache Spark sub-projects SparkSQL, SparkR, MLLib, and PySpark with over 3,000 total contributions in the last year. 

In addition, IBM has built Spark into the core of its platforms including Watson, Commerce, Analytics, Systems, Cloud as well as more than 30 offerings including IBM BigInsights for Apache Hadoop, IBM Analytics on Apache Spark, Spark with Power Systems, Watson Analytics, SPSS Modeler and IBM Stream Computing. IBM also open-sourced its SystemML machine learning technology to advance Spark’s machine learning capabilities in 2015.


More App Developer News

Tether QVAC SDK Powers AI Across Devices and Platforms



APAC 5G expansion to fuel 347B mobile market by 2030



How AI is causing app litter everywhere



The App Economy Is Thriving



NIKKE 3.5 anniversary update livestream coming soon



New AI tool targets early dementia detection



Jentic launch gives AI agents api access



Experts warn ai-generated health content risks misinterpretation without human oversight



Ludo.ai Unveils API and MCP Beta to Power AI Game Asset Pipelines



AccuWeather Launches ChatGPT Integration for Live Weather Updates



Stop Using Business Jargon: 5 Ways Buzzwords Damage Job Performance



IT spending rises as banks balance legacy and innovation



Tech hiring slumps as Software Developer job postings fall



AI is becoming more widespread in collaboration tools



FCC prohibits new foreign router models citing critical infrastructure risks



ChatGPT Carbon Footprint Matches 1.3 Million Cars Report Finds



Lens Launches MCP Server to Connect AI Coding Assistants with Kubernetes



Accelerating corporate ai investment returns



Enviromates tech startup launches global participation platform



Private Repository Secures the AI-driven Development Boom



UK Fintech Platform Enviromates Connects Projects Brands and Consumers



Env Zero and CloudQuery Announce Merger



How Industrial AI Is Transforming Operations in 2026



AI generated work from managers is damaging trust among employees



Foresight Secures $25M to Bridge Infrastructure Execution Gap



Copyright © 2026 by Moonbeam

Address:
1855 S Ingram Mill Rd
STE# 201
Springfield, Mo 65804

Phone: 1-844-277-3386

Fax:417-429-2935

E-Mail: contact@appdevelopermagazine.com