Apache Spark 2.0.0 Release Doubles Down On Big Data

Big Data
Posted 7/29/2016 3:03:13 PM by RICHARD HARRIS, Executive Editor

Apache Spark 2.0.0 Release Doubles Down On Big Data
Apache Spark 2.0.0 is the first release on the 2.x line offering the first major release of open source Spark since Spark 1.6 in 2015. The major updates in the release include API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements. In addition, this release includes over 2500 patches from over 300 contributors.

Databricks, a company founded by the team that created Apache Spark, quickly followed the announcement of the generally availability of Apache Spark 2.0 with its own announcement of its platform’s compatibility with 2.0.

As noted by the Databricks team, some of the leading features of the Apache Spark 2.0 release include:

- Speed: Gaining huge performance in orders of 5 to 10 times faster than Spark 1.6 for some Spark operators due to Tungsten's Phase 2 whole-stage-code generation and Catalyst's code optimization.

- Simplicity: Unifying developer APIs across Spark's libraries such as DataFrames and Datasets.

- Structured Streaming: Laying the foundation for continuous applications by providing high-level declarative streaming APIs based on DataFrames and Datasets built atop Spark SQL engine that works on real-time data.

- Machine Learning Model Persistence: Saving and loading pipelines and models across all programming languages supported by Spark.

- DataFrame-based Machine Learning APIs: Emerging as the primary MLlib package with its "pipeline" APIs and focusing future developments on DataFrame-based API.

- Standard SQL Support: Expanding Spark's SQL capabilities for SQL:2003 features, introducing new ANSI SQL parser, and supporting scalar and predicate type subqueries.

Read More https://databricks.com/blog/2016/07/26/introducing...


About the author: RICHARD HARRIS, Executive Editor

As the Publisher and Editor for App Developer Magazine, Richard has several industry recognitions and endorsements from tech companies such as Microsoft, Apple and Google for accomplishments in the mobile market. He was part of the early Google AFMA program, and also involved in the foundation of Google TV. He has been developing for mobile since 2003 and serves as CEO of Moonbeam Development, a mobile app company with 200 published titles in various markets throughout the world. Richard is also the founder of LunarAds, a mobile cross-promotion and self-serv mediation network for developers. He has been a featured presenter at trade-shows and conferences, and stays active with new projects relating to mobile development.

Subscribe to App Developer Daily

Latest headlines delivered to you daily.