Apache Spark with Scala Hands On with Big Data

Apache Spark with Scala Hands On with Big Data

Dive right in with 20+ hands-on examples of analyzing large data sets with Apache Spark, on your desktop or on Hadoop!

What you’ll learn

  • Frame big data analysis problems as Apache Spark scripts
  • Develop distributed code using the Scala programming language
  • Optimize Spark jobs through partitioning, caching, and other techniques
  • Build, deploy, and run Spark scripts on Hadoop clusters
  • Process continual streams of data with Spark Streaming
  • Transform structured data using SparkSQL, DataSets, and DataFrames
  • Traverse and analyze graph structures using GraphX
  • Analyze massive data set with Machine Learning on Spark

Requirements

  • Some prior programming or scripting experience is required. A crash course in Scala is included, but you need to know the fundamentals of programming in order to pick it up.
  • You will need a desktop PC and an Internet connection. The course is created with Windows in mind, but users comfortable with MacOS or Linux can use the same tools.
  • The software needed for this course is freely available, and I’ll walk you through downloading and installing it.

Who this course is for:

  • Software engineers who want to expand their skills into the world of big data processing on a cluster
  • If you have no previous programming or scripting experience, you’ll want to take an introductory programming course first.

Tags:

Tutorial Bar
Logo