Spark and Python for Big Data with PySpark

Spark and Python for Big Data with PySpark

Learn how to use Spark with Python, including Spark Streaming, Machine Learning, Spark 2.0 DataFrames and more!

What you’ll learn

  • Use Python and Spark together to analyze Big Data
  • Learn how to use the new Spark 2.0 DataFrame Syntax
  • Work on Consulting Projects that mimic real world situations!
  • Classify Customer Churn with Logisitic Regression
  • Use Spark with Random Forests for Classification
  • Learn how to use Spark’s Gradient Boosted Trees
  • Use Spark’s MLlib to create Powerful Machine Learning Models
  • Learn about the DataBricks Platform!
  • Get set up on Amazon Web Services EC2 for Big Data Analysis
  • Learn how to use AWS Elastic MapReduce Service!
  • Learn how to leverage the power of Linux with a Spark Environment!
  • Create a Spam filter using Spark and Natural Language Processing!
  • Use Spark Streaming to Analyze Tweets in Real Time!

Requirements

  • General Programming Skills in any Language (Preferrably Python)
  • 20 GB of free space on your local computer (or alternatively a strong internet connection for AWS)

Who this course is for:

  • Someone who knows Python and would like to learn how to use it for Big Data
  • Someone who is very familiar with another programming language and needs to learn Spark

Tags:

Tutorial Bar
Logo