Description

In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python library for interacting with Spark. In the first lesson, you will learn about big data and how Spark fits into the big data ecosystem. In lesson two, you will be practicing processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. In the third lesson, you will debug and optimize your Spark code when running on a cluster. In lesson four, you will use Spark’s Machine Learning Library to train machine learning models at scale.Read more.

This resource is offered by an affiliate partner. If you pay for training, we may earn a commission to support this site.

Career Relevance by Data Role

The techniques and tools covered in Spark are most similar to the requirements found in Data Engineer job advertisements.

Similarity Scores (Out of 100)

Learning Sequence

Spark is a part of three structured learning paths.

None

DataKwery

17 Courses

Free Data Engineer

None

DataKwery

15 Courses

Free Data Architect

None

DataKwery

22 Courses

Free Data Scientist

All Learning Paths

Spark

Description

Career Relevance by Data Role

Learning Sequence

Free Data Engineer

Free Data Architect

Free Data Scientist

Fast Facts

Structure

Tools and Techniques

Subscribe for Updates

Similar Opportunities

Apache Spark Essential Training: Big Data Engineering

Distributed Computing with Spark SQL

Introduction to Apache Spark

Scalable Machine Learning on Big Data using Apache Spark

Introduction to Spark with sparklyr in R

Big Data Analytics Using Spark

Apache Spark (TM) SQL for Data Analysts

Machine Learning with PySpark

Big Data Analysis with Scala and Spark

Select Learning Source