Spark and Python for Big Data with PySpark

BY
Udemy

Learn the skills for using Python, Spark Streaming, Machine Learning, and Spark 2.0 DataFrames with Spark and Python for Big Data with PySpark.

Mode

Online

Fees

₹ 599 4099

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course overview

Spark and Python for Big Data with PySpark online certification are designed to develop one of the most demanding and valuable skills in the field of technology i.e., big data analytics technology which is widely used among technology firms including Google, Meta, Netflix, Amazon, and Airbnb i.e., Apache Spark. Spark is up to 100 times quicker and efficient than Hadoop MapReduce, resulting in strong growth in the need for skilled professionals. Due to the newness of the Spark 2.0 DataFrame framework, individuals with skills are considered among the most valuable professionals in the industry.

Spark and Python for Big Data with PySpark online course is a short-term programme developed by Jose Portilla, Head of Data Science, Pierian Data Inc., and offered by Udemy Inc., a US-based online learning platform that provides courses for both amateurs and professionals.

Spark and Python for Big Data with PySpark syllabus include topics such as fundamentals of Spark DataFrames, optimization of Spark 2.0 syntax. The course also provides content for MLlib Machine Library with DataFrame syntax and Spark, Spark SQL, Spark streaming and Gradient Boosted Trees which learners will learn through 10+ hours of video content, articles and downloadable materials. 

The highlights

  • Certificate of completion
  • Self-paced course
  • English videos with multi-language subtitles
  • 10.5 hours of pre-recorded video content
  • Online course
  • 30-day money-back guarantee 
  • Unlimited access
  • Accessible on mobile devices and TV

Program offerings

  • Certificate of completion
  • Self-paced course
  • English videos with multi-language subtitles
  • 10.5 hours of pre-recorded video content
  • 4 articles
  • 4 downloadable resources
  • 30-day money-back guarantee
  • Unlimited access
  • Accessible on mobile devices and tv

Course and certificate fees

Fees information
₹ 599  ₹4,099
certificate availability

Yes

certificate providing authority

Udemy

What you will learn

Machine learning Knowledge of python

After completing the Spark and Python for Big Data with PySpark certification course, learners will gain knowledge of using Python and Spark programs to analyse big data, syntax in Spark 2.0  DataFrame, classification using Spark with Random Forests, use Logistic Regression to categorise customer churn. Individuals will also learn to develop machine learning models using Spark’s MLlib, use AWS Elastic MapReduce service, analyse big data by setting up Amazon web services, developing spam filters using NaturalLanguage Processing and Spark.

The syllabus

Introduction to Course

  • Introduction
  • Course Overview
  • Frequently Asked Questions
  • What is Spark? Why Python?

Setting up Python with Spark

  • Set-up Overview
  • Note on Installation Sections

Databricks Setup

  • Recommended Setup
  • Databricks Setup

Local Installation VirtualBox

  • Local Installation VirtualBox Part 1
  • Local Installation VirtualBox Part 2
  • Setting up PySpark

AWS EC2 PySpark Set-up

  • AWS EC2 Set-up Guide
  • Creating the EC2 Instance
  • SSH with Mac or Linux
  • Installations on EC2

AWS EMR Cluster Setup

  • AWS EMR Setup

Python Crash Course

  • Introduction to Python Crash Course
  • Jupyter Notebook Overview
  • Python Crash Course Part One
  • Python Crash Course Part Two
  • Python Crash Course Part Three
  • Python Crash Course Exercises
  • Python Crash Course Exercise Solutions

Spark DataFrame Basics

  • Introduction to Spark DataFrames
  • Spark DataFrame Basics
  • Spark DataFrame Basics Part Two
  • Spark DataFrame Basic Operations
  • Groupby and Aggregate Operations
  • Missing Data
  • Dates and Timestamps

Spark DataFrame Project Exercise

  • DataFrame Project Exercise
  • DataFrame Project Exercise Solutions

Introduction to Machine Learning with MLlib

  • Introduction to Machine Learning and ISLR
  • Machine Learning with Spark and Python with MLlib

Linear Regression

  • Linear Regression Theory and Reading
  • Linear Regression Documentation Example
  • Regression Evaluation
  • Linear Regression Example Code Along
  • Linear Regression Consulting Project
  • Linear Regression Consulting Project Solutions

Logistic Regression

  • Logistic Regression Theory and Reading
  • Logistic Regression Example Code Along
  • Logistic Regression Code Along
  • Logistic Regression Consulting Project
  • Logistic Regression Consulting Project Solutions

Decision Trees and Random Forests

  • Tree Methods Theory and Reading
  • Tree Methods Documentation Examples
  • Decision Tress and Random Forest Code Along Examples
  • Random Forest - Classification Consulting Project
  • Random Forest Classification Consulting Project Solutions

K-means Clustering

  • K-means Clustering Theory and Reading
  • KMeans Clustering Documentation Example
  • Clustering Example Code Along
  • Clustering Consulting Project
  • Clustering Consulting Project Solutions

Collaborative Filtering for Recommender Systems

  • Introduction to Recommender Systems
  • Recommender System - Code Along Project

Natural Language Processing

  • Introduction to Natural Language Processing
  • NLP Tools Part One
  • NLP Tools Part Two
  • Natural Language Processing Code Along Project

Spark Streaming with Python

  • Introduction to Streaming with Spark!
  • Spark Streaming Documentation Example
  • Spark Streaming Twitter Project - Part
  • Spark Streaming Twitter Project - Part Two
  • Spark Streaming Twitter Project - Part Three

Bonus

  • Bonus Lecture

Instructors

Mr Jose Portilla
Head of Data Science
Udemy

Other Bachelors, M.S

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books