Apache Spark 3 for Data Engineering & Analytics with Python

BY
Udemy

Gain a thorough understanding of Python and Spark 3's functionalities for data engineering and data analytics.

Mode

Online

Fees

₹ 499 2799

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course overview

The Apache Spark 3.0.0 release begins the 3. x series. Apache Spark 3.0 expands on many of the technological advances made in Spark 2. x, introducing fresh concepts while also continuing long-term projects in development. The Apache Spark 3 for Data Engineering & Analytics with Python certification course was designed by David Charles Academy - Senior Big Data Engineer & Consultant at ABN AMRO and is available on Udemy for individuals interested in learning how to use Apache Spark for data engineering and data analytics with Python.

Apache Spark 3 for Data Engineering & Analytics with Python online classes incorporates more than 8 hours of prerecorded lectures supported by 12 downloadable resources and 4 articles aimed at providing individuals with a deeper understanding of managing data across a cluster using Spark. The Apache Spark 3 for Data Engineering & Analytics with Python online course covers data analytics, Spark transformation, Spark execution, data engineering, and data visualization, as well as analytical processing strategies deployed across significant data clusters.

The highlights

  • Certificate of completion
  • Self-paced course
  • 8.5 hours of pre-recorded video content
  • 4 articles 
  • 12 downloadable resources

Program offerings

  • Online course
  • Learning resources
  • 30-day money-back guarantee
  • Unlimited access
  • Accessible on mobile devices and tv

Course and certificate fees

Fees information
₹ 499  ₹2,799
certificate availability

Yes

certificate providing authority

Udemy

What you will learn

Knowledge of python Knowledge of data visualization Knowledge of apache spark

After completing the Apache Spark 3 for Data Engineering & Analytics with Python online certification, individuals will gain insight into the principles of Apache Spark as well as will acquire the knowledge of the functionalities of Spark 3 and Python for data engineering and data analytics operations. In this Apache Spark course, individuals will explore the fundamentals associated with Spark SQL, Spark transformation, Spark actions, Spark execution, Spark DataFrame API, and Spark Web UI. In this Apache Spark certification, individuals will learn about the resilient distributed datasets and APIs as well as will acquire the skills to interpret Spark Web UI and directed acyclic graphs for Spark execution. Individuals will also learn about the strategies to visualize data including dashboards and graphs on Databricks.

The syllabus

Introduction to Spark and Installation

  • Introduction
  • The Spark Architecture
  • The Spark Unified Stack
  • Java Installation
  • Hadoop Installation
  • Python Installation
  • PySpark Installation
  • Install Microsoft Buid Tools
  • Mac OS - Java Installation
  • Mac OS - Python Installation
  • Mac OS - PySpark Installation
  • Mac OS - Testing the Spark Installation
  • Install Jupyter Notebooks
  • The Spark Web UI
  • Section Summary

Spark Execution Concepts

  • Section Introduction
  • Spark Application and Session
  • Spark Transformations and Actions Part 1
  • Spark Transformations and Actions Part 2
  • DAG Visualisation

RDD Crash Course

  • Introduction to RDDs
  • Data Preparation
  • Distince and Filter Transformations
  • Map and Flat Map Transformations
  • SortByKey Transformations
  • RDD Actions
  • Challenge - Convert Fahrenheit to Centigrade
  • Challenge - XYZ Research
  • XYZ Research
  • Challenge - XYZ Research Part 1
  • Challenge XYZ Research Part 2

Structured API - Spark DataFrame

  • Structured APIs Introduction
  • Preparing the Project Folder
  • PySpark DataFrame, Schema and DataTypes
  • DataFrame Reader and Writer
  • Challenge Part 1 - Brief
  • Challenge Part 1
  • Challenge Part 1 - Data Preparation
  • Working with Structured Operations
  • Managing Performance Errors
  • Reading a JSON File
  • Columns and Expressions
  • Filter and Where Conditions
  • Distinct Drop Duplicates Order By
  • Rows and Union
  • Adding, Renaming and Dropping Columns
  • Working with Missing or Bad Data
  • Working with User Defined Functions
  • Challenge Part 2 - Brief
  • Challenge Part 2
  • Challenge Part 2 - Remove Null Row and Bad Records
  • Challenge Part 2 - Get the City and State
  • Challenge Part 2 - Rearrange the Schema
  • Challenge Part 2 - Write Partitioned DataFrame to Parquet
  • Aggregations
  • Aggregations - Setting up Flight Summary Data
  • Aggregations - Count and Count Distinct
  • Aggregations - Min Max Sum SumDistinct AVG
  • Aggregations with Grouping
  • Challenge Part 3 - Brief
  • Challenge Part 3
  • Challenge Part 3 - Prepare 2019 Data
  • Challenge Part 3 - Q1 Get the Best Sales Month
  • Challenge Part 3 - Q2 Get the City that sold the most products
  • Challenge Part 3 - Q3 When to advertise
  • Challenge Part 3 - Q4 Products Bought Together

Introduction to Spark SQL and Databricks

  • Introduction to DataBricks
  • Spark SQL Introduction
  • Register Account on Databricks
  • Create a Databricks Cluster
  • Creating our First 2 Databricks Notebooks
  • Reading CSV Files into DataFrame
  • Creating a Database and Table
  • Inserting Records into a Table
  • Exposing Bad Records
  • Figuring out how to remove bad records
  • Extract the City and State
  • Inserting Records to Final Sales Table
  • What was the best month in sales?
  • Get the City that sold the most products
  • Get the right time to advertise
  • Get the most products sold together
  • Create a Dashboard
  • Summary

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books