Hands-On PySpark for Big Data Analysis

BY
Udemy

Mode

Online

Fees

₹ 449 3499

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course and certificate fees

Fees information
₹ 449  ₹3,499
certificate availability

Yes

certificate providing authority

Udemy

What you will learn

Knowledge of big data

The syllabus

Install PySpark and Setup Your Development Environment

  • The Course Overview
  • Core Concepts in Spark and PySpark
  • Setting Up Spark on Windows and PySpark
  • SparkContext, SparkConf and Spark Shell

Getting Your Big Data into the Spark Environment Using RDDs

  • Loading Data onto Spark RDDs
  • Parallelization with Spark RDDs
  • RDD Operation Basics

Big Data Cleaning and Wrangling with Spark Notebooks

  • Using Spark Notebooks for Quick Iteration of Ideas
  • Sampling/Filtering RDDs to Pick-Out Relevant Data Points
  • Splitting Datasets and Creating New Combinations with Set Operations

Aggregating and Summarizing Data into Useful Reports

  • Calculating Averages with Map and Reduce
  • Faster Average Computation with Aggregate
  • Pivot Tabling with Key-Value Paired Data Points

Powerful Exploratory Data Analysis with MLlib

  • Computing Summary Statistics with MLlib
  • Using Pearson and Spearman to Discover Correlations
  • Testing Your Hypotheses on Large Datasets

Putting Structure on Your Big Data with SparkSQL

  • Manipulating DataFrames with SparkSQL Schemas
  • Using the Spark DSL to Build Queries for Structured Data Operations

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books