Building Big Data Pipelines with PySpark + MongoDB + Bokeh

BY
Udemy

Acquire a thorough understanding of the strategies involved in building big data pipelines with PySpark, MongoDB, and Bokeh.

Mode

Online

Fees

₹ 499 2299

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course overview

Big data pipelines are data pipelines that are built to facilitate one or more of the three characteristics of big data. The speed of big data makes it attractive to create big data streaming data pipelines. Data can be gathered and handled in real time, allowing for action to be taken. EBISYS R&D - Big Data Engineering and Consulting created the Building Big Data Pipelines with PySpark + MongoDB + Bokeh certification course, which is available on Udemy.

Building Big Data Pipelines with PySpark + MongoDB + Bokeh online course is a self-paced program that is aimed at students who want to master the skills and strategies useful for creating data pipelines using the core functionalities of tools like PySpark, Bokeh, and MongoDB. Building Big Data Pipelines with PySpark + MongoDB + Bokeh online classes cover topics like data preprocessing, data loading, data extraction, data manipulation, data transformation, and data visualization as well as explain the techniques to create machine learning scripts, PySpark ETL scripts, and dashboard server.

The highlights

  • Certificate of completion
  • Self-paced course
  • 5 hours of pre-recorded video content
  • 1 article
  • 1 downloadable resource

Program offerings

  • Online course
  • Learning resources
  • 30-day money-back guarantee
  • Unlimited access
  • Accessible on mobile devices and tv

Course and certificate fees

Fees information
₹ 499  ₹2,299
certificate availability

Yes

certificate providing authority

Udemy

What you will learn

Knowledge of big data Machine learning Knowledge of data visualization Knowledge of mongodb

After completing the Building Big Data Pipelines with PySpark + MongoDB + Bokeh online certification, students will develop an understanding of big data and machine learning to develop big data pipelines using PySpark, MongoDB, Bokeh, and MLlib. Students will explore the methodologies associated with data processing, data analysis, data loading, data transformation, data extraction, data visualization, and data manipulation. Students will also learn about the strategies and concepts involved with geospatial machine learning and geo-mapping.

The syllabus

Introduction

  • Introduction

Setup and Installations

  • Python Installation
  • Installing Third Party Libraries
  • Installing Apache Spark
  • Installing Java (Optional)
  • Testing Apache Spark Installation
  • Installing MongoDB
  • Installing NoSQL Booster for MongoDB

Data Processing with PySpark and MongoDB

  • Integrating PySpark with Jupyter Notebook
  • Data Extraction
  • Data Transformation
  • Loading Data into MongoDB

Machine Learning with PySpark and MLlib

  • Data Pre-processing
  • Building the Predictive Model
  • Creating the Prediction Dataset

Data Visualization

  • Loading the Data Sources from MongoDB
  • Creating a Map Plot
  • Creating a Bar Chart
  • Creating a Magnitude Plot
  • Creating a Grid Plot

Creating the Data Pipeline Scripts

  • Installing Visual Studio Code
  • Creating the PySpark ETL Script
  • Creating the Machine Learning Script
  • Creating the Dashboard Server

Source Code and Notebook

  • Source Code and Notebook

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books