ETL and Data Pipelines with Shell, Airflow and Kafka

BY
IBM via Coursera

Gain a thorough understanding of the capabilities of Airflow, Kafka, and shell scripting for data pipeline and ETL processes.

Lavel

Intermediate

Mode

Online

Duration

5 Weeks

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course overview

The term ETL describes a set of procedures used to extract data from one system, change it, and load it into a target network. A more general name for any group of operations that transfer data from one system to another, possibly transforming it along the way, is a data pipeline. ETL and Data Pipelines with Shell, Airflow, and Kafka certification course is developed by IBM Skills Network and taught by Yan Luo - Ph.D., Data Scientist & Developer, Jeff Grossman - Certified Instructor, Sabrina Spillner - Senior Instructional Designer & Content Developer, and Ramesh Sannareddy - Data Engineering Expert, which is made available by Coursera. 

ETL and Data Pipelines with Shell, Airflow, and Kafka online training is designed to help the students gain an understanding of the two different approaches to converting raw data into analytics-ready data including the ETL process and ELT process. ETL and Data Pipelines with Shell, Airflow, and Kafka online course explains the differences between ELT and ETL processing, identifies instances for each, and goes over the procedures and tools needed for extracting data, combining it logically or physically, and importing it into data sources.

The highlights

  • Shareable certificate of completion
  • Self-paced course
  • 17 hours of effort
  • 100% online content
  • Flexible deadlines

Program offerings

  • English video lectures with subtitles
  • 100% online content
  • Learning resources
  • Graded quizzes
  • Graded assignments
  • Accessible on mobile devices.

Course and certificate fees

Depending upon the number of hours required to complete the learning, the participants may choose from any of the following ETL and Data Pipelines with Shell, Airflow, and Kafka certification fees mentioned in the table down below. All these structures include certificates after course completion.

ETL and Data Pipelines with Shell, Airflow, and Kafka Fee Structure

Description

Amount in INR

1 Month, 20+ hours per week

Rs. 4,117

3 Months, 18 hours per week

Rs. 8,234 (Rs. 2,745/month)

6 Months, 9 hours every week

Rs. 12,352 (Rs. 2,059/month)

certificate availability

Yes

certificate providing authority

Coursera

What you will learn

Knowledge of kafka

After completing the ETL and Data Pipelines with Shell, Airflow, and Kafka online certification, students will gain a better understanding of the functionalities of Apache Kafka, Apache Airflow, and shell scripting using various ETL and data pipeline operations. Students will explore the advanced techniques utilized for data extraction, data transformation, data loading, and data streaming as well as will acquire knowledge of the batch mode and concurrent mode of execution. Students will also acquire knowledge of the techniques used for creating DAG using Airflow.

The syllabus

Module 1: Data Processing Techniques

Videos
  • Course Intro video
  • ETL Fundamentals
  • ELT Basics
  • Comparing ETL to ELT
  • Data Extraction Techniques
  • Introduction to Data Transformation Techniques
  • Data Loading Techniques
Readings
  • Course Introduction
  • Summary & Highlights
Practice Exercises
  • Practice Quiz: ETL and ELT Processes
  • Graded Quiz: ETL and ELT Processes

Module 2: ETL & Data Pipelines: Tools and Techniques

Videos
  • ETL using Shell Scripting
  • Introduction to Data Pipelines
  • Key Data Pipeline Processes
  • Batch Versus Streaming Data Pipeline Use cases
  • Data Pipeline Tools and Technologies
Readings
  • Linux Commands and Shell Scripting
  • ETL Techniques
  • Summary & Highlights
  • Summary & Highlights
Practice Exercises
  • Practice Quiz: ETL using Shell Scripts
  • Graded Quiz: ETL using Shell Scripts
  • Practice Quiz: An Introduction to Data Pipelines
  • Graded Quiz: An Introduction to Data Pipelines

Module 3: Building Data Pipelines using Airflow

Videos
  • Apache Airflow Overview
  • Advantages of Using Data Pipelines as DAGs in Apache Airflow
  • Apache Airflow UI
  • Build DAG Using Airflow
  • Airflow Monitoring and Logging
Reading
  • Summary & Highlights
Practice Exercises
  • Practice Quiz: Using Apache Airflow to build Data Pipelines
  • Graded Quiz: Using Apache Airflow to build Data Pipelines

Module 4: Building Streaming Pipelines using Kafka

Videos
  • Distributed Event Streaming Platform Components
  • Apache Kafka Overview
  • Building Event Streaming Pipelines using Kafka
  • Kafka Streaming Process
Reading
  • Summary & Highlights
Practice Exercises
  • Practice Quiz: Using Apache Kafka to build Pipelines for Streaming Data
  • Graded Quiz: Using Apache Kafka to build Pipelines for Streaming Data

Module 5: Final Assignment

Readings
  • Project Overview
  • Congrats & Next Steps
  • Team & Acknowledgements
Practice Exercise
  • Final Quiz

Instructors

Mr Yan Luo

Mr Yan Luo
Data Scientist
IBM

Ph.D

Ms Sabrina Spillner
Senior Instructional Designer
IBM

Mr Jeff Grossman
Instructor
IBM

Mr Ramesh Sannareddy

Mr Ramesh Sannareddy
Instructor
IBM

Other Bachelors

Courses of your Interest

Salesforce Administrator and App Builder

Salesforce Administrator and App Builder

SkillUp Online via Simplilearn

16 Hours Online
Intermediate
Free
Introduction to Medical Software

Introduction to Medical Software

Yale University, New Haven via Coursera

3 Weeks Online
Intermediate
Free

Google Cloud Architect Program

Google Cloud via SkillUp Online

11 Weeks Online
Intermediate
₹ 54,999

Google Cloud Architect Program

Google via SkillUp Online

11 Weeks Online
Intermediate
₹ 54,999
Information Security Design and Development

Information Security Design and Development

Coventry University, Coventry via Futurelearn

10 Weeks Online
Intermediate
Ethics Laws and Implementing an AI Solution on Mic...

Ethics Laws and Implementing an AI Solution on Mic...

CloudSwyft Global Systems, Inc via Futurelearn

14 Weeks Online
Intermediate
Network Security and Defence

Network Security and Defence

Coventry University, Coventry via Futurelearn

10 Weeks Online
Intermediate

Cyber Security Foundations Start Building Your Car...

EC-Council via Futurelearn

15 Weeks Online
Intermediate
Applied Data Analysis

Applied Data Analysis

CloudSwyft Global Systems, Inc via Futurelearn

14 Weeks Online
Intermediate
₹ 900

More Courses by IBM

AI Applications With Watson

IBM via Edx

3 Weeks Online
Intermediate
Free

Site Reliability Engineers Infrastructure Resilien...

IBM via Edx

6 Weeks Online
Intermediate
Free

Python for Data Science Project

IBM via Edx

1 Week Online
Intermediate
Free

Site Reliability Engineering Fundamentals and Secu...

IBM via Edx

5 Weeks Online
Intermediate
Free

Site Reliability Engineering Capstone

IBM via Edx

4 Weeks Online
Intermediate
Free

Blockchain Framework and Platforms

IBM via Edx

2 Weeks Online
Intermediate
Free

Introduction to System Programming on IBM Z

IBM via Edx

3 Weeks Online
Intermediate
Free

Smarter Chatbots with Node RED and Watson AI

IBM via Edx

3 Weeks Online
Intermediate
Free

Relational Database Administration

IBM via Coursera

Online
Intermediate

Application Development using Microservices and Se...

IBM via Coursera

Online
Intermediate

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books