Big Data, Hadoop, and Spark Basics

BY
IBM via Edx

Learn the concepts and practices of big data, Hadoop and Spark and know the big data processing tools through the online big data offered by EdX.

Lavel

Beginner

Mode

Online

Duration

6 Weeks

Fees

Free

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based
Learning efforts 2-3 Hours Per Week

Course overview

Business organizations now need expert big data professionals to understand customer behaviour out of the complex and unstructured data of pictures, posts, tweets, audio files, videos and satellite images. Edx here presents you with a foundational course to get started with big data and climb up the ladder of a career in the industry. Big Data, Hadoop, and Spark Basics is an introductory computer science course that provides an introductory understanding of big data and its tools. 

Big Data, Hadoop, and Spark Basics Certification, provided by edX, is a beginner-level programme offered by IBM. The online programmes will explore big data processing tools such as Apache SparkHive and Hadoop. Big Data, Hadoop, and Spark Basics Certification by IBM is a 6-week long programme that demands 2-3 hours per week from the learners to devote. The self-paced programme is a free certification programme that can be upgraded to the fullest mode by paying the fee. 

The highlights

  • Available on edX
  • Offered by IBM
  • 6 Week-long Course 
  • Complete Online Course 
  • Audit and Verified Tracks Available 
  • Shareable Certificate Upon Completion
  • Self-paced Programme

Program offerings

  • Graded assignments and exams
  • Edx support
  • English medium
  • Intermediate level course
  • Video transcript in english

Course and certificate fees

Type of course

Free

EdX provides two enrollment options to get into the Big Data, Hadoop, and Spark Basics Online Courses; audit track and verified track. In the audit track, the candidates can attend the courses without paying any fee and will be given limited access. Whereas in the verified track, the learners will have to pay the Big Data, Hadoop, and Spark Basics Certification Fees and will be provided with certification and graded assignments. 

Big Data, Hadoop, and Spark Basics Certification Fees 

Course Title 

Total Fee in  INR

Big Data Analytics (Verified Tracks)

INR 8,273

certificate availability

Yes

certificate providing authority

IBM

certificate fees

₹8,273

Who it is for

Big Data, Hadoop, and Spark Basics Classes is an ideal programme for the professionals such as 

Eligibility criteria

Academic Qualifications

EdX specifies some prerequisites for the learning to be eligible for joining the Big Data, Hadoop, and Spark Basics Certification Course. They are computer and IT literate and interested in learning the process of data management. 

What you will learn

Knowledge of big data Knowledge of apache spark

Big Data, Hadoop, and Spark Basics Certification Syllabus will help the learners:

  • Understand the basics of big data, the tools of big data processing, and its use cases.
  • Learn the application of runtime environment options and Apache Spark development. 
  • Familiarize with fundamental concepts of Spark programming, SparkSQL and DataFrames, and data sets. 
  • Closely know the ecosystem, practices, architecture, and applications of Hadoop such as MapReduce, HBase, Spark, and Distributed File System (HDFS). 

The syllabus

Module 1 – What is Big Data?

  • Introduction to Big Data
  • What is Big Data?
  • Impact of Big Data
  • Parallel Processing, Scaling, and Data Parallelism
  • Tools of Big Data
  • Beyond the Hype
  • Big Data Use Cases
  • Viewpoints about Big Data

Module 2 – Introduction to the Hadoop Ecosystem

  • What is Hadoop
  • An introduction to MapReduce
  • The Hadoop Ecosystem/Common components: Introducing HDFS, Hive, HBase, and Spark, other modules
  • Working with HDFS
  • Working with HBase
  • Lab: MapReduce

Module 3 – Introduction to Apache Spark

  • Why use Apache Spark?
  • Functional Programming Basics
  • Parallel Programming using Resilient Distributed Datasets
  • Scale-out / Data Parallelism in Apache Spark
  • DataFrames and SparkSQL
  • Lab: Practical examples with PySpark

Module 4 – DataFrames and SparkSQL

  • Introduction to Data-Frames & SparkSQL
  • RDDs in Parallel Programming and Spark
  • Data-frames and Datasets
  • Catalyst and Tungsten
  • ETL with Data-frames
  • Lab: ETL with Data-frames
  • Real-world usage of SparkSQL
  • Lab: SparkSQL

Module 5 – Development and Runtime Environment options

  • Apache Spark architecture
  • Overview of Apache Spark Cluster Modes
  • How to Run an Apache Spark Application
  • Using Apache Spark on IBM Cloud
  • Lab: Scale-out on IBM Spark Environment in Watson Studio
  • Setting Apache Spark Configuration
  • Running Spark on Kubernetes
  • Lab: Spark on Kube

Module 6 – Monitoring & Tuning

  • The Apache Spark User Interface
  • Monitoring Jobs
  • Debugging of parallel jobs
  • Understanding Memory resources
  • Understanding Processor resources
  • Lab: Monitoring and Performance tuning

Module 7 – Final Quiz

Admission details

Join the Big Data, Hadoop, and Spark Basics Training through the following steps: 

Step 1 - Browse the official URL https://www.edx.org/course/big-data-hadoop-and-spark-basics

Step 2 - Get started with the programmes by choosing the option ‘Enroll’. 

How it helps

By enrolling in the programmes, the learners will have the Big Data, Hadoop, and Spark Basics certification benefits including a thorough understanding of big data and its management tools. Plus, EdX will confer the paid students enrolled in the verified track certification of completion.  

Instructors

Mr Karthik Muthuraman
Software Engineer
IBM

Other Bachelors, Other Masters

Ms Aije Egwaikhide

Ms Aije Egwaikhide
Senior Data Scientist
IBM

Other Bachelors, Other Masters

FAQs

Big Data, Hadoop, and Spark Basics online course is developed and delivered by?

The online certification programme is a joint venture between EdX and IBM. 

What is the duration of Big Data, Hadoop, and Spark Basics online certification? How much time does the learner need to devote to the course per week?

The duration of the online programmes is 6 weeks and the candidates are required to spend 2-3 hours a week. 

Which level of the audience is targeted by the online programme by IBM?

The online course on big data is intended for introductory level learners.

What are the prior requirements of the online programme?

The prerequisites for the online certification programme are computer and IT literacy and an understanding of data management.

Who are the instructors who are in the charge of the online programme?

The online certification program is taught by Karthik Muthuraman who is a software engineer (Machine Learning) and Aije Egwaikhide who is a senior Data Scientist at IBM. 

Similar Courses

Big Data and Hadoop and Spark

Big Data and Hadoop and Spark

Board Infinity

1 Week Online
Beginner
₹15,000 ₹20,000
Big Data Hadoop

Big Data Hadoop

Udemy

Online
Beginner
₹549 ₹2,499
Big Data and Hadoop for Beginners with Hands on

Big Data and Hadoop for Beginners with Hands on

Udemy

Online
Beginner
₹449 ₹1,999

Courses of your Interest

An Introduction To Coding Theory

An Introduction To Coding Theory

IIT Kanpur via Swayam

8 Weeks Online
Beginner
Free

C++ Foundation

PW Skills

5 Months Online
Beginner
Free

Advanced CFD Meshing using ANSA

Skill Lync

4 Weeks Online
Beginner
₹ 40,000

Salesforce Platform App Builder Certification Trai...

Simplilearn

12 Hours Online
Beginner

Data Science Foundations to Core Bootcamp

Springboard

7 Months Online
Beginner
$9,900 $13,900
Full Stack Developer Course With Placement

Full Stack Developer Course With Placement

AttainU

7 Months Online
Beginner
₹ 68,000
User Experience Design And Research

User Experience Design And Research

UM–Ann Arbor via Futurelearn

35 Weeks Online
Beginner
Fundamentals of Agile Project Management

Fundamentals of Agile Project Management

UCI Irvine via Futurelearn

21 Weeks Online
Beginner
Artificial intelligence Design and Engineering wit...

Artificial intelligence Design and Engineering wit...

CloudSwyft Global Systems, Inc via Futurelearn

17 Weeks Online
Beginner

More Courses by IBM

Artificial Intelligence Chatbots Without Programmi...

IBM via Edx

2 Weeks Online
Beginner
Free

R Programming Basics for Data Science

IBM via Edx

5 Weeks Online
Beginner
Free

Threat Intelligence Lifecycle Fundamentals

IBM via Edx

4 Weeks Online
Beginner
Free

Introduction to Data Engineering

IBM via Coursera

Online
Beginner

Introduction to the Threat Intelligence Lifecycle

IBM via Coursera

3 Weeks Online
Beginner
Free

Introduction to Devops

IBM via Coursera

Online
Beginner

Data Scientist Career Guide and Interview Preparat...

IBM via Coursera

3 Weeks Online
Beginner

Introduction to Software Programming and Databases

IBM via Coursera

Online
Beginner

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books