Big Data Essentials: HDFS, MapReduce and Spark RDD

BY
Yandex via Coursera

Take a step toward learning the new technologies of big data with this Big Data Essentials: HDFS, MapReduce and Spark RDD certification course by Coursera.

Lavel

Intermediate

Mode

Online

Duration

6 Weeks

Fees

Free

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study
Mode of Delivery Video and Text Based

Course overview

For candidates looking forward to making a successful career in the big data industry, this certificate course on Big Data Essentials: HDFS, MapReduce and Spark RDD is the best way to get started. This course gives you new tools and reading materials to direct you in the right direction. Technologies like MapReduce, HDFS and Spark are all covered in this certificate by Coursera. The 6 course is of 6 weeks and helps candidates learn the basics of technologies in the Big Data field. It guides you throughout the system internals as well as its applications in the real world.

Candidates can expect to learn about the distributed file system, the reason for their existence and their functions. This course will guide candidates to build a stronger understanding of the big data technology and concepts to better utilize them in their work. The programme focuses on developing skills in learners to learn and use these tools in creating a solution for social networks, finance, telecommunications and other fields.

The Big Data Essentials: HDFS, MapReduce and Spark RDD programme offer relevant assignments that impart practical knowledge and experience to solve the most pressing problems in the field of big data. It presents an opportunity to evaluate the assignments of candidates on a real cluster.

The highlights

  • Online course with completely flexible deadlines
  • Self-paced programme to suit different schedules of candidates.
  • 43 hours of learning content available
  • Earn a shareable certificate on completing the course. Share it on LinkedIn, CV and portfolio, etc
  • English and Korean Subtitles for class videos are provided
  • Offered by Yandex

Program offerings

  • Online classes
  • Reading material
  • Practice exercises
  • Practice assignments
  • Graded assignments

Course and certificate fees

Type of course

Free

  • The Big Data Essentials: HDFS, MapReduce and Spark RDD programme on Coursera are offered for a fee of Rs. 2127
  • The course fee includes certification charges. The certificate will be given at the end of the course to all the candidates completing the entire course and graded assignments. 
  • Candidates can also choose to attend the course for free with the ‘audit for free’ option. Candidates auditing the course will be having access to all the video content but not the graded assignments. Also, this mode does not give a certificate on completion.

Course

Fee

Big Data Essentials: HDFS, MapReduce and Spark RDD

Rs. 2,159

certificate availability

Yes

certificate providing authority

Coursera

certificate fees

₹2,152

Who it is for

The Big Data Essentials: HDFS, MapReduce and Spark RDD Course will help:

  • Candidates seeking to advance their career in the Bid data industry.
  • Professionals from big data domain who are willing to learn the data engineering tools in the industry for better job prospects and a larger job market.
  • Developers and IT professionals who wish to make forays in the field of Big Data.
  • Professors and professionals who are looking to update their skills and learn the newer trends of the industry.

Eligibility criteria

Certification Qualifying Details

The Certificate Programme On Big Data Essentials: Hdfs, Mapreduce And Spark RDD awards certificate to candidates who complete the course and assignments satisfactorily.

What you will learn

Knowledge of python Knowledge of apache spark

The Certificate course in Big Data Essentials: HDFS, MapReduce and Spark RDD help candidates in developing an understanding for the following:

  • Basic technologies of the modern-day Big data industry. Particularly MapReduce, HDFS and Spark.
  • An in-depth understanding of the distributed file system, its functionality and usage, along with other system internals and relevant applications.
  • The working and framework of MapReduce which is considered as a workhorse in the applications of the big data industry.
  • Understanding of the Spark technology, its basics and applications in the computational framework. 
  • Hands-on experience of handling different industry tools, and knowing when to use them in different problems and challenges arising in the real world.

The syllabus

Welcome

  • BigData Applications
  • Why BigData?
  • Course Structure
  • What is BigData Essentials?
  • Issues BigData can solve

What are BigData and distributed file systems (e.g. HDFS)?

  • File content exploration
  • File system exploration
  • File content exploration
  • File system managing
  • Scaling Distributed File System
  • Processes
  • Block and Replica States, Recovery Process
  • HDFS Client9m Web UI, REST API
  • Block and Replica States, Recovery Process
  • Namenode Architecture
  • Introduction
  • Binary formats
  • Text formats
  • Binary formats
  • How to Install Docker on Windows 7, 8, 104m
  • How to submit your first assignment
  • Compression

Solving Problems with MapReduce

  • Streaming in Python
  • Streaming
  • Unreliable Components 
  • MapReduce
  • Unreliable Components 
  • Distributed Shell
  • Fault Tolerance
  • Fault Tolerance. Live Demo
  • Distributed Cache
  • WordCount in Python
  • Testing
  • Environment, Counters
  • Partitioner
  • Combiner
  • Compression
  • Comparator
  • Speculative Execution / Backup Tasks

Solving Problems with MapReduce (practice week)

  • How to submit your first Hadoop assignment

Introduction to Apache Spark

  • RDDs
  • Welcome
  • Resiliency
  • Transformations 
  • Actions
  • Caching & Persistence
  • Execution & Scheduling
  • Getting started with Spark & Python
  • Broadcast variables
  • Broadcast & Accumulator variables
  • Cluster mode
  • Working with text files
  • Accumulator variables
  • Joins
  • Spark UI

Introduction to Apache Spark (practice week)

  • Building an intuition behind the PMI definition
  • Spark assignments Intro

Real-World Applications

  • Medians
  • Sampling
  • Means
  • Estimating proportions
  • Tabular Data, KeyFieldSelection
  • Map and Reduce Side Joins
  • Twitter graph case study
  • Shortest path
  • Data Skew, Salting

Admission details

For enrolling in the Big Data Essentials: HDFS, MapReduce and Spark RDD programme  the candidates can follow this simple procedure and follow these registration steps:

Step 1: Visit the course page, press ‘Enroll.’

Step 2: New candidates will have to create an account with Coursera with an email, Google / Facebook account. The existing account holders can just log in. 

Step 3:Next you can choose to pay the course fee and buy the course by clicking enrol. Candidates can also avail the course for free via the audit for free option. However, note that ‘free audit mode does not include certification.

Step 4: Candidates can also choose for a free 7-day trial of the Coursera plus version. This is a nice way to get the experience of the premium benefits before you buy.

Step 5: If you choose to buy the course, You will be directed to pay the course fee. Choose a payment mode available and make the payment.

Step 6: After the fee is paid. candidates will have access to video lectures and reading material in the course.

Step 7: Candidates have the option to download the video classes and watch them at their own pace.

Scholarship Details

The Big Data Essentials: HDFS, MapReduce and Spark RDD course has the option of giving Financial aid in case the candidates cannot afford the course fee.

  • Visit the course page.

  • Click on ‘Financial aid’ under enrollment options.

  • Fill out the application and submit the form.

  • It takes a minimum of 15 days to review the application

  • The candidates that are selected for financial aid will be notified

How it helps

The Big Data Essentials: HDFS, MapReduce and Spark RDD programme are best suited for candidates of all proficiency levels. The course content is curated in a way that takes you from basic to intermediate level in steps that are easy to follow. The course content is industry-relevant and tries to bridge the gap between the textbook theories and real-life problems faced in the industry. The hands-on experience offered in the practice assignments and quizzes throughout the course will prepare candidates to develop confidence in candidates to work on the challenges with best practices learned in the course. Candidates who want to take their career to the next level can benefit from this Big Data Essentials: HDFS, MapReduce and Spark RDD course certification and its practical approach to industry tools like HDFS, Spark RDD and MapReduce.

The course curriculum is created to help develop subject expertise in candidates. It will equip them in tackling challenges and Identifying them in the data analytics and big data field. Multi-discipline candidates can benefit from the course. Software developers, programmers who are looking for a career in Big data can learn best practices through this course. It is also great for professionals willing to update skills.

FAQs

Why should I take this course through Coursera?

Coursera provides an industry-relevant curriculum and expert-taught courses with the most flexibility. It offers certification to competing candidates and support at every step of the course.

What work experience is required for admission to this course?

The course can be taken by all candidates and does not necessarily require any work experience.

What tools are used for training this course?

The course uses big data tools like HDFS, Spark RDD and MapReduce. Candidates can read about them in advance for a better understanding.

What is the primary mode of conducting the course?

The course is conducted primarily in an online mode. The learners are given pre-recorded online video classes and assignments throughout the course.

What is the total time duration of the course?

The course takes a total of 43 hours to be completed. However, each student can pace it according to their speed. They can also reset deadlines in between the course.

From when will I get access to the course material?

Candidates can get the course material on registration and payment of fees. Free audit option candidates can get videos on enrolling but the graded assignments are not available for such candidates.

Do I have to follow a particular order for course videos?

Yes. It is recommended that learners follow the set order of videos. The initial classes are orientation and learnings for classes. Randomised viewing can hamper the flow of learning.

Similar Courses

Big Data Fundamentals

The University of Adelaide, Adelaide via Edx

10 Weeks Online
Intermediate
Free

Big Data Technology Capstone Project

Hong Kong University of Science and Technology,... via Edx

4 Weeks Online
Intermediate
Free

Big Data Analysis with Scala and Spark Scala 2 Ver...

Swiss Federal Institute of Technology Lausanne via Coursera

3 Weeks Online
Intermediate
Free

Big Data Analytics in Healthcare

Georgia Tech via Udacity

2 Months Online
Intermediate
Free
Big Data Analytics

Big Data Analytics

The University of Adelaide, Adelaide via Edx

10 Weeks Online
Intermediate
Free

Demystifying Biomedical Big Data A User's Guide

GU Washington via Edx

8 Weeks Online
Intermediate
Free

Big Data for Reliability and Security

Purdue University, West Lafayette via Edx

6 Weeks Online
Intermediate
Free

Big Data Science with the BD2K Lincs Data Coordina...

ISMMS New York via Coursera

3 Weeks Online
Intermediate
Free

Courses of your Interest

Salesforce Administrator and App Builder

Salesforce Administrator and App Builder

SkillUp Online via Simplilearn

16 Hours Online
Intermediate
Free
Introduction to Medical Software

Introduction to Medical Software

Yale University, New Haven via Coursera

3 Weeks Online
Intermediate
Free

Google Cloud Architect Program

Google Cloud via SkillUp Online

11 Weeks Online
Intermediate
₹ 54,999

Google Cloud Architect Program

Google via SkillUp Online

11 Weeks Online
Intermediate
₹ 54,999
Information Security Design and Development

Information Security Design and Development

Coventry University, Coventry via Futurelearn

10 Weeks Online
Intermediate
Ethics Laws and Implementing an AI Solution on Mic...

Ethics Laws and Implementing an AI Solution on Mic...

CloudSwyft Global Systems, Inc via Futurelearn

14 Weeks Online
Intermediate
Network Security and Defence

Network Security and Defence

Coventry University, Coventry via Futurelearn

10 Weeks Online
Intermediate

Cyber Security Foundations Start Building Your Car...

EC-Council via Futurelearn

15 Weeks Online
Intermediate
Applied Data Analysis

Applied Data Analysis

CloudSwyft Global Systems, Inc via Futurelearn

14 Weeks Online
Intermediate
₹ 900

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books