Big Data Hadoop Certification Training

BY
Simpliv Learning

Mode

Online

Duration

35 Hours

Quick Facts

particular details
Medium of instructions English
Mode of learning Self study, Virtual Classroom
Mode of Delivery Video and Text Based

Course and certificate fees

certificate availability

Yes

certificate providing authority

Simpliv Learning

The syllabus

Understanding Big Data and Hadoop

  • Introduction to Big Data & Big Data Challenges  
  • Limitations & Solutions of Big Data Architecture  
  • Hadoop & its Features  
  • Hadoop Ecosystem  
  • Hadoop 2.x Core Components  
  • Hadoop Storage: HDFS (Hadoop Distributed File System)  
  • Hadoop Processing: MapReduce Framework  
  • Different Hadoop Distributions  

Hadoop Architecture and HDFS

  • Hadoop 2.x Cluster Architecture Preview  
  • Federation and High Availability Architecture Preview  
  • Typical Production Hadoop Cluster  
  • Hadoop Cluster Modes  
  • Common Hadoop Shell Commands Preview  
  • Hadoop 2.x Configuration Files  
  • Single Node Cluster & Multi-Node Cluster set up  
  • Basic Hadoop Administration  

Hadoop MapReduce Framework

  • Traditional way vs MapReduce way  
  • Why MapReduce  
  • YARN Components  
  • YARN Architecture  
  • YARN MapReduce Application Execution Flow  
  • YARN Workflow  
  • Anatomy of MapReduce Program  
  • Input Splits, Relation between Input Splits and HDFS Blocks  
  • MapReduce: Combiner & Partitioner  

Advanced Hadoop MapReduce

  • Counters  
  • Distributed Cache  
  • MRunit  
  • Reduce Join  
  • Custom Input Format  
  • Sequence Input Format  
  • XML file Parsing using MapReduce

Apache Pig

  • Introduction to Apache Pig  
  • MapReduce vs Pig  
  • Pig Components & Pig Execution  
  • Pig Data Types & Data Models in Pig  
  • Pig Latin Programs  
  • Shell and Utility Commands  
  • Pig UDF & Pig Streaming  
  • Testing Pig scripts with Punit  
  • Aviation use-case in PIG  

Apache Hive

  • Introduction to Apache Hive  
  • Hive vs Pig  
  • Hive Architecture and Components  
  • Hive Metastore  
  • Comparison with Traditional Database  
  • Hive Data Types and Data Models  
  • Hive Partition  
  • Hive Bucketing  
  • Hive Tables (Managed Tables and External Tables)  
  • Importing Data  
  • Querying Data & Managing Outputs  
  • Hive Script & Hive UDF  
  • Retail use case in Hive  

Advanced Apache Hive and HBase

  • Hive QL: Joining Tables, Dynamic Partitioning  
  • Custom MapReduce Scripts  
  • Hive Indexes and views  
  • Hive Query Optimizers  
  • Hive Thrift Server  
  • Hive UDF  
  • Apache HBase: Introduction to NoSQL Databases and HBase  
  • HBase v/s RDBMS  
  • HBase Components  
  • HBase Architecture  
  • HBase Run Modes  
  • HBase Configuration  
  • HBase Cluster Deployment  

Advanced Apache HBase

  • HBase Data Model  
  • HBase Shell  
  • HBase Client API  
  • Hive Data Loading Techniques  
  • Apache Zookeeper Introduction  
  • ZooKeeper Data Model  
  • Zookeeper Service  
  • HBase Bulk Loading  
  • Getting and Inserting Data  
  • HBase Filters  

Processing Distributed Data with Apache Spark

  • What is Spark  
  • Spark Ecosystem  
  • Spark Components  
  • What is Scala  
  • Why Scala  
  • SparkContext  
  • Spark RDD  

Oozie

  • Oozie  
  • Oozie Components  
  • Oozie Workflow  
  • Scheduling Jobs with Oozie Scheduler  
  • Demo of Oozie Workflow  
  • Oozie Coordinator  
  • Oozie Commands  
  • Oozie Web Console  
  • Oozie for MapReduce  
  • Combining flow of MapReduce Jobs  
  • Hive in Oozie  
  • Hadoop Talend Integration  

Trending Courses

Popular Courses

Popular Platforms

Learn more about the Courses

Download the Careers360 App on your Android phone

Regular exam updates, QnA, Predictors, College Applications & E-books now on your Mobile

Careers360 App
150M+ Students
30,000+ Colleges
500+ Exams
1500+ E-books