top of page

BIG DATA

Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software.

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram

BIG DATA Syllabus

Big Data and Data Analysis with Hadoop

  1. Introduction to Big Data

    1. What is Big Data

    2. Characteristics of Big Data

    3. Big Data Exploration

    4. Data Extraction

  2. Introduction to Hadoop

    1. What is Hadoop

    2. Hadoop Architechture

    3. Setting Up Hadoop Environment

    4. Setting up Single Node Cluster

    5. Setting up Multiple Node Cluster

  3. Understanding HDFS

    1. Introduction to HDFS

    2. Installing HDFS

    3. HDFS Command Line

  4. Hadoop Administration – Part I

    1. Adding and Removing Nodes

    2. Starting and Stopping Services

  5. Hadoop Administration – Part II

    1. Configuring Hadoop and Rack Topology

  6. MapReduce

    1. Overview of MapReduce

    2. Inputs and Outputs

    3. MapReduce – Task Execution & Environment

    4. MapReduce – Job Input

    5. MapReduce – Job Output

  7. YARN

    1. Overview of YARN

    2. YARN Architecture

    3. YARN Resource Manager

    4. YARN Timeline Server

    5. Writing YARN Applications

    6. YARN Commands

    7. YARN Resource Manager

    8. YARN Node Manager

  8. Pig

    1. Overview of Pig

    2. Pig Installation

    3. Pig Scripts in Local Mode

  9. Hive

    1. Overview of Hive

    2. Hive Installation

    3. Data Types in Hive

    4. Creating a Database in Hive

    5. Dropping a Database in Hive

    6. Creating Modifying & Dropping Table

    7. Partitioning in Hive

    8. Built-in Operators in Hive

    9. Built-in Functions in Hive

  10. Data Ingestion Tools

    1. Flume

    2. Sqoop

  11. Control Hadoop Job Execution

    1. Oozie Workflows

    2. Oozie Coordinator

  12. Introduction to HBase

    1. Overview of HBase

    2. HBase Installation

    3. HBase and Shell

    4. HBase Java Client API

    5. HBase Java Admin API

  13. Configuration Service – ZooKeeper

    1. Overview of ZooKeeper

    2. ZooKeeper Data Model

    3. Programming in ZooKeeper

  14. Refresher to Statistics

    1. Descriptive Analysis

    2. Related Mathematics/Statistics Concepts

    3. Basic Statistics Tests

  15. R Programming

    1. Introduction to R

    2. Objects in R and Data Classes

    3. Importing Data

    4. Functions in R

    5. Visualization in R

  16. Integrating R with Hadoop

    1. Understanding Required Packages

    2. Installation of Required Packages

    3. Integrating RHive and RHbase

    4. Visualization in RHive

  17. Hadoop Security

    1. Introduction to Hadoop Security

    2. Introduction to Kerbros

  18. Final Project

    1. Final Project - Part 1

    2. Final Project - Part 2

Contact

I'm always looking for new and exciting opportunities. Let's connect.

123-456-7890 

bottom of page