Big Data Hadoop

by NICE (National Institute of Computer Education Pvt. Ltd) Claim Listing

Big Data Hadoop course is offered by NICE (National Institute of Computer Education Pvt. Ltd). NICE-National Institute of Computer Education Pvt. Ltd., is one of the Pioneer Computer Training Institute of the Country setting quality standards for ‘IT Education’.

Price : Enquire Now

Contact the Institutes

Fill this form

Advertisement

NICE (National Institute of Computer Education Pvt. Ltd) Logo

img Duration

2 Months

Course Details

Big Data Hadoop course is offered by NICE (National Institute of Computer Education Pvt. Ltd). NICE-National Institute of Computer Education Pvt. Ltd., is one of the Pioneer Computer Training Institute of the Country setting quality standards for ‘IT Education’.

Quality Education & Training from NICE have become synonymous. This is reflected by the acceptance of NICE Certificates by all and utilization of its Services by MNCs, Public & Private Sector Organizations, Government Departments, Universities, Colleges, Etc.

 

Course Structure:

  • 1. Introduction to Hadoop
  • High Availability
  • Scaling
  • Advantages and Challenges 
  • 2.Introduction to Big Data
  • What is Big data
  • Big Data opportunities
  • Big Data Challenges
  • Characteristics of Big data 
  • 3. Introduction to Hadoop
  • Hadoop Distributed File System
  • Comparing Hadoop & SQL.
  • Industries using Hadoop.
  • Data Locality.
  • Hadoop Architecture.
  • Map Reduce & HDFS.
  • Using the Hadoop single node image (Clone). 
  • 4.The Hadoop Distributed File System (HDFS)
  • HDFS Design & Concepts
  • Blocks, Name nodes and Data nodes
  • HDFS High-Availability and HDFS Federation.
  • Hadoop DFS The Command-Line Interface
  • Basic File System Operations
  • Anatomy of File Read
  • Anatomy of File Write
  • Block Placement Policy and Modes
  • More detailed explanation about Configuration files.
  • Metadata, FS image, Edit log, Secondary Name Node and Safe Mode.
  • How to add New Data Node dynamically.
  • How to decommission a Data Node dynamically (Without stopping cluster).
  • FSCK Utility. (Block report).
  • How to override default configuration at system level and Programming level.
  • HDFS Federation.
  • ZOOKEEPER Leader Election Algorithm.
  • Exercise and small use case on HDFS. 
  • 5. Map Reduce
  • Functional Programming Basics.
  • Map and Reduce Basics
  • How Map Reduce Works
  • Anatomy of a Map Reduce Job Run
  • Legacy Architecture ->Job Submission, Job Initialization, Task Assignment, Task Execution, Progress and Status Updates
  • Job Completion, Failures
  • Shuffling and Sorting
  • Splits, Record reader, Partition, Types of partitions & Combiner
  • Optimization Techniques -> Speculative Execution, JVM Reuse and No. Slots.
  • Types of Schedulers and Counters.
  • Comparisons between Old and New API at code and Architecture Level.
  • Getting the data from RDBMS into HDFS using Custom data types.
  • Distributed Cache and Hadoop Streaming (Python, Ruby and R).
  • YARN.
  • Sequential Files and Map Files.
  • Enabling Compression Codec’s.
  • Map side Join with distributed Cache.
  • Types of I/O Formats: Multiple outputs, NLINEinputformat.
  • Handling small files using CombineFileInputFormat.
  • 6.Map/Reduce Programming – Java Programming
  • Hands on “Word Count” in Map/Reduce in standalone and Pseudo distribution Mode.
  • Sorting files using Hadoop Configuration API discussion
  • Emulating “grep” for searching inside a file in Hadoop
  • DBInput Format
  • Job Dependency API discussion
  • Input Format API discussion
  • Input Split API discussion
  • Custom Data type creation in Hadoop.
  • 7.NOSQL
  • ACID in RDBMS and BASE in NoSQL.
  • CAP Theorem and Types of Consistency.
  • Types of NoSQL Databases in detail.
  • Columnar Databases in Detail (HBASE and CASSANDRA).
  • TTL, Bloom Filters and Compensation.
  • 8.HBase
  • HBase Installation
  • HBase concepts
  • HBase Data Model and Comparison between RDBMS and NOSQL.
  • Master  & Region Servers.
  • HBase Operations (DDL and DML) through Shell and Programming and HBase Architecture.
  • Catalog Tables.
  • Block Cache and sharding.
  • SPLITS.
  • DATA Modeling (Sequential, Salted, Promoted and Random Keys).
  • JAVA API’s and Rest Interface.
  • Client Side Buffering and Process 1 million records using Client side Buffering.
  • HBASE Counters.
  • Enabling Replication and HBASE RAW Scans.
  • HBASE Filters.
  • Bulk Loading and Coprocessors (Endpoints and Observers with programs).
  • Real world use case consisting of HDFS,MR and HBASE.
  • 9.Hive
  • Installation
  • Introduction and Architecture.
  • Hive Services, Hive Shell, Hive Server and Hive Web Interface (HWI)
  • Meta store
  • Hive QL
  • OLTP vs. OLAP
  • Working with Tables.
  • Primitive data types and complex data types.
  • Working with Partitions.
  • User Defined Functions
  • Hive Bucketed Tables and Sampling.
  • External partitioned tables, Map the data to the partition in the table, Writing the output of one query to another table, Multiple inserts
  • Dynamic Partition
  • Differences between ORDER BY, DISTRIBUTE BY and SORT BY.
  • Bucketing and Sorted Bucketing with Dynamic partition.
  • RC File.
  • INDEXES and VIEWS.
  • MAPSIDE JOINS.
  • Compression on hive tables and Migrating Hive tables.
  • Dynamic substation of Hive and Different ways of running Hive
  • How to enable Update in HIVE.
  • Log Analysis on Hive.
  • Access HBASE tables using Hive.
  • Hands on Exercises
  •  
  • 11.Pig
  • Installation
  • Execution Types
  • Grunt Shell
  • Pig Latin
  • Data Processing
  • Schema on read
  • Primitive data types and complex data types.
  • Tuple schema, BAG Schema and MAP Schema.
  • Loading and Storing
  • Filtering
  • Grouping & Joining
  • Debugging commands (Illustrate and Explain).
  • Validations in PIG.
  • Type casting in PIG.
  • Working with Functions
  • User Defined Functions
  • Types of JOINS in pig and Replicated Join in detail.
  • SPLITS and Multiquery execution.
  • Error Handling, FLATTEN and ORDER BY.
  • Parameter Substitution.
  • Nested For Each.
  • User Defined Functions, Dynamic Invokers and Macros.
  • How to access HBASE using PIG.
  • How to Load and Write JSON DATA using PIG.
  • Piggy Bank.
  • Hands on Exercises
  • 12. IC-WEB CLIENT (Customer Interaction Center)
  • Overview of IC-WEB Client
  • Account Identification
  • Customizing IC-WEB Client Profiles
  • IC Manager
  • IC Agent
  • Agent Inbox
  • E-Mail Response Management System (ERMS) and Order Routing with Rule Modeler
  • IC-Functions
  • Interactive Scripting
  • Broadcast Messaging
  • Call List Management
  • 13. SQOOP
  • Installation
  • Import Data.(Full table, Only Subset, Target Directory, protecting Password, file format other than CSV,Compressing,Control Parallelism, All tables Import)
  • Incremental  Import(Import only New data, Last Imported data, storing Password in Metastore, Sharing Metastore between Sqoop Clients)
  • Free Form Query Import
  • Export data to RDBMS,HIVE and HBASE
  • Hands on Exercises.
  • 14. HCATALOG
  • Installation.
  • Introduction to HCATALOG.
  • About Hcatalog with PIG,HIVE and MR.
  • Hands on Exercises.
  • 15. HCATALOG
  • Installation.
  • Introduction to HCATALOG.
  • About Hcatalog with PIG,HIVE and MR.
  • Hands on Exercises.
  • 16.FLUME
  • Installation
  • Introduction to Flume
  • Flume Agents: Sources, Channels and Sinks
  • Log User information using Java program in to HDFS using LOG4J and Avro Source
  • Log User information using Java program in to HDFS using Tail Source
  • Log User information using Java program in to HBASE using LOG4J and Avro Source
  • Log User information using Java program in to HBASE using Tail Source
  • Flume Commands
  • Use case of Flume: Flume the data from twitter in to HDFS and HBASE. Do some
  • 17.More Ecosystems
  • HUE.(Hortonworks and Cloudera).
  • 18.Oozie
  • Workflow (Action, Start, Action, End, Kill, Join and Fork), Schedulers, Coordinators and Bundles.
  • Workflow to show how to schedule Sqoop Job, Hive, MR and PIG.
  • Real world Use case which will find the top websites used by users of certain ages and will be scheduled to run for every one hour.
  • Zoo Keeper
  • HBASE Integration with HIVE and PIG.
  • Phoenix
  • Proof of concept (POC).
  • 19.SPARK
  • Overview
  • Linking with Spark
  • Initializing Spark
  • Using the Shell
  • Resilient Distributed Datasets (RDDs)
  • Parallelized Collections
  • External Datasets
  • RDD Operations
  • Basics, Passing Functions to Spark
  • Working with Key-Value Pairs
  • Transformations
  • Actions
  • RDD Persistence
  • Which Storage Level to Choose?
  • Removing Data
  • Shared Variables
  • Broadcast Variables
  • Accumulators
  • Deploying to a Cluster
  • Unit Testing
  • Migrating from pre-1.0 Versions of Spark
  • Where to Go from Her
  • Bhubaneshwar Branch

    Plot # 560 (1st Floor), Saheed Nagar, Bhubaneshwar

Check out more Big Data Analytics courses in India

ICT Academy Kerala Logo

Tableau

Tableau course is offered by ICT Academy Kerala. The ICT Academy of Kerala offers top-quality education to empower youths across the state in diverse sectors. Enroll in our courses and shape your future!

by ICT Academy Kerala [Claim Listing ]
RAT (Road Ahead Technologies) Logo

BigData Hadoop Certification Training

Big Data Hadoop Training Course is curated by Hadoop industry experts, and it covers in-depth knowledge on Big Data and Hadoop Ecosystem tools such as HDFS, YARN, MapReduce, Hive, Pig, HBase, Spark, Oozie, Flume, and Sqoop.

by RAT (Road Ahead Technologies) [Claim Listing ]
Ethans Tech Solutions LLP Logo

Data Analytics Course

In this course, you will learn the latest tools, and techniques with respect to SQL, Excel, Python, Data Handling techniques, Tableau, Power BI, Alteryx, and data visualization and how to handle the situation with the help of different tools in data analytics in a business environment.

by Ethans Tech Solutions LLP [Claim Listing ]
L&D Edutech Logo

Power BI

Power BI course is offered by L&D Edutech. We at L&D Edutech are driven by Passion, passion for excellence in education, passion for excellence in customer service, and passion for excellence in whatever we do.

by L&D Edutech [Claim Listing ]
Imagecon India Logo

Diploma in Data Analytics (Power BI)

Diploma in Data Analytics (Power BI)  is offered by Imagecon India. We at Imagecon India have vision to make skilled India, So we have started training division to contribute India’s vision. We have a peerless engagement model that seeks to balance student needs with industries requirements.

by Imagecon India [Claim Listing ]
  • Price
  • Start Date
  • Duration

© 2024 coursetakers.com All Rights Reserved. Terms and Conditions of use | Privacy Policy