Hadoop

Hadoop is a framework for running applications on large clusters built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named Map/Reduce, where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system (HDFS) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both Map/Reduce and the distributed file system are designed so that node failures are automatically handled by the framework.

Data Analytics

Data analytics (DA) is the science of examining raw data with the purpose of drawing conclusions about that information. Data analytics is used in many industries to allow companies and organization to make better business decisions and in the sciences to verify or disprove existing models or theories.

Workshop Goals Benefits for Participants
  • Understand the basics of Big data
  • Learn to Google Map/Reduce, Map/Reduce Programming
  • Learn to Hadoop Programming
  • Learn to manage Big Data by using Hadoop
  • Learn to advancement in Hadoop for enhancing performance of operations
  • Learn to implement Hadoop File system (HDFS)
  • Learn to Case studies over Hadoop technology
  • Learn to use Spark & Zeppelin for Data Analytics
  • Grasp all the knobs & levers for running Hadoop
  • Use Hadoop for a variety of data analysis tasks
  • Understand the challenges of Hadoop & its future
  • How Hadoop is useful in their academics or projects
  • Best for those who want to be a Data Scientist
  • Best applied in most of Companies
  • Participants get basic knowledge of it earlier so it make good impact on career
  • Certificates to every participant from organization
  • Workshop Prerequisites
  • Students should have basic knowledge of Java
  • Auditorium or Lab with projector & mike system
  • Lab with Ubuntu 14.04 32 bit Min. 4 GB RAM
  • Student should carry their own laptops with above configuration
  • Coordinator team (2 Technical assistant)
  • One Board with Marker
  • Student should arrange in single Lab so that we can co-ordinate well.
  • Workshop Agenda

    Time Subject Content
  • Day 1:

  • 9:30Hrs to 01:30Hrs
  • Big Data & Hadoop
  • Introduction to Data Analytics
  • Introduction to Big Data
  • What is Big Data
  • Problems with Big Data
  • Idea Behind Hadoop
  • Structured VS Unstructured Data
  • Big Data & Hadoop Strategy
  • Introduction to Hadoop
  • Need of Hadoop
  • History behind Hadoop
  • Name Evolution
  • Hadoop Usage
  • Hadoop Limitations
  • Future of Hadoop
  • Map/Reduce
  • What is Data Analytics
  • 02:00Hrs to 05:00Hrs
  • Google Map/Reduce
  • WordCount Program
  • What is Google Map/Reduce
  • Google Map/Reduce structure
  • Relation between Hadoop & Map/Reduce
  • Hadoop Map/Reduce programming
  • Hadoop Setup
  • Installation of Hadoop
  • WordCount Program Handson
  • 05:00 Hrs to 05:30 Hrs QA Session QA Session
  • Day 2:

  • 9:30Hrs to 01:30Hrs
    Visualization Techniques for Analytics
  • Introduction to Real Time Analytics
  • Real time projects of Analytics
  • Installation of Zeppelin
  • Python NoteBook
  • Zeppelin Tool Demo
  • 02:00Hrs to 05:00Hrs
  • Hands on with Zeppelin
  • QA Session
  • One Mini Project with Zeppelin Tool
  • Recent Trends in Bigdata & Career Opportunity in it
  • Research Trends in IT
  • QA Session & Career Opportunity with Hadoop & Bigdata
  • Day 3:

  • 9:30Hrs to 10:30Hrs
    Introduction to YARN
  • How is it different
  • MapReduce-2
  • 10:30Hrs to 12:30Hrs
  • Introduction to NOSQL
  • Apache Hive
  • Introduction to Apache Hive
  • Installation of Hive
  • Work with Hive
  • 12:30Hrs to 1:30Hrs
  • Introduction to SQOOP
  • SQOOP Installation
  • Installation of Sqoop
  • Working with Sqoop
  • 2:00Hrs to 5:00Hrs Introduction to MongoDB
  • Installation of MongoDB
  • Working with MongoDB
  • MongoDB Assignments
  • Hands on Assignments
  • 5:00Hrs to 5:30Hrs QA Session QA Session