(SEM VIII) THEORY EXAMINATION 2021-22 BIG DATA

B.Tech Data Structure 0 downloads
₹29.00

SECTION A

(Attempt all – 2 marks each)

 

(a) Apache Hadoop

Apache Hadoop is an open-source framework used to store and process large volumes of data across distributed computer systems using simple programming models.

 

(b) Big Data

Big Data refers to extremely large and complex datasets that cannot be processed efficiently using traditional data processing tools.

 

(c) Need of Hadoop

Hadoop is needed to store and process huge amounts of data in a cost-effective, fault-tolerant, and scalable manner using distributed computing.

 

(d) Digital Data

Digital data is information stored or transmitted in binary form (0s and 1s), such as text files, images, videos, and audio.

 

(e) Data replication in HDFS

Data replication in HDFS is the process of storing multiple copies of data blocks across different nodes to ensure fault tolerance and data availability.

 

(f) Serialization in HDFS

Serialization is the process of converting data objects into a byte stream for storage or transmission in Hadoop Distributed File System.

 

(g) Schedulers

Schedulers manage resource allocation and job execution in Hadoop. Common schedulers include FIFO, Fair Scheduler, and Capacity Scheduler.

 

(h) NameNode

NameNode is the master node in HDFS that manages metadata, file names, block locations, and access permissions.

 

(i) ZooKeeper

ZooKeeper is a centralized coordination service used for configuration management, synchronization, and cluster monitoring in Hadoop.

 

(j) Execution modes of Pig

Pig supports two execution modes: Local mode and MapReduce mode.

 

SECTION B

(Attempt any three – 10 marks each)

 

2(a) Views in HIVE and Difference Between Internal and External Tables

Views in HIVE are virtual tables created using SELECT queries. They do not store data physically and simplify complex queries.
Internal tables store both data and metadata in HIVE warehouse, and data is deleted when the table is dropped.
External tables store only metadata in HIVE, while data remains in external storage even after the table is dropped.

 

2(b) MapReduce Framework and Its Working

MapReduce is a programming model for processing large datasets in parallel. It consists of two main functions: Map and Reduce.
The Map function processes input data and produces key-value pairs. The Reduce function aggregates and processes these pairs to generate final output.

 

2(c) Structured, Semi-Structured, and Unstructured Data

Structured data is organized in fixed format like tables (e.g., databases).
Semi-structured data has flexible structure like XML and JSON.
Unstructured data has no predefined structure such as videos, images, and social media posts.

 

2(d) Shuffle & Sort Phase and Reducer Phase

Shuffle & Sort phase transfers intermediate key-value pairs from Mapper to Reducer and sorts them by key.
Reducer phase processes sorted data to generate final output by aggregating values.

 

2(e) Benefits and 5V’s of Big Data

Big Data helps in better decision-making, cost reduction, improved customer experience, and innovation.
The 5V’s are Volume, Velocity, Variety, Veracity, and Value.

 

SECTION C

 

3(a) Hadoop Ecosystem Frameworks and Joins & Subqueries

Hadoop ecosystem includes tools like HDFS, MapReduce, HIVE, PIG, HBase, Sqoop, Flume, and ZooKeeper.
Joins combine data from multiple tables, while subqueries are queries within queries used for filtering or computation.

 

3(b) Statement for Developing a MapReduce Application

Steps include writing Mapper class, Reducer class, Driver class, setting input/output paths, configuring job, and executing MapReduce program.

 

4(a) Analytic Processes and Tools in Big Data

Analytic processes include data acquisition, storage, processing, analysis, and visualization.
Tools include Hadoop, Spark, HIVE, Pig, HBase, and NoSQL databases.

 

4(b) Cluster Specification and Hadoop Cluster Setup

Cluster specification defines hardware, software, nodes, memory, and storage requirements.
Hadoop cluster setup involves configuring NameNode, DataNode, ResourceManager, and NodeManager.

 

5(a) Master-Slave and Peer-Peer Replication

Master-slave replication has one master controlling data updates.
Peer-to-peer replication allows all nodes to share equal responsibility for data replication.

 

5(b) HBase Concepts and ZooKeeper Role

HBase is a column-oriented NoSQL database built on HDFS.
ZooKeeper helps in coordination, leader election, and monitoring HBase clusters.

 

6(a) Anatomy of MapReduce Job Run

A MapReduce job involves job submission, input splitting, mapping, shuffling, reducing, and output generation.

 

6(b) Analysis vs Reporting

Analysis focuses on discovering insights and patterns, while reporting presents historical data in structured formats.

 

7(a) Compression and Serialization in Hadoop I/O

Compression reduces data size for faster processing and storage efficiency.
Serialization converts objects into byte streams for data transfer.

 

7(b) HBase Storage Mechanism and Table Creation Query

HBase stores data in tables, column families, rows, and cells.
Query to create table:
create 'student','info'

File Size
128.29 KB
Uploader
SuGanta International
⭐ Elite Educators Network

Meet Our Exceptional Teachers

Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication

KISHAN KUMAR DUBEY

KISHAN KUMAR DUBEY

Sant Ravidas Nagar Bhadohi, Uttar Pradesh , Babusarai Market , 221314
5 Years
Years
₹10000+
Monthly
₹201-300
Per Hour

This is Kishan Kumar Dubey. I have done my schooling from CBSE, graduation from CSJMU, post graduati...

Swethavyas bakka

Swethavyas bakka

Hyderabad, Telangana , 500044
10 Years
Years
₹10000+
Monthly
₹501-600
Per Hour

I have 10+ years of experience in teaching maths physics and chemistry for 10th 11th 12th and interm...

Vijaya Lakshmi

Vijaya Lakshmi

Hyderabad, Telangana , New Nallakunta , 500044
30+ Years
Years
₹9001-10000
Monthly
₹501-600
Per Hour

I am an experienced teacher ,worked with many reputed institutions Mount Carmel Convent , Chandrapu...

Shifna sherin F

Shifna sherin F

Gudalur, Tamilnadu , Gudalur , 643212
5 Years
Years
₹6001-7000
Monthly
₹401-500
Per Hour

Hi, I’m Shifna Sherin! I believe that every student has the potential to excel in Math with the righ...

Divyank Gautam

Divyank Gautam

Pune, Maharashtra , Kothrud , 411052
3 Years
Years
Not Specified
Monthly
Not Specified
Per Hour

An IIT graduate having 8 years of experience teaching Maths. Passionate to understand student proble...

Explore Tutors In Your Location

Discover expert tutors in popular areas across India

Web Development Classes Near Noida Sector 101 – Learn Coding and Build Your Tech Career Noida
Singing & Guitar Classes Near Sector 106 Gurugram (Dwarka Expressway) – Discover Your Musical Talent Sector 106, Gurugram
Diet & Nutrition Consultation Near Sector 127 Noida – A Complete Guide to Healthy Living Noida
Guitar Classes Near Central Noida Sector 1 – Learn Guitar with Expert Trainers Noida
Legal Documentation Assistance Near By Dwarka Mor Reliable, Accurate & Professional Legal Drafting Dwarka Mor, Delhi
Accounts & Commerce Classes Near By Dwarka Mor Professional Coaching Dwarka Mor, Delhi
Spoken English Classes Near By Punjabi Bagh Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Punjabi Bagh, Delhi
SEO Training Near Sector 63 Gurugram – Master Search Engine Optimization & Build a High-Growth Career Sector 63, Gurugram
Web Development Classes Near Uttam Nagar – Learn to Build Modern Websites Uttam Nagar, Delhi
Personal Fitness Training Near Palam Vihar – Transform Your Body with Expert Guidance Palam Vihar, Gurugram
Yoga Classes Near By Green Park Elevate Your Physical Strength, Mental Clarity & Lifestyle in 2026 Green Park, Delhi
Fashion Designing Classes Near By Dwarka Mor – Turn Your Creativity into a Stylish Career Dwarka Mor, Delhi
Personality Development Classes Near Sector 56 Gurugram – Build Confidence, Communication & Professional Success Sector 56, Gurugram
Violin Classes Near by Gurugram – Learn, Perform & Master the Art of Strings Gurugram
Cake Decoration Classes Near Sector 86 Gurugram – Learn Professional Cake Designing Skills Sector 86, Gurugram
Guitar Classes Near Central Noida Sector 5 – Learn Guitar with Professional Trainers B Block Sector 5, Noida
Guitar Classes Near Sarita Vihar – Learn Guitar from Expert Trainers in South Delhi Sarita Vihar, Delhi
Graphic Designing Classes Near Uttam Nagar – Turn Your Creativity into a Successful Career Uttam Nagar, Delhi
SEO Training Near Noida Sector 95 – Learn Search Engine Optimization and Build a Digital Career Noida
🇫🇷 French Language Classes Near Rosewood City – Learn French for Global Opportunities Rosewood, Gurugram
⭐ Premium Institute Network

Discover Elite Educational Institutes

Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies

Réussi Academy of languages

sugandha mishra

Réussi Academy of languages
Madhya pradesh, Indore, G...

Details

Coaching Center
Private
Est. 2021-Present

Sugandha Mishra is the Founder Director of Réussi Academy of Languages, a premie...

IGS Institute

Pranav Shivhare

IGS Institute
Uttar Pradesh, Noida, Sec...

Details

Coaching Center
Private
Est. 2011-2020

Institute For Government Services

Krishna home tutor

Krishna Home tutor

Krishna home tutor
New Delhi, New Delhi, 110...

Details

School
Private
Est. 2001-2010

Krishna home tutor provide tutors for all subjects & classes since 2001

Edustunt Tuition Centre

Lakhwinder Singh

Edustunt Tuition Centre
Punjab, Hoshiarpur, 14453...

Details

Coaching Center
Private
Est. 2021-Present
Great success tuition & tutor

Ginni Sahdev

Great success tuition & tutor
Delhi, Delhi, Raja park,...

Details

Coaching Center
Private
Est. 2011-2020