(SEM VIII) THEORY EXAMINATION 2023-24 BIG DATA

B.Tech General 0 downloads
₹29.00

SECTION A

(Attempt all | 2 × 10 = 20 Marks)

 

a. List any five Big Data platforms
Apache Hadoop, Apache Spark, Apache Flink, Apache Storm, Google BigQuery.

 

b. Importance of Hadoop technology in Big Data Analytics
Hadoop enables distributed storage and parallel processing of large datasets at low cost, providing scalability, fault tolerance, and high availability.

 

c. Three benefits of MapReduce
MapReduce offers scalability, fault tolerance, and parallel processing of large data across clusters.

 

d. Define heartbeat in HDFS
Heartbeat is a periodic signal sent by DataNodes to NameNode to indicate that they are active and functioning properly.

 

e. List any five Big Data platforms
Apache Hadoop, Apache Spark, Cassandra, MongoDB, Amazon EMR.

 

f. Define data replication in HDFS
Data replication is the process of storing multiple copies of data blocks on different DataNodes to ensure fault tolerance and data availability.

 

g. Name any two data ingestion tools in Hadoop
Apache Flume and Apache Sqoop.

 

h. Compare NoSQL and Relational Databases
Relational databases use structured schema and SQL, while NoSQL databases support flexible schema and handle large-scale unstructured data.

 

i. Advantages of Scala over Java
Scala supports functional programming, concise syntax, immutability, and better performance with Apache Spark.

 

j. Differentiate between Pig and Hive
Pig uses procedural language (Pig Latin) for data flow, while Hive uses declarative SQL-like language (HiveQL) for querying data.

 

SECTION B

(Attempt any THREE | 3 × 10 = 30 Marks)

 

2(a) Structured, Semi-Structured & Unstructured Data

Structured data is organized in rows and columns such as databases and spreadsheets.
Semi-structured data has tags or markers like XML and JSON files.
Unstructured data has no predefined format, such as videos, images, emails, and social media posts.
Big Data technologies handle all three types efficiently.

 

2(b) Anatomy of a MapReduce Job Run

A MapReduce job begins with data input split into blocks. The Map phase processes data into key-value pairs. The Shuffle and Sort phase groups similar keys. The Reduce phase aggregates results and stores output in HDFS. The JobTracker coordinates tasks, while TaskTrackers execute them.

 

2(c) Design and Concept of HDFS

HDFS follows a master-slave architecture with NameNode managing metadata and DataNodes storing data blocks. Data is stored in large blocks with replication. HDFS provides high fault tolerance, scalability, and is optimized for batch processing.

 

2(d) CRUD operations in MongoDB

CRUD stands for Create, Read, Update, and Delete.     Create inserts documents using insert().
Read retrieves data using find().                                    Update modifies documents using update().
Delete removes documents using delete().               MongoDB stores data in flexible JSON-like documents.

 

2(e) Role of ZooKeeper in HBase

ZooKeeper manages coordination, synchronization, configuration, and leader election among HBase components. It ensures reliability, fault tolerance, and consistency in distributed environments.

 

SECTION C

 

3(a) 5 Vs of Big Data and their implications

The 5 Vs are:

Volume: Huge amount of data                               Velocity: Speed of data generation

Variety: Different data formats                               Veracity: Data quality and accuracy

Value: Useful insights from data

These characteristics require specialized tools for storage, processing, and analytics.

 

3(b) Components of Big Data Architecture

Components include data sources, data ingestion layer, storage layer (HDFS/NoSQL), processing layer (MapReduce/Spark), analytics layer, and visualization tools. Together they enable end-to-end Big Data processing.

 

4(a) HDFS Architecture & Fault Tolerance

HDFS uses NameNode, DataNode, and Secondary NameNode. Fault tolerance is achieved using data replication, heartbeat monitoring, and automatic re-replication of failed blocks.

 

4(b) Hadoop Streaming and Pipes

Hadoop Streaming allows MapReduce programs in languages like Python or Perl. Pipes enable C/C++ programs to interact with Hadoop via standard input/output.

 

5(a) Client Read and Write Operations in HDFS

For write operation, the client contacts NameNode for metadata and writes data to DataNodes in a pipeline.
For read operation, the client fetches metadata from NameNode and reads data directly from nearest DataNode.

 

5(b) Cluster Specification & Hadoop Cluster Setup

Cluster specification includes number of nodes, CPU, RAM, storage, and network bandwidth.
Setting up Hadoop cluster involves installing Hadoop, configuring core-site.xml, hdfs-site.xml, yarn-site.xml, formatting NameNode, and starting Hadoop services.

 

6(a) Features of Apache Spark & Integration with Hadoop

Spark provides in-memory processing, high speed, fault tolerance, and supports batch, streaming, ML, and graph processing.
Spark can work with HDFS, YARN, and Hadoop MapReduce.

 

6(b) NameNode High Availability & HDFS Federation

High Availability removes single point of failure using Active-Standby NameNodes.
HDFS Federation allows multiple NameNodes to manage separate namespaces for scalability.

 

7(a) Need of Pig & Execution Modes

Pig simplifies complex data processing using Pig Latin.
Execution modes are:                                                     Local mode

MapReduce mode                                                          Tez mode

 

7(b) Apache Hive Architecture

Hive architecture includes UI, Driver, Compiler, Metastore, Execution Engine, and HDFS. Hive converts HiveQL queries into MapReduce or Spark jobs for execution.

File Size
139.61 KB
Uploader
SuGanta International
⭐ Elite Educators Network

Meet Our Exceptional Teachers

Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication

KISHAN KUMAR DUBEY

KISHAN KUMAR DUBEY

Sant Ravidas Nagar Bhadohi, Uttar Pradesh , Babusarai Market , 221314
5 Years
Years
₹10000+
Monthly
₹201-300
Per Hour

This is Kishan Kumar Dubey. I have done my schooling from CBSE, graduation from CSJMU, post graduati...

Swethavyas bakka

Swethavyas bakka

Hyderabad, Telangana , 500044
10 Years
Years
₹10000+
Monthly
₹501-600
Per Hour

I have 10+ years of experience in teaching maths physics and chemistry for 10th 11th 12th and interm...

Vijaya Lakshmi

Vijaya Lakshmi

Hyderabad, Telangana , New Nallakunta , 500044
30+ Years
Years
₹9001-10000
Monthly
₹501-600
Per Hour

I am an experienced teacher ,worked with many reputed institutions Mount Carmel Convent , Chandrapu...

Shifna sherin F

Shifna sherin F

Gudalur, Tamilnadu , Gudalur , 643212
5 Years
Years
₹6001-7000
Monthly
₹401-500
Per Hour

Hi, I’m Shifna Sherin! I believe that every student has the potential to excel in Math with the righ...

Divyank Gautam

Divyank Gautam

Pune, Maharashtra , Kothrud , 411052
3 Years
Years
Not Specified
Monthly
Not Specified
Per Hour

An IIT graduate having 8 years of experience teaching Maths. Passionate to understand student proble...

Explore Tutors In Your Location

Discover expert tutors in popular areas across India

Music Theory & Composition Near DLF Cyber City – Master the Language of Music DLF Cyber City, Gurugram
Zumba Classes Near Palam Vihar – Fun Dance Fitness for a Healthy Lifestyle Palam Vihar, Gurugram
Public Speaking Training Near Sector 108 Noida – Build Confidence and Communication Skills Noida
Guitar Classes Near Jangpura – Professional Guitar Training in South Delhi Jangpura, Delhi
Spoken English Classes Near By Greater Kailash Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Greater Kailash, Delhi
Coding Classes for Kids Near By Kirti Nagar – Build Future-Ready Skills Early Kirti Nagar, Delhi
Prenatal Yoga Training Near Vatika City – Safe & Healthy Pregnancy Wellness Vatika City, Gurugram
Prenatal Yoga Training Near Uppal Southend, Gurugram – A Calm & Healthy Pregnancy Journey Uppal Southend, Gurugram
Yoga Classes Near Malviya Nagar Build Strength, Reduce Stress & Transform Your Lifestyle with Professional Yoga Training in 2026 Malviya Nagar, Delhi
High Profit Margin Business Opportunities Near Sector 109 Gurugram (Dwarka Expressway) Gurugram
Music Production (Laptop-Based) Near DLF Cyber City – Learn Professional Music Creation DLF Cyber City, Gurugram
Yoga Classes Near By Green Park Elevate Your Physical Strength, Mental Clarity & Lifestyle in 2026 Green Park, Delhi
Career Counseling Classes Near By Dwarka Mor Find the Right Direction Dwarka Mor, Delhi
No Office Rent Business Setup Near Najafgarh Start & Grow Your Business Without Paying High Office Rent in 2026 Najafgarh, Delhi
Vedic Maths Classes Near By Dwarka Mor Improve Speed, Accuracy & Confidence in Mathematics Dwarka Mor, Delhi
Fitness Training Near By Najafgarh Professional Workout Programs for Strength, Weight Loss & Overall Wellness Najafgarh, Delhi
Harmonium Classes Near Sector 140 Noida – Learn Indian Classical Music with Expert Guidance Sector 140, Noida
Zumba Classes Near Palam Vihar Extension – Dance Your Way to Fitness New Palam Vihar, Gurugram
Graphic Designing Classes Near Noida Sector 99 – Learn Creative Design and Build a Successful Career Noida
Cake Decoration Classes Near Sector 86 Gurugram – Learn Professional Cake Designing Skills Sector 86, Gurugram
⭐ Premium Institute Network

Discover Elite Educational Institutes

Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies

Réussi Academy of languages

sugandha mishra

Réussi Academy of languages
Madhya pradesh, Indore, G...

Details

Coaching Center
Private
Est. 2021-Present

Sugandha Mishra is the Founder Director of Réussi Academy of Languages, a premie...

IGS Institute

Pranav Shivhare

IGS Institute
Uttar Pradesh, Noida, Sec...

Details

Coaching Center
Private
Est. 2011-2020

Institute For Government Services

Krishna home tutor

Krishna Home tutor

Krishna home tutor
New Delhi, New Delhi, 110...

Details

School
Private
Est. 2001-2010

Krishna home tutor provide tutors for all subjects & classes since 2001

Edustunt Tuition Centre

Lakhwinder Singh

Edustunt Tuition Centre
Punjab, Hoshiarpur, 14453...

Details

Coaching Center
Private
Est. 2021-Present
Great success tuition & tutor

Ginni Sahdev

Great success tuition & tutor
Delhi, Delhi, Raja park,...

Details

Coaching Center
Private
Est. 2011-2020