(SEM VIII) THEORY EXAMINATION 2024-25 BIG DATA

B.Tech General 0 downloads
₹29.00

SECTION A – Short Answers (2 Marks Each) – Paragraph Style

 

a) What is Big Data?

Big Data refers to extremely large and complex datasets that cannot be processed efficiently using traditional data processing tools. These datasets are generated continuously from sources such as social media, sensors, mobile devices, transactions, and online platforms. Big Data requires advanced storage, processing, and analytical techniques to extract useful information and insights.

 

b) Explain the characteristics of Big Data.

The characteristics of Big Data describe the nature of the data and the challenges involved in handling it. Big Data is massive in size, generated at high speed, comes in different formats, and varies in quality and usefulness. These characteristics make traditional data management systems inadequate for Big Data processing.

 

c) Explain the four V’s of Big Data.

The four V’s of Big Data are Volume, Velocity, Variety, and Veracity. Volume refers to the huge amount of data generated daily. Velocity represents the speed at which data is generated and processed. Variety indicates different forms of data such as text, images, and videos. Veracity deals with the accuracy and reliability of the data.

 

d) Discuss applications of Big Data.

Big Data is widely used in healthcare for disease prediction, in finance for fraud detection, in e-commerce for recommendation systems, and in social media for sentiment analysis. It also plays an important role in smart cities, weather forecasting, and business decision-making.

 

e) What is Big Data Analytics?

Big Data Analytics is the process of examining large and complex datasets to uncover hidden patterns, trends, and relationships. It uses advanced analytical techniques such as machine learning, data mining, and statistical analysis to support better decision-making.

 

f) Discuss challenges of Big Data.

Big Data faces challenges such as data storage, data security, data integration, scalability, and processing speed. Managing data quality and ensuring privacy are also major concerns when dealing with large-scale data systems.

 

g) Differentiate between structured and unstructured data.

Structured data is organized in a predefined format such as tables and databases, making it easy to store and analyze. Unstructured data does not follow a fixed structure and includes text, images, audio, and videos, which require advanced tools for processing.

 

h) Differentiate between HDFS and HBase.

HDFS is a distributed file system designed for storing large files with high fault tolerance, while HBase is a NoSQL database built on top of HDFS for real-time read and write access. HDFS is optimized for batch processing, whereas HBase supports random access.

 

i) What is ZooKeeper? List its benefits.

ZooKeeper is a centralized coordination service used in distributed systems. It provides services such as configuration management, synchronization, and leader election. ZooKeeper improves reliability, simplifies coordination, and ensures consistent system operation.

 

j) Differentiate between Apache Pig and MapReduce.

Apache Pig is a high-level data processing framework that uses Pig Latin scripts, making it easier to write programs. MapReduce is a low-level programming model that requires complex coding. Pig simplifies development, while MapReduce offers greater control over processing.

 

SECTION B – Descriptive Answers (10 Marks Each) – Paragraph Style

 

a) Explain how Big Data processing is different from distributed processing.

Big Data processing focuses on handling extremely large and diverse datasets using scalable frameworks such as Hadoop and Spark. While distributed processing divides tasks across multiple systems, Big Data processing additionally manages data variety, velocity, and fault tolerance. It emphasizes data locality, parallel computation, and scalability, which go beyond traditional distributed systems.

 

b) Discuss Hadoop YARN in detail with failures in classic MapReduce.

Hadoop YARN is a resource management layer that separates job scheduling from data processing. In classic MapReduce, resource management and job execution were tightly coupled, leading to scalability issues and inefficient resource utilization. YARN overcomes these limitations by enabling multiple processing engines and improving cluster efficiency.

 

c) Explain the MapReduce framework in detail.

MapReduce is a programming model used for processing large datasets in parallel. It consists of a Map phase that processes input data and generates key-value pairs, and a Reduce phase that aggregates results. The framework ensures fault tolerance, scalability, and efficient distributed processing.

 

d) What are NameNode and DataNode in Hadoop architecture?

The NameNode is the master node responsible for managing metadata and file system structure in HDFS. DataNodes are worker nodes that store actual data blocks. Together, they ensure reliable data storage and retrieval in a distributed environment.

 

e) Differentiate between NoSQL and SQL databases.

SQL databases use structured schemas and relational models, making them suitable for structured data. NoSQL databases support flexible schemas and are designed for scalability and high availability, making them ideal for Big Data applications.

 

SECTION C – Long Answer (10 Marks Each) – Paragraph Style

 

a) What is MapReduce? Explain the working of various phases of MapReduce with example and diagram.

MapReduce is a distributed computing framework used for processing large datasets. The Map phase reads input data and converts it into key-value pairs. The Shuffle and Sort phase groups similar keys together. The Reduce phase processes these grouped values to produce final output. For example, in a word count program, the Map phase counts words, and the Reduce phase aggregates total occurrences.

 

OR

 

b) Explain the working of Hive with proper steps and diagram.

Apache Hive is a data warehousing tool built on Hadoop that allows querying large datasets using HiveQL. Data is stored in HDFS, queries are written in HiveQL, and Hive converts them into MapReduce jobs. The execution engine processes these jobs and returns results, enabling easy data analysis without complex programming.

File Size
130.23 KB
Uploader
SuGanta International
⭐ Elite Educators Network

Meet Our Exceptional Teachers

Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication

KISHAN KUMAR DUBEY

KISHAN KUMAR DUBEY

Sant Ravidas Nagar Bhadohi, Uttar Pradesh , Babusarai Market , 221314
5 Years
Years
₹10000+
Monthly
₹201-300
Per Hour

This is Kishan Kumar Dubey. I have done my schooling from CBSE, graduation from CSJMU, post graduati...

Swethavyas bakka

Swethavyas bakka

Hyderabad, Telangana , 500044
10 Years
Years
₹10000+
Monthly
₹501-600
Per Hour

I have 10+ years of experience in teaching maths physics and chemistry for 10th 11th 12th and interm...

Vijaya Lakshmi

Vijaya Lakshmi

Hyderabad, Telangana , New Nallakunta , 500044
30+ Years
Years
₹9001-10000
Monthly
₹501-600
Per Hour

I am an experienced teacher ,worked with many reputed institutions Mount Carmel Convent , Chandrapu...

Shifna sherin F

Shifna sherin F

Gudalur, Tamilnadu , Gudalur , 643212
5 Years
Years
₹6001-7000
Monthly
₹401-500
Per Hour

Hi, I’m Shifna Sherin! I believe that every student has the potential to excel in Math with the righ...

Divyank Gautam

Divyank Gautam

Pune, Maharashtra , Kothrud , 411052
3 Years
Years
Not Specified
Monthly
Not Specified
Per Hour

An IIT graduate having 8 years of experience teaching Maths. Passionate to understand student proble...

Explore Tutors In Your Location

Discover expert tutors in popular areas across India

Guitar Classes Near Jangpura – Professional Guitar Training in South Delhi Jangpura, Delhi
Meditation Coaching Near Sector 124 Noida – A Complete Guide to Mental Peace and Mindfulness Noida
Guitar Classes Near By Green Park Learn Guitar with Expert Trainers & Turn Your Passion into a Lifelong Skill Green Park, Delhi
Personality Development Classes Near Uttam Nagar – Build Confidence & Leadership Skills Uttam Nagar, Delhi
Soap Making Classes Near Sector 85 Gurugram – Learn Handmade & Herbal Soap Craft Sector 85, Gurugram
Guitar Classes Near Central Noida Sector 1 – Learn Guitar with Expert Trainers Noida
Spoken English Classes Near By Vasant Vihar Improve Fluency, Build Confidence & Achieve Career Success in 2026 Vasant Vihar, Delhi
Guitar Classes Near By Lajpat Nagar Learn Guitar with Expert Trainers & Turn Your Passion into a Powerful Skill Lajpat Nagar, Delhi
Hindi Coaching Classes Near By Dwarka Mor Build Strong Language Skills Dwarka Mor, Delhi
Real Estate Consulting Near By Dwarka Mor Professional Property Guidance for Buying, Selling & Investment Decisions Dwarka Mor, Delhi
Spoken English Classes Near By Subhash Nagar Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Subhash Nagar, Delhi
Spoken English Classes Near Rajouri Garden Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Rajouri Garden, Delhi
Graphic Designing Classes Near Noida Sector 97 – Learn Creative Design Skills and Build Your Career Sector 97, Noida
Video Editing Classes Near Sector 82A Gurugram – Learn Professional Editing Skills Sector 82A, Gurugram
Data Analytics Training Near Noida Sector 94 – Learn Data Skills and Build a High-Demand Career Noida
Singing / Vocal Training Near Sector 148 Noida – Professional Vocal Coaching for All Levels Noida
Personal Fitness Training Near Palam Vihar – Transform Your Body with Expert Guidance Palam Vihar, Gurugram
Drum Lessons (Electronic Drums Preferred at Home) Near Sector 145 Noida – Learn Drumming with Expert Trainers Noida
Digital Marketing Classes Near Noida Sector 96 – Learn Modern Marketing Skills and Build a Successful Career Noida
Guitar Classes Near New Friends Colony – Learn Guitar from Expert Trainers in South Delhi New Friends Colony, Delhi
⭐ Premium Institute Network

Discover Elite Educational Institutes

Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies

Réussi Academy of languages

sugandha mishra

Réussi Academy of languages
Madhya pradesh, Indore, G...

Details

Coaching Center
Private
Est. 2021-Present

Sugandha Mishra is the Founder Director of Réussi Academy of Languages, a premie...

IGS Institute

Pranav Shivhare

IGS Institute
Uttar Pradesh, Noida, Sec...

Details

Coaching Center
Private
Est. 2011-2020

Institute For Government Services

Krishna home tutor

Krishna Home tutor

Krishna home tutor
New Delhi, New Delhi, 110...

Details

School
Private
Est. 2001-2010

Krishna home tutor provide tutors for all subjects & classes since 2001

Edustunt Tuition Centre

Lakhwinder Singh

Edustunt Tuition Centre
Punjab, Hoshiarpur, 14453...

Details

Coaching Center
Private
Est. 2021-Present
Great success tuition & tutor

Ginni Sahdev

Great success tuition & tutor
Delhi, Delhi, Raja park,...

Details

Coaching Center
Private
Est. 2011-2020