(SEM VI) THEORY EXAMINATION 2023-24 BIG DATA

B.Tech Data Structure 0 downloads
₹29.00

KCS061 – BIG DATA (B.Tech Sem VI, 2023–24)

The answers are written in simple, humanized language, not in short points, and strictly follow the uploaded question paper (both pages).
Reference: Uploaded Question Paper 


KCS061-BIG-DATA

SECTION A

Attempt all questions in brief (2 × 10 = 20 marks)


(a) Types of digital data in Big Data with examples

Digital data in Big Data applications is broadly classified into structured, semi-structured, and unstructured data. Structured data is well organized in rows and columns, such as data stored in relational databases like student records or bank transactions. Semi-structured data does not follow a strict table format but contains tags or markers, such as XML and JSON files used in web applications. Unstructured data has no predefined structure and includes text documents, images, videos, audio files, emails, and social media posts.


(b) What constitutes a Big Data platform?

A Big Data platform consists of components for data ingestion, storage, processing, analysis, and visualization. It includes distributed storage systems like HDFS, processing frameworks such as MapReduce or Spark, resource management tools like YARN, analytics tools, and visualization or reporting tools. Together, these components enable handling of large, complex datasets efficiently.


(c) Hadoop Streaming

Hadoop Streaming is a utility that allows users to write MapReduce programs using any programming language that can read from standard input and write to standard output, such as Python or Perl, instead of using Java.


(d) Data formats used in Hadoop environments

Common data formats in Hadoop include Text files, Sequence files, Avro, Parquet, and ORC. Text files are simple and readable, while Avro, Parquet, and ORC are optimized formats that support schema evolution, compression, and faster query performance.


(e) File sizes, block sizes, and block abstraction in HDFS

In HDFS, files are split into large fixed-size blocks, typically 128 MB. A file is stored as multiple blocks across different DataNodes. Block abstraction allows HDFS to manage storage and replication independently of the file structure, improving scalability and fault tolerance.


(f) Benefits and challenges of using HDFS

HDFS provides high fault tolerance, scalability, and cost-effective storage on commodity hardware. However, it is not suitable for low-latency access or handling a large number of small files, and it works best with batch processing rather than real-time operations.


(g) Fair Scheduler and Capacity Scheduler

The Fair Scheduler ensures that all applications get a fair share of cluster resources over time. It is useful in multi-user environments. The Capacity Scheduler divides resources into queues with guaranteed capacity, making it suitable for large organizations with multiple teams sharing a Hadoop cluster.


(h) YARN

YARN (Yet Another Resource Negotiator) is Hadoop’s resource management layer. It manages cluster resources and schedules applications, allowing multiple data processing engines to run on the same cluster.


(i) Apache Pig

Apache Pig is a high-level data processing platform used with Hadoop. It uses a scripting language called Pig Latin, which simplifies writing complex data transformations compared to raw MapReduce code.


(j) Grunt shell in Apache Pig

The Grunt shell is an interactive command-line interface of Apache Pig. It allows users to execute Pig Latin commands interactively, test scripts, and debug data processing logic.


SECTION B

Attempt any three (3 × 10 = 30 marks)


(a) Data analysis vs reporting in Big Data

Reporting focuses on summarizing historical data using predefined queries, dashboards, and charts. It answers questions like “what happened” and “when it happened.” Data analysis, especially advanced analytics, goes beyond reporting by exploring data to identify hidden patterns, correlations, and trends. Techniques such as machine learning, predictive analytics, and data mining help organizations make future-oriented decisions rather than just reviewing past performance.


(b) Apache Hadoop and its role in Big Data processing

Apache Hadoop is an open-source framework designed to store and process large datasets across clusters of computers. Its core components include HDFS for distributed storage, MapReduce for distributed processing, YARN for resource management, and Hadoop Common for utilities. These components work together to provide scalable, fault-tolerant Big Data processing.


(c) Core concepts of HDFS: NameNode and DataNode

HDFS follows a master-slave architecture. The NameNode maintains metadata such as file names, block locations, and access permissions. DataNodes store actual data blocks and handle read/write requests. The NameNode coordinates data placement and replication, ensuring reliability and efficient data access across the cluster.
 

(d) NoSQL databases and their benefits

NoSQL databases are non-relational databases designed to handle large volumes of unstructured and semi-structured data. They offer schema flexibility, horizontal scalability, high performance, and fault tolerance. Compared to traditional relational databases, NoSQL systems like MongoDB and Cassandra are better suited for Big Data applications.
 

(e) Apache Hive architecture

Apache Hive provides a data warehousing layer on Hadoop. Its architecture includes HiveQL parser, compiler, optimizer, and execution engine. Hive translates SQL-like queries into MapReduce or Spark jobs, allowing users to analyze large datasets without writing complex code.
 

SECTION C

Attempt any one (1 × 10 = 10 marks)
 

(a) The 5 Vs of Big Data

The 5 Vs of Big Data are Volume, Velocity, Variety, Veracity, and Value. Volume refers to the massive amount of data generated. Velocity describes the speed at which data is produced and processed. Variety represents different data types and formats. Veracity deals with data quality and reliability. Value focuses on extracting meaningful insights that support decision-making.
 

(b) Real-world applications of Big Data analytics

In healthcare, Big Data helps in disease prediction and personalized treatment. In finance, it is used for fraud detection and risk analysis. E-commerce platforms analyze customer behavior to recommend products. Transportation systems use Big Data for traffic management, route optimization, and predictive maintenance.

File Size
144.31 KB
Uploader
SuGanta International
⭐ Elite Educators Network

Meet Our Exceptional Teachers

Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication

KISHAN KUMAR DUBEY

KISHAN KUMAR DUBEY

Sant Ravidas Nagar Bhadohi, Uttar Pradesh , Babusarai Market , 221314
5 Years
Years
₹10000+
Monthly
₹201-300
Per Hour

This is Kishan Kumar Dubey. I have done my schooling from CBSE, graduation from CSJMU, post graduati...

Swethavyas bakka

Swethavyas bakka

Hyderabad, Telangana , 500044
10 Years
Years
₹10000+
Monthly
₹501-600
Per Hour

I have 10+ years of experience in teaching maths physics and chemistry for 10th 11th 12th and interm...

Vijaya Lakshmi

Vijaya Lakshmi

Hyderabad, Telangana , New Nallakunta , 500044
30+ Years
Years
₹9001-10000
Monthly
₹501-600
Per Hour

I am an experienced teacher ,worked with many reputed institutions Mount Carmel Convent , Chandrapu...

Shifna sherin F

Shifna sherin F

Gudalur, Tamilnadu , Gudalur , 643212
5 Years
Years
₹6001-7000
Monthly
₹401-500
Per Hour

Hi, I’m Shifna Sherin! I believe that every student has the potential to excel in Math with the righ...

Divyank Gautam

Divyank Gautam

Pune, Maharashtra , Kothrud , 411052
3 Years
Years
Not Specified
Monthly
Not Specified
Per Hour

An IIT graduate having 8 years of experience teaching Maths. Passionate to understand student proble...

Explore Tutors In Your Location

Discover expert tutors in popular areas across India

Diet & Nutrition Consultation Near Malibu Town – Personalized Guidance for a Healthy Lifestyle Malibu Town, Gurugram
IELTS Coaching Near Sector 57 Gurugram – Expert Training for High Band Scores Gurugram Sector 57, Gurugram
Spoken English Classes Near By Kirti Nagar Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Kirti Nagar, Delhi
🇪🇸 Spanish Language Classes Near Sector 111 Noida – Learn Spanish with Professional Trainers Noida
Low Investment Business Opportunities Near By Kirti Nagar Start Small, Grow Smart & Build Profitable Ventures with Minimal Capital Kirti Nagar, Delhi
Yoga Classes Near By Greater Kailash Achieve Strength, Flexibility & Mental Peace with Expert Yoga Training in 2026 Greater Kailash, Delhi
Public Speaking Training Near Sector 55 Gurugram – Build Confidence, Communication & Leadership Skills Sector 55, Gurugram
Physiotherapy Guidance (Certified Professionals Only) Near Central Park 1 & 2 – Restore Movement, Regain Strength Central Park 2, Gurugram
Competitive Exam Coaching Near Dwarka Mor Complete Preparation for Government & Entrance Exams with Expert Guidance Dwarka Mor, Delhi
Coding Classes for Kids Near By Kirti Nagar – Build Future-Ready Skills Early Kirti Nagar, Delhi
German Language Classes Near Central Park 2 – Learn German for Career, Study & Global Opportunities Central Park 2, Gurugram
Science Classes Near Sector 88A Gurugram – Build Strong Concepts for a Bright Future Sector 88A, Gurugram
SEO Training Near Noida Sector 95 – Learn Search Engine Optimization and Build a Digital Career Noida
Spoken English Classes Near By Kalkaji Improve Fluency, Build Confidence & Grow Career Opportunities in 2026 Kalkaji, Delhi
Harmonium Classes Near Sushant Lok Phase 1 – Learn Classical Music with Confidence Sushant Lok Phase 1, Gurugram
Data Analytics Course Near Sector 63A Gurugram – Build a High-Demand Career in Data Sector 63A, Gurugram
Diet & Nutrition Consultation Near Sector 127 Noida – A Complete Guide to Healthy Living Noida
Vedic Maths Classes Near Sector 99A Dwarka Expressway, Gurugram – Boost Speed, Accuracy & Mental Calculation Skills Sector 99A, Gurugram
UI/UX Designing Course Near Sector 66 Gurugram – Build a Creative & High-Paying Design Career Sector 66, Gurugram
Guitar Classes Near Chhatarpur – Professional Guitar Training in South Delhi Chhatarpur, Delhi
⭐ Premium Institute Network

Discover Elite Educational Institutes

Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies

Réussi Academy of languages

sugandha mishra

Réussi Academy of languages
Madhya pradesh, Indore, G...

Details

Coaching Center
Private
Est. 2021-Present

Sugandha Mishra is the Founder Director of Réussi Academy of Languages, a premie...

IGS Institute

Pranav Shivhare

IGS Institute
Uttar Pradesh, Noida, Sec...

Details

Coaching Center
Private
Est. 2011-2020

Institute For Government Services

Krishna home tutor

Krishna Home tutor

Krishna home tutor
New Delhi, New Delhi, 110...

Details

School
Private
Est. 2001-2010

Krishna home tutor provide tutors for all subjects & classes since 2001

Edustunt Tuition Centre

Lakhwinder Singh

Edustunt Tuition Centre
Punjab, Hoshiarpur, 14453...

Details

Coaching Center
Private
Est. 2021-Present
Great success tuition & tutor

Ginni Sahdev

Great success tuition & tutor
Delhi, Delhi, Raja park,...

Details

Coaching Center
Private
Est. 2011-2020