(SEM VI) THEORY EXAMINATION 2023-24 BIG DATA AND ANALYTICS

B.Tech General 0 downloads
₹29.00

KDS601 – BIG DATA AND ANALYTICS (B.Tech Sem VI, 2023–24)


All answers are written in simple, clear, humanized language (not short bullet points) and are prepared strictly according to the uploaded question paper (Page 1).
Reference: Uploaded Question Paper 


KDS601-BIG-DATA-AND-ANALYTICS


SECTION A

Attempt all questions in brief (2 × 10 = 20 marks)


(a) Differences between structured, semi-structured, and unstructured data

Structured data is highly organized and stored in rows and columns, such as data in relational databases. Semi-structured data does not follow a fixed schema but uses tags or markers, for example JSON and XML files. Unstructured data has no predefined format and includes text documents, images, videos, emails, and social media content.


(b) Drivers of Big Data

Big Data is driven by the rapid growth of social media, mobile devices, IoT sensors, cloud computing, digital transactions, and online services. These sources continuously generate large volumes of diverse and fast-moving data.


(c) Core functionalities of Apache Hadoop

Apache Hadoop provides distributed data storage using HDFS, parallel data processing using MapReduce, resource management using YARN, and fault tolerance through data replication across multiple nodes.


(d) Importance of Hadoop data format

Hadoop data formats such as Avro, Parquet, and ORC affect storage efficiency, compression, and processing speed. Optimized formats reduce disk usage and improve query performance, making large-scale data analysis faster and more efficient.


(e) Steps to import data from RDBMS to Hadoop using Sqoop

First, Sqoop establishes a connection with the RDBMS. Then, it analyzes table metadata, divides data into splits, launches parallel Map tasks, and finally imports the data into HDFS or Hive tables.


(f) How a file system works

A file system manages how data is stored, organized, and retrieved. It maintains file names, directories, permissions, and metadata, and ensures efficient access and storage of data on physical disks.


(g) Fair Scheduler vs Capacity Scheduler in Hadoop

The Fair Scheduler allocates resources equally among running jobs, ensuring fairness. The Capacity Scheduler divides resources into queues with fixed capacities, ensuring guaranteed resource availability for different organizations.


(h) Data types used in MongoDB

MongoDB supports data types such as String, Integer, Boolean, Double, Date, Array, Object, Null, and ObjectId, allowing flexible schema design.


(i) HiveQL and its key features

HiveQL is a SQL-like query language used in Apache Hive. It supports table creation, querying, partitioning, and integration with Hadoop, allowing users to analyze big data without writing complex MapReduce code.


(j) Data processing operators in Pig

Pig supports operators such as LOAD, FILTER, GROUP, JOIN, FOREACH, ORDER, DISTINCT, and STORE to perform data transformation and analysis.


SECTION B

Attempt any three (10 × 3 = 30 marks)


(a) Why Big Data is crucial for modern businesses and industries

Big Data helps organizations analyze customer behavior, optimize operations, improve decision-making, and gain competitive advantage. Industries use Big Data for fraud detection, predictive maintenance, personalized marketing, risk analysis, and innovation. Data-driven insights enable faster and more accurate business strategies.


(b) Hadoop Distributed File System (HDFS) and its working

HDFS is a distributed storage system designed for large datasets. Files are divided into blocks and stored across multiple DataNodes. The NameNode manages metadata, while DataNodes store actual data. Replication ensures fault tolerance and reliability even if a node fails.


(c) Benefits and challenges of using HDFS

HDFS provides scalability, fault tolerance, and cost-effective storage. It supports parallel processing of big data. However, it is not suitable for small files, real-time access, or low-latency applications and requires skilled administration.


(d) NoSQL databases vs traditional RDBMS

Traditional RDBMS use fixed schemas, SQL queries, and vertical scaling. NoSQL databases offer flexible schemas, horizontal scaling, and high availability. NoSQL systems are ideal for big data applications like social media, IoT, and real-time analytics.


(e) Role of ZooKeeper in Hadoop cluster monitoring

ZooKeeper coordinates distributed applications by providing configuration management, synchronization, leader election, and fault detection. It ensures high availability and consistency across Hadoop clusters.


SECTION C


3(a) Big Data analytics vs traditional data analytics

Traditional analytics deals with structured, small-scale data using centralized systems. Big Data analytics handles massive, diverse, and fast-moving data using distributed systems. Tools like Hadoop, Spark, Hive, and Pig are used in Big Data analytics, while traditional analytics relies on RDBMS and data warehouses.


3(b) Big Data features: Security, Protection, and Auditing

Big Data security includes authentication, authorization, encryption, and access control. Data protection ensures confidentiality and integrity, while auditing tracks user actions and data access to ensure compliance and accountability.

File Size
136.01 KB
Uploader
SuGanta International
⭐ Elite Educators Network

Meet Our Exceptional Teachers

Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication

KISHAN KUMAR DUBEY

KISHAN KUMAR DUBEY

Sant Ravidas Nagar Bhadohi, Uttar Pradesh , Babusarai Market , 221314
5 Years
Years
₹10000+
Monthly
₹201-300
Per Hour

This is Kishan Kumar Dubey. I have done my schooling from CBSE, graduation from CSJMU, post graduati...

Swethavyas bakka

Swethavyas bakka

Hyderabad, Telangana , 500044
10 Years
Years
₹10000+
Monthly
₹501-600
Per Hour

I have 10+ years of experience in teaching maths physics and chemistry for 10th 11th 12th and interm...

Vijaya Lakshmi

Vijaya Lakshmi

Hyderabad, Telangana , New Nallakunta , 500044
30+ Years
Years
₹9001-10000
Monthly
₹501-600
Per Hour

I am an experienced teacher ,worked with many reputed institutions Mount Carmel Convent , Chandrapu...

Shifna sherin F

Shifna sherin F

Gudalur, Tamilnadu , Gudalur , 643212
5 Years
Years
₹6001-7000
Monthly
₹401-500
Per Hour

Hi, I’m Shifna Sherin! I believe that every student has the potential to excel in Math with the righ...

Divyank Gautam

Divyank Gautam

Pune, Maharashtra , Kothrud , 411052
3 Years
Years
Not Specified
Monthly
Not Specified
Per Hour

An IIT graduate having 8 years of experience teaching Maths. Passionate to understand student proble...

Explore Tutors In Your Location

Discover expert tutors in popular areas across India

Zumba Classes Near Sector 131 Greater Noida – Enjoy Dance Fitness and Stay Healthy Noida
Yoga Classes Near By Greater Kailash Achieve Strength, Flexibility & Mental Peace with Expert Yoga Training in 2026 Greater Kailash, Delhi
Music Production (Laptop-Based) Classes Near Sector 142 Noida – Learn Professional Digital Music Creation Sector 142, Noida
Prenatal Yoga Training Near Sector 121 Noida – A Complete Guide for Healthy Pregnancy and Wellness Noida
Harmonium Classes Near By Saket – Learn Classical & Devotional Music with Confidence Delhi
Digital Marketing Classes Near By Kirti Nagar – Build a High-Growth Career in the Digital World Kirti Nagar, Delhi
Spoken English Classes Near By Vikaspuri Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Vikaspuri, Delhi
German Language Classes Near Sector 118 Noida – Learn German with Expert Trainers Noida
Guitar Classes Near South Extension – Professional Guitar Training in South Delhi South Extension, Delhi
Home Tuition (All Subjects) Near Dwarka Mor – Personalized Learning for Academic Success Dwarka Mor, Delhi
Keyboard / Piano Classes Near Sector 147 Noida – Learn Music with Expert Trainers Noida
Yoga Classes Near By Green Park Elevate Your Physical Strength, Mental Clarity & Lifestyle in 2026 Green Park, Delhi
🇯🇵 Japanese Language Classes Near Golf Course Extension Road – Complete Guide to Learning Japanese Golf Course Ext Road, Gurugram
Singing / Vocal Training Near Sector 18 Market Area Noida – Learn Music with Professional Vocal Trainers Noida Sector 18, Noida
Science Classes Near By Dwarka Mor – Build Strong Concepts in Physics, Chemistry & Biology Dwarka Mor, Delhi
Spoken English Classes Near By Kirti Nagar Improve Fluency, Build Confidence & Unlock Career Opportunities in 2026 Kirti Nagar, Delhi
Zumba Classes Near Sector 133 Greater Noida – Fun, Fitness and Energy in Every Step Noida
TOEFL Coaching Near Noida Sector 104 – Complete Preparation Guide for Study Abroad Sector 104, Noida
Yoga Classes Near Sector 136 Greater Noida – Improve Your Health, Flexibility and Mental Wellness Noida
Data Analytics Training Near Noida Sector 94 – Learn Data Skills and Build a High-Demand Career Noida
⭐ Premium Institute Network

Discover Elite Educational Institutes

Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies

Réussi Academy of languages

sugandha mishra

Réussi Academy of languages
Madhya pradesh, Indore, G...

Details

Coaching Center
Private
Est. 2021-Present

Sugandha Mishra is the Founder Director of Réussi Academy of Languages, a premie...

IGS Institute

Pranav Shivhare

IGS Institute
Uttar Pradesh, Noida, Sec...

Details

Coaching Center
Private
Est. 2011-2020

Institute For Government Services

Krishna home tutor

Krishna Home tutor

Krishna home tutor
New Delhi, New Delhi, 110...

Details

School
Private
Est. 2001-2010

Krishna home tutor provide tutors for all subjects & classes since 2001

Edustunt Tuition Centre

Lakhwinder Singh

Edustunt Tuition Centre
Punjab, Hoshiarpur, 14453...

Details

Coaching Center
Private
Est. 2021-Present
Great success tuition & tutor

Ginni Sahdev

Great success tuition & tutor
Delhi, Delhi, Raja park,...

Details

Coaching Center
Private
Est. 2011-2020