(SEM VI) THEORY EXAMINATION 2024-25 BIG DATA
BCS061 – BIG DATA
Time: 3 Hours | Max Marks: 70
SECTION A – Short Answer Questions
(2 × 7 = 14 marks | Attempt ALL)
Write definition + 1–2 key points
1. List Any Five Big Data Platforms
Apache Hadoop Apache Spark
Apache Flink Apache Storm
Google BigQuery
2. How Does Apache Hadoop Help Process Data?
Apache Hadoop enables distributed storage and parallel processing of large datasets using HDFS and MapReduce, making big data processing scalable and fault-tolerant.
3. What is HDFS?
HDFS (Hadoop Distributed File System) stores large files across multiple nodes with replication for fault tolerance and high availability.
4. Difference Between Hive and Pig
| Hive | Pig |
|---|---|
| SQL-like (HiveQL) | Script-based (Pig Latin) |
| Used for querying | Used for data transformation |
| Declarative | Procedural |
5. Main Components of Hadoop Ecosystem
HDFS (Storage) MapReduce (Processing)
YARN (Resource management) Hive, Pig, HBase, Spark (Tools)
6. Role of YARN in Hadoop
YARN manages cluster resources, schedules jobs, and allows multiple processing engines to run on the same Hadoop cluster.
7. Role of YARN in Hadoop (Repeated in paper)
YARN separates resource management from job scheduling, improving scalability and efficiency.
SECTION B – Medium Answer Questions
(7 × 3 = 21 marks | Attempt ANY THREE)
Write concept → explanation → diagram/example
1. Types of Analytics in Big Data
Descriptive: What happened? Diagnostic: Why did it happen?
Predictive: What will happen? Prescriptive: What should be done?
Used in business intelligence and decision making.
2. Master–Slave vs Peer-to-Peer Replication
Master–Slave: One master node controls slaves
Easier management Single point of failure
Peer-to-Peer: All nodes equal
High fault tolerance Complex synchronization
3. Architecture of MapReduce
Input Split Mapper → key-value pairs
Shuffle & Sort Reducer → final output
Used for parallel data processing.
4. Hadoop Cluster Specification & Setup Hardware: nodes, CPU, RAM, storage
Software: Java, Hadoop Configuration: core-site.xml, hdfs-site.xml
Start NameNode, DataNode, ResourceManager
5. Architecture & Data Flow in HIVE User submits query
Driver receives query Compiler converts to MapReduce
Execution engine runs job Results stored in HDFS
(Hive sits on top of Hadoop)
SECTION C – Attempt ANY ONE
(7 marks)
1. 5 Vs of Big Data (Very Important)
Volume: Huge data size Velocity: Speed of data generation
Variety: Structured, semi-structured, unstructured Veracity: Data quality
Value: Useful insights Example: Social media data.
2. Analysis vs Reporting in Big Data
| Analysis | Reporting |
|---|---|
| Insight-oriented | Summary-oriented |
| Predictive | Historical |
| Supports decisions | Shows results |
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies