(SEM VI) THEORY EXAMINATION 2022-23 BIG DATA
BIG DATA – KCS-061
Section-wise Important Questions & Ready Answers
SECTION A
(Attempt all questions – 2 marks each)
(a) Five Big Data Platforms
Common Big Data platforms include Hadoop, Apache Spark, Apache Flink, Apache Storm, and Google BigQuery. These platforms support large-scale data storage, processing, and analytics.
(b) Importance of Hadoop in Big Data Analytics
Hadoop is important because it enables distributed storage and parallel processing of massive datasets using commodity hardware. Its fault tolerance, scalability, and cost effectiveness make it suitable for Big Data analytics.
(c) Three Benefits of MapReduce
MapReduce provides automatic parallelization of tasks, fault tolerance through re-execution of failed tasks, and scalability across large clusters, making large-scale data processing efficient.
(d) Heartbeat in HDFS
A heartbeat is a periodic signal sent by DataNodes to the NameNode indicating that the DataNode is active and functioning properly.
(e) Data Replication in HDFS
Data replication is the process of storing multiple copies of data blocks across different DataNodes to ensure fault tolerance and data availability.
(f) Difference Between Flume and Sqoop
Flume is used to collect and transfer streaming data such as logs into HDFS, whereas Sqoop is used to transfer bulk data between relational databases and Hadoop.
(g) NoSQL vs Relational Databases
Relational databases use fixed schemas and SQL, while NoSQL databases support flexible schemas, horizontal scalability, and are suitable for unstructured or semi-structured data.
(h) Hadoop Schedulers
Schedulers manage resource allocation in Hadoop. Common schedulers include FIFO, Fair Scheduler, and Capacity Scheduler.
(i) Pig vs MapReduce
Pig provides a high-level scripting language (Pig Latin) that simplifies data processing, whereas MapReduce requires low-level Java programming.
(j) Metastore in Hive
The Hive metastore stores metadata about tables, schemas, partitions, and data locations, enabling Hive to manage and query data efficiently.
SECTION B
(Attempt any three – 10 marks each)
2(a) Hadoop Ecosystem in Detail
The Hadoop ecosystem consists of HDFS for distributed storage, MapReduce for batch processing, YARN for resource management, and tools like Hive, Pig, HBase, Sqoop, Flume, Oozie, and Spark. Together, these components support data ingestion, storage, processing, and analytics.
2(b) Master-Slave and Peer-to-Peer Replication
In master-slave replication, one master node handles writes and multiple slave nodes handle reads. In peer-to-peer replication, all nodes have equal roles, improving availability and scalability.
2(c) Reading and Writing Data in HDFS
When a client writes data, it contacts the NameNode for metadata and writes blocks directly to DataNodes. During reading, the client gets block locations from the NameNode and reads data from the nearest DataNode.
2(d) CRUD Operations in MongoDB
MongoDB supports Create (insert), Read (find), Update (update), and Delete (remove) operations using JSON-like documents, enabling flexible data handling.
2(e) Architecture of Hive
Hive architecture includes the user interface, driver, compiler, optimizer, execution engine, and metastore. Queries written in HiveQL are converted into MapReduce or Spark jobs for execution.
SECTION C
3(a) Analysis vs Reporting in Big Data
Reporting focuses on summarizing historical data, while analysis involves exploring data to discover patterns and insights. Big Data analytics emphasizes real-time and predictive analysis over static reporting.
3(b) Components of Big Data Architecture
Big Data architecture includes data sources, ingestion layer, storage layer, processing layer, analytics layer, and visualization layer, ensuring end-to-end data handling.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies