(SEM V) THEORY EXAMINATION 2020-21 DATA ANALYTICS
SECTION A (Any 2 Questions Explained Briefly)
What are the different types of data?
Data can be classified into structured, semi-structured, and unstructured data. Structured data is organized in tables and databases, semi-structured data includes formats like XML or JSON, and unstructured data includes text, images, audio, and video. Understanding data types is important for selecting the correct analysis technique.
Explain decision tree.
A decision tree is a supervised learning technique used for classification and prediction. It represents decisions in the form of a tree structure where each internal node denotes a condition, each branch represents a decision rule, and each leaf node shows the outcome. Decision trees are easy to understand and interpret.
SECTION B (Any 2 Questions Explained)
Explain K-Means algorithm and its use.
K-Means is an unsupervised clustering algorithm used to group data into K clusters based on similarity. It works by selecting initial cluster centers, assigning data points to the nearest center, and updating the centers iteratively. K-Means is mainly used when the number of clusters is known in advance and is widely applied in market segmentation and pattern recognition.
Explain Bayesian data analysis.
Bayesian data analysis is a statistical method that uses probability to represent uncertainty in data. It updates prior knowledge with new evidence using Bayes’ theorem to obtain posterior probabilities. This approach is useful in situations where data is limited or uncertainty needs to be handled effectively.
SECTION C (Any 2 Questions Explained)
Describe the architecture of HIVE.
Hive is a data warehousing tool built on Hadoop that allows users to query large datasets using SQL-like language called HiveQL. Its architecture includes user interface, compiler, execution engine, and Hadoop distributed file system. Hive simplifies data analysis by hiding the complexity of MapReduce programming.
Brief about the main components of MapReduce.
MapReduce consists of two main components: Mapper and Reducer. The Mapper processes input data and generates key-value pairs, while the Reducer aggregates and processes these pairs to produce final results. MapReduce enables parallel processing of large datasets efficiently.
Most Questions in This PDF Are Related To
Most questions in the Data Analytics (KCS-051) paper are related to data types, data analytics life cycle, clustering and classification algorithms, Hadoop ecosystem (HDFS, Hive, MapReduce), data streams, sampling techniques, multivariate analysis, and machine learning concepts such as supervised and unsupervised learning. The paper mainly focuses on analytical techniques and big data processing frameworks.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies