(SEM VI) THEORY EXAMINATION 2022-23 DATA ANALYTICS
DATA ANALYTICS – KIT-601
Section-wise Important Questions & Ready Answers
SECTION A
(Attempt all questions – 2 marks each)
(a) Main Characteristics of Big Data
Big Data is characterized by Volume (huge amount of data), Velocity (high speed of data generation), Variety (different data types), Veracity (data uncertainty), and Value (usefulness of data). These characteristics differentiate Big Data from traditional data systems.
(b) Role of Analytical Tools in Big Data
Analytical tools help in collecting, cleaning, processing, analyzing, and visualizing large datasets. They transform raw data into meaningful insights that support decision-making, prediction, and optimization.
(c) Purposes of Regression Analysis
Regression analysis is used to study the relationship between dependent and independent variables. It helps in prediction, trend analysis, forecasting future values, and understanding the impact of variables.
(d) Fuzzy Qualitative Model
A fuzzy qualitative model represents uncertain or imprecise information using fuzzy logic instead of exact numerical values. It is useful when human reasoning or linguistic variables are involved.
(e) Association Rule
An association rule identifies relationships between items in a dataset. It is expressed in the form X → Y, meaning if X occurs, Y is likely to occur.
(f) Benefits of Analytic Sandbox
An analytic sandbox provides an isolated environment where analysts can explore data, test models, and perform experiments without affecting production systems.
(g) Data Stream Management System (DSMS)
A DSMS processes continuous, real-time data streams instead of static datasets. It supports querying, filtering, and aggregation of streaming data.
(h) Response Modeling
Response modeling predicts how users or customers will respond to certain actions such as advertisements, offers, or campaigns using historical data.
(i) Benefits of Visual Data Exploration
Visual data exploration helps users identify patterns, trends, and anomalies quickly. It improves understanding, reduces analysis time, and supports better decisions.
(j) Main Goals of Hadoop
Hadoop aims to provide scalable storage, distributed processing, fault tolerance, and cost-effective handling of Big Data using commodity hardware.
SECTION B
(Attempt any three – 10 marks each)
2(a) Analysis vs Reporting in Data Analytics
Reporting focuses on presenting historical data in predefined formats such as dashboards and reports. Analysis goes beyond reporting by exploring data, identifying patterns, predicting outcomes, and supporting strategic decisions. Reporting answers what happened, while analytics answers why it happened and what will happen next.
2(b) Neural Network and Its Use in Analytics
A neural network is a computational model inspired by the human brain. It consists of layers of interconnected neurons that process data. In analytics, neural networks are used for classification, prediction, image recognition, fraud detection, and pattern recognition due to their ability to learn complex relationships.
2(c) Apriori Association Rule Mining Algorithm
The Apriori algorithm identifies frequent itemsets by using the principle that all subsets of a frequent itemset must also be frequent. It generates candidate itemsets, calculates their support, and eliminates infrequent ones iteratively until all strong association rules are found.
2(d) Advantages and Disadvantages of K-Means Clustering
K-Means is simple, fast, and efficient for large datasets. However, it requires predefined number of clusters, is sensitive to initial centroids, and performs poorly with non-spherical or unevenly sized clusters.
2(e) HDFS and Handling of Big Data
HDFS is a distributed file system that stores large datasets across multiple machines. It divides data into blocks, replicates them for fault tolerance, and allows parallel access, making it suitable for Big Data storage.
SECTION C
3(a) Big Data Analytics Life Cycle
The Big Data analytics life cycle includes data discovery, data preparation, model planning, model building, result evaluation, and deployment. Each stage ensures structured handling of data from raw form to actionable insights.
(In exam, a neat labeled diagram is expected.)
3(b) Regression Modeling vs Bayesian Modeling
Regression modeling estimates relationships using fixed parameters, while Bayesian modeling treats parameters as random variables and updates beliefs using prior and posterior probabilities. Bayesian models handle uncertainty better than traditional regression.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies