(SEM V) THEORY EXAMINATION 2022-23 INTRODUCTION TO DATA ANALYTICS AND VISUALIZATION
SECTION A – Short Answer Type Questions (2 Marks each)
(a) Why Is There a Need for Data Analytics?
Data Analytics is the process of inspecting, cleaning, and modeling data to uncover useful insights, draw conclusions, and support decision-making.
Need for Data Analytics:
Informed Decision-Making: Helps organizations make evidence-based decisions using real-time insights.
Pattern Recognition: Identifies trends and hidden correlations from large datasets.
Process Optimization: Improves business efficiency and resource utilization.
Predictive Insights: Enables forecasting using machine learning and statistical models.
Customer Understanding: Provides deep insights into consumer behavior for better marketing strategies.
Example:
Retail companies like Amazon and Flipkart use data analytics for product recommendations and inventory optimization.
(b) What Are the Components of a Time Series?
A Time Series is a sequence of data points recorded at successive time intervals.
Components:
Trend (T): Long-term increase or decrease in data (e.g., sales growth over years).
Seasonal (S): Regular pattern repeating over fixed intervals (e.g., festive season sales).
Cyclical (C): Irregular up-and-down movements linked to economic cycles.
Irregular (I): Random variations due to unforeseen events like natural disasters.
Equation:
Yt=Tt+St+Ct+ItY_t = T_t + S_t + C_t + I_tYt=Tt+St+Ct+It
SECTION B – Long Answer Type Questions (10 Marks each)
(a) Differentiate Between Structured, Semi-Structured, and Unstructured Data.
| Category | Structured Data | Semi-Structured Data | Unstructured Data |
|---|---|---|---|
| Definition | Data organized in tables with predefined schema | Data with tags or markers but not strict schema | Data without predefined format |
| Storage | Relational databases (SQL) | NoSQL, XML, JSON | Text files, videos, emails |
| Example | Customer records, transactions | Web logs, JSON, XML | Images, PDFs, audio, social media posts |
| Query Language | SQL | NoSQL, XPath | NLP, AI-based tools |
| Flexibility | Low | Medium | High |
| Processing Tools | RDBMS, MySQL | MongoDB, Cassandra | Hadoop, Spark |
Conclusion:
In modern analytics, combining all three data types provides a 360° view of business insights.
(b) Explain Market Basket Analysis with an Example.
Market Basket Analysis (MBA) identifies relationships between items purchased together — an example of association rule learning.
Key Metrics:
Support: Frequency of itemset occurrence.
- Support(A⇒B)=Transactions containing (A,B)Total Transactions\text{Support}(A \Rightarrow B) = \frac{\text{Transactions containing } (A,B)}{\text{Total Transactions}}Support(A⇒B)=Total TransactionsTransactions containing (A,B)
Confidence: Probability that item B is purchased when A is purchased.
- Confidence(A⇒B)=Support(A,B)Support(A)\text{Confidence}(A \Rightarrow B) = \frac{\text{Support}(A,B)}{\text{Support}(A)}Confidence(A⇒B)=Support(A)Support(A,B)
Lift: Indicates the strength of association.
- Lift(A⇒B)=Confidence(A⇒B)Support(B)\text{Lift}(A \Rightarrow B) = \frac{\text{Confidence}(A \Rightarrow B)}{\text{Support}(B)}Lift(A⇒B)=Support(B)Confidence(A⇒B)
Example:
In a supermarket dataset:
Bread → Butter (Support = 30%, Confidence = 70%, Lift = 1.4)
This means people who buy bread are 1.4 times more likely to buy butter.
Applications:
Cross-selling in e-commerce.
Store layout design and recommendation engines.
SECTION C – Very Long Answer Type Questions (10 Marks each)
(a) Describe All the Phases of the Data Analytics Life Cycle.
The Data Analytics Life Cycle is a systematic approach to extract insights from raw data.
Phases:
Discovery:
Understand business objectives and data availability. Identify key variables and success criteria.
Data Preparation:
Data collection, cleaning, and transformation. Handle missing values and outliers.
Model Planning:
Select analytical techniques such as regression, clustering, or decision trees.
Use visualization for exploratory data analysis (EDA).
Model Building:
Develop and test predictive models using algorithms (e.g., Random Forest, SVM).
Operationalization:
Deploy the model into production. Generate reports or dashboards for users.
Communication of Results:
Interpret insights and recommend actions for decision-makers.
Tools Used: Python, R, Tableau, Power BI, SAS, and Hadoop ecosystems.
(b) Explain the Working of the k-Means Clustering Algorithm with an Example.
Definition:
The k-means clustering algorithm is an unsupervised learning method used to partition a dataset into k clusters, minimizing intra-cluster distance and maximizing inter-cluster distance.
Algorithm Steps:
Select number of clusters kkk. Initialize kkk random centroids.
Assign each data point to the nearest centroid. Recalculate centroids as mean of assigned points.
Repeat steps 3–4 until centroids stabilize.
Mathematical Expression:
J=∑i=1k∑xj∈Si∣∣xj−μi∣∣2J = \sum_{i=1}^{k} \sum_{x_j \in S_i} ||x_j - \mu_i||^2J=i=1∑kxj∈Si∑∣∣xj−μi∣∣2
where μi\mu_iμi is centroid of cluster SiS_iSi.
Example:
Given data points = {(2,3), (3,4), (10,12), (11,11)}, and k = 2:
Initial centroids: (2,3), (10,12)
Cluster 1: (2,3), (3,4); Cluster 2: (10,12), (11,11)
New centroids: (2.5, 3.5) and (10.5, 11.5)
Applications:
Customer segmentation. Image compression. Anomaly detection.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies