(SEM V) THEORY EXAMINATION 2024-25 DATA ANALYTICS

B.Tech Engineering 0 downloads

₹29.00

Subject Code: BCS052
Maximum Marks: 70
Time: 3 Hours
Paper ID: 310908

Question Paper Overview

SECTION A (2 × 7 = 14 Marks)

(Short Answer / Conceptual Questions)

a. Differentiate between Predictive and Prescriptive Data Analytics.
b. Define the term Data Lake, Database, and Data Warehouse.
c. Explain the concept of Outliers.
d. Describe the concept of Lasso Regression.
e. Differentiate between Stream Processing and Traditional Data Processing.
f. Write the two limitations of K-Means.
g. Discuss the various categories of clustering techniques.

SECTION B (Attempt any three × 7 = 21 Marks)

a. Explain the different categories of data analytics with examples.
b. Explore PCA (Principal Component Analysis).

Given data = {4, 8, 13, 7; 11, 4, 5, 14}.

Compute the principal components and reduce dimension from 2D to 1D.
c. Explain Market Basket Analysis.

Is it supervised or unsupervised?

How can a company use it to improve marketing strategies?
d. Differentiate between CLIQUE and ProCLUS clustering algorithms.
e. Differentiate between NoSQL and Relational Databases.

Identify when to use NoSQL instead of a Relational Database, with an example.

SECTION C (Attempt one part from each question × 7 = 35 Marks)

(a) Differentiate between Structured, Semi-Structured, and Unstructured Data.
OR
(b) Describe Big Data and its characteristics.

(a) Differentiate between Neural Network and Artificial Neural Network.
OR
(b) Given two fuzzy sets:

A = {(10, 0.2), (20, 0.4), (25, 0.7), (30, 0.9), (40, 1), (50, 0.4)}

B = {(10, 0.4), (20, 0.1), (25, 0.9), (30, 0.2), (40, 0.6), (50, 0.6)}

Apply Union, Intersection, Complement, Bold Union, and Bold Intersection operations.

(a) Apply the Flajolet-Martin Algorithm on the data stream:

S = 1, 3, 2, 1, 2, 3, 4, 3, 1, 2, 3, 1

Given: h(x) = (6x + 1) mod 5

Identify unique elements in the stream.
OR
(b) Discuss the concept of filtering in Data Stream Processing and explain Bloom Filtering in detail.

(a) Cluster the following eight points into three clusters using K-Means Algorithm:

A₁(2,10), A₂(2,5), A₃(8,4), A₄(5,8), A₅(7,5), A₆(6,4), A₇(1,2), A₈(4,9)

Initial centers: A₁(2,10), A₄(5,8), A₇(1,2)

Distance function:

P(a,b)=∣x2−x1∣+∣y2−y1∣P(a,b) = |x₂ - x₁| + |y₂ - y₁|P(a,b)=∣x2−x1∣+∣y2−y1∣

Find the final cluster centers.
OR
(b) A transaction database has 6 transactions with Support = 50%, Confidence = 60%:

TID	Items Bought
10	Beer, Nuts, Diaper
20	Beer, Coffee, Diaper
30	Beer, Diaper, Eggs
40	Nuts, Eggs, Milk
50	Nuts, Coffee, Diaper, Eggs, Milk
60	Beer, Nuts, Diaper

i) Use Apriori Algorithm to find frequent itemsets.
ii) Show all strong association rules (with support & confidence).

(a) Brief about the main components of MapReduce.
OR
(b) Draw and explain the architecture of HIVE with its features.

Key Topics for Revision

1. Categories of Data Analytics

Type	Description	Example
Descriptive	Summarizes past data	Monthly sales reports
Diagnostic	Explains reasons behind trends	Root cause analysis
Predictive	Forecasts future trends	Predicting customer churn
Prescriptive	Suggests optimal actions	Recommending marketing offers

2. Data Storage Concepts

Database: Structured, transactional data (SQL). Data Warehouse: Historical, analytical storage (OLAP).

Data Lake: Raw, unstructured storage (Hadoop, AWS S3).

3. Outliers

Data points that deviate significantly from others. Detected using:

Z-score, IQR (Interquartile Range),

Visualization (Box Plot).

4. Lasso Regression

Regularized regression using L1 penalty.

Shrinks coefficients to zero → performs feature selection.

5. Stream Processing vs Traditional Processing

Stream Processing	Traditional Processing
Real-time data flow	Batch data
Frameworks: Apache Flink, Kafka	Hadoop, Spark
Example: IoT sensor data	Daily transaction logs

6. PCA (Principal Component Analysis)

Used for dimensionality reduction.

Steps:

Standardize data. Compute covariance matrix.

Calculate eigenvalues & eigenvectors. Project data onto principal components.

7. Market Basket Analysis

Unsupervised learning (association rule mining).

Uses Apriori Algorithm: Finds frequent itemsets, e.g., “Beer → Diaper.”

Applications: Retail recommendations, cross-selling, layout optimization.

8. Clustering

Partitioning methods: K-Means, K-Medoids. Hierarchical methods: Agglomerative, Divisive.

Density-based: DBSCAN, OPTICS. Grid-based: CLIQUE, STING.

9. NoSQL vs Relational Database

Feature	Relational	NoSQL
Schema	Fixed	Flexible
Scaling	Vertical	Horizontal
Use Case	Banking	Social media, IoT
Example	MySQL	MongoDB, Cassandra

10. Big Data Characteristics (5Vs)

Volume: Massive data size. Velocity: Fast data generation.

Variety: Structured, semi/unstructured. Veracity: Data accuracy.

Value: Extracting useful insights.

11. Flajolet–Martin Algorithm

Estimates number of distinct elements in data streams using hash functions.

Efficient for large-scale streaming data.

12. Bloom Filtering

Probabilistic data structure for membership testing.

Space-efficient but allows false positives.

Used in caching, networking, and databases.

13. Apriori Algorithm

Step 1: Generate frequent itemsets using support. Step 2: Generate strong association rules using confidence.

Example:

Support(A→B) = freq(A∪B) / total transactions Confidence(A→B) = freq(A∪B) / freq(A)

14. K-Means Clustering

Iterative algorithm that partitions data into k clusters.

Limitations:

Sensitive to initial centroids. Assumes spherical clusters.

15. MapReduce Components

Map Phase: Input split → key-value pairs.

Shuffle & Sort: Group similar keys.

Reduce Phase: Aggregate output.

16. HIVE Architecture

Built on top of Hadoop for data querying (SQL-like interface).

Components:

Driver: Compiles queries.

Metastore: Stores schema.

Execution Engine: Converts queries to MapReduce.

HiveQL: SQL-based query language.

File Size

147.89 KB

Uploader

SuGanta International

Download

Create a free account or log in to download this Notes instantly.

Download

Related Notes

Need more notes?

Return to the notes store to keep exploring curated study material.

Back to Notes Store

Latest Blog Posts

TUT

Best Home Tutors for Class 12 Science in Dwarka, Delhi

3 min read 1 Views

Top Universities in Chennai for Postgraduate Courses with Complete Guide

GUI

Best Home Tuition for Competitive Exams in Dwarka, Delhi

3 min read 1 Views

TUT

Best Online Tutors for Maths in Noida 2026

4 min read 2 Views

GUI

Best Coaching Centers for UPSC in Rajender Place, Delhi 2026

4 min read 2 Views

GUI

How to Apply for NEET in Gurugram, Haryana for 2026

3 min read 2 Views

GUI

Admission Process for BTech at NIT Warangal 2026

4 min read 3 Views

TUT

Best Home Tutors for JEE in Maharashtra 2026

Monthly

Not Specified

Per Hour

An IIT graduate having 8 years of experience teaching Maths. Passionate to understand student proble...

View Profile

Explore Tutors In Your Location

Discover expert tutors in popular areas across India

Baking Classes Near Sector 84 Gurugram – Learn Cake & Bakery Skills Professionally Sector 84, Gurugram

Stenography Classes Near Sector 93 Gurugram – Build Speed, Accuracy & Secure Government Career Opportunities Sector 93, Gurugram

Graphic Designing Classes Near Noida Sector 99 – Learn Creative Design and Build a Successful Career Noida

App Development Classes Near Noida Sector 100 – Learn Mobile App Development and Start Your Tech Career Sector 100, Noida

Piano Classes Near Tilak Nagar – Learn, Play & Master Music with Confidenc Tilak Nagar, Delhi

Fashion Designing Course Near Sector 81 Gurugram – Turn Your Creativity into a Successful Career Sector 81, Gurugram

Digital Marketing Course Near Sector 62 Gurugram – Master Online Growth & Build a High-Demand Career Sector 62, Gurugram

Maths Coaching Near By Dwarka Mor – Build Strong Concepts & Score Higher Dwarka Mor, Delhi

No Office Rent Business Setup Near By Uttam Nagar Start & Grow Your Business Without Paying High Office Rent in 2026 Uttam Nagar, Delhi

Public Speaking Training Near Sector 108 Noida – Build Confidence and Communication Skills Noida

Guitar Classes Near Central Noida Sector 1 – Learn Guitar with Expert Trainers Noida

Geography Coaching Classes Near By Dwarka Mor Build Strong Conceptual Understanding & Score High in Board Exams Dwarka Mor, Delhi

Spanish Language Classes Near Sector 113 Noida – Learn Spanish with Professional Training Noida

Guitar Classes Near Mehrauli – Professional Guitar Training in South Delhi Mehrauli, Delhi

Spoken English Classes Near By Saket Improve Fluency, Confidence & Career Opportunities with Expert Training in 2026 Saket, Delhi

Diet & Nutrition Consultation Near Malibu Town – Personalized Guidance for a Healthy Lifestyle Malibu Town, Gurugram

Diet & Nutrition Consultation Near Sector 125 Noida – Your Complete Guide to Healthy Living Sector 125, Noida

🇪🇸 Spanish Language Classes Near Golf Course Road – Learn Spanish for Global Communication Golf Course Road, Gurugram

Guitar Classes Near Chhatarpur – Professional Guitar Training in South Delhi Chhatarpur, Delhi

Coaching Center

Private

Est. 2021-Present

View Institute

⭐

✓

Ginni Sahdev

Great success tuition & tutor

Delhi, Delhi, Raja park,...

Details

Coaching Center

Private

Est. 2011-2020

View Institute

(SEM V) THEORY EXAMINATION 2024-25 DATA ANALYTICS

Question Paper Overview

SECTION A (2 × 7 = 14 Marks)

SECTION B (Attempt any three × 7 = 21 Marks)

SECTION C (Attempt one part from each question × 7 = 35 Marks)

Key Topics for Revision

1. Categories of Data Analytics

2. Data Storage Concepts

3. Outliers

4. Lasso Regression

5. Stream Processing vs Traditional Processing

6. PCA (Principal Component Analysis)

7. Market Basket Analysis

8. Clustering

9. NoSQL vs Relational Database

10. Big Data Characteristics (5Vs)

11. Flajolet–Martin Algorithm

12. Bloom Filtering

13. Apriori Algorithm

14. K-Means Clustering

15. MapReduce Components

16. HIVE Architecture

Download

Related Notes

BASIC ELECTRICAL ENGINEERING

ENGINEERING PHYSICS THEORY EXAMINATION 2024-25

(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...

THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...

(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...

(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...

Need more notes?

Latest Blog Posts

Best Home Tutors for Class 12 Science in Dwarka, Delhi

Top Universities in Chennai for Postgraduate Courses with Complete Guide

Best Home Tuition for Competitive Exams in Dwarka, Delhi

Best Online Tutors for Maths in Noida 2026

Best Coaching Centers for UPSC in Rajender Place, Delhi 2026

How to Apply for NEET in Gurugram, Haryana for 2026

Admission Process for BTech at NIT Warangal 2026

Best Home Tutors for JEE in Maharashtra 2026

Meet Our Exceptional Teachers

KISHAN KUMAR DUBEY

Swethavyas bakka

Vijaya Lakshmi

Shifna sherin F

Divyank Gautam

Explore Tutors In Your Location

Discover Elite Educational Institutes

sugandha mishra

Details

Pranav Shivhare

Details

Krishna Home tutor

Details

Lakhwinder Singh

Details

Ginni Sahdev

Details