(SEM VII) THEORY EXAMINATION 2023-24 DATA WAREHOUSING AND DATA MINING
SECTION A – Very Short Answer Type
(2 × 10 = 20 Marks)
a) Key steps of Data Mining
The key steps are: Data cleaning
Data integration Data selection
Data transformation Data mining
Pattern evaluation Knowledge presentation
These steps convert raw data into useful knowledge.
b) Support and Confidence
Support: Frequency of an itemset appearing in the database.
Confidence: Probability that item Y is purchased when item X is purchased.
c) Data Warehouse Process
It involves data extraction from multiple sources, transformation to a consistent format, and loading into a central repository for analysis.
d) Warehousing Strategy
Warehousing strategy defines how data is collected, stored, and accessed, such as enterprise warehouse, data mart, or virtual warehouse.
e) Statement of Apriori Algorithm
“All non-empty subsets of a frequent itemset must also be frequent.”
This property reduces the search space in association rule mining.
f) Drawbacks of K-Means Algorithm
Requires predefined number of clusters Sensitive to initial centroids
Fails with non-spherical clusters Affected by noise and outliers
g) Classification
Classification assigns data objects to predefined classes based on labeled training data, using algorithms like decision trees and Naive Bayes.
h) Clustering
Clustering groups similar data objects without predefined labels, aiming to maximize similarity within clusters and minimize similarity between clusters.
i) Need for Data Mining
Data mining helps discover hidden patterns, trends, and relationships in large datasets to support decision-making.
j) Binning
Binning is a data smoothing technique that reduces noise by grouping values into intervals or bins.
SECTION B – Long Answer Type
(Attempt any three – 10 Marks each)
2(a) Knowledge Extraction Process
Knowledge extraction is the process of discovering meaningful patterns from large datasets.
Steps:
Data Selection – Choose relevant data
Data Preprocessing – Clean noisy and missing data
Transformation – Normalize and aggregate data
Data Mining – Apply algorithms (classification, clustering, association)
Pattern Evaluation – Identify interesting patterns
Knowledge Presentation – Visualize results using charts and reports
This process converts raw data into actionable knowledge.
2(b) OLAP Functions, Tools, and Servers
OLAP Functions: Roll-up
Drill-down Slice
Dice Pivot
OLAP Tools: MOLAP (Multidimensional OLAP)
ROLAP (Relational OLAP) HOLAP (Hybrid OLAP)
OLAP Servers:
OLAP servers store and process multidimensional data, enabling fast analytical queries.
2(c) Database Schemas
Types:
Star Schema – Central fact table connected to dimension tables
Snowflake Schema – Normalized version of star schema
Fact Constellation Schema – Multiple fact tables sharing dimensions
Example:
Sales fact table linked to time, product, and location dimensions.
2(d) Statistical Measures in Classification
Key measures include: Mean
Median Variance
Standard Deviation Correlation
These measures summarize data distribution and improve classification accuracy in large databases.
2(e) Building a Data Warehouse
Steps include: Business analysis
Data source identification ETL design
Data modeling Data loading
Testing and deployment
A data warehouse supports long-term decision making.
SECTION C – Descriptive Answer Type
3(a) Mapping Data Warehouse to Multiprocessor Architecture
Steps:
Partition data across processors Assign fact and dimension tables
Enable parallel query processing Synchronize data access
Optimize workload distribution This improves performance and scalability.
3(b) Data Cubes with Example
A data cube represents data in multiple dimensions.
Example:
Sales analyzed by time, location, and product.
Each cell stores aggregated values like total sales.
4(a) Concept Hierarchy
Concept hierarchy organizes data from general to specific levels.
Example:
Location → Country → State → City It supports roll-up and drill-down operations.
4(b) Warehouse Management and Support Process
Includes: Data refresh
Indexing Backup and recovery
Query management Performance tuning
Ensures smooth warehouse operation.
5(a) Data Mining System Integration Approaches
| Approach | Description |
|---|---|
| No Coupling | Mining done outside database |
| Loose Coupling | Database used for data storage |
| Semi-tight Coupling | Some mining functions integrated |
Semi-tight coupling provides better performance.
5(b) Data Consolidation Statement Justification
Yes, data consolidation is a data modeling activity because it integrates data from multiple sources into a unified schema for analysis.
6(a) Measures of Central Tendency
Mean – Average value Median – Middle value
Mode – Most frequent value These summarize dataset characteristics.
6(b) Quartiles and Histograms
Quartiles divide data into four equal parts Histograms graphically represent data distribution
Both help in understanding data spread.
7(a) Distance-Based vs Decision Tree Algorithms
| Aspect | Distance-Based | Decision Tree |
|---|---|---|
| Method | Similarity measures | Rule-based |
| Interpretability | Low | High |
| Noise handling | Weak | Strong |
7(b) Web Mining
Types: Web Content Mining – Extracts text and media
Web Structure Mining – Analyzes link structure Web Usage Mining – Studies user behavior
Used in recommendation systems and search engines.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies