(SEM VIII) THEORY EXAMINATION 2024-25 DATA WAREHOUSING & DATA MINING
SECTION A – Short Answers (2 Marks Each) – Paragraph Style
a) Explain Discretization.
Discretization is a data preprocessing technique used in data mining where continuous numerical data is converted into a finite number of intervals or categories. This helps reduce data complexity and improves the performance of data mining algorithms by making patterns easier to identify and analyze.
b) Discuss issues to consider during data integration.
Data integration involves combining data from multiple sources into a unified dataset. During this process, issues such as schema conflicts, naming inconsistencies, data redundancy, and data value conflicts may arise. Resolving these issues is necessary to maintain data consistency and accuracy.
c) Explain the working of an Artificial Neuron.
An artificial neuron is a basic unit of a neural network that mimics the functioning of a biological neuron. It receives input signals, multiplies them by weights, sums them, and applies an activation function to produce an output. This process enables learning and pattern recognition.
d) Explain support and confidence in association rule mining.
Support measures how frequently an itemset appears in a dataset, while confidence indicates the reliability of an association rule. Together, they help determine the strength and usefulness of relationships discovered in transactional data.
e) Explain pivot operation in OLAP.
The pivot operation in OLAP allows users to rotate the data cube to view data from different perspectives. It helps in analyzing multidimensional data by rearranging rows and columns to gain better insights.
f) Define Information Gain.
Information Gain is a measure used in decision tree algorithms to determine the best attribute for splitting data. It calculates the reduction in entropy after a dataset is divided based on an attribute.
g) Define Data Warehouse.
A data warehouse is a centralized repository that stores integrated, historical, and subject-oriented data from multiple sources to support decision-making and analysis.
h) Explain Outlier Analysis.
Outlier analysis identifies data objects that significantly differ from the rest of the dataset. These unusual values may indicate errors, rare events, or important insights and must be carefully analyzed.
i) What do negative, positive, and zero correlation coefficients indicate?
A positive correlation coefficient indicates that variables increase together, a negative value shows that one variable increases while the other decreases, and a zero value means there is no linear relationship between variables.
j) Explain the binning method for dealing with noisy data.
Binning is a data smoothing technique where data is divided into bins and replaced with a representative value such as the mean or median. This helps reduce noise and improves data quality.
SECTION B – Descriptive Answers (10 Marks Each) – Paragraph Style
a) Explain three-tier data warehousing architecture.
The three-tier data warehousing architecture consists of the bottom tier, middle tier, and top tier. The bottom tier contains the data warehouse database where cleaned and integrated data is stored. The middle tier includes OLAP servers that process queries and perform analytical operations. The top tier consists of front-end tools used by users for reporting, analysis, and decision-making. This architecture improves scalability, performance, and data management efficiency.
b) Compare OLTP and OLAP systems.
OLTP systems are designed for routine transaction processing and handle large numbers of short, simple operations such as insert, update, and delete. OLAP systems, on the other hand, are used for analytical processing and complex queries involving large datasets. While OLTP focuses on operational efficiency, OLAP emphasizes data analysis and decision support.
c) Discuss snowflake and fact constellation schemas.
A snowflake schema is an extension of the star schema where dimension tables are normalized to reduce redundancy. A fact constellation schema consists of multiple fact tables sharing common dimension tables. These schemas support complex analytical queries and improve storage efficiency.
d) Explain Enterprise Warehouse and Data Mart.
An enterprise warehouse stores integrated data from the entire organization and supports strategic decision-making. A data mart is a subset of a data warehouse designed for a specific department or business function. Data marts are smaller, faster to implement, and easier to manage.
e) Explain non-linear separability using EX-OR functionality in neural networks.
The EX-OR problem demonstrates non-linear separability where data cannot be separated using a single straight line. Neural networks solve this problem using multiple layers and hidden neurons that learn complex decision boundaries, highlighting the power of multi-layer perceptrons.
SECTION C – Long Answer (10 Marks) – Paragraph Style
a) Explain Laplacian correction in Naïve Bayesian classifier with an example.
Laplacian correction is used in Naïve Bayesian classifiers to avoid zero probability values when a feature does not appear in a training class. It adds a small constant to frequency counts, ensuring that no probability becomes zero. This improves classification accuracy, especially when dealing with limited training data.
OR
b) Explain top-down and bottom-up approaches for hierarchical clustering.
The top-down approach, also known as divisive clustering, starts with all data points in a single cluster and recursively divides them. The bottom-up approach, called agglomerative clustering, begins with each data point as an individual cluster and gradually merges them. Both approaches build a hierarchy of clusters that help identify data structure.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies