(SEM V) THEORY EXAMINATION 2023-24 DATA ANALYTICS

B.Tech Data Structure 0 downloads

Subject Code: KCS051

Subject Name: Data Analytics

Course: B.Tech (Semester V)

Maximum Marks: 100

Duration: 3 Hours

Exam Year: 2023–24

Sections: A, B, and C

SECTION A – Short Answer Questions (2 × 10 = 20 Marks)

Attempt all questions briefly.

a. How have advancements in technology contributed to the scalability of analytics?
b. What are the sources of data in data analytics?
c. Elaborate on the mathematical foundations of Support Vector Machines (SVMs).
d. Discuss advantages of Bayesian methods in real-world applications.
e. Elaborate on methods used for filtering streams in real-time analytics.
f. What are key considerations when implementing sampling techniques for stream data?
g. Differentiate stream-based algorithms vs batch processing in frequent itemset mining.
h. What are challenges in Apriori algorithm under memory constraints?
i. Explain the role of Hive in the Hadoop ecosystem.
j. How does the MapReduce framework facilitate distributed processing?

Tips:

Revise Hadoop ecosystem (HDFS, MapReduce, Hive).

Learn SVM equations — hyperplane, kernel trick.

Understand real-time stream filtering and batch vs stream difference.

SECTION B – Medium-Length Questions (10 × 3 = 30 Marks)

Attempt any three of the following.

Characteristics of Data: Volume, Variety, Velocity, Veracity, Value.

Impact: affects storage, scalability, and analytical model design.

Bayesian Networks:

Probabilistic graphical model based on conditional dependencies.

Used in medical diagnosis, predictive modeling, and uncertainty handling.

Real-time Analytics Platforms:

Tools: Apache Storm, Flink, Kafka Streams.

Used for live data processing — IoT sensors, stock trading, etc.

Clustering Comparison:

K-Means: Efficient, assumes spherical clusters.

Hierarchical: Dendrogram-based, better for non-spherical or unknown clusters.

Sharding in NoSQL:

Horizontal data partitioning to improve scalability.

Addresses challenges in distributed database management (MongoDB, Cassandra).

SECTION C – Long / Analytical Questions (10 × 5 = 50 Marks)

Q3. Neural Networks and Fuzzy Models

a. Generalization in Neural Networks:

Balancing bias–variance trade-off using regularization and dropout.
OR
b. Fuzzy Logic Models:

Fuzzy rules capture uncertainty better than crisp models.

Applied in expert systems and predictive modeling.

Q4. Stream Data Analysis

a. Counting distinct elements in streams:

Algorithms: Flajolet–Martin, HyperLogLog, and Bloom filters.

Used for real-time metrics, unique user counts, etc.
OR
b. Counting uniqueness in a window:

Tracks element frequency and diversity within time-based windows.

Q5. Clustering

a. CLIQUE vs ProCLUS:

CLIQUE: Grid-based subspace clustering; efficient for high-dimensional data.

ProCLUS: Uses projected clustering, better handling of noise and outliers.
OR
b. Non-Euclidean Clustering:

Uses Mahalanobis, cosine, or Manhattan distance instead of Euclidean.

Important for text, graphs, and categorical data.

Q6. Interactive and NoSQL Systems

a. Interactive Techniques:

Visualization tools like Tableau, Power BI, D3.js.

Enable intuitive exploration of large datasets.
OR
b. NoSQL for Unstructured Data:

Databases: MongoDB, Cassandra, CouchDB.

Outperform relational DBs in scalability and flexible schema handling.

Q7. Modern Analytics Tools

a. Analysis vs Reporting:

Analysis: Discover patterns (machine learning, clustering).

Reporting: Summarize data for decision-making (dashboards).
OR
b. Modern Tools:

Power BI, Tableau, Google Data Studio, Apache Spark, Hadoop, TensorFlow.

Revolutionized analytics via automation, scalability, and visualization.

Key Topics to Prepare

Core Concepts

Data lifecycle and sources Big Data characteristics (5Vs)

Types of analytics: Descriptive, Predictive, Prescriptive

Machine Learning Models

SVM, Bayesian networks, Neural networks, Fuzzy logic

Real-time & Stream Processing

Algorithms for streaming data Tools: Apache Kafka, Flink, Spark Streaming

Clustering & Mining

K-Means, Hierarchical, CLIQUE, ProCLUS, Apriori

Big Data Frameworks

Hadoop ecosystem: HDFS, Hive, MapReduce NoSQL databases: MongoDB, Cassandra

Visualization & Tools

Power BI, Tableau, D3.js Interactive analytics and dashboards

Study Tips

Understand key algorithms conceptually (not just formulas).

Practice diagram-based answers — network models, data flow in Hadoop.

Review use cases — predictive analytics, fraud detection, IoT.

Revise modern tools and frameworks — Spark, Kafka, Hive, Tableau.

Focus on conceptual clarity in clustering, streaming, and NoSQL.

(SEM V) THEORY EXAMINATION 2023-24 DATA ANALYTICS

SECTION A – Short Answer Questions (2 × 10 = 20 Marks)

SECTION B – Medium-Length Questions (10 × 3 = 30 Marks)

SECTION C – Long / Analytical Questions (10 × 5 = 50 Marks)

Q3. Neural Networks and Fuzzy Models

Q4. Stream Data Analysis

Q5. Clustering

Q6. Interactive and NoSQL Systems

Q7. Modern Analytics Tools

Key Topics to Prepare

Core Concepts

Machine Learning Models

Real-time & Stream Processing

Clustering & Mining

Big Data Frameworks

Visualization & Tools

Study Tips

Download

Related Notes

BASIC ELECTRICAL ENGINEERING

ENGINEERING PHYSICS THEORY EXAMINATION 2024-25

(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...

THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...

(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...

(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...

Need more notes?

Latest Blog Posts

Avoid Common Mistakes in CMAT Exam and Score High

Crack CMAT Like a Pro: Smart Strategies and Expert Study Support from Suganta Tutors

Master the CSIR-UGC NET 2025: Step-by-Step Guide to Achieve JRF & Teaching Excellence

Thomas Edison’s Inspiring Journey: How Education and Persistence Created the Light of...

5 Powerful AI Tools Every Student Should Use to Learn Smarter and Faster

SAT vs ACT Explained: Which Test Gives You a Better Edge for U.S. College Admissions?

8 Interesting Ways to Increase Your Concentration While Studying

Skill-Based Learning in India: Study Smart with Experts