(SEM VII) THEORY EXAMINATION 2022-23 SPEECH PROCESSING
SECTION A (2 Marks Each)
(a) Speech Signal
A speech signal is a time-varying acoustic signal produced by the human vocal system, used for communication.
(b) Lossless Tube Model of Speech Signal
In the lossless tube model, the vocal tract is modeled as a series of lossless acoustic tubes that shape the speech sound without energy loss.
(c) Speech Spectrogram
A speech spectrogram is a time-frequency representation showing how speech energy varies with time and frequency.
(d) Short-Time Average Zero Crossing Rate
It is the average number of times the speech signal crosses the zero amplitude axis in a short time interval and is used to distinguish voiced and unvoiced sounds.
(e) Pitch Detection
Pitch detection is the process of estimating the fundamental frequency of a voiced speech signal.
(f) Correlation Function
Correlation measures similarity between signals. For speech, autocorrelation compares a signal with its delayed version to find periodicity.
(g) Filter
A filter is a system that selectively allows or suppresses certain frequency components of a signal.
(h) Principle of Linear Predictive Coding (LPC)
LPC predicts the current speech sample as a linear combination of past samples, modeling the vocal tract efficiently.
(i) Complex Cepstrum of Speech
The complex cepstrum is obtained by taking the inverse Fourier transform of the logarithm of the complex spectrum, useful in deconvolution.
(j) Convolution vs Deconvolution of Speech
Convolution combines excitation and vocal tract response, while deconvolution separates excitation from the vocal tract effect.
SECTION B (10 Marks Each – Any Three)
(a) Mechanics of Speech Production and Acoustic Phonics
Speech production involves air from lungs, vibration of vocal cords, and shaping by the vocal tract. Acoustic phonetics studies speech sounds based on frequency, amplitude, and duration. Voiced sounds result from vocal cord vibration, while unvoiced sounds are produced by airflow turbulence.
(b) Short-Time Energy and Average Magnitude
Short-time energy measures signal strength over short intervals using windowing. Average magnitude computes the mean absolute value of the signal. Both help in speech detection and segmentation.
(c) Short-Time Fourier Analysis
STFT analyzes speech in short segments assuming stationarity. Properties include time-frequency trade-off, linearity, and ability to represent non-stationary signals.
(d) Homomorphic System of Convolution
In homomorphic processing, convolution in time domain is converted into addition using logarithm and cepstrum, simplifying separation of speech components.
(e) Frequency Domain Interpretation of Prediction Error
Mean squared prediction error reflects mismatch between actual and predicted speech. It is related to LPC parameters, spectral envelope, pitch, and gain.
SECTION C (10 Marks Each)
Q3
(a) Digital Models for Speech Signals
Digital speech models include source-filter model, LPC model, and tube models. These represent speech using excitation and vocal tract characteristics for analysis and synthesis.
(b) Need for Speech Processing
Speech processing is required for speech recognition, voice assistants, speaker identification, hearing aids, and communication systems.
Q4
(a) Pitch Period Estimation Using Parallel Processing
Parallel processing estimates pitch using time-domain, frequency-domain, and cepstral methods simultaneously to improve accuracy.
(b) Speech vs Silence Discrimination
Factors include energy level, zero crossing rate, and spectral features. Silence has low energy and random zero crossings.
Q5
(a) Filter Bank Summation Method
Speech synthesis is done by passing excitation through multiple band-pass filters and summing outputs to reconstruct speech.
(b) Vocoder and Channel Vocoder
A vocoder analyzes speech parameters and transmits them efficiently. Channel vocoder divides speech into frequency bands and encodes envelope and excitation.
Q6
(a) Parallel Processing Time-Domain Pitch Detection and Homomorphic Deconvolution
Pitch is detected using autocorrelation, AMDF, and energy methods. Homomorphic deconvolution separates excitation and vocal tract using cepstrum.
(b) Homomorphic Vocoder
It uses analyzer to separate excitation and system response, and synthesizer to reconstruct speech using modified parameters.
Q7
(a) Multipulse LPC
Multipulse LPC uses multiple excitation pulses per pitch period, improving speech quality and naturalness.
(b) Computation of Gain
Gain is computed from prediction error energy and represents speech signal strength in LPC models.
Related Notes
BASIC ELECTRICAL ENGINEERING
ENGINEERING PHYSICS THEORY EXAMINATION 2024-25
(SEM I) ENGINEERING CHEMISTRY THEORY EXAMINATION...
THEORY EXAMINATION 2024-25 ENGINEERING MATHEMATICS...
(SEM I) THEORY EXAMINATION 2024-25 ENGINEERING CHE...
(SEM I) THEORY EXAMINATION 2024-25 ENVIRONMENT AND...
Need more notes?
Return to the notes store to keep exploring curated study material.
Back to Notes StoreLatest Blog Posts
Best Home Tutors for Class 12 Science in Dwarka, Delhi
Top Universities in Chennai for Postgraduate Courses with Complete Guide
Best Home Tuition for Competitive Exams in Dwarka, Delhi
Best Online Tutors for Maths in Noida 2026
Best Coaching Centers for UPSC in Rajender Place, Delhi 2026
How to Apply for NEET in Gurugram, Haryana for 2026
Admission Process for BTech at NIT Warangal 2026
Best Home Tutors for JEE in Maharashtra 2026
Meet Our Exceptional Teachers
Discover passionate educators who inspire, motivate, and transform learning experiences with their expertise and dedication
Explore Tutors In Your Location
Discover expert tutors in popular areas across India
Discover Elite Educational Institutes
Connect with top-tier educational institutions offering world-class learning experiences, expert faculty, and innovative teaching methodologies