Skip to content

Latest commit

 

History

History
79 lines (65 loc) · 3.01 KB

234-Speech-Processing-and-Synthesis.org

File metadata and controls

79 lines (65 loc) · 3.01 KB

<<<CP1234>>> SPEECH PROCESSING AND SYNTHESIS

{{{credits}}}

LTPC
3003

Course Objectives

  • To explore the fundamentals of digital speech processing.
  • To understand the basic concepts and algorithms of speech processing.
  • To familiarize the students with the various speech signal representation, coding and recognition techniques.
  • To study the concepts and evaluation method of speech synthesis.

{{{unit}}}

Unit IFundamentals of Digital Speech Processing9

Introduction: Discrete-Time signals and systems – Transform representation of Signals and systems – Fundamentals of digital filters – Sampling; Process of Speech Production – Acoustic theory of speech production – Digital models for speech signals.

{{{unit}}}

Unit IISpeech Signal Analysis in Time Domain9

Time-dependent processing of speech – Methods for extracting the parameters: Energy – Average Magnitude – Zero-crossing rate; Slience discrimation using ZCR and energy – Short-time autocorreleation function – Pitch period estimation using autocorrelation function.

{{{unit}}}

Unit IIISpeech Signal Analysis in Frequency Domain9

Short time fourier analysis – Fourier transform and linear interpretations – Sampling rates – Spectrographic Displays – Formant extraction – Pitch extraction – Linear predictive coding: Autocorrelation method – Covariance method; Solution of LPC equations – Durbin’s Recursive solution – Application of LPC parameters – Pitch detection.

{{{unit}}}

Unit IVSpeech Recognition9

Introduction – Preprocessing – Parametric representation – Speech segmentation – Dynamic time warping – Vector quantization – Hidden Markov Model – Language Models – Developing an isolated digit recognition system.

{{{unit}}}

Unit VSpeech Synthesis9

Attributes of speech synthesis – Formant speech synthesis – Concatenative speech synthesis – Prosodic modification of speech – Source filter models for prosody modification – Evaluation of TTS system.

\hfill Total: 45

Course Outcomes

After the completion of this course, students will be able to:

  • Illustrate how the speech production is modelled (K2)
  • Extract features from the speech signal in both time and frequency domain (K3)
  • Developing a speech recognition system using statistical approach (K3)
  • Compare the various methods of speech synthesis (K2)

References

  1. L. R. Rabiner and R. W. Schaffer, “Digital Processing of Speech signals”, Prentice Hall, 1978.
  2. Xuedong Huang, Alex Acero, Hsiao-Wuen Hon, “Spoken Language Processing – A guide to Theory, Algorithm and System Development”, Prentice Hall PTR, 2001.
  3. Lawrence Rabiner and Biing-Hwang Juang, “Fundamentals of Speech Recognition”, Prentice Hall Signal Processing Series, 1993.
  4. Thomas F.Quatieri, “Discrete-Time Speech Signal Processing”, Pearson Education, 2002.
  5. Ben Gold and Nelson Morgan, “Speech and Audio Signal Processing”, John Wiley and Sons Inc., Singapore, 2004.