Berauti Spectral Subtraction dengan Gaussian Window untuk Peningkatan Akurasi Pengenalan Ucapan Berderau

Fitrilina Fitrilina, Winda Alfin, Fajar Afriyansah

Abstract


The accuracy of speech recognition system decreases when used on a noisy speech. Therefore, the speech recognition system needs to be supported by a speech enhancement method. This study proposed Berauti spectral subtraction method that used gaussian window and minimum statistics noise estimation in order to improve the quality of noisy speech hence increase the accuracy of noisy speech recognition. Speech recognition system was built using the Hidden Markov Model Toolkit (HTK). This study varied three types of noise, five SNR level, six oversubtraction values and four sidelobe gaussian window attenuation values with 1500 speech signals. Improvement of speech recognition accuracy using Gaussian window compared with Hamming window. The result of this study were sidelobe and oversubtraction attenuation values affects recognition accuracy, the average improvement of speech recognition accuracy using gaussian window about 36.4%  obtained at oversubtraction = 4.75 and  sidelobe attenuation = 1.5,  using hamming window about 18,7 % obtained at oversubtraction 2.5. Spectral subtraction using gaussian window or hamming window able to improve the speech recognition accuracy, but gaussian window  better than hamming window.

 

Keywords : Berauti spectral subtraction, gaussian window,  speech recognition

 

 

Abstrak

Akurasi sistem pengenalan ucapan menurun ketika digunakan pada ucapan berderau. Oleh karena itu, sistem pengenalan ucapan perlu didukung dengan metoda perbaikan sinyal ucapan. Pada penelitian ini diusulkan metoda Berauti spectral subtraction yang menerapkan gaussian window dan estimasi derau minimum statistik untuk memperbaiki kualitas sinyal berderau sehingga dapat meningkatkan akurasi pengenalan ucapan berderau. Sistem pengenalan ucapan dibangun menggunakan Hidden Markov Model ToolKit (HTK). Pada penelitian ini divariasikan tiga jenis derau, lima level SNR, enam nilai oversubtraction dan empat nilai redaman sidelobe gaussian window dengan 1500 sinyal ucapan. Peningkatan akurasi pengenalan ucapan yang menggunakan gaussian window dibandingkan dengan hamming window. Hasil penelitian ini menunjukan pemilihan nilai redaman sidelobe dan oversubtraction mempengaruhi akurasi pengenalan. Rata-rata peningkatan akurasi pengenalan ucapan sebesar 36,4 % diperoleh pada nilai oversubtraction 4.75 dan  redaman sidelobe 1.5. Penggunaan hamming window memiliki rata-rata peningkatan akurasi pengenalan sebesar 18,7 % pada nilai oversubtraction 2.5. Metoda spectral subtraction yang menggunakan gaussian window  atau  hamming window, keduanya mampu menaikan akurasi pengenalan ucapan, akan tetapi gaussian window memiliki hasil yang lebih baik dibanding hamming window

 

Kata Kunci : Berauti spectral subtraction, gaussian window, pengenalan ucapan


References


Li, J., Deng, L., Yu, D., Gong, Y., & Acero, A., "High-performance HMM adaptation with joint compensation of additive and convolutive distortions via vector Taylor series." In Automatic Speech Recognition & Understanding, ASRU. IEEE Workshop on pp. 65-70. 2007.

Xiao Xiong. Speech Enhancement with Applications in Speech Recognition. Nanyang Technological University, 2006

Gales, Mark John Francis. "Model-based techniques for noise robust speech recognition." PhD diss., University of Cambridge, 1995.

Arun Narayanan, Xiaojia Zhao., DeLiang Wang, Eric Fosler-Lussier. "Robust Speech Recognitiion Using Multiple Prior Models For Reconstructon". ICASSP - IEEE. 2011;4800-48003

Boll: ”Suppression of acoustic noise in speech using spectral subtraction”, IEEE Trans. Acoust. Speech and Sig. Proc., 27:113-120, 1979.

M. Berouti, R. Schwartz, J. Makhoul, ”Enhancement of speech corrupted by acoustic noise”, Proc. IEEE ICASSP, 208-211, 1979.

Harald Gustafsson, Sven Nordholm and Ingvar Claesson, "Spectral Subtraction With Adaptive Averaging Of The Gain Function", 6th European Conference on Speech Communication and Technology (EUROSPEECH’99) Budapest, Hungary, September 5-9, 1999

S. China Venkateswarlu, A. Subba Rami Reddy & K. Satya Prasad, "Speech Enhancement using Boll’s Spectral Subtraction Method based on Gaussian Window", Global Journal of Researches in Engineering: Electrical and Electronics Engineering Vol. 14 Issue 6 Version 1.0 2014

A. M. Kandoz, Digital speech, 2nd edition, Willey, 2002.

Martin, Rainer. "Noise power spectral density estimation based on optimal smoothing and minimum statistics." IEEE Transactions on speech and audio processing Vol.9 No.5 pp.504-512, 2001.

Urmila Shrawankar, Vilas Thakare. An "Adaptive Methodology for Ubiquitous ASR System", Computer and Information Science, vol 6 no.1, pp 58-69, 2013

Fitrilina ., Rahmadi Kurnia, Siska Aulia, "Pengenalan Ucapan Metoda MFCC-HMM Untuk Perintah Gerak Robot Mobil Penjejak Identifikasi Warna", Jurnal Nasional Teknik Elektro (JNTE) Vol 2,No 1: Maret 2013

Fitrilina, "Sistem Pengenalan Isolated Digit yang Robust Dengan Menggunakan Spectral Subtraction Berdasarkan Minimum Statistics", Jurnal Teknika No 35 Vol 1. 2011,

Venkata Rami Reddy Datla, "Implementation and evaluation of spectral subtraction (SS) with minimum statistics and wiener beamformer combination". Master Thesis of Electrical Engineering, school of Engineeringctrical Engineering ,Blekinge Institute of Technology (BTH), Sweden. 2013

Martin, Rainer. "Spectral subtraction based on minimum statistics." Power V6 n.8, 1994

Bhatnagar, Mukul, “Modified Spectral Subtraction Method Combine With Percetual Weighting for Speech Enhancement”, Thesis, The University of Texas, Dallas, 2002

Hilman F. Pardede, "Nonlinear Spectral Subtraction Berbasis Tsallis Statistics untuk Peningkatan Kualitas Sinyal Ucapan", INKOM, Vol. 7, No. 1, 2013.




DOI: https://doi.org/10.25077/jnte.v7n3.497.2018

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.

JNTE index by:

  

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Statistic and Traffic