Impact of Semantic Coding of Emotional Speech on Speech Coding Performance

Firos A; Utpal Bhattacharjee

doi:10.14445/22312803/IJCTT-V57P102

Research Article | Open Access | Download PDF

Volume 57 | Number 1 | Year 2018 | Article Id. IJCTT-V57P102 | DOI : https://doi.org/10.14445/22312803/IJCTT-V57P102

Impact of Semantic Coding of Emotional Speech on Speech Coding Performance

Firos A , Utpal Bhattacharjee

Citation :

Firos A , Utpal Bhattacharjee, "Impact of Semantic Coding of Emotional Speech on Speech Coding Performance," International Journal of Computer Trends and Technology (IJCTT), vol. 57, no. 1, pp. 6-10, 2018. Crossref, https://doi.org/10.14445/22312803/IJCTT-V57P102

Abstract

This paper presents a technique for solving the real time computational difficulty of speech coding standards in semantic level by preserving its prosodic features. LPC analysis will be done to identify the feature of the input speech. The proposal takes the GMM model to identify the semantic features and prosody of the input speech.. ANN will be utilized to identify the best features for encoding. Using such semantic based coding will highly reduce the computational overhead in speech coders.

Keywords

Speech coding; G.723.1, iLBC; fuzzy clustering; Windowing; ANN.

References

[1] Ying-Hui Lai, Fei Chen , Yu Tsao, ``Adaptive Dynamic Range Compression for Improving Envelope-Based Speech Perception: Implications for Cochlear Implants,`` Springer, Emerging Technology and Architecture for Big-data Analytics, pp. 191-214, April 2017.
[2] Stanislaw Gorlow ; Joshua D. Reiss .?Model-Based Inversion of Dynamic Range Compression? IEEE, IEEE Transactions on Audio, Speech, and Language Processing , Page(s): 1434 - 1444 ,Volume: 21 Issue: 7, July 2013.
[3] Virendra Chauhan, Shobhana Dwivedi, Pooja Karale, Prof. S.M. Potdar ?SPEECH TO TEXT CONVERTER USING GAUSSIAN MIXTURE MODEL(GMM) ", International Journal of Engineering Research and Applications (IJERA), ISSN: 2248-9622, Vol. 2, Issue 3, May-Jun 2012, pp.1169-1173.
[4] Dhinesh Babu L.D, P. Venkata Krishna, ?Honey bee behavior inspired load balancing of tasks in cloud computing environments?, Applied Soft Computing 13 (2013), pp.2292–2303.
[5] Matthias Schmidt,Niels Fallenbeck,Matthew Smith,Bernd Freisleben,"Efficient Distribution of Virtual Machines for Cloud Computing",Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing,IEEE Computer Society Washington, DC,(2010), pp.567-574
[6] Peipei Shen, Zhou Changjun, Xiong Chen,? Automatic Speech Emotion Recognition Using Support Vector Machine? IEEE International Conference on Electronic and Mechanical Engineering and Information Technology (EMEIT) volume2 , Page(s) : 621 - 625 , 12-14 Aug. 2011.
[7] Akalpita Das, Purnendu Acharjee , Laba Kr. Thakuria , ? A brief study on speech emotion recognition? , International Journal of Scientific and Engineering Research(IJSER), Volume 5, Issue 1,pg-339-343, January-2014.
[8] Kshamamayee Dash, Debananda Padhi , Bhoomika Panda, Prof. Sanghamitra Mohanty, ? Speaker Identification using Mel Frequency Cepstral Coefficient and BPNN?, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 4, pg.-326-332, April 2012.
[9] Vinay, Shilpi Gupta, Anu Mehra,?Gender Specific Emotion Recognition Through Speech Signals?, IEEE International Conference on Signal Processing and Integrated Networks (SPIN), 2014 , Page(s):727 – 733, 20-21 Feb. 2014.
[10] Norhaslinda Kamaruddin, Abdul wahab Rahman,Nor Sakinah Abdullah,?Speech emotion identification analysis based on different spectral feature extraction methods?, IEEE Information and Communication Technology for The Muslim World, 2014 The 5th International Conference, Pages:1-5, 2014.
[11] A. D. Dileep, C. Chandra Sekhar, ?GMM Based Intermediate Matching Kernel for Classification of Varying Length Patterns of Long Duration Speech Using Support Vector Machines?, IEEE Transactions on Neural Networks and Learning Systems, Volume: 25, Issue: 8,Pages: 1421 -1432, 2014.
[12] S.Lalitha, Abhishek Madhavan, Bharath Bhushan, Srinivas Saketh ?Speech Emotion Recognition? IEEE International Conference on Advances in Electronics, Computers and Communications (ICAECC), Page(s): 1-4, 2014 .
[13] S.Sravan Kumar, T.RangaBabu , Emotion and Gender Recognition of Speech Signals Using SVM, International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 4, Issue 3, pg.- 128-137 May 2015.
[14] R.Banse, K.R.Scherer, ?Acoustic profiles in vocal emotion expression?, Journal of Personality and Social Psychology, Vol.70, 614-636, 1996
[15] T.Bänziger, K.R.Scherer, ?The role of intonation in emotional expression?, Speech Communication, Vol.46, 252-267, 2005
[16] F.Yu, E.Chang, Y.Xu, H.Shum, ?Emotion detection from speech to enrich multimedia content?, Lecture Notes In Computer Science,Vol.2195, 550-557, 2001
[17] D.Talkin, ?A Robust Algorithm for Pitch Tracking (RAPT)?, Speech Coding and Synthesis, 1995
[18] Unknown,http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
[19] S.Kim, P.Georgiou, S.Lee, S.Narayanan. ?Real-time emotion detection system using speech: Multi-modal fusion of different timescale features?, Proceedings of IEEE Multimedia Signal Processing Workshop, Chania, Greece, 2007
[20] Unknown,http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/
[21] L.R.Rabiner and B.H.Juang. ?Fundamentals of Speech Recognition?, Upper Saddle River; NJ: Prentice-Hall, 1993
[22] V.A Petrushin, ?Emotional Recognition in Speech Signal: Experimental Study, Development, and Application?, ICSLP-2000, Vol.2,222-225, 2000 J. Breckling, Ed., The Analysis of Directional Time Series: Applications to Wind Speed and Direction, ser. Lecture Notes in Statistics. Berlin, Germany: Springer, 1989, vol. 61.