Abstract:
Emotion recognition plays a major role in a ective computing and adds value to
machine intelligence. While the emotional state of a person can be expressed in
di erent ways such as facial expressions, gestures, movements and postures, recognition
of emotion from speech has gathered much interest over others. However,
after years of research, recognizing the emotional state of individuals from their
speech as accurately as possible still remains a challenging task. This motivates
an attempt to study the factors that in
uence identi cation of Speech Emotion
Recognition (SER) such as gender, culture, dialects, education, social status and
age. The aim of this study is to investigate whether a SER system can identify
the emotional state of a person regardless of the language used. To investigate
the in
uence of languages in SER, we explored how spoken expressions of six selected
emotions (happiness, anger, sadness, neutral, fear & disgust) varied in two
languages of interest: English and Bangla. In addition, the perceptual outcomes
were studied in relation to identifying the advantage of speech emotion expression
produced by native speakers and also by bilingual speakers