Directory

Mark A. Hasegawa-Johnson's directory photo.

Mark A. Hasegawa-Johnson

(he/him/his)

Professor

Primary Affiliation

Biologically Informed Artificial Intelligence

Affiliations

Status Full-time Faculty

Home Department of Electrical and Computer Engineering

Phone 333-0925

Email jhasegaw@illinois.edu

Address 2011 Beckman Institute, 405 North Mathews Avenue

  • Biography

    Mark Hasegawa-Johnson is a professor in the University of Illinois department of Electrical and Computer Engineering and a full-time faculty member in the Artificial Intelligence group at the Beckman Institute.

    Education

    • Ph.D., Massachusetts of Technology, 1996

  • Honors
    • 2023: Fellow of the International Speech Communication Association for contributions to knowledge-constrained signal generation

    • 2020: Fellow of the IEEE, for contributions to speech processing of under-resourced languages

    • 2011: Fellow of the Acoustical Society of America, for contributions to vocal tract and speech modeling

    • 2009: Senior Member of the Association for Computing Machinery

    • 2004: Member, Articulograph International Steering Committee; CLSP Workshop leader, "Landmark-Based Speech Recognition”, Invited paper

    • 2004: NAACL workshop on Linguistic and Higher-Level Knowledge Sources in Speech Recognition and Understanding

    • 2003: List of faculty rated as excellent by their students

    • 2002: NSF CAREER award

    • 1998: NIH National Research Service Award

  • Research

    Research Interests

    • Acoustic phonetics

    • Audio signal processing and speech recognition

    • Speech and auditory physiology

    Research Areas

    • Acoustics

    • Adaptive signal processing

    • Biomedical imaging

    • Computer vision and pattern recognition

    • Image, video, and multimedia processing and compression

    • Machine learning

    • Machine learning and pattern recognition

    • Natural language processing

    • Random processes

    • Robotics and motion planning

    • Signal detection and estimation

    • Signal Processing

    • Speech recognition and processing

    Hasegawa-Johnson has been on the faculty at the University of Illinois since 1999. His research addresses automatic speech recognition with a focus on the mathematization of linguistic concepts. His group has developed mathematical models of concepts from linguistics including a rudimentary model of pre-conscious speech perception (the landmark-based speech recognizer), a model that interprets pronunciation variability by figuring out how the talker planned his or her speech movements (tracking of tract variables from acoustics, and of gestures from tract variables), and a model that uses the stress and rhythm of natural language (prosody) to disambiguate confusable sentences. Applications of his research include:

    • Speech recognition for talkers with cerebral palsy. The automatic system, suitably constrained, outperforms a human listener.

    • Provably correct unsupervised ASR, or ASR that can be trained using speech that has no associated text transcripts.

    • Equal Accuracy Ratio regularization: Methods that reduce the error rate gaps caused by gender, race, dialect, age, education, disability and/or socioeconomic class.

    • Automatic analysis of the social interactions between infant, father, mother, and older sibling during the first eighteen months of life.

    Hasegawa-Johnson is currently Senior Area Editor of the journal IEEE Transactions on Audio, Speech and Language and a member of the ISCA Diversity Committee. He has published 308 peer-reviewed journal articles, patents, and conference papers in the general area of automatic speech analysis, including machine learning models of articulatory and acoustic phonetics, prosody, dysarthria, non-speech acoustic events, audio source separation, and under-resourced languages.

  • 2016

    • Chen, W.; Hasegawa-Johnson, M.; Chen, N. F., Mismatched Crowdsourcing Based Language Perception for under-Resourced Languages. Procedia Computer Science 2016, 81, 23-29, DOI:10.1016/j.procs.2016.04.025.
    • Kong, X.; Jyothi, P.; Hasegawa-Johnson, M., Performance Improvement of Probabilistic Transcriptions with Language-Specific Constraints. Procedia Computer Science 2016, 81, 30-36, DOI:10.1016/j.procs.2016.04.026.
    • Livescu, K.; Rudzicz, F.; Fosler-Lussier, E.; Hasegawa-Johnson, M.; Bilmes, J., Speech Production in Speech Technologies: Introduction to the CSL Special Issue. Computer Speech and Language 2016, 36, 165-172.

    2015

    • Hasegawa-Johnson, M.; Cole, J.; Jyothi, P.; Varshney, L. R., Models of Dataset Size, Question Design, and Cross-Language Speech Perception for Speech Crowdsourcing Applications. Laboratory Phonology 2015, 6, (3-4), 381-432.
    • Huang, P. S.; Kim, M.; Hasegawa-Johnson, M.; Smaragdis, P., Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation. IEEE-ACM Transactions on Audio Speech and Language Processing 2015, 23, (12), 2136-2147.

    2014

    • Chen, A.; Hasegawa-Johnson, M. A., Mixed Stereo Audio Classification Using a Stereo-Input Mixed-to-Panned Level Feature. IEEE-ACM Transactions on Audio Speech and Language  Processing 2014, 22, (12), 2025-2033, DOI:10.1109/Taslp.2014.2359628.
    • Huang P.-S.; Kim, M.; Hasegawa-Johnson, M.; Smaragdis, P., Singing-Voice Separation from Monaural Recordings Using Deep Recurrent Neural Networks, Proceedings of the International Symposium of Music Information Retrieval, 2014, Taipei, Taiwan.
    • Huang, P. S.; Kim, M.; Hasegawa-Johnson, M.; Smaragdis, P., Deep Learning for Monaural Speech Separation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014, Florence, Italy.
    • Jyothi, P.; Cole, J.; Hasegawa-Johnson, M.; Puri, V., An Investigation of Prosody in Hindi Narrative Speech, Proceedings of Speech Prosody 2014, Volume 7. Dublin, Ireland.
    • Khasanova, A.; Cole, J.; Hasegawa-Johnson, M., Detecting Articulatory Compensation in Acoustic Data through Linear Regression Modeling, Proceedings of Interspeech 2014, Singapore.
    • Kim, K.; Lin, K. H.; Walther, D. B.; Hasegawa-Johnson, M. A.; Huang, T. S., Automatic Detection of Auditory Salience with Optimized Linear Filters Derived from Human Annotation. Pattern Recognition Letters 2014, 38, 78-85, DOI: 10.1016/j.patrec.2013.11.010.

    2013

    • Bharadwaj, S.; Hasegawa-Johnson, M.; Ajmera, J.; Deshmukh, O.; Verma, A.; Sparse Hidden Markov Models for Purer Clusters, In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, New York, 2013, 3098-3102.
    • Huang, P. S.; Deng, L.; Hasegawa-Johnson, M.; He, X. D.; Random Features for Kernel Deep Convex Network, In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, New York, 2013, 3143-3147.
    • King, S.; Hasegawa-Johnson, M., Accurate Speech Segmentation by Mimicking Human Auditory Processing, In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, New York, 2013, 8096-8100.
    • Lin, K. H.; Zhuang, X. D.; Goudeseune, C.; King, S.; Hasegawa-Johnson, M.; Huang, T. S., Saliency-Maximized Audio Visualization and Efficient Audio-Visual Browsing for Faster-Than-Real-Time Human Acoustic Event Detection. ACM Transactions on Applied Perception 2013, 10, (4), DOI: 10.1145/2536764.2536773.
    • Mertens, R.; Huang, P.-S.; Gottlieb, L.; Friedland, G.; Divakaran, A.; Hasegawa-Johnson, M., On the Application of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video Soundtracks. International Journal of Multimedia Data Engineering and Management 2013, 3, (3), 1-19.
    • Sharma, H. V.; Hasegawa-Johnson, M., Acoustic Model Adaptation Using in-Domain Background Models for Dysarthric Speech Recognition. Computer Speech and Language 2013, 27, (6), 1147-1162, DOI: 10.1016/j.csl.2012.10.002.

    2012

    • Mahrt, T.; Cole, J.; Fleck, M.; Hasegawa-Johnson, M. F0 and the Perception of Prominence, Proceedings of Interspeech 2012, Portland, Oregon, 2012.
    • Mahrt, T.; Cole, J.; Fleck, M.; Hasegawa-Johnson, M. Modeling Speaker Variation in Cues to Prominence Using the Bayesian Information Criterion, Proceedings of Speech Prosody 2012, Shanghai, 2012.
    • Mathur, S.; Poole, M. S.; Feniosky, P. M.; Hasegawa-Johnson, M.; Contractor, N., Detecting Interaction Links in a Collaborating Group Using Manually Annotated Data. Social Networks 2012, DOI: doi:10.1016/j.socnet.2012.04.002.
    • Mathur, S.; Poole, M. S.; Pena-Mora, F.; Hasegawa-Johnson, M.; Contractor, N., Detecting Interaction Links in a Collaborating Group Using Manually Annotated Data. Social Networks 2012, 34, (4), 515-526.
    • Nam, H.; Mitra, V.; Tiede, M.; Hasegawa-Johnson, M.; Espy-Wilson, C.; Saltzman, E.; Goldstein, L., A Procedure for Estimating Gestural Scores from Speech Acoustics. Journal of the Acoustical Society of America 2012, 132, (6), 3980-3989.
    • Ozbek, I. Y.; Hasegawa-Johnson, M.; Demirekler, M., On Improving Dynamic State Space Approaches to Articulatory Inversion with Map-Based Parameter Estimation. IEEE Transactions on Audio Speech and Language Processing 2012, 20, (1), 67-81.
    • Rong, P. Y.; Loucks, T.; Kim, H.; Hasegawa-Johnson, M., Relationship between Kinematics, F2 Slope and Speech Intelligibility in Dysarthria Due to Cerebral Palsy. Clinical Linguistics & Phonetics 2012, 26, (9), 806-822.
    • Tang, H.; Chu, S. M.; Hasegawa-Johnson, M.; Huang, T. S., Partially Supervised Speaker Clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 2012, 34, (5), 959-971.

    2011

    • Kim, H.; Hasegawa-Johnson, M.; Perlman, A., Vowel Contrast and Speech Intelligibility in Dysarthria. Folia Phoniatrica Et Logopaedica 2011, 63, (4), 187-194.
    • Lobdell, B. E.; Allen, J. B.; Hasegawa-Johnson, M. A., Intelligibility predictors and neural representation of speech. Speech Communication 2011, 53, (2), 185-194.
    • Ozbek, I. Y.; Hasegawa-Johnson, M.; Demirekler, M., Estimation of Articulatory Trajectories Based on Gaussian Mixture Model (Gmm) with Audio-Visual Information Fusion and Dynamic Kalman Smoothing. IEEE Transactions on Audio Speech and Language Processing 2011, 19, (5), 1180-1195.
    • Zhuang, X. D.; Zhou, X.; Hasegawa-Johnson, M. A.; Huang, T. S., Efficient Object Localization with Variation-Normalized Gaussianized Vectors, In Intelligent Video Event Analysis and Understanding; Zhang, J., Shao, L., Zhang, L., Jones, G. A., Eds. 2011; Vol. 332, 93-109.

    2010

    • Kim, H.; Martin, K.; Hasegawa-Johnson, M.; Perlman, A., Frequency of Consonant Articulation Errors in Dysarthric Speech. Clinical Linguistics & Phonetics 2010, 24, (10), 759-770.
    • Tang, H.; Hasegawa-Johnson, M.; Huang, T. S., Non-frontal View Facial Expression Recognition Based on Ergodic Hidden Markov Model Supervectors, IEEE International Conference on Multimedia & Expo, Singapore, 2010.
    • Tang, H.; Hasegawa-Johnson, M.; Huang, T., A Novel Vector Representation of Stochastic Signals Based on Adapted Ergodic HMMs. IEEE Signal Processing Letters 2010, 17, (8), 715-718.
    • Zhuang, X. D.; Zhou, X.; Hasegawa-Johnson, M. A.; Huang, T. S., Real-World Acoustic Event Detection. Pattern Recognition Letters 2010, 31, (12), 1543-1551.
    • Zu, Y. H.; Hasegawa-Johnson, M.; Perlman, A.; Yang, Z., A Mathematical Model of Swallowing. Dysphagia 2010, 25, (4), 397-398.

    2009

    • Huang, T. S.; Hasegawa-Johnson, M. A.; Chu, S. M.; Zeng, Z.; Tang, H., Sensitive Talking Heads. IEEE Signal Processing Magazine 2009, 26, (4), 67-72.
    • Yoon, P.; Huensch, A.; Juul, E.; Perkins, S.; Sproat, R.; Hasegawa-Johnson, M., Construction of a rated speech corpus of L2 learners' speech. CALICO Journal 2009, 26, (3), 662-673.

    2008

    • Chang, S. E.; Erickson, K. I.; Ambrose, N. G.; Hasegawa-Johnson, M. A.; Ludlow, C. L., Brain anatomy differences in childhood stuttering. Neuroimage 2008, 39, (3), 1333-1344.
    • Kim, L. H.; Hasegawa-Johnson, M.; Lim, J. S.; Sung, K. M., Acoustic model for robustness analysis of optimal multipoint room equalization. Journal of the Acoustical Society of America 2008, 123, (4), 2043-2053.
    • Tang, H.; Fu, Y.; Tu, J. L.; Hasegawa-Johnson, M.; Huang, T. S., Humanoid Audio-Visual Avatar With Emotive Text-to-Speech Synthesis. IEEE Transactions on Multimedia 2008, 10, (6), 969-981.
    • Yoon, T.; Cole, J.; Hasegawa-Johnson, M. Detecting non-modal phonation in telephone speech, In Proceedings of Speech Prosody 2008, Campinas, Brazil, 2008.

    2007

    • Chen, K.; Hasegawa-Johnson, M.; Cole, J., A Factored Language Model for Prosody-Dependent Speech Recognition. In Speech Synthesis and Recognition, Kordic, V., Ed. Advanced Robotic Systems: 2007.
    • Cole, J.; Kim, H.; Choi, H.; Hasegawa-Johnson, M., Prosodic effects on acoustic cues to stop voicing and place of articulation: Evidence from Radio News speech. Journal of Phonetics 2007, 35, (2), 180-209.
    • Yoon, T.; Cole, J.; Hasegawa-Johnson, M. On the edge. Acoustic cues to layered prosodic domains, In Proceedings of the International Conference on Phonetic Sciences, Saarbrucken, Germany, 2007.

    2006

    • Zhang, T.; Hasegawa-Johnson, M.; Levinson, S. E., Cognitive state classification in a spoken tutorial dialogue system. Speech Communication 2006, 48, (6), 616-632.
    • Zhang, T.; Hasegawa-Johnson, M.; Levinson, S. E., Extraction of pragmatic and semantic salience from spontaneous spoken English. Speech Communication 2006, 48, (3-4), 437-462.