Lyndon Kennedy, Ph.D.

Senior Research Scientist

Lyndon Kennedy

Lyndon Kennedy is a Senior Research Scientist at FXPAL. Prior to joining FXPAL, he was a Senior Staff Researcher at Huawei R&D USA where he investigated computer vision and machine learning techniques for robot perception and scene understanding. Before Huawei, he was a Senior Research Scientist at Yahoo Labs where he worked on recommendation and discovery in large-scale social multimedia systems, particularly Flickr. His research interests span many areas in computer vision and machine learning and their application towards tasks in image and video indexing, retrieval, and discovery. In particular, he is interested in media systems composed of contributions from many participants and how machine learning and content analysis can be applied to understand the media objects themselves, the users, and the world at large. He holds a Ph.D. in Electrical Engineering from Columbia University where he conducted research in the Digital Video and Multimedia Lab. He also has B.S. and M.S. degrees, both in Electrical Engineering also from Columbia University.



Publication Details
  • IEEE 2nd International Conference on Multimedia Information Processing and Retrieval
  • Mar 14, 2019


We present an approach to detect speech impairments from video of people with aphasia, a neurological condition that affects the ability to comprehend and produce speech. To counter inherent privacy issues, we propose a cross-media approach using only visual facial features to detect speech properties without listening to the audio content of speech. Our method uses facial landmark detections to measure facial motion over time. We show how to detect speech and pause instances based on temporal mouth shape analysis and identify repeating mouth patterns using a dynamic warping mechanism. We relate our developed features for pause frequency, mouth pattern repetitions, and pattern variety to actual symptoms of people with aphasia in the AphasiaBank dataset. Our evaluation shows that our developed features are able to reliably differentiate dysfluent speech production of people with aphasia from those without aphasia with an accuracy of 0.86. A combination of these handcrafted features and further statistical measures on talking and repetition improves classification performance to an accuracy of 0.88.