Visible Speech

The Experts below are selected from a list of 7071 Experts worldwide ranked by ideXlab platform

Aslı Özyürek - One of the best experts on this subject based on the ideXlab platform.

Aging and working memory modulate the ability to benefit from Visible Speech and iconic gestures during Speech-in-noise comprehension

Psychological Research, 2020

Co-Authors: Louise Schubotz, Linda Drijvers, Judith Holler, Aslı Özyürek

Abstract:

When comprehending Speech-in-noise (SiN), younger and older adults benefit from seeing the speaker’s mouth, i.e. Visible Speech. Younger adults additionally benefit from manual iconic co-Speech gestures. Here, we investigate to what extent younger and older adults benefit from perceiving both visual articulators while comprehending SiN, and whether this is modulated by working memory and inhibitory control. Twenty-eight younger and 28 older adults performed a word recognition task in three visual contexts: mouth blurred (Speech-only), Visible Speech, or Visible Speech + iconic gesture. The Speech signal was either clear or embedded in multitalker babble. Additionally, there were two visual-only conditions (Visible Speech, Visible Speech + gesture). Accuracy levels for both age groups were higher when both visual articulators were present compared to either one or none. However, older adults received a significantly smaller benefit than younger adults, although they performed equally well in Speech-only and visual-only word recognition. Individual differences in verbal working memory and inhibitory control partly accounted for age-related performance differences. To conclude, perceiving iconic gestures in addition to Visible Speech improves younger and older adults’ comprehension of SiN. Yet, the ability to benefit from this additional visual information is modulated by age and verbal working memory. Future research will have to show whether these findings extend beyond the single word level.

15 days free trial to Access Article
degree of language experience modulates visual attention to Visible Speech and iconic gestures during clear and degraded Speech comprehension

Cognitive Science, 2019

Co-Authors: Linda Drijvers, Julija Vaitonyte, Aslı Özyürek

Abstract:

Visual information conveyed by iconic hand gestures and Visible Speech can enhance Speech comprehension under adverse listening conditions for both native and non-native listeners. However, how a listener allocates visual attention to these articulators during Speech comprehension is unknown. We used eye-tracking to investigate whether and how native and highly proficient non-native listeners of Dutch allocated overt eye gaze to Visible Speech and gestures during clear and degraded Speech comprehension. Participants watched video clips of an actress uttering a clear or degraded (6-band noise-vocoded) action verb while performing a gesture or not, and were asked to indicate the word they heard in a cued-recall task. Gestural enhancement was the largest (i.e., a relative reduction in reaction time cost) when Speech was degraded for all listeners, but it was stronger for native listeners. Both native and non-native listeners mostly gazed at the face during comprehension, but non-native listeners gazed more often at gestures than native listeners. However, only native but not non-native listeners' gaze allocation to gestures predicted gestural benefit during degraded Speech comprehension. We conclude that non-native listeners might gaze at gesture more as it might be more challenging for non-native listeners to resolve the degraded auditory cues and couple those cues to phonological information that is conveyed by Visible Speech. This diminished phonological knowledge might hinder the use of semantic information that is conveyed by gestures for non-native compared to native listeners. Our results demonstrate that the degree of language experience impacts overt visual attention to visual articulators, resulting in different visual benefits for native versus non-native listeners.

15 days free trial to Access Article
Non-native Listeners Benefit Less from Gestures and Visible Speech than Native Listeners During Degraded Speech Comprehension:

Language and Speech, 2019

Co-Authors: Linda Drijvers, Aslı Özyürek

Abstract:

Native listeners benefit from both Visible Speech and iconic gestures to enhance degraded Speech comprehension (Drijvers & Ozyurek, 2017). We tested how highly proficient non-native listeners benefit from these visual articulators compared to native listeners. We presented videos of an actress uttering a verb in clear, moderately, or severely degraded Speech, while her lips were blurred, Visible, or Visible and accompanied by a gesture. Our results revealed that unlike native listeners, non-native listeners were less likely to benefit from the combined enhancement of Visible Speech and gestures, especially since the benefit from Visible Speech was minimal when the signal quality was not sufficient.

15 days free trial to Access Article
visual context enhanced the joint contribution of iconic gestures and Visible Speech to degraded Speech comprehension

Journal of Speech Language and Hearing Research, 2017

Co-Authors: Linda Drijvers, Aslı Özyürek

Abstract:

Purpose This study investigated whether and to what extent iconic co-Speech gestures contribute to information from Visible Speech to enhance degraded Speech comprehension at different levels of no...

15 days free trial to Access Article

Linda Drijvers - One of the best experts on this subject based on the ideXlab platform.

Aging and working memory modulate the ability to benefit from Visible Speech and iconic gestures during Speech-in-noise comprehension

Psychological Research, 2020

Co-Authors: Louise Schubotz, Linda Drijvers, Judith Holler, Aslı Özyürek

Abstract:

When comprehending Speech-in-noise (SiN), younger and older adults benefit from seeing the speaker’s mouth, i.e. Visible Speech. Younger adults additionally benefit from manual iconic co-Speech gestures. Here, we investigate to what extent younger and older adults benefit from perceiving both visual articulators while comprehending SiN, and whether this is modulated by working memory and inhibitory control. Twenty-eight younger and 28 older adults performed a word recognition task in three visual contexts: mouth blurred (Speech-only), Visible Speech, or Visible Speech + iconic gesture. The Speech signal was either clear or embedded in multitalker babble. Additionally, there were two visual-only conditions (Visible Speech, Visible Speech + gesture). Accuracy levels for both age groups were higher when both visual articulators were present compared to either one or none. However, older adults received a significantly smaller benefit than younger adults, although they performed equally well in Speech-only and visual-only word recognition. Individual differences in verbal working memory and inhibitory control partly accounted for age-related performance differences. To conclude, perceiving iconic gestures in addition to Visible Speech improves younger and older adults’ comprehension of SiN. Yet, the ability to benefit from this additional visual information is modulated by age and verbal working memory. Future research will have to show whether these findings extend beyond the single word level.

15 days free trial to Access Article
degree of language experience modulates visual attention to Visible Speech and iconic gestures during clear and degraded Speech comprehension

Cognitive Science, 2019

Co-Authors: Linda Drijvers, Julija Vaitonyte, Aslı Özyürek

Abstract:

Visual information conveyed by iconic hand gestures and Visible Speech can enhance Speech comprehension under adverse listening conditions for both native and non-native listeners. However, how a listener allocates visual attention to these articulators during Speech comprehension is unknown. We used eye-tracking to investigate whether and how native and highly proficient non-native listeners of Dutch allocated overt eye gaze to Visible Speech and gestures during clear and degraded Speech comprehension. Participants watched video clips of an actress uttering a clear or degraded (6-band noise-vocoded) action verb while performing a gesture or not, and were asked to indicate the word they heard in a cued-recall task. Gestural enhancement was the largest (i.e., a relative reduction in reaction time cost) when Speech was degraded for all listeners, but it was stronger for native listeners. Both native and non-native listeners mostly gazed at the face during comprehension, but non-native listeners gazed more often at gestures than native listeners. However, only native but not non-native listeners' gaze allocation to gestures predicted gestural benefit during degraded Speech comprehension. We conclude that non-native listeners might gaze at gesture more as it might be more challenging for non-native listeners to resolve the degraded auditory cues and couple those cues to phonological information that is conveyed by Visible Speech. This diminished phonological knowledge might hinder the use of semantic information that is conveyed by gestures for non-native compared to native listeners. Our results demonstrate that the degree of language experience impacts overt visual attention to visual articulators, resulting in different visual benefits for native versus non-native listeners.

15 days free trial to Access Article
Non-native Listeners Benefit Less from Gestures and Visible Speech than Native Listeners During Degraded Speech Comprehension:

Language and Speech, 2019

Co-Authors: Linda Drijvers, Aslı Özyürek

Abstract:

Native listeners benefit from both Visible Speech and iconic gestures to enhance degraded Speech comprehension (Drijvers & Ozyurek, 2017). We tested how highly proficient non-native listeners benefit from these visual articulators compared to native listeners. We presented videos of an actress uttering a verb in clear, moderately, or severely degraded Speech, while her lips were blurred, Visible, or Visible and accompanied by a gesture. Our results revealed that unlike native listeners, non-native listeners were less likely to benefit from the combined enhancement of Visible Speech and gestures, especially since the benefit from Visible Speech was minimal when the signal quality was not sufficient.

15 days free trial to Access Article
visual context enhanced the joint contribution of iconic gestures and Visible Speech to degraded Speech comprehension

Journal of Speech Language and Hearing Research, 2017

Co-Authors: Linda Drijvers, Aslı Özyürek

Abstract:

Purpose This study investigated whether and to what extent iconic co-Speech gestures contribute to information from Visible Speech to enhance degraded Speech comprehension at different levels of no...

15 days free trial to Access Article

Barbara Wise - One of the best experts on this subject based on the ideXlab platform.

accurate Visible Speech synthesis based on concatenating variable length motion capture data

IEEE Transactions on Visualization and Computer Graphics, 2006

Co-Authors: Ronald A Cole, Bryan L Pellom, Wayne H Ward, Barbara Wise

Abstract:

We present a novel approach to synthesizing accurate Visible Speech based on searching and concatenating optimal variable-length units in a large corpus of motion capture data. Based on a set of visual prototypes selected on a source face and a corresponding set designated for a target face, we propose a machine learning technique to automatically map the facial motions observed on the source face to the target face. In order to model the long distance coarticulation effects in Visible Speech, a large-scale corpus that covers the most common syllables in English was collected, annotated and analyzed. For any input text, a search algorithm to locate the optimal sequences of concatenated units for synthesis is described. A new algorithm to adapt lip motions from a generic 3D face model to a specific 3D face model is also proposed. A complete, end-to-end Visible Speech animation system is implemented based on the approach. This system is currently used in more than 60 kindergartens through third grade classrooms to teach students to read using a lifelike conversational animated agent. To evaluate the quality of the Visible Speech produced by the animation system, both subjective evaluation and objective evaluation are conducted. The evaluation results show that the proposed approach is accurate and powerful for Visible Speech synthesis.

15 days free trial to Access Article
accurate automatic Visible Speech synthesis of arbitrary 3d models based on concatenation of diviseme motion capture data

Computer Animation and Virtual Worlds, 2004

Co-Authors: Ronald A Cole, Bryan L Pellom, Wayne H Ward, Barbara Wise

Abstract:

We present a technique for accurate automatic Visible Speech synthesis from textual input. When provided with a Speech waveform and the text of a spoken sentence, the system produces accurate Visible Speech synchronized with the audio signal. To develop the system, we collected motion capture data from a speaker's face during production of a set of words containing all diviseme sequences in English. The motion capture points from the speaker's face are retargeted to the vertices of the polygons of a 3D face model. When synthesizing a new utterance, the system locates the required sequence of divisemes, shrinks or expands each diviseme based on the desired phoneme segment durations in the target utterance, then moves the polygons in the regions of the lips and lower face to correspond to the spatial coordinates of the motion capture data. The motion mapping is realized by a key-shape mapping function learned by a set of viseme examples in the source and target faces. A well-posed numerical algorithm estimates the shape blending coefficients. Time warping and motion vector blending at the juncture of two divisemes and the algorithm to search the optimal concatenated Visible Speech are also developed to provide the final concatenative motion sequence. Copyright © 2004 John Wiley & Sons, Ltd.

15 days free trial to Access Article
accurate automatic Visible Speech synthesis of arbitrary 3d models based on concatenation of diviseme motion capture data research articles

Computer Animation and Virtual Worlds, 2004

Co-Authors: Ronald A Cole, Bryan L Pellom, Wayne H Ward, Barbara Wise

Abstract:

We present a technique for accurate automatic Visible Speech synthesis from textual input. When provided with a Speech waveform and the text of a spoken sentence, the system produces accurate Visible Speech synchronized with the audio signal. To develop the system, we collected motion capture data from a speaker's face during production of a set of words containing all diviseme sequences in English. The motion capture points from the speaker's face are retargeted to the vertices of the polygons of a 3D face model. When synthesizing a new utterance, the system locates the required sequence of divisemes, shrinks or expands each diviseme based on the desired phoneme segment durations in the target utterance, then moves the polygons in the regions of the lips and lower face to correspond to the spatial coordinates of the motion capture data. The motion mapping is realized by a key-shape mapping function learned by a set of viseme examples in the source and target faces. A well-posed numerical algorithm estimates the shape blending coefficients. Time warping and motion vector blending at the juncture of two divisemes and the algorithm to search the optimal concatenated Visible Speech are also developed to provide the final concatenative motion sequence. Copyright © 2004 John Wiley & Sons, Ltd.

15 days free trial to Access Article

Laura A Thompson - One of the best experts on this subject based on the ideXlab platform.

reliance on Visible Speech cues during multimodal language processing individual and age differences

Experimental Aging Research, 2007

Co-Authors: Laura A Thompson, E Garcia, D Malloy

Abstract:

The current study demonstrates that when a strong inhibition process is invoked during multimodal (auditory-visual) language understanding: older adults perform worse than younger adults, Visible Speech does not benefit language-processing performance, and individual differences in measures of working memory for language do not predict performance. In contrast, in a task that does not invoke inhibition: adult age differences in performance are not obtained, Visible Speech benefits language performance, and individual differences in working memory predict performance. The results offer support for a framework for investigating multimodal language processing that incorporates assumptions about general information processing, individual differences in working memory capacity, and adult cognitive aging.

15 days free trial to Access Article
attention resources and Visible Speech encoding in older and younger adults

Experimental Aging Research, 2004

Co-Authors: Laura A Thompson, Daniel M Malloy

Abstract:

Two experiments investigated adult age differences in the distribution of attention across a speaker's face during auditory-visual language processing. Dots were superimposed on the faces of speakers for 17-ms presentations, and participants reported the spatial locations of the dots. In Experiment 1, older adults showed relatively better detection performance at the mouth area than the eye area compared to younger adults. In Experiment 2, in the absence of audible language, both age groups did not differentially focus on the mouth area. The results are interpreted in light of Massaro's (1998, Perceiving talking faces: From Speech perception to a behavioral principle. Cambridge, MA: MIT Press) theoretical framework for understanding auditory-visual Speech perception. It is claimed that older adults' greater reliance on Visible Speech is due to a reallocation of resources away from the eyes and toward the mouth area of the face.

15 days free trial to Access Article
some limits on encoding Visible Speech and gestures using a dichotic shadowing task

Journals of Gerontology Series B-psychological Sciences and Social Sciences, 1999

Co-Authors: Laura A Thompson, Felipe A Guzman

Abstract:

Visible Speech and gestures are two forms of available language information that can be used by listeners to help them understand the speaker's meaning. Previous research has shown that older adults are particularly dependent on Visible Speech, yet seem to profit less than younger adults from the speaker's gestures. To understand how Visible Speech and gestures are used when listening becomes difficult, the authors conducted an experiment with a dichotic shadowing task. The experiment examined how accurately participants could shadow the right- or left-ear input when instructed to attend selectively to a particular ear and whether performance benefited from visual input. The results indicate that older adults' shadowing performance was unaffected by Visible Speech and gestures. Younger adults did benefit by both Visible Speech and gestures. Thus, under extremely attention-demanding listening conditions, older adults are unable to use a compensatory mechanism for encoding visual language.

15 days free trial to Access Article
Visible Speech improves human language understanding implications for Speech processing systems

Artificial Intelligence Review, 1995

Co-Authors: Laura A Thompson, William C Ogden

Abstract:

Evidence from the study of human language understanding is presented suggesting that our ability to perceive Visible Speech can greatly influence our ability to understand and remember spoken language. A view of the speaker’s face can greatly aid in the perception of ambiguous or noisy Speech and can aid cognitive processing of Speech leading to better understanding and recall. Some of these effects have been replicated using computer synthesized visual and auditory Speech. Thus, it appears that when giving an interface a voice, it may be best to give it a face too.

15 days free trial to Access Article
encoding and memory for Visible Speech and gestures a comparison between young and older adults

Psychology and Aging, 1995

Co-Authors: Laura A Thompson

Abstract:

Two experiments explored whether older adults have developed a strategy of compensating for slower speeds of language processing and hearing loss by relying more on the visual modality. Experiment 1 examined the influence of visual articulatory movements of the face (Visible Speech ) in auditory-visual syllable classification in young adults and older adults. Older adults showed a significantly greater influence of Visible Speech. Experiment 2 examined immediate recall in three spoken-language sentence conditions : Speech alone, with Visible Speech, or with both Visible Speech and iconic gestures. Sentences also varied in meaningfulness and Speech rate. In the old adult group, recall was better for sentences containing Visible Speech compared with the Speech-alone sentences in the meaningful sentence condition. Old adults' recall showed no overall benefit of the presence of gestures. Young adults' recall on meaningful sentences was not higher for the Visible Speech compared with the Speech-alone condition, whereas recall was significantly higher with the addition of iconic gestures. In the anomalous sentence condition, both young and old adults showed an advantage in recall by the presence of Visible Speech. The experiments provide converging evidence for old adults' greater reliance on Visible Speech while processing visual-spoken language.

15 days free trial to Access Article

Lawrence D Rosenblum - One of the best experts on this subject based on the ideXlab platform.

visibility of Speech articulation enhances auditory phonetic convergence

Attention Perception & Psychophysics, 2016

Co-Authors: James W Dias, Lawrence D Rosenblum

Abstract:

Talkers automatically imitate aspects of perceived Speech, a phenomenon known as phonetic convergence. Talkers have previously been found to converge to auditory and visual Speech information. Furthermore, talkers converge more to the Speech of a conversational partner who is seen and heard, relative to one who is just heard (Dias & Rosenblum Perception, 40, 1457-1466, 2011). A question raised by this finding is what visual information facilitates the enhancement effect. In the following experiments, we investigated the possible contributions of Visible Speech articulation to visual enhancement of phonetic convergence within the noninteractive context of a shadowing task. In Experiment 1, we examined the influence of the visibility of a talker on phonetic convergence when shadowing auditory Speech either in the clear or in low-level auditory noise. The results suggest that visual Speech can compensate for convergence that is reduced by auditory noise masking. Experiment 2 further established the visibility of articulatory mouth movements as being important to the visual enhancement of phonetic convergence. Furthermore, the word frequency and phonological neighborhood density characteristics of the words shadowed were found to significantly predict phonetic convergence in both experiments. Consistent with previous findings (e.g., Goldinger Psychological Review, 105, 251-279, 1998), phonetic convergence was greater when shadowing low-frequency words. Convergence was also found to be greater for low-density words, contrasting with previous predictions of the effect of phonological neighborhood density on auditory phonetic convergence (e.g., Pardo, Jordan, Mallari, Scanlon, & Lewandowski Journal of Memory and Language, 69, 183-195, 2013). Implications of the results for a gestural account of phonetic convergence are discussed.

15 days free trial to Access Article
hearing a face cross modal speaker matching using isolated Visible Speech

Attention Perception & Psychophysics, 2006

Co-Authors: Lawrence D Rosenblum, Nicolas M Smith, Sarah M Nichols, Steven Hale, Joanne Lee

Abstract:

An experiment was performed to test whether cross-modal speaker matches could be made using isolated Visible Speech movement information. Visible Speech movements were isolated using a pointlight technique. In five conditions, subjects were asked to match a voice to one of two (unimodal) speaking point-light faces on the basis of speaker identity. Two of these conditions were designed to maintain the idiosyncratic Speech dynamics of the speakers, whereas three of the conditions deleted or distorted the dynamics in various ways. Some of these conditions also equated video frames across dynamically correct and distorted movements. The results revealed generally better matching performance in the conditions that maintained the correct Speech dynamics than in those conditions that did not, despite containing exactly the same video frames. The results suggest that Visible Speech movements themselves can support cross-modal speaker matching.

15 days free trial to Access Article
an audiovisual test of kinematic primitives for visual Speech perception

Journal of Experimental Psychology: Human Perception and Performance, 1996

Co-Authors: Lawrence D Rosenblum, Helena M Saldana

Abstract:

Isolated kinematic properties of Visible Speech can provide information for lip reading. Kinematic facial information is isolated by darkening an actor's face and attaching dots to various articulators so that only moving dots can be seen with no facial features present. To test the salience of these images, the authors conducted experiments to determine whether the images could visually influence the perception of discrepant auditory syllables. Results showed that these images can influence auditory Speech independently of the participant's knowledge of the stimuli. In other experiments, single frozen frames of Visible syllables were presented with discrepant auditory syllables to test the salience of static facial features. Although the influence of the kinematic stimuli was perceptual, any influence of the static featural stimuli was likely based on participant's misunderstanding or postperceptual response bias.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Aslı Özyürek - One of the best experts on this subject based on the ideXlab platform.

Aging and working memory modulate the ability to benefit from Visible Speech and iconic gestures during Speech-in-noise comprehension

degree of language experience modulates visual attention to Visible Speech and iconic gestures during clear and degraded Speech comprehension

Non-native Listeners Benefit Less from Gestures and Visible Speech than Native Listeners During Degraded Speech Comprehension:

visual context enhanced the joint contribution of iconic gestures and Visible Speech to degraded Speech comprehension

Linda Drijvers - One of the best experts on this subject based on the ideXlab platform.

Aging and working memory modulate the ability to benefit from Visible Speech and iconic gestures during Speech-in-noise comprehension

degree of language experience modulates visual attention to Visible Speech and iconic gestures during clear and degraded Speech comprehension

Non-native Listeners Benefit Less from Gestures and Visible Speech than Native Listeners During Degraded Speech Comprehension:

visual context enhanced the joint contribution of iconic gestures and Visible Speech to degraded Speech comprehension

Barbara Wise - One of the best experts on this subject based on the ideXlab platform.

accurate Visible Speech synthesis based on concatenating variable length motion capture data

accurate automatic Visible Speech synthesis of arbitrary 3d models based on concatenation of diviseme motion capture data

accurate automatic Visible Speech synthesis of arbitrary 3d models based on concatenation of diviseme motion capture data research articles

Laura A Thompson - One of the best experts on this subject based on the ideXlab platform.

reliance on Visible Speech cues during multimodal language processing individual and age differences

attention resources and Visible Speech encoding in older and younger adults

some limits on encoding Visible Speech and gestures using a dichotic shadowing task

Visible Speech improves human language understanding implications for Speech processing systems

encoding and memory for Visible Speech and gestures a comparison between young and older adults

Lawrence D Rosenblum - One of the best experts on this subject based on the ideXlab platform.

visibility of Speech articulation enhances auditory phonetic convergence

hearing a face cross modal speaker matching using isolated Visible Speech

an audiovisual test of kinematic primitives for visual Speech perception

Visible Speech

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Aslı Özyürek - One of the best experts on this subject based on the ideXlab platform.

Linda Drijvers - One of the best experts on this subject based on the ideXlab platform.

Barbara Wise - One of the best experts on this subject based on the ideXlab platform.

Laura A Thompson - One of the best experts on this subject based on the ideXlab platform.

Lawrence D Rosenblum - One of the best experts on this subject based on the ideXlab platform.

Related terms