Articulator

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 6219 Experts worldwide ranked by ideXlab platform

Shrikanth S Narayanan - One of the best experts on this subject based on the ideXlab platform.

  • task dependence of Articulator synergies
    Journal of the Acoustical Society of America, 2019
    Co-Authors: Tanner Sorensen, Louis Goldstein, Asterios Toutios, Shrikanth S Narayanan
    Abstract:

    In speech production, the motor system organizes Articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study tests whether synergies for different constriction tasks differ in terms of inter-Articulator coordination. The test is conducted on utterances [ɑpɑ], [ɑtɑ], [ɑiɑ], and [ɑkɑ] with a real-time magnetic resonance imaging biomarker that is computed using a statistical model of the forward kinematics of the vocal tract. The present study is the first to estimate the forward kinematics of the vocal tract from speech production data. Using the imaging biomarker, the study finds that the jaw contributes least to the velar stop for [k], more to pharyngeal approximation for [ɑ], still more to palatal approximation for [i], and most to the coronal stop for [t]. Additionally, the jaw contributes more to the coronal stop for [t] than to the bilabial stop for [p]. Finally, the study investigates how this pattern of results varies by participant. The study identifies differences in inter-Articulator coordination by constriction task, which support the claim that inter-Articulator coordination differs depending on the active Articulator synergy.In speech production, the motor system organizes Articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study tests whether synergies for different constriction tasks differ in terms of inter-Articulator coordination. The test is conducted on utterances [ɑpɑ], [ɑtɑ], [ɑiɑ], and [ɑkɑ] with a real-time magnetic resonance imaging biomarker that is computed using a statistical model of the forward kinematics of the vocal tract. The present study is the first to estimate the forward kinematics of the vocal tract from speech production data. Using the imaging biomarker, the study finds that the jaw contributes least to the velar stop for [k], more to pharyngeal approximation for [ɑ], still more to palatal approximation for [i], and most to the coronal stop for [t]. Additionally, the jaw contributes more to the coronal stop for [t] than to the bilabial stop for [p]. Finally, the study in...

  • Feasibility of through-time spiral generalized autocalibrating partial parallel acquisition for low latency accelerated real-time MRI of speech.
    Magnetic Resonance in Medicine, 2017
    Co-Authors: Sajan Goud Lingala, Shrikanth S Narayanan, Asterios Toutios, Yongwan Lim, Yinghua Zhu, Nicole Seiberlich, Krishna S. Nayak
    Abstract:

    Purpose To evaluate the feasibility of through-time spiral generalized autocalibrating partial parallel acquisition (GRAPPA) for low-latency accelerated real-time MRI of speech. Methods Through-time spiral GRAPPA (spiral GRAPPA), a fast linear reconstruction method, is applied to spiral (k-t) data acquired from an eight-channel custom upper-airway coil. Fully sampled data were retrospectively down-sampled to evaluate spiral GRAPPA at undersampling factors R = 2 to 6. Pseudo-golden-angle spiral acquisitions were used for prospective studies. Three subjects were imaged while performing a range of speech tasks that involved rapid Articulator movements, including fluent speech and beat-boxing. Spiral GRAPPA was compared with view sharing, and a parallel imaging and compressed sensing (PI-CS) method. Results Spiral GRAPPA captured spatiotemporal dynamics of vocal tract Articulators at undersampling factors ≤4. Spiral GRAPPA at 18 ms/frame and 2.4 mm2/pixel outperformed view sharing in depicting rapidly moving Articulators. Spiral GRAPPA and PI-CS provided equivalent temporal fidelity. Reconstruction latency per frame was 14 ms for view sharing and 116 ms for spiral GRAPPA, using a single processor. Spiral GRAPPA kept up with the MRI data rate of 18ms/frame with eight processors. PI-CS required 17 minutes to reconstruct 5 seconds of dynamic data. Conclusion Spiral GRAPPA enabled 4-fold accelerated real-time MRI of speech with a low reconstruction latency. This approach is applicable to wide range of speech RT-MRI experiments that benefit from real-time feedback while visualizing rapid Articulator movement. Magn Reson Med 78:2275–2282, 2017. © 2017 International Society for Magnetic Resonance in Medicine.

  • INTERSPEECH - A Real-Time MRI Study of Articulatory Setting in Second Language Speech
    2014
    Co-Authors: Andrés Benítez, Louis Goldstein, Vikram Ramanarayanan, Shrikanth S Narayanan
    Abstract:

    Previous work has shown that languages differ in their Articulatory setting, the postural configuration that the vocal tract Articulators tend to adopt when they are not engaged in any active speech gesture, and that this posture might be specified as part of the phonological knowledge speakers have of the language. This study tests whether the Articulatory setting of a language can be acquired by non-native speakers. Three native speakers of German who had learned English as a second language were imaged using real-time MRI of the vocal tract while reading passages in German and English, and features that capture vocal tract posture were extracted from the inter-speech pauses in their native and non-native languages. Results show that the speakers exhibit distinct inter-speech postures in each language, with a lower and more retracted tongue in English, consistent with classic descriptions of the differences between the German and the English Articulatory settings. This supports the view that non-native speakers may acquire relevant features of the Articulatory setting of a second language, and also lends further support to the idea that Articulatory setting is part of a speaker’s phonological competence in a language. Index Terms: Articulatory setting, speech production, second language speech acquisition, real-time MRI.

  • flexible retrospective selection of temporal resolution in real time speech mri using a golden ratio spiral view order
    Magnetic Resonance in Medicine, 2011
    Co-Authors: Shrikanth S Narayanan, Krishna S. Nayak
    Abstract:

    In speech production research using real-time magnetic reso- nance imaging (MRI), the analysis of Articulatory dynamics is performed retrospectively. A flexible selection of temporal resolution is highly desirable because of natural variations in speech rate and variations in the speed of different articula- tors. The purpose of the study is to demonstrate a first appli- cation of golden-ratio spiral temporal view order to real-time speech MRI and investigate its performance by comparison with conventional bit-reversed temporal view order. Golden- ratio view order proved to be more effective at capturing the dynamics of rapid tongue tip motion. A method for automated blockwise selection of temporal resolution is presented that enables the synthesis of a single video from multiple temporal resolution videos and potentially facilitates subsequent vocal tract shape analysis. Magn Reson Med 65:1365-1371, 2011. V C 2010 Wiley-Liss, Inc. during production of monophthongal vowel sounds or during/vicinity of pauses. Vocal tract variables such as tongue tip constriction, lip aperture, and velum aperture are dynamically controlled and coordinated to produce target words (11). The speeds among Articulators can also differ during the coordination of different articula- tors, for example, the movement of the velum and the tongue tip during the production of the nasal consonant /n/. Current speech MRI protocols do not provide a mech- anism for flexible selection of temporal resolution. This is of potential value, because higher temporal resolution is necessary for frames that reflect rapid Articulator motion while lower temporal resolution is sufficient for capturing the frames that correspond to static postures. As recently shown by Winkelmann et al. (12), golden- ratio sampling enables flexible retrospective selection of temporal resolution. It may be suited for speech imag- ing, in which the motion patterning of Articulators varies significantly in time, and in which it is difficult to determine an appropriate temporal resolution a priori. In this manuscript, we present a first application of spiral golden-ratio sampling scheme to real-time speech MRI and investigate its performance by comparison with conventional bit-reversed temporal view order sampling scheme. Simulation studies are performed to compare unaliased field-of-view (FOV) from spiral golden-ratio sampling with that from conventional bit- reversed sampling at different levels of temporal resolu- tion after a retrospective selection. In vivo experiments are performed to qualitatively compare image signal-to- noise ratio (SNR), level of spatial aliasing, and degree of temporal fidelity. Finally, we present an automated technique in which a composite movie can be produced using data reconstructed at several different temporal resolutions. We demonstrate its effectiveness at improv- ing Articulator visualization during production of nasal consonant /n/.

  • analysis of inter Articulator correlation in acoustic to Articulatory inversion using generalized smoothness criterion
    Conference of the International Speech Communication Association, 2011
    Co-Authors: Prasanta Kumar Ghosh, Shrikanth S Narayanan
    Abstract:

    The movements of the different speech Articulators are known to be correlated to various degrees during speech production. In this paper, we investigate whether the inter-Articulator correlation is preserved among the Articulators estimated through acoustic-toArticulatory inversion using the generalized smoothness criterion (GSC). GSC estimates each Articulator separately without explicitly using any correlation information between the Articulators. Theoretical analysis of inter-Articulator correlation in GSC reveals that the correlation between any two estimated Articulators may not be identical to that between the corresponding measured Articulatory trajectories; however, based on smoothness constraints provided by the real Articulatory data, we found that, in practice, the correlation among Articulators is approximately preserved in GSC based inversion. To validate the theoretical analysis on interArticulator correlation, we propose a modified version of GSC where correlations among Articulators are explicitly imposed. We found that there is no significant benefit in inversion using such modified GSC, which further strengthens the conclusions drawn from the theoretical analysis of inter-Articulator correlation. Index Terms: acoustic-to-Articulatory inversion, interarticulation correlation, generalized smoothness criterion

Edward F. Chang - One of the best experts on this subject based on the ideXlab platform.

  • Speech synthesis from neural decoding of spoken sentences
    Nature, 2019
    Co-Authors: Gopala K. Anumanchipalli, Josh Chartier, Edward F. Chang
    Abstract:

    Technology that translates neural activity into speech would be transformative for people who are unable to communicate as a result of neurological impairments. Decoding speech from neural activity is challenging because speaking requires very precise and rapid multi-dimensional control of vocal tract Articulators. Here we designed a neural decoder that explicitly leverages kinematic and sound representations encoded in human cortical activity to synthesize audible speech. Recurrent neural networks first decoded directly recorded cortical activity into representations of Articulatory movement, and then transformed these representations into speech acoustics. In closed vocabulary tests, listeners could readily identify and transcribe speech synthesized from cortical activity. Intermediate Articulatory dynamics enhanced performance even with limited data. Decoded Articulatory representations were highly conserved across speakers, enabling a component of the decoder to be transferrable across participants. Furthermore, the decoder could synthesize speech when a participant silently mimed sentences. These findings advance the clinical viability of using speech neuroprosthetic technology to restore spoken communication.A neural decoder uses kinematic and sound representations encoded in human cortical activity to synthesize audible sentences, which are readily identified and transcribed by listeners.

  • The auditory representation of speech sounds in human motor cortex
    eLife, 2016
    Co-Authors: Connie Cheung, Liberty S Hamiton, Keith A Johnson, Edward F. Chang
    Abstract:

    In humans, listening to speech evokes neural responses in the motor cortex. This has been controversially interpreted as evidence that speech sounds are processed as Articulatory gestures. However, it is unclear what information is actually encoded by such neural activity. We used high-density direct human cortical recordings while participants spoke and listened to speech sounds. Motor cortex neural patterns during listening were substantially different than during articulation of the same sounds. During listening, we observed neural activity in the superior and inferior regions of ventral motor cortex. During speaking, responses were distributed throughout somatotopic representations of speech Articulators in motor cortex. The structure of responses in motor cortex during listening was organized along acoustic features similar to auditory cortex, rather than along Articulatory features as during speaking. Motor cortex does not contain Articulatory representations of perceived actions in speech, but rather, represents auditory vocal information.

  • functional organization of human sensorimotor cortex for speech articulation
    Nature, 2013
    Co-Authors: Kristofer E Bouchard, Keith A Johnson, Nima Mesgarani, Edward F. Chang
    Abstract:

    Speaking is one of the most complex actions we perform, yet nearly all of us learn to do it effortlessly. Production of fluent speech requires the precise, coordinated movement of multiple Articulators (e.g., lips, jaw, tongue, larynx) over rapid time scales. Here, we used high-resolution, multi-electrode cortical recordings during the production of consonant-vowel syllables to determine the organization of speech sensorimotor cortex in humans. We found speech Articulator representations that were somatotopically arranged on ventral pre- and post-central gyri and partially overlapping at individual electrodes. These representations were temporally coordinated as sequences during syllable production. Spatial patterns of cortical activity revealed an emergent, population-level representation, which was organized by phonetic features. Over tens of milliseconds, the spatial patterns transitioned between distinct representations for different consonants and vowels. These results reveal the dynamic organization of speech sensorimotor cortex during the generation of multi-Articulator movements underlying our ability to speak.

Louis Goldstein - One of the best experts on this subject based on the ideXlab platform.

  • task dependence of Articulator synergies
    Journal of the Acoustical Society of America, 2019
    Co-Authors: Tanner Sorensen, Louis Goldstein, Asterios Toutios, Shrikanth S Narayanan
    Abstract:

    In speech production, the motor system organizes Articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study tests whether synergies for different constriction tasks differ in terms of inter-Articulator coordination. The test is conducted on utterances [ɑpɑ], [ɑtɑ], [ɑiɑ], and [ɑkɑ] with a real-time magnetic resonance imaging biomarker that is computed using a statistical model of the forward kinematics of the vocal tract. The present study is the first to estimate the forward kinematics of the vocal tract from speech production data. Using the imaging biomarker, the study finds that the jaw contributes least to the velar stop for [k], more to pharyngeal approximation for [ɑ], still more to palatal approximation for [i], and most to the coronal stop for [t]. Additionally, the jaw contributes more to the coronal stop for [t] than to the bilabial stop for [p]. Finally, the study investigates how this pattern of results varies by participant. The study identifies differences in inter-Articulator coordination by constriction task, which support the claim that inter-Articulator coordination differs depending on the active Articulator synergy.In speech production, the motor system organizes Articulators such as the jaw, tongue, and lips into synergies whose function is to produce speech sounds by forming constrictions at the phonetic places of articulation. The present study tests whether synergies for different constriction tasks differ in terms of inter-Articulator coordination. The test is conducted on utterances [ɑpɑ], [ɑtɑ], [ɑiɑ], and [ɑkɑ] with a real-time magnetic resonance imaging biomarker that is computed using a statistical model of the forward kinematics of the vocal tract. The present study is the first to estimate the forward kinematics of the vocal tract from speech production data. Using the imaging biomarker, the study finds that the jaw contributes least to the velar stop for [k], more to pharyngeal approximation for [ɑ], still more to palatal approximation for [i], and most to the coronal stop for [t]. Additionally, the jaw contributes more to the coronal stop for [t] than to the bilabial stop for [p]. Finally, the study in...

  • INTERSPEECH - A Real-Time MRI Study of Articulatory Setting in Second Language Speech
    2014
    Co-Authors: Andrés Benítez, Louis Goldstein, Vikram Ramanarayanan, Shrikanth S Narayanan
    Abstract:

    Previous work has shown that languages differ in their Articulatory setting, the postural configuration that the vocal tract Articulators tend to adopt when they are not engaged in any active speech gesture, and that this posture might be specified as part of the phonological knowledge speakers have of the language. This study tests whether the Articulatory setting of a language can be acquired by non-native speakers. Three native speakers of German who had learned English as a second language were imaged using real-time MRI of the vocal tract while reading passages in German and English, and features that capture vocal tract posture were extracted from the inter-speech pauses in their native and non-native languages. Results show that the speakers exhibit distinct inter-speech postures in each language, with a lower and more retracted tongue in English, consistent with classic descriptions of the differences between the German and the English Articulatory settings. This supports the view that non-native speakers may acquire relevant features of the Articulatory setting of a second language, and also lends further support to the idea that Articulatory setting is part of a speaker’s phonological competence in a language. Index Terms: Articulatory setting, speech production, second language speech acquisition, real-time MRI.

  • Real‐time MRI tracking of articulation during grammatical and ungrammatical pauses in speech.
    The Journal of the Acoustical Society of America, 2009
    Co-Authors: Vikram Ramanarayanan, Louis Goldstein, Dani Byrd, Erik Bresch, Shrikanth S Narayanan
    Abstract:

    Grammatical pauses in speech generally occur at a clause boundary, presumably due to parsing and planning; however, pausing can occur at grammatically inappropriate locations when planning, production, and/or lexical access processes are disrupted. Real‐time MRI of spontaneous speech production (responses to queries like “tell me about your family,” etc.) was used for seven subjects to examine the Articulatory manifestations of grammatical and ungrammatical pauses (manually classified as such by two experimenters depending on the presence/absence of a clausal juncture). Measures quantifying the speed of Articulators were developed and applied during these pauses as well as their immediate neighborhoods. Results indicate a consistently higher Articulatory speed and spatial range for grammatical compared to ungrammatical pauses, and an appreciable drop in speed for grammatical pauses relative to their neighborhoods, suggesting that higher‐level cognitive mechanisms are at work in planning grammatical pauses...

  • An analysis‐by‐synthesis approach to modeling real‐time MRI Articulatory data using the task dynamic application framework.
    The Journal of the Acoustical Society of America, 2009
    Co-Authors: Erik Bresch, Louis Goldstein, Shrikanth S Narayanan
    Abstract:

    We report on a method of modeling real‐time MRI Articulatory speech data using the Haskins task dynamic application (TaDA) framework. TaDA models speech using a set of discrete dynamical regimes that control the formation of vocal tract constrictions (gestures). An utterance can be specified by a gestural score: the pattern of activation of these regimes in time. Individual model Articulator degrees of freedom are automatically coordinated according the concurrent demands of the unfolding constrictions. Our modeling procedure consists of two stages: (1) After determining the outline of the midsagittal upper airway, time series of constriction measurements are derived which allow the estimation of the subject‐specific parameters relating the Articulator and constriction domains. (2) Gradient descent is utilized to adjust the activation intervals of the gestural score generated by TaDA for that utterance so that the synthesized vocal tract constriction evolution matches the observed MRI time series. Additio...

  • Inverting mappings from smooth paths through Rn to paths through Rm: A technique applied to recovering articulation from acoustics
    Speech Communication, 2007
    Co-Authors: John Hogden, Philip E. Rubin, E. Mcdermott, Shigeru Katagiri, Louis Goldstein
    Abstract:

    Motor theories, which postulate that speech perception is related to linguistically significant movements of the vocal tract, have guided speech perception research for nearly four decades but have had little impact on automatic speech recognition. In this paper, we describe a signal processing technique named MIMICRI that may help link motor theory with automatic speech recognition by providing a practical approach to recovering Articulator positions from acoustics. MIMICRI's name reflects three important operations it can perform on time-series data: it can reduce the dimensionality of a data set (manifold inference); it can blindly invert nonlinear functions applied to the data (mapping inversion); and it can use temporal context to estimate intermediate data (contextual recovery of information). In order for MIMICRI to work, the signals to be analyzed must be functions of unobservable signals that lie on a linear subspace of the set of all unobservable signals. For example, MIMICRI will typically work if the unobservable signals are band-pass and we know the pass-band, as is the case for Articulator motions. We discuss the abilities of MIMICRI as they relate to speech processing applications, particularly as they relate to inverting the mapping from speech Articulator positions to acoustics. We then present a mathematical proof that explains why MIMICRI can invert nonlinear functions, which it can do even in some cases in which the mapping from the unobservable variables to the observable variables is many-to-one. Finally, we show that MIMICRI is able to infer accurately the positions of the speech Articulators from speech acoustics for vowels. Five parameters estimated by MIMICRI were more linearly related to Articulator positions than 128 spectral energies.

Yves Laprie - One of the best experts on this subject based on the ideXlab platform.

  • Centerline Articulatory models of the velum and epiglottis for Articulatory synthesis of speech
    2018
    Co-Authors: Yves Laprie, Benjamin Elie, Anastasiia Tsukanova, Pierre-andré Vuissoz
    Abstract:

    This work concerns the construction of Articulatory models for synthesis of speech, and more specifically the velum and epiglottis. The direct application of principal component analysis to the contours of these Articulators extracted from MRI images results in unrealistic factors due to delineation errors. The approach described in this paper relies on the application of PCA to the centerline of the Articulator and a simple reconstruction algorithm to obtain the global Articulator contour. The complete Articulatory model was constructed from static Magnetic Resonance (MR) images because their quality is much better than that of dynamic MR images. We thus assessed the extent to which the model constructed from static images is capable of approaching the vocal tract shape in MR images recorded at 55 Hz for continuous speech. The analysis of reconstruction errors shows that it is necessary to add dynamic images to the database of static images, in particular to approach the tongue shape for the /l/ sound.

  • Adjonction de contraintes visuelles pour l'inversion acoustique-articulatoire
    2016
    Co-Authors: Yves Laprie, Blaise Potard
    Abstract:

    The goal of this work is to investigate audiovisual-to-Articulatory inversion. It is well established that acoustic-to-Articulatory inversion is an under-determined problem. On the other hand, there is strong evidence that human speakers/listeners exploit the multimodality of speech, and more particularly the Articulatory cues : the view of visible Articulators, i.e. jaw and lips, improves speech intelligibility. It is thus interesting to add constraints provided by the direct visual observation of the speaker's face. Visible data were obtained by stereo-vision and enable the 3D recovery of jaw and lip movements. These data were processed to fit the nature of parameters of Maeda's Articulatory model. Inversion experiments show that constraints on visible Articulatory parameters enable relevant Articulatory trajectories to be recovered and substantially reduce time required to explore the Articulatory codebook.

  • Adapting visual data to a linear Articulatory model
    2016
    Co-Authors: Yves Laprie, Blaise Potard
    Abstract:

    The goal of this work is to investigate audiovisual-to-Articulatory inversion. It is well established that acoustic-to-Articulatory inversion is an underdetermined problem. On the other hand, there is strong evidence that human speakers/listeners exploit the multimodality of speech, and more particularly the Articulatory cues: the view of visible Articulators, i.e. jaw and lips, improves speech intelligibility. It is thus interesting to add constraints provided by the direct visual observation of the speaker's face. Visible data was obtained by stereo-vision and enable the 3D recovery of jaw and lip movements. These data were processed to fit the nature of parameters of Maeda's Articulatory model. Inversion experiments were conducted.

  • Articulatory copy synthesis from cine X-ray films
    Proceedings of the Annual Conference of the International Speech Communication Association INTERSPEECH, 2013
    Co-Authors: Yves Laprie, Matthieu Loosvelt, Rudolph Sock, Shinji Maeda, Fabrice Hirsch
    Abstract:

    This paper deals with Articulatory copy synthesis from X-ray films. The underlying Articulatory synthesizer uses an aerodynamic and an acoustic simulation using target area functions, F0 and transition patterns from one area function to the next as input data. The Articulators, tongue in particular, have been delineated by hand or semi-automatically from the X-ray films. A specific attention has been paid on the determination of the centerline of the vocal tract from the image and on the coordination between glottal area and vocal tract constrictions since both aspects strongly impact on the acoustics. Experiments show that good quality speech can be resynthesized even if the interval between two images is 40 ms. The same approach could be easily applied to cine MRI data. Copyright © 2013 ISCA.

Christopher J Diorio - One of the best experts on this subject based on the ideXlab platform.

  • hidden Articulator markov models for speech recognition
    Speech Communication, 2003
    Co-Authors: Matthew Richardson, Jeff A Bilmes, Christopher J Diorio
    Abstract:

    Abstract Most existing automatic speech recognition systems today do not explicitly use knowledge about human speech production. We show that the incorporation of Articulatory knowledge into these systems is a promising direction for speech recognition, with the potential for lower error rates and more robust performance. To this end, we introduce the Hidden-Articulator Markov model (HAMM), a model which directly integrates Articulatory information into speech recognition. The HAMM is an extension of the Articulatory-feature model introduced by Erler in 1996. We extend the model by using diphone units, developing a new technique for model initialization, and constructing a novel Articulatory feature mapping. We also introduce a method to decrease the number of parameters, making the HAMM comparable in size to standard HMMs. We demonstrate that the HAMM can reasonably predict the movement of Articulators, which results in a decreased word error rate (WER). The Articulatory knowledge also proves useful in noisy acoustic conditions. When combined with a standard model, the HAMM reduces WER 28–35% relative to the standard model alone.

  • hidden Articulator markov models performance improvements and robustness to noise
    Conference of the International Speech Communication Association, 2000
    Co-Authors: Matthew Richardson, Jeff A Bilmes, Christopher J Diorio
    Abstract:

    A Hidden-Articulator Markov Model (HAMM) is a Hidden Markov Model (HMM) in which each state represents an Articulatory configuration. Articulatory knowledge, known to be useful for speech recognition [4], is represented by specifying a mapping of phonemes to Articulatory configurations; vocal tract dynamics are represented via transitions between Articulatory configurations. In previous work [13], we extended the Articulatory-feature model introduced by Erler [7] by using diphone units and a new technique for model initialization. By comparing it with a purely random model, we showed that the HAMM can take advantage of Articulatory knowledge. In this paper, we extend that work in three ways. First, we decrease the number of parameters, making it comparable in size to standard HMMs. Second, we evaluate our model in noisy contexts, verifying that Articulatory knowledge can provide benefits in adverse acoustic conditions. Third, we use a corpus of sideby-side speech and Articulator trajectories to show that the HAMM can reasonably predict the movement of the Articulators.