Formant

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 288 Experts worldwide ranked by ideXlab platform

Peter J. Bailey - One of the best experts on this subject based on the ideXlab platform.

  • Formant-Frequency Variation and Its Effects on Across-Formant Grouping in Speech Perception
    Advances in experimental medicine and biology, 2013
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    How speech is separated perceptually from other speech remains poorly understood. In a series of experiments, perceptual organisation was probed by presenting three-Formant (F1+F2+F3) analogues of target sentences dichotically, together with a competitor for F2 (F2C), or for F2+F3, which listeners must reject to optimise recognition. To control for energetic masking, the competitor was always presented in the opposite ear to the corresponding target Formant(s). Sine-wave speech was used initially, and different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, whatever their amplitude characteristics, whereas constant-frequency F2Cs were ineffective. Subsequent studies used synthetic-Formant speech to explore the effects of manipulating the rate and depth of Formant-frequency change in the competitor. Competitor efficacy was not tuned to the rate of Formant-frequency variation in the target sentences; rather, the reduction in intelligibility increased with competitor rate relative to the rate for the target sentences. Therefore, differences in speech rate may not be a useful cue for separating the speech of concurrent talkers. Effects of competitors whose depth of Formant-frequency variation was scaled by a range of factors were explored using competitors derived either by inverting the frequency contour of F2 about its geometric mean (plausibly speech-like pattern) or by using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Competitor efficacy depended on the overall depth of frequency variation, not depth relative to that for the other Formants. Furthermore, the triangle-wave competitors were as effective as their more speech-like counterparts. Overall, the results suggest that Formant-frequency variation is critical for the across-frequency grouping of Formants but that this grouping does not depend on speech-specific constraints.

  • Effects of the Rate of Formant-Frequency Variation on the Grouping of Formants in Speech Perception
    Journal of the Association for Research in Otolaryngology, 2012
    Co-Authors: Robert J Summers, Peter J. Bailey, Brian Roberts
    Abstract:

    How speech is separated perceptually from other speech remains poorly understood. Recent research suggests that the ability of an extraneous Formant to impair intelligibility depends on the modulation of its frequency, but not its amplitude, contour. This study further examined the effect of Formant-frequency variation on intelligibility by manipulating the rate of Formant-frequency change. Target sentences were synthetic three-Formant (F1 + F2 + F3) analogues of natural utterances. Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3C; F2 + F3), where F2C + F3C constitute a competitor for F2 and F3 that listeners must reject to optimize recognition. Competitors were derived using Formant-frequency contours extracted from extended passages spoken by the same talker and processed to alter the rate of Formant-frequency variation, such that rate scale factors relative to the target sentences were 0, 0.25, 0.5, 1, 2, and 4 (0 = constant frequencies). Competitor amplitude contours were either constant, or time-reversed and rate-adjusted in parallel with the frequency contour. Adding a competitor typically reduced intelligibility; this reduction increased with competitor rate until the rate was at least twice that of the target sentences. Similarity in the results for the two amplitude conditions confirmed that Formant amplitude contours do not influence across-Formant grouping. The findings indicate that competitor efficacy is not tuned to the rate of the target sentences; most probably, it depends primarily on the overall rate of frequency variation in the competitor Formants. This suggests that, when segregating the speech of concurrent talkers, differences in speech rate may not be a significant cue for across-frequency grouping of Formants.

  • the intelligibility of noise vocoded speech spectral information available from across channel comparison of amplitude envelopes
    Proceedings of The Royal Society B: Biological Sciences, 2011
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the Formant frequencies in the vocal-tract output—a key source of phonetic detail—from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher Formants (F3′ ≈ N5 + N6), such that the frequency contour of each Formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-Formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each Formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying Formants.

  • The role of Formant‐frequency contours in the perceptual grouping of speech Formants.
    The Journal of the Acoustical Society of America, 2011
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    The perceptual organization of speech remains poorly understood. Recent research using sine‐wave speech suggests that the ability of an extraneous Formant to impair intelligibility depends on modulation of its frequency contour [Roberts et al., J. Acoust. Soc. Am. 128, 804–817]. This study examined the effect on intelligibility of manipulating the depth of this frequency variation. Three‐Formant (F1+F2+F3) analoges of natural sentences were synthesized using a monotonous glottal source (F0=140 Hz). Each Formant‐frequency contour was scaled to 50% depth about its geometric mean; this manipulation had relatively little impact on intelligibility. Perceptual organization was probed by presenting stimuli dichotically (F1+F2C; F2+F3), where F2C is a competitor for F2 that listeners must resist to optimize recognition. Different competitors were created by inverting the frequency contour of F2 about its geometric mean and varying its depth (100%‐0%, 25% steps). Adding F2C typically reduced intelligibility; this reduction was greatest for 100%‐depth, intermediate for 50%‐depth, and least for 0%‐depth (constant) F2Cs. These results indicate that competitor efficacy depends on overall depth of frequency variation, not depth relative to that of the other Formants, and suggest that frequency‐contour modulation influences across‐Formant grouping not only in sine‐wave analogues but also in more speech‐like simulations. [Work supported by EPSRC.]

Brian Roberts - One of the best experts on this subject based on the ideXlab platform.

  • effects of frequency region and number of Formants in an interferer on the informational masking of speech
    Journal of the Acoustical Society of America, 2017
    Co-Authors: Brian Roberts, Robert J Summers
    Abstract:

    This study explored whether the extent of informational masking depends on the frequency region and number of Formants in an interferer. Target Formants—monotonized three-Formant analogues of natural sentences—were presented monaurally, with the target ear assigned randomly on each trial. Interferers were presented contralaterally. In experiment 1, single-Formant interferers were created using the time-reversed F2 frequency contour and constant amplitude, RMS-matched to F2. Interferer center frequency was matched to that of F1, F2, or F3, while maintaining the extent of Formant-frequency variation (depth) on a log scale. In experiment 2, the interferer comprised either one Formant (F1) or all three, created using the time-reversed frequency contours of the corresponding targets and RMS-matched constant amplitudes. Including the higher Formants had little effect on interferer intensity. Interferer Formant-frequency variation was scaled to 0%, 50%, or 100% of the original depth. Adding an interferer lowered...

  • Formant-Frequency Variation and Its Effects on Across-Formant Grouping in Speech Perception
    Advances in experimental medicine and biology, 2013
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    How speech is separated perceptually from other speech remains poorly understood. In a series of experiments, perceptual organisation was probed by presenting three-Formant (F1+F2+F3) analogues of target sentences dichotically, together with a competitor for F2 (F2C), or for F2+F3, which listeners must reject to optimise recognition. To control for energetic masking, the competitor was always presented in the opposite ear to the corresponding target Formant(s). Sine-wave speech was used initially, and different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, whatever their amplitude characteristics, whereas constant-frequency F2Cs were ineffective. Subsequent studies used synthetic-Formant speech to explore the effects of manipulating the rate and depth of Formant-frequency change in the competitor. Competitor efficacy was not tuned to the rate of Formant-frequency variation in the target sentences; rather, the reduction in intelligibility increased with competitor rate relative to the rate for the target sentences. Therefore, differences in speech rate may not be a useful cue for separating the speech of concurrent talkers. Effects of competitors whose depth of Formant-frequency variation was scaled by a range of factors were explored using competitors derived either by inverting the frequency contour of F2 about its geometric mean (plausibly speech-like pattern) or by using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Competitor efficacy depended on the overall depth of frequency variation, not depth relative to that for the other Formants. Furthermore, the triangle-wave competitors were as effective as their more speech-like counterparts. Overall, the results suggest that Formant-frequency variation is critical for the across-frequency grouping of Formants but that this grouping does not depend on speech-specific constraints.

  • Effects of the Rate of Formant-Frequency Variation on the Grouping of Formants in Speech Perception
    Journal of the Association for Research in Otolaryngology, 2012
    Co-Authors: Robert J Summers, Peter J. Bailey, Brian Roberts
    Abstract:

    How speech is separated perceptually from other speech remains poorly understood. Recent research suggests that the ability of an extraneous Formant to impair intelligibility depends on the modulation of its frequency, but not its amplitude, contour. This study further examined the effect of Formant-frequency variation on intelligibility by manipulating the rate of Formant-frequency change. Target sentences were synthetic three-Formant (F1 + F2 + F3) analogues of natural utterances. Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3C; F2 + F3), where F2C + F3C constitute a competitor for F2 and F3 that listeners must reject to optimize recognition. Competitors were derived using Formant-frequency contours extracted from extended passages spoken by the same talker and processed to alter the rate of Formant-frequency variation, such that rate scale factors relative to the target sentences were 0, 0.25, 0.5, 1, 2, and 4 (0 = constant frequencies). Competitor amplitude contours were either constant, or time-reversed and rate-adjusted in parallel with the frequency contour. Adding a competitor typically reduced intelligibility; this reduction increased with competitor rate until the rate was at least twice that of the target sentences. Similarity in the results for the two amplitude conditions confirmed that Formant amplitude contours do not influence across-Formant grouping. The findings indicate that competitor efficacy is not tuned to the rate of the target sentences; most probably, it depends primarily on the overall rate of frequency variation in the competitor Formants. This suggests that, when segregating the speech of concurrent talkers, differences in speech rate may not be a significant cue for across-frequency grouping of Formants.

  • the intelligibility of noise vocoded speech spectral information available from across channel comparison of amplitude envelopes
    Proceedings of The Royal Society B: Biological Sciences, 2011
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the Formant frequencies in the vocal-tract output—a key source of phonetic detail—from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher Formants (F3′ ≈ N5 + N6), such that the frequency contour of each Formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-Formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each Formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying Formants.

  • The role of Formant‐frequency contours in the perceptual grouping of speech Formants.
    The Journal of the Acoustical Society of America, 2011
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    The perceptual organization of speech remains poorly understood. Recent research using sine‐wave speech suggests that the ability of an extraneous Formant to impair intelligibility depends on modulation of its frequency contour [Roberts et al., J. Acoust. Soc. Am. 128, 804–817]. This study examined the effect on intelligibility of manipulating the depth of this frequency variation. Three‐Formant (F1+F2+F3) analoges of natural sentences were synthesized using a monotonous glottal source (F0=140 Hz). Each Formant‐frequency contour was scaled to 50% depth about its geometric mean; this manipulation had relatively little impact on intelligibility. Perceptual organization was probed by presenting stimuli dichotically (F1+F2C; F2+F3), where F2C is a competitor for F2 that listeners must resist to optimize recognition. Different competitors were created by inverting the frequency contour of F2 about its geometric mean and varying its depth (100%‐0%, 25% steps). Adding F2C typically reduced intelligibility; this reduction was greatest for 100%‐depth, intermediate for 50%‐depth, and least for 0%‐depth (constant) F2Cs. These results indicate that competitor efficacy depends on overall depth of frequency variation, not depth relative to that of the other Formants, and suggest that frequency‐contour modulation influences across‐Formant grouping not only in sine‐wave analogues but also in more speech‐like simulations. [Work supported by EPSRC.]

Robert J Summers - One of the best experts on this subject based on the ideXlab platform.

  • effects of frequency region and number of Formants in an interferer on the informational masking of speech
    Journal of the Acoustical Society of America, 2017
    Co-Authors: Brian Roberts, Robert J Summers
    Abstract:

    This study explored whether the extent of informational masking depends on the frequency region and number of Formants in an interferer. Target Formants—monotonized three-Formant analogues of natural sentences—were presented monaurally, with the target ear assigned randomly on each trial. Interferers were presented contralaterally. In experiment 1, single-Formant interferers were created using the time-reversed F2 frequency contour and constant amplitude, RMS-matched to F2. Interferer center frequency was matched to that of F1, F2, or F3, while maintaining the extent of Formant-frequency variation (depth) on a log scale. In experiment 2, the interferer comprised either one Formant (F1) or all three, created using the time-reversed frequency contours of the corresponding targets and RMS-matched constant amplitudes. Including the higher Formants had little effect on interferer intensity. Interferer Formant-frequency variation was scaled to 0%, 50%, or 100% of the original depth. Adding an interferer lowered...

  • Formant-Frequency Variation and Its Effects on Across-Formant Grouping in Speech Perception
    Advances in experimental medicine and biology, 2013
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    How speech is separated perceptually from other speech remains poorly understood. In a series of experiments, perceptual organisation was probed by presenting three-Formant (F1+F2+F3) analogues of target sentences dichotically, together with a competitor for F2 (F2C), or for F2+F3, which listeners must reject to optimise recognition. To control for energetic masking, the competitor was always presented in the opposite ear to the corresponding target Formant(s). Sine-wave speech was used initially, and different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, whatever their amplitude characteristics, whereas constant-frequency F2Cs were ineffective. Subsequent studies used synthetic-Formant speech to explore the effects of manipulating the rate and depth of Formant-frequency change in the competitor. Competitor efficacy was not tuned to the rate of Formant-frequency variation in the target sentences; rather, the reduction in intelligibility increased with competitor rate relative to the rate for the target sentences. Therefore, differences in speech rate may not be a useful cue for separating the speech of concurrent talkers. Effects of competitors whose depth of Formant-frequency variation was scaled by a range of factors were explored using competitors derived either by inverting the frequency contour of F2 about its geometric mean (plausibly speech-like pattern) or by using a regular and arbitrary frequency contour (triangle wave, not plausibly speech-like) matched to the average rate and depth of variation for the inverted F2C. Competitor efficacy depended on the overall depth of frequency variation, not depth relative to that for the other Formants. Furthermore, the triangle-wave competitors were as effective as their more speech-like counterparts. Overall, the results suggest that Formant-frequency variation is critical for the across-frequency grouping of Formants but that this grouping does not depend on speech-specific constraints.

  • Effects of the Rate of Formant-Frequency Variation on the Grouping of Formants in Speech Perception
    Journal of the Association for Research in Otolaryngology, 2012
    Co-Authors: Robert J Summers, Peter J. Bailey, Brian Roberts
    Abstract:

    How speech is separated perceptually from other speech remains poorly understood. Recent research suggests that the ability of an extraneous Formant to impair intelligibility depends on the modulation of its frequency, but not its amplitude, contour. This study further examined the effect of Formant-frequency variation on intelligibility by manipulating the rate of Formant-frequency change. Target sentences were synthetic three-Formant (F1 + F2 + F3) analogues of natural utterances. Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3C; F2 + F3), where F2C + F3C constitute a competitor for F2 and F3 that listeners must reject to optimize recognition. Competitors were derived using Formant-frequency contours extracted from extended passages spoken by the same talker and processed to alter the rate of Formant-frequency variation, such that rate scale factors relative to the target sentences were 0, 0.25, 0.5, 1, 2, and 4 (0 = constant frequencies). Competitor amplitude contours were either constant, or time-reversed and rate-adjusted in parallel with the frequency contour. Adding a competitor typically reduced intelligibility; this reduction increased with competitor rate until the rate was at least twice that of the target sentences. Similarity in the results for the two amplitude conditions confirmed that Formant amplitude contours do not influence across-Formant grouping. The findings indicate that competitor efficacy is not tuned to the rate of the target sentences; most probably, it depends primarily on the overall rate of frequency variation in the competitor Formants. This suggests that, when segregating the speech of concurrent talkers, differences in speech rate may not be a significant cue for across-frequency grouping of Formants.

  • the intelligibility of noise vocoded speech spectral information available from across channel comparison of amplitude envelopes
    Proceedings of The Royal Society B: Biological Sciences, 2011
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    Noise-vocoded (NV) speech is often regarded as conveying phonetic information primarily through temporal-envelope cues rather than spectral cues. However, listeners may infer the Formant frequencies in the vocal-tract output—a key source of phonetic detail—from across-band differences in amplitude when speech is processed through a small number of channels. The potential utility of this spectral information was assessed for NV speech created by filtering sentences into six frequency bands, and using the amplitude envelope of each band (≤30 Hz) to modulate a matched noise-band carrier (N). Bands were paired, corresponding to F1 (≈N1 + N2), F2 (≈N3 + N4) and the higher Formants (F3′ ≈ N5 + N6), such that the frequency contour of each Formant was implied by variations in relative amplitude between bands within the corresponding pair. Three-Formant analogues (F0 = 150 Hz) of the NV stimuli were synthesized using frame-by-frame reconstruction of the frequency and amplitude of each Formant. These analogues were less intelligible than the NV stimuli or analogues created using contours extracted from spectrograms of the original sentences, but more intelligible than when the frequency contours were replaced with constant (mean) values. Across-band comparisons of amplitude envelopes in NV speech can provide phonetically important information about the frequency contours of the underlying Formants.

  • The role of Formant‐frequency contours in the perceptual grouping of speech Formants.
    The Journal of the Acoustical Society of America, 2011
    Co-Authors: Brian Roberts, Robert J Summers, Peter J. Bailey
    Abstract:

    The perceptual organization of speech remains poorly understood. Recent research using sine‐wave speech suggests that the ability of an extraneous Formant to impair intelligibility depends on modulation of its frequency contour [Roberts et al., J. Acoust. Soc. Am. 128, 804–817]. This study examined the effect on intelligibility of manipulating the depth of this frequency variation. Three‐Formant (F1+F2+F3) analoges of natural sentences were synthesized using a monotonous glottal source (F0=140 Hz). Each Formant‐frequency contour was scaled to 50% depth about its geometric mean; this manipulation had relatively little impact on intelligibility. Perceptual organization was probed by presenting stimuli dichotically (F1+F2C; F2+F3), where F2C is a competitor for F2 that listeners must resist to optimize recognition. Different competitors were created by inverting the frequency contour of F2 about its geometric mean and varying its depth (100%‐0%, 25% steps). Adding F2C typically reduced intelligibility; this reduction was greatest for 100%‐depth, intermediate for 50%‐depth, and least for 0%‐depth (constant) F2Cs. These results indicate that competitor efficacy depends on overall depth of frequency variation, not depth relative to that of the other Formants, and suggest that frequency‐contour modulation influences across‐Formant grouping not only in sine‐wave analogues but also in more speech‐like simulations. [Work supported by EPSRC.]

R. Carré - One of the best experts on this subject based on the ideXlab platform.

  • Perception of synthetic two-Formant vowel transitions
    Speech Communication, 1997
    Co-Authors: William A. Ainsworth, R. Carré
    Abstract:

    Abstract Speech analysis shows that the second Formant transitions in vowel–vowel utterances are not always of the same duration as those of the first Formant transitions nor are they always synchronised. Moreover the Formant transitions often move initially in a different direction from their final target. In order to investigate whether these deviations from linearity and synchrony are perceptually significant a series of listening tests have been conducted with the vowel pair /a/–/i/. It was found that delays between the first and second Formant transitions of up to 30 ms are not perceived, nor are differences in duration of up to 40 ms if the first and second Formants start or end simultaneously. If the second Formant transition is symmetric in time with respect to the first Formant differences of up to 50 ms are tolerated. Excursions in second Formant transition shape of up to about 500 Hz are also not perceived. These results suggest that most of the deviations from linearity and synchrony found in natural vowel–vowel utterances are not perceptually significant.

James L. Hieronymus - One of the best experts on this subject based on the ideXlab platform.

  • Formant normalisation for speech recognition and vowel studies
    Speech Communication, 1991
    Co-Authors: James L. Hieronymus
    Abstract:

    Abstract Vowel Formant target frequencies from different talkers depend on the details of the vocal tract, sex, regional accent, speaking habits and other factors. Good vowel recognition and studies of vowels from different talkers require an accurate method for compensating for speaker differences in these frequencies. The major variance seen in the data is between males and females. However, even within the same sex class, there are large variations in the Formant target frequencies for the same vowel in the same phonetic context. Various methods of compensating for speaker variation in Formants were studied. Bark scaled Formants and subtraction of Bark fundamental frequency from the first Formant was tried first. In spite of recent published papers on the efficacy of this technique, it was found inadequate. The transformations were incapable of improving the clusters of the cardinal vowels, for example. A modification of the Gerstman technique, determining the speaker's Formant range and then transforming into an “ideal” talker's range, was found to account for most of the variance due to different talkers given a small amount of training data. This technique was applied to vowel in context studies on American English. Formant ranges were studied for 125 talkers of General American English. Plots of Formant ranges for males and females showed interesting patterns. The lower limit of the second Formant was not very different, while the lower limit of the first Formant was lower for males. Both the first and second Formant maxima were larger for females. The modified Gerstman transformation was able to superimpose the Formant targets for the same vowel in the same context from different talkers into the same region of F 1, F 2 space. There remained some residual variance between male and female, even after the transformation. These trends are shown in a series of plots of vowel target frequency data.