Videoendoscopy

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Dimitar D Deliyski - One of the best experts on this subject based on the ideXlab platform.

  • method for horizontal calibration of laser projection transnasal fiberoptic high speed Videoendoscopy
    Applied Sciences, 2021
    Co-Authors: Hamzeh Ghasemzadeh, Dimitar D Deliyski, Robert E Hillman, Daryush D Mehta
    Abstract:

    Objective Calibrated horizontal measurements (e.g., mm) from endoscopic procedures could be utilized for advancement of evidence-based practice and personalized medicine. However, the size of an object in endoscopic images is not readily calibrated and depends on multiple factors, including the distance between the endoscope and the target surface. Additionally, acquired images may have significant non-linear distortion that would further complicate calibrated measurements. This study used a recently developed in-vivo laser-projection fiberoptic laryngoscope and proposes a method for calibrated spatial measurements. Method A set of circular grids were recorded at multiple working distances. A statistical model was trained that would map from pixel length of the object, the working distance, and the spatial location of the target object into its mm length. Result A detailed analysis of the performance of the proposed method is presented. The analyses have shown that the accuracy of the proposed method does not depend on the working distance and length of the target object. The estimated average magnitude of error was 0.27 mm, which is three times lower than the existing alternative. Conclusion The presented method can achieve sub-millimeter accuracy in horizontal measurement. Significance Evidence-based practice and personalized medicine could significantly benefit from the proposed method. Implications of the findings for other endoscopic procedures are also discussed.

  • laser calibrated system for transnasal fiberoptic laryngeal high speed Videoendoscopy
    Journal of Voice, 2021
    Co-Authors: Dimitar D Deliyski, Alessandro De Alarcon, Daryush D Mehta, Matias Zanartu, Milen Shishkov, Hamzeh Ghasemzadeh, Brett E Bouma, Robert E Hillman
    Abstract:

    The design specifications and experimental characteristics of a newly developed laser-projection transnasal flexible endoscope coupled with a high-speed Videoendoscopy system are provided. The hardware and software design of the proposed system benefits from the combination of structured green light projection and laser triangulation techniques, which provide the capability of calibrated absolute measurements of the laryngeal structures along the horizontal and vertical planes during phonation. Visual inspection of in vivo acquired images demonstrated sharp contrast between laser points and background, confirming successful design of the system. Objective analyses were carried out for assessing the irradiance of the system and the penetration of the green laser light into the red and blue channels in the recorded images. The analysis showed that the system has irradiance of 372 W/m2 at a working distance of 20 mm, which is well within the safety limits, indicating minimal risk of usage of the device on human subjects. Additionally, the color penetration analysis showed that, with probability of 90%, the ratio of contamination of the red channel from the green laser light is less than 0.002. This indicates minimal effect of the laser projection on the measurements performed on the red data channel, making the system applicable for calibrated 3D spatial-temporal segmentation and data-driven subject-specific modeling, which is important for further advancing voice science and clinical voice assessment.

  • quantitative analysis of vocal fold vibration using high speed Videoendoscopy in children with and without bilateral lesions
    Journal of Voice, 2020
    Co-Authors: Stephanie Zacharias, Alessandro De Alarcon, Dimitar D Deliyski
    Abstract:

    Summary Objective To provide data on the measurable vocal fold vibratory differences in children with and without vocal fold lesions using high-speed Videoendoscopy. Design Prospective study, 24 participants (8 healthy; 16 with lesions) between the ages of 5 and 10. Methods Rigid high-speed Videoendoscopy at the rate of 8,000 frames per second was used to examine participants. Four objective vocal fold phase linearity measures were obtained to establish anterior-posterior contact and separation vibratory patterns. Results All objective measures showed a difference between nonlesion and bilateral vocal fold lesion groups. Contact-separation patterns in all nonlesion girls and young pre-pubertal boys exhibited an anterior-to-posterior contact and posterior-to-anterior separation; while older boys differed. The objective measures of open quotient, left-right relative phase asymmetry and speed index, showed linear anterior-posterior patterns within the nonlesion group; while the bilateral vocal fold lesion group displayed nonlinear patterns. Patterns in the posterior region of the vocal fold were similar in both groups; while patterns in the anterior region differed. Conclusions This study suggests lesions have an effect on the anterior aspect of vocal fold vibratory patterns specifically anterior to the lesions. Age-related differences for males are also evidenced, prompting further investigation of laryngeal development in males and females from childhood to adulthood. This study could serve as a basis for the development of objective clinical measurements of vocal fold vibration in presence of lesions. Further findings could help redefine the theoretical framework of pediatric voice.

  • method for vertical calibration of laser projection transnasal fiberoptic high speed Videoendoscopy
    Journal of Voice, 2019
    Co-Authors: Hamzeh Ghasemzadeh, Dimitar D Deliyski, Robert E Hillman, David S Ford, James B Kobler, Daryush D Mehta
    Abstract:

    Summary The ability to provide absolute calibrated measurement of the laryngeal structures during phonation is of paramount importance to voice science and clinical practice. Calibrated three-dimensional measurement could provide essential information for modeling purposes, for studying the developmental aspects of vocal fold vibration, for refining functional voice assessment and treatment outcomes evaluation, and for more accurate staging and grading of laryngeal disease. Recently, a laser-calibrated transnasal fiberoptic endoscope compatible with high-speed Videoendoscopy (HSV) and capable of providing three-dimensional measurements was developed. The optical principle employed is to project a grid of 7 × 7 green laser points across the field of view (FOV) at an angle relative to the imaging axis, such that (after calibration) the position of each laser point within the FOV encodes the vertical distance from the tip of the endoscope to the laryngeal tissues. The purpose of this study was to develop a precise method for vertical calibration of the endoscope. Investigating the position of the laser points showed that, besides the vertical distance, they also depend on the parameters of the lens coupler, including the FOV position within the image frame and the rotation angle of the endoscope. The presented automatic calibration method was developed to compensate for the effect of these parameters. Statistical image processing and pattern recognition were used to detect the FOV, the center of FOV, and the fiducial marker. This step normalizes the HSV frames to a standard coordinate system and removes the dependence of the laser-point positions on the parameters of the lens coupler. Then, using a statistical learning technique, a calibration protocol was developed to model the trajectories of all laser points as the working distance was varied. Finally, a set of experiments was conducted to measure the accuracy and reliability of every step of the procedure. The system was able to measure absolute vertical distance with mean percent error in the range of 1.7% to 4.7%, depending on the working distance.

  • studying vocal fold non stationary behavior during connected speech using high speed Videoendoscopy
    Journal of the Acoustical Society of America, 2018
    Co-Authors: Maryam Naghibolhosseini, Dimitar D Deliyski, Alessandro De Alarcon, Stephanie R C Zacharias, Robert F Orlikoff
    Abstract:

    Studying voice production during running speech can provide new knowledge about the mechanisms of voice production with and without disorder. Laryngeal high-speed Videoendoscopy (HSV) systems are powerful tools for studying laryngeal function and, if coupled with flexible fiberoptic endoscopes, they can provide unique possibilities to measure vocal fold vibration with high temporal resolution during connected speech. Hence, we can measure the non-stationary behaviors of the vocal folds, such as the glottal attack and offset times in running speech. In this study, a custom-built flexible fiberoptic HSV system was used to record a “Rainbow Passage” production from a vocally normal female. Automated temporal and spatial segmentation algorithms were developed to determine the time stamps of the vibrating vocal folds and the edges of the vocal folds during phonation. The glottal attack time and offset times were then measured from the temporally and spatially segmented HSV images. The amplification ratio was computed during the phonation onset and the damping ratio was calculated at the offset of sustained portion of phonation. These measures can be used to describe the laryngeal mechanisms of voice production in connected speech.Studying voice production during running speech can provide new knowledge about the mechanisms of voice production with and without disorder. Laryngeal high-speed Videoendoscopy (HSV) systems are powerful tools for studying laryngeal function and, if coupled with flexible fiberoptic endoscopes, they can provide unique possibilities to measure vocal fold vibration with high temporal resolution during connected speech. Hence, we can measure the non-stationary behaviors of the vocal folds, such as the glottal attack and offset times in running speech. In this study, a custom-built flexible fiberoptic HSV system was used to record a “Rainbow Passage” production from a vocally normal female. Automated temporal and spatial segmentation algorithms were developed to determine the time stamps of the vibrating vocal folds and the edges of the vocal folds during phonation. The glottal attack time and offset times were then measured from the temporally and spatially segmented HSV images. The amplification ratio was c...

Melda Kunduk - One of the best experts on this subject based on the ideXlab platform.

  • interdependencies between acoustic and high speed Videoendoscopy parameters
    PLOS ONE, 2021
    Co-Authors: Patrick Schlegel, Melda Kunduk, Michael Dollinger, Andreas M Kist, Stephan Durr, Anne Schutzenberger
    Abstract:

    In voice research, uncovering relations between the oscillating vocal folds, being the sound source of phonation, and the resulting perceived acoustic signal are of great interest. This is especially the case in the context of voice disorders, such as functional dysphonia (FD). We investigated 250 high-speed Videoendoscopy (HSV) recordings with simultaneously recorded acoustic signals (124 healthy females, 60 FD females, 44 healthy males, 22 FD males). 35 glottal area waveform (GAW) parameters and 14 acoustic parameters were calculated for each recording. Linear and non-linear relations between GAW and acoustic parameters were investigated using Pearson correlation coefficients (PCC) and distance correlation coefficients (DCC). Further, norm values for parameters obtained from 250 ms long sustained phonation data (vowel /i/) were provided. 26 PCCs in females (5.3%) and 8 in males (1.6%) were found to be statistically significant (|corr.| ≥ 0.3). Only minor differences were found between PCCs and DCCs, indicating presence of weak non-linear dependencies between parameters. Fundamental frequency was involved in the majority of all relevant PCCs between GAW and acoustic parameters (19 in females and 7 in males). The most distinct difference between correlations in females and males was found for the parameter Period Variability Index. The study shows only weak relations between investigated acoustic and GAW-parameters. This indicates that the reduction of the complex 3D glottal dynamics to the 1D-GAW may erase laryngeal dynamic characteristics that are reflected within the acoustic signal. Hence, other GAW parameters, 2D-, 3D-laryngeal dynamics and vocal tract parameters should be further investigated towards potential correlations to the acoustic signal.

  • bagls a multihospital benchmark for automatic glottis segmentation
    Scientific Data, 2020
    Co-Authors: Pablo Gomez, Andreas M Kist, Patrick Schlegel, David A Berry, Dinesh K Chhetri, Stephan Durr, Matthias Echternach, Aaron M Johnson, Stefan Kniesburges, Melda Kunduk
    Abstract:

    Laryngeal Videoendoscopy is one of the main tools in clinical examinations for voice disorders and voice research. Using high-speed Videoendoscopy, it is possible to fully capture the vocal fold oscillations, however, processing the recordings typically involves a time-consuming segmentation of the glottal area by trained experts. Even though automatic methods have been proposed and the task is particularly suited for deep learning methods, there are no public datasets and benchmarks available to compare methods and to allow training of generalizing deep learning models. In an international collaboration of researchers from seven institutions from the EU and USA, we have created BAGLS, a large, multihospital dataset of 59,250 high-speed Videoendoscopy frames with individually annotated segmentation masks. The frames are based on 640 recordings of healthy and disordered subjects that were recorded with varying technical equipment by numerous clinicians. The BAGLS dataset will allow an objective comparison of glottis segmentation methods and will enable interested researchers to train their own models and compare their methods.

  • Influence of spatial camera resolution in high-speed Videoendoscopy on laryngeal parameters
    2019
    Co-Authors: Patrick Schlegel, Melda Kunduk, Michael Dollinger, Marion Semmler, Christopher Bohr, Michael Stingl, Anne Schutzenberger
    Abstract:

    In laryngeal high-speed Videoendoscopy (HSV) the area between the vibrating vocal folds during phonation is of interest, being referred to as glottal area waveform (GAW). Varying camera resolution may influence parameters computed on the GAW and hence hinder the comparability between examinations. This study investigates the influence of spatial camera resolution on quantitative vocal fold vibratory function parameters obtained from the GAW. In total 40 HSV recordings during sustained phonation (20 healthy males and 20 healthy females) were investigated. A clinically used Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512×256 pixels was applied. This initial resolution was reduced by pixel averaging to (1) a resolution of 256×128 and (2) to a resolution of 128×64 pixels, yielding three sets of recordings. The GAW was extracted and in total 50 vocal fold vibratory parameters representing different features of the GAW were computed. Statistical analyses using SPSS Statistics, version 21, was performed. 15 Parameters showing strong mathematical dependencies with other parameters were excluded from the main analysis but are given in the Supporting Information. Data analysis revealed clear influence of spatial resolution on GAW parameters. Fundamental period measures and period perturbation measures were the least affected. Amplitude perturbation measures and mechanical measures were most strongly influenced. Most glottal dynamic characteristics and symmetry measures deviated significantly. Most energy perturbation measures changed significantly in males but were mostly unaffected in females. In females 18 of 35 remaining parameters (51%) and in males 22 parameters (63%) changed significantly between spatial resolutions. This work represents the first step in studying the impact of video resolution on quantitative HSV parameters. Clear influences of spatial camera resolution on computed parameters were found. The study results suggest avoiding the use of the most strongly affected parameters. Further, the use of cameras with high resolution is recommended to analyze GAW measures in HSV data.

  • influence of analyzed sequence length on parameters in laryngeal high speed Videoendoscopy
    Applied Sciences, 2018
    Co-Authors: Patrick Schlegel, Melda Kunduk, Michael Dollinger, Marion Semmler, Christopher Bohr, Anne Schutzenberger
    Abstract:

    Laryngeal high-speed Videoendoscopy (HSV) allows objective quantification of vocal fold vibratory characteristics. However, it is unknown how the analyzed sequence length affects some of the computed parameters. To examine if varying sequence lengths influence parameter calculation, 20 HSV recordings of healthy females during sustained phonation were investigated. The clinical prevalent Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512 x 256 pixels was used to collect HSV data. The glottal area waveform (GAW), describing the increase and decrease of the area between the vocal folds during phonation, was extracted. Based on the GAW, 16 perturbation parameters were computed for sequences of 5, 10, 20, 50 and 100 consecutive cycles. Statistical analysis was performed using SPSS Statistics, version 21. Only three parameters (18.8%) were statistically significantly influenced by changing sequence lengths. Of these parameters, one changed until 10 cycles were reached, one until 20 cycles were reached and one, namely Amplitude Variability Index (AVI), changed between almost all groups of different sequence lengths. Moreover, visually observable, but not statistically significant, changes within parameters were observed. These changes were often most prominent between shorter sequence lengths. Hence, we suggest using a minimum sequence length of at least 20 cycles and discarding the parameter AVI.

  • effects of volume pitch and phonation type on oscillation initiation and termination phases investigated with high speed Videoendoscopy
    Journal of Voice, 2017
    Co-Authors: Melda Kunduk, Takeshi Ikuma, David C Blouin, Andrew J Mcwhorter
    Abstract:

    Summary Objectives This study aimed to investigate the effects of varying volume, pitch, and phonation types on the initiation and termination phases of vocal fold oscillation using high-speed digital Videoendoscopy. Specifically, it addressed the effects of the variation of volume, pitch, and phonation type (normal, pressed, and breathy) on the transient duration of the vibrating glottal length (length transient duration, T len ), the transient duration of the glottal area waveform (area transient duration, T area ), the time offset between the beginning (or the end) of the full-length vibration and the full-amplitude vibration, T Δ , and the variation of the fundamental frequency during the vocal fold oscillation initiation and termination segments (pitch instability, %PI). Methods A female subject with no voice problem produced voices with varying pitch and loudness, including comfortable pitch and comfortable loudness, normal pitch loud, high pitch and comfortable loudness, and high pitch and loud. Breathy and pressed phonations were also recorded. Each of the six phonation types was recorded six times, which resulted in 72 transient segments (each recording included both initiation and termination phases). Mixed model statistical analyses were employed to the five objective high-speed digital Videoendoscopy parameters. Results Preliminary findings demonstrated significant findings for voice type effects for the length and area transient durations for the oscillation initiation segment but not for the oscillation termination segment. Conclusions This study demonstrates that voice types appear to influence vibration initiation patterns more than the vibration termination patterns.

Andrew J Mcwhorter - One of the best experts on this subject based on the ideXlab platform.

  • effects of volume pitch and phonation type on oscillation initiation and termination phases investigated with high speed Videoendoscopy
    Journal of Voice, 2017
    Co-Authors: Melda Kunduk, Takeshi Ikuma, David C Blouin, Andrew J Mcwhorter
    Abstract:

    Summary Objectives This study aimed to investigate the effects of varying volume, pitch, and phonation types on the initiation and termination phases of vocal fold oscillation using high-speed digital Videoendoscopy. Specifically, it addressed the effects of the variation of volume, pitch, and phonation type (normal, pressed, and breathy) on the transient duration of the vibrating glottal length (length transient duration, T len ), the transient duration of the glottal area waveform (area transient duration, T area ), the time offset between the beginning (or the end) of the full-length vibration and the full-amplitude vibration, T Δ , and the variation of the fundamental frequency during the vocal fold oscillation initiation and termination segments (pitch instability, %PI). Methods A female subject with no voice problem produced voices with varying pitch and loudness, including comfortable pitch and comfortable loudness, normal pitch loud, high pitch and comfortable loudness, and high pitch and loud. Breathy and pressed phonations were also recorded. Each of the six phonation types was recorded six times, which resulted in 72 transient segments (each recording included both initiation and termination phases). Mixed model statistical analyses were employed to the five objective high-speed digital Videoendoscopy parameters. Results Preliminary findings demonstrated significant findings for voice type effects for the length and area transient durations for the oscillation initiation segment but not for the oscillation termination segment. Conclusions This study demonstrates that voice types appear to influence vibration initiation patterns more than the vibration termination patterns.

  • a spatiotemporal approach to the objective analysis of initiation and termination of vocal fold oscillation with high speed Videoendoscopy
    Journal of Voice, 2016
    Co-Authors: Takeshi Ikuma, Melda Kunduk, Andrew J Mcwhorter, Daniel Fink
    Abstract:

    Summary High-speed Videoendoscopy excels in the ability to observe the vocal-fold oscillatory patterns during voice initiation and termination. The initial and most critical step in the analysis of these transient regions is to identify the locations of these transient periods, that is, determining when the vocal-fold oscillation is absent and when the oscillation has reached its steady-state behavior. The latter is more challenging as the "steady" oscillation during sustained phonation is not truly steady and is expected to vary over time. This variation may cause unreliable identification of the transient periods, possibly resulting in less accurate or less reliable parameter measurements. An oscillation feature that is relatively consistent in the steady state is the glottal length, that is, the extent of the oscillation along vocal folds. This paper proposes an autonomous algorithm to estimate the vocal-fold oscillation length and its use to detect four transient events: oscillation onset and offset, and attainment and loss of full-length oscillation. The detected event markers are intended to be used to improve the transient parameter measurements. The autonomous algorithm manipulates the set of glottal width waveforms spatiotemporally to estimate the oscillation length. Examples with in vivo high-speed Videoendoscopy recordings of both normal and pathological cases are included to show the efficacy of the proposed algorithm to identify the transient markers.

  • objective quantification of pre and postphonosurgery vocal fold vibratory characteristics using high speed Videoendoscopy and a harmonic waveform model
    Journal of Speech Language and Hearing Research, 2014
    Co-Authors: Takeshi Ikuma, Melda Kunduk, Andrew J Mcwhorter
    Abstract:

    Purpose The model-based quantitative analysis of high-speed Videoendoscopy (HSV) data at a low frame rate of 2,000 frames per second was assessed for its clinical adequacy. Stepwise regression was ...

  • preprocessing techniques for high speed Videoendoscopy analysis
    Journal of Voice, 2013
    Co-Authors: Takeshi Ikuma, Melda Kunduk, Andrew J Mcwhorter
    Abstract:

    Summary One of the critical requirements for high-speed Videoendoscopy (HSV) to become a clinically useful tool is to pair it with a technique, which provides a quick overview of the vast amount of HSV data and rapidly identifies the best video segments for subjective and objective analyses. This article proposes intensity-based representations that are easily computed from the HSV data and can be used to identify the HSV features quickly. The first representation—termed the Quick Vibratory Profile (QVP)—is an HSV-based one-dimensional waveform that captures the vocal fold vibration as well as nonglottic activities. The QVP can be used in a wide range of experimental and clinical studies to select appropriate HSV recording segments quickly without extensive review of the actual video frames. Moreover, this article proposes a pair of spatial profiles to locate the vibrating vocal folds within the HSV frames. These profiles are useful in automation of objective assessments as their use together with the QVP are demonstrated in a proposed cyclewise three-dimensional glottal area segmentation. The article illustrates the usefulness of these proposed representations with examples.

  • advanced waveform decomposition for high speed Videoendoscopy analysis
    Journal of Voice, 2013
    Co-Authors: Takeshi Ikuma, Melda Kunduk, Andrew J Mcwhorter
    Abstract:

    This article presents a novel approach to analyze nonperiodic vocal fold behavior of high-speed Videoendoscopy (HSV) data. Although HSV can capture true vibrational motions of the vocal folds, its clinical advantage over the videostroboscopy has not widely been accepted. One of the key advantages of the HSV over the videostroboscopy is its ability to capture vocal folds' nonperiodic behavior, which is more prominent in pathological vocal folds. However, such nonperiodicity in the HSV data has not been fully explored quantitatively beyond simple perturbation analysis. This article presents an advanced waveform modeling and decomposition technique for HSV-based waveforms. Waveforms are modeled to have three components: harmonic signal, deterministic nonharmonic signal, and random nonharmonic signal. This decomposition is motivated by the fact that voice disorders introduce signal content that is nonharmonic but carries deterministic quality such as subharmonic or modulating content. The proposed model is aimed to isolate such disordered behaviors as deterministic nonharmonic signal and quantify them. In addition to the model, the article outlines model parameter estimation procedures and a family of harmonics-to-noise ratio (HNR) parameters. The proposed HNR parameters include harmonics-to-deterministic-noise ratio (HDNR) and harmonics-to-random-noise ratio. A preliminary study demonstrates the effectiveness of the extended model and its HNR parameters. Vocal folds with and without benign lesions (Nwith = 13; Nwithout = 20) were studied with HSV glottal area waveforms. All three HNR parameters significantly distinguished the disordered condition, and the HDNR reported the largest effect size (Cohen's d = 2.04).

Daryush D Mehta - One of the best experts on this subject based on the ideXlab platform.

  • method for horizontal calibration of laser projection transnasal fiberoptic high speed Videoendoscopy
    Applied Sciences, 2021
    Co-Authors: Hamzeh Ghasemzadeh, Dimitar D Deliyski, Robert E Hillman, Daryush D Mehta
    Abstract:

    Objective Calibrated horizontal measurements (e.g., mm) from endoscopic procedures could be utilized for advancement of evidence-based practice and personalized medicine. However, the size of an object in endoscopic images is not readily calibrated and depends on multiple factors, including the distance between the endoscope and the target surface. Additionally, acquired images may have significant non-linear distortion that would further complicate calibrated measurements. This study used a recently developed in-vivo laser-projection fiberoptic laryngoscope and proposes a method for calibrated spatial measurements. Method A set of circular grids were recorded at multiple working distances. A statistical model was trained that would map from pixel length of the object, the working distance, and the spatial location of the target object into its mm length. Result A detailed analysis of the performance of the proposed method is presented. The analyses have shown that the accuracy of the proposed method does not depend on the working distance and length of the target object. The estimated average magnitude of error was 0.27 mm, which is three times lower than the existing alternative. Conclusion The presented method can achieve sub-millimeter accuracy in horizontal measurement. Significance Evidence-based practice and personalized medicine could significantly benefit from the proposed method. Implications of the findings for other endoscopic procedures are also discussed.

  • laser calibrated system for transnasal fiberoptic laryngeal high speed Videoendoscopy
    Journal of Voice, 2021
    Co-Authors: Dimitar D Deliyski, Alessandro De Alarcon, Daryush D Mehta, Matias Zanartu, Milen Shishkov, Hamzeh Ghasemzadeh, Brett E Bouma, Robert E Hillman
    Abstract:

    The design specifications and experimental characteristics of a newly developed laser-projection transnasal flexible endoscope coupled with a high-speed Videoendoscopy system are provided. The hardware and software design of the proposed system benefits from the combination of structured green light projection and laser triangulation techniques, which provide the capability of calibrated absolute measurements of the laryngeal structures along the horizontal and vertical planes during phonation. Visual inspection of in vivo acquired images demonstrated sharp contrast between laser points and background, confirming successful design of the system. Objective analyses were carried out for assessing the irradiance of the system and the penetration of the green laser light into the red and blue channels in the recorded images. The analysis showed that the system has irradiance of 372 W/m2 at a working distance of 20 mm, which is well within the safety limits, indicating minimal risk of usage of the device on human subjects. Additionally, the color penetration analysis showed that, with probability of 90%, the ratio of contamination of the red channel from the green laser light is less than 0.002. This indicates minimal effect of the laser projection on the measurements performed on the red data channel, making the system applicable for calibrated 3D spatial-temporal segmentation and data-driven subject-specific modeling, which is important for further advancing voice science and clinical voice assessment.

  • bayesian estimation of vocal function measures using laryngeal high speed Videoendoscopy and glottal airflow estimates an in vivo case study
    Journal of the Acoustical Society of America, 2020
    Co-Authors: Gabriel A Alzamendi, Daryush D Mehta, Robert E Hillman, Rodrigo Manriquez, Paul J Hadwin, Jonathan J Deng, Sean D Peterson, Byron D Erath, Matias Zanartu
    Abstract:

    This study introduces the in vivo application of a Bayesian framework to estimate subglottal pressure, laryngeal muscle activation, and vocal fold contact pressure from calibrated transnasal high-speed Videoendoscopy and oral airflow data. A subject-specific, lumped-element vocal fold model is estimated using an extended Kalman filter and two observation models involving glottal area and glottal airflow. Model-based inferences using data from a vocally healthy male individual are compared with empirical estimates of subglottal pressure and reference values for muscle activation and contact pressure in the literature, thus providing baseline error metrics for future clinical investigations.

  • method for vertical calibration of laser projection transnasal fiberoptic high speed Videoendoscopy
    Journal of Voice, 2019
    Co-Authors: Hamzeh Ghasemzadeh, Dimitar D Deliyski, Robert E Hillman, David S Ford, James B Kobler, Daryush D Mehta
    Abstract:

    Summary The ability to provide absolute calibrated measurement of the laryngeal structures during phonation is of paramount importance to voice science and clinical practice. Calibrated three-dimensional measurement could provide essential information for modeling purposes, for studying the developmental aspects of vocal fold vibration, for refining functional voice assessment and treatment outcomes evaluation, and for more accurate staging and grading of laryngeal disease. Recently, a laser-calibrated transnasal fiberoptic endoscope compatible with high-speed Videoendoscopy (HSV) and capable of providing three-dimensional measurements was developed. The optical principle employed is to project a grid of 7 × 7 green laser points across the field of view (FOV) at an angle relative to the imaging axis, such that (after calibration) the position of each laser point within the FOV encodes the vertical distance from the tip of the endoscope to the laryngeal tissues. The purpose of this study was to develop a precise method for vertical calibration of the endoscope. Investigating the position of the laser points showed that, besides the vertical distance, they also depend on the parameters of the lens coupler, including the FOV position within the image frame and the rotation angle of the endoscope. The presented automatic calibration method was developed to compensate for the effect of these parameters. Statistical image processing and pattern recognition were used to detect the FOV, the center of FOV, and the fiducial marker. This step normalizes the HSV frames to a standard coordinate system and removes the dependence of the laser-point positions on the parameters of the lens coupler. Then, using a statistical learning technique, a calibration protocol was developed to model the trajectories of all laser points as the working distance was varied. Finally, a set of experiments was conducted to measure the accuracy and reliability of every step of the procedure. The system was able to measure absolute vertical distance with mean percent error in the range of 1.7% to 4.7%, depending on the working distance.

  • comparison of videostroboscopy to stroboscopy derived from high speed Videoendoscopy for evaluating patients with vocal fold mass lesions
    American Journal of Speech-language Pathology, 2016
    Co-Authors: Dimitar D Deliyski, Daryush D Mehta, Robert E Hillman, Maria E Powell, Steven M Zeitels, James A Burns
    Abstract:

    Purpose Videostroboscopy (VS) uses an indirect physiological signal to predict the phase of the vocal fold vibratory cycle for sampling. Simulated stroboscopy (SS) extracts the phase of the glottal cycle directly from the changing glottal area in the high-speed Videoendoscopy (HSV) image sequence. The purpose of this study is to determine the reliability of SS relative to VS for clinical assessment of vocal fold vibratory function in patients with mass lesions. Methods VS and SS recordings were obtained from 28 patients with vocal fold mass lesions before and after phonomicrosurgery and 17 controls who were vocally healthy. Two clinicians rated clinically relevant vocal fold vibratory features using both imaging techniques, indicated their internal level of confidence in the accuracy of their ratings, and provided reasons for low or no confidence. Results SS had fewer asynchronous image sequences than VS. Vibratory outcomes were able to be computed for more patients using SS. In addition, raters demonstra...

Michael Dollinger - One of the best experts on this subject based on the ideXlab platform.

  • openhsv an open platform for laryngeal high speed Videoendoscopy
    Scientific Reports, 2021
    Co-Authors: Andreas M Kist, Anne Schutzenberger, Stephan Durr, Michael Dollinger
    Abstract:

    High-speed Videoendoscopy is an important tool to study laryngeal dynamics, to quantify vocal fold oscillations, to diagnose voice impairments at laryngeal level and to monitor treatment progress. However, there is a significant lack of an open source, expandable research tool that features latest hardware and data analysis. In this work, we propose an open research platform termed OpenHSV that is based on state-of-the-art, commercially available equipment and features a fully automatic data analysis pipeline. A publicly available, user-friendly graphical user interface implemented in Python is used to interface the hardware. Video and audio data are recorded in synchrony and are subsequently fully automatically analyzed. Video segmentation of the glottal area is performed using efficient deep neural networks to derive glottal area waveform and glottal midline. Established quantitative, clinically relevant video and audio parameters were implemented and computed. In a preliminary clinical study, we recorded video and audio data from 28 healthy subjects. Analyzing these data in terms of image quality and derived quantitative parameters, we show the applicability, performance and usefulness of OpenHSV. Therefore, OpenHSV provides a valid, standardized access to high-speed Videoendoscopy data acquisition and analysis for voice scientists, highlighting its use as a valuable research tool in understanding voice physiology. We envision that OpenHSV serves as basis for the next generation of clinical HSV systems.

  • interdependencies between acoustic and high speed Videoendoscopy parameters
    PLOS ONE, 2021
    Co-Authors: Patrick Schlegel, Melda Kunduk, Michael Dollinger, Andreas M Kist, Stephan Durr, Anne Schutzenberger
    Abstract:

    In voice research, uncovering relations between the oscillating vocal folds, being the sound source of phonation, and the resulting perceived acoustic signal are of great interest. This is especially the case in the context of voice disorders, such as functional dysphonia (FD). We investigated 250 high-speed Videoendoscopy (HSV) recordings with simultaneously recorded acoustic signals (124 healthy females, 60 FD females, 44 healthy males, 22 FD males). 35 glottal area waveform (GAW) parameters and 14 acoustic parameters were calculated for each recording. Linear and non-linear relations between GAW and acoustic parameters were investigated using Pearson correlation coefficients (PCC) and distance correlation coefficients (DCC). Further, norm values for parameters obtained from 250 ms long sustained phonation data (vowel /i/) were provided. 26 PCCs in females (5.3%) and 8 in males (1.6%) were found to be statistically significant (|corr.| ≥ 0.3). Only minor differences were found between PCCs and DCCs, indicating presence of weak non-linear dependencies between parameters. Fundamental frequency was involved in the majority of all relevant PCCs between GAW and acoustic parameters (19 in females and 7 in males). The most distinct difference between correlations in females and males was found for the parameter Period Variability Index. The study shows only weak relations between investigated acoustic and GAW-parameters. This indicates that the reduction of the complex 3D glottal dynamics to the 1D-GAW may erase laryngeal dynamic characteristics that are reflected within the acoustic signal. Hence, other GAW parameters, 2D-, 3D-laryngeal dynamics and vocal tract parameters should be further investigated towards potential correlations to the acoustic signal.

  • Influence of spatial camera resolution in high-speed Videoendoscopy on laryngeal parameters
    2019
    Co-Authors: Patrick Schlegel, Melda Kunduk, Michael Dollinger, Marion Semmler, Christopher Bohr, Michael Stingl, Anne Schutzenberger
    Abstract:

    In laryngeal high-speed Videoendoscopy (HSV) the area between the vibrating vocal folds during phonation is of interest, being referred to as glottal area waveform (GAW). Varying camera resolution may influence parameters computed on the GAW and hence hinder the comparability between examinations. This study investigates the influence of spatial camera resolution on quantitative vocal fold vibratory function parameters obtained from the GAW. In total 40 HSV recordings during sustained phonation (20 healthy males and 20 healthy females) were investigated. A clinically used Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512×256 pixels was applied. This initial resolution was reduced by pixel averaging to (1) a resolution of 256×128 and (2) to a resolution of 128×64 pixels, yielding three sets of recordings. The GAW was extracted and in total 50 vocal fold vibratory parameters representing different features of the GAW were computed. Statistical analyses using SPSS Statistics, version 21, was performed. 15 Parameters showing strong mathematical dependencies with other parameters were excluded from the main analysis but are given in the Supporting Information. Data analysis revealed clear influence of spatial resolution on GAW parameters. Fundamental period measures and period perturbation measures were the least affected. Amplitude perturbation measures and mechanical measures were most strongly influenced. Most glottal dynamic characteristics and symmetry measures deviated significantly. Most energy perturbation measures changed significantly in males but were mostly unaffected in females. In females 18 of 35 remaining parameters (51%) and in males 22 parameters (63%) changed significantly between spatial resolutions. This work represents the first step in studying the impact of video resolution on quantitative HSV parameters. Clear influences of spatial camera resolution on computed parameters were found. The study results suggest avoiding the use of the most strongly affected parameters. Further, the use of cameras with high resolution is recommended to analyze GAW measures in HSV data.

  • influence of analyzed sequence length on parameters in laryngeal high speed Videoendoscopy
    Applied Sciences, 2018
    Co-Authors: Patrick Schlegel, Melda Kunduk, Michael Dollinger, Marion Semmler, Christopher Bohr, Anne Schutzenberger
    Abstract:

    Laryngeal high-speed Videoendoscopy (HSV) allows objective quantification of vocal fold vibratory characteristics. However, it is unknown how the analyzed sequence length affects some of the computed parameters. To examine if varying sequence lengths influence parameter calculation, 20 HSV recordings of healthy females during sustained phonation were investigated. The clinical prevalent Photron Fastcam MC2 camera with a frame rate of 4000 fps and a spatial resolution of 512 x 256 pixels was used to collect HSV data. The glottal area waveform (GAW), describing the increase and decrease of the area between the vocal folds during phonation, was extracted. Based on the GAW, 16 perturbation parameters were computed for sequences of 5, 10, 20, 50 and 100 consecutive cycles. Statistical analysis was performed using SPSS Statistics, version 21. Only three parameters (18.8%) were statistically significantly influenced by changing sequence lengths. Of these parameters, one changed until 10 cycles were reached, one until 20 cycles were reached and one, namely Amplitude Variability Index (AVI), changed between almost all groups of different sequence lengths. Moreover, visually observable, but not statistically significant, changes within parameters were observed. These changes were often most prominent between shorter sequence lengths. Hence, we suggest using a minimum sequence length of at least 20 cycles and discarding the parameter AVI.

  • impact of phonatory frequency and intensity on glottal area waveform measurements derived from high speed Videoendoscopy
    Journal of the Acoustical Society of America, 2018
    Co-Authors: Rita R Patel, Michael Dollinger, Stefan Kniesburges
    Abstract:

    Measurements of glottal area waveform from high-speed Videoendoscopy were made on vocally healthy females (n = 41) and males (n = 25) during sustained /i/ production at typical pitch and loudness, high pitch, and soft phonation. Three trials of each condition were performed yielding 594 samples. Statistical analysis of glottal cycle quotients (open quotient (OQ), speed quotient (SQ), rate quotient (RQ), glottal gap index (GGI)), glottal cycle periodicity (amplitude, time (TP)), glottal cycle symmetry (phase asymmetry index, spatial symmetry index, and amplitude symmetry index), glottal area derivative (maximum area declination rate (MADR)), and mechanical stress measures (stiffness index (SI), amplitude-to-length ratio (ALR)) revealed that only SI varied systematically across pitch and loudness conditions for males and females. Variations in pitch and loudness results in changes in SI, ALR, RQ, MADR, and SQ for females, whereas variations in target pitch and loudness results in changes in SI, ALR, RQ, MAD...