The Experts below are selected from a list of 24483 Experts worldwide ranked by ideXlab platform
Guodong Zhou - One of the best experts on this subject based on the ideXlab platform.
-
Multi-Modal Language Analysis with Hierarchical Interaction-Level and Selection-Level Attentions
2019 IEEE International Conference on Multimedia and Expo (ICME), 2019Co-Authors: Dong Zhang, Liangqing Wu, Shoushan Li, Guodong ZhouAbstract:As an emerging research area in natural Language processing, multi-Modal human Language analysis spans Language, vision and audio Modalities. Understanding multi-Modal Language requires not only the modeling of independent dynamics within each Modality (intra-Modal dynamics), but also more importantly interactive dynamics among different Modalities (inter-Modal dynamics). In this paper, we propose a hierarchical approach to multi-Modal Language analysis with two levels of attention mechanism, namely interaction-level, which captures the intra-Modal and inter-Modal dynamics across different Modalities with multiple types of attention, and selection-level attention, which selects the effective representations for final prediction by calculating the importance of each vector obtained from interaction-level. Empirical evaluation demonstrates the effectiveness of our proposed approach to multi-Modal sentiment classification, sentiment regression and emotion recognition.
Dong Zhang - One of the best experts on this subject based on the ideXlab platform.
-
Multi-Modal Language Analysis with Hierarchical Interaction-Level and Selection-Level Attentions
2019 IEEE International Conference on Multimedia and Expo (ICME), 2019Co-Authors: Dong Zhang, Liangqing Wu, Shoushan Li, Guodong ZhouAbstract:As an emerging research area in natural Language processing, multi-Modal human Language analysis spans Language, vision and audio Modalities. Understanding multi-Modal Language requires not only the modeling of independent dynamics within each Modality (intra-Modal dynamics), but also more importantly interactive dynamics among different Modalities (inter-Modal dynamics). In this paper, we propose a hierarchical approach to multi-Modal Language analysis with two levels of attention mechanism, namely interaction-level, which captures the intra-Modal and inter-Modal dynamics across different Modalities with multiple types of attention, and selection-level attention, which selects the effective representations for final prediction by calculating the importance of each vector obtained from interaction-level. Empirical evaluation demonstrates the effectiveness of our proposed approach to multi-Modal sentiment classification, sentiment regression and emotion recognition.
Stephanie N Del Tufo - One of the best experts on this subject based on the ideXlab platform.
-
neurochemistry predicts convergence of written and spoken Language a proton magnetic resonance spectroscopy study of cross Modal Language integration
Frontiers in Psychology, 2018Co-Authors: Stephanie N Del Tufo, Stephen J Frost, Fumiko Hoeft, Laurie E Cutting, Peter J Molfese, Graeme F Mason, Douglas L RothmanAbstract:Recent studies have provided evidence of associations between neurochemistry and reading (dis)ability (Pugh et al., 2014). Based on a long history of studies indicating that fluent reading entails the automatic convergence of the written and spoken forms of Language and our recently proposed Neural Noise Hypothesis (Hancock et al., 2017), we hypothesized that individual differences in cross-Modal integration would mediate, at least partially, the relationship between neurochemical concentrations and reading. Cross-Modal integration was measured in 231 children using a two-alternative forced choice cross-Modal matching task with three Language conditions (letters, words, and pseudowords) and two levels of difficulty within each Language condition. Neurometabolite concentrations of Choline (Cho), Glutamate (Glu), gamma-Aminobutyric (GABA), and N- acetyl-aspartate (NAA) were then measured in a subset of this sample (n=70) with Magnetic Resonance Spectroscopy (MRS). A structural equation mediation model revealed that the effect of cross-Modal word matching mediated the relationship between increased Glu (which has been proposed to be an index of neural noise) and poorer reading ability. In addition, the effect of cross-Modal word matching fully mediated a relationship between increased Cho and poorer reading ability. Multilevel mixed effects models confirmed that lower Cho predicted faster cross-Modal matching reaction time, specifically in the hard word condition. These findings with Cho are consistent with previous work in both adults and children showing a negative associated between Cho and reading ability. We also found two novel neurochemical relationships with children’s cross-Modal integration. Specifically, lower GABA and higher NAA predicted faster cross-Modal matching reaction times. We interpret these results within a biochemical framework in which the ability of neurochemistry to predict reading ability may at least partially be explained by cross-Modal integration.
Douglas L Rothman - One of the best experts on this subject based on the ideXlab platform.
-
neurochemistry predicts convergence of written and spoken Language a proton magnetic resonance spectroscopy study of cross Modal Language integration
Frontiers in Psychology, 2018Co-Authors: Stephanie N Del Tufo, Stephen J Frost, Fumiko Hoeft, Laurie E Cutting, Peter J Molfese, Graeme F Mason, Douglas L RothmanAbstract:Recent studies have provided evidence of associations between neurochemistry and reading (dis)ability (Pugh et al., 2014). Based on a long history of studies indicating that fluent reading entails the automatic convergence of the written and spoken forms of Language and our recently proposed Neural Noise Hypothesis (Hancock et al., 2017), we hypothesized that individual differences in cross-Modal integration would mediate, at least partially, the relationship between neurochemical concentrations and reading. Cross-Modal integration was measured in 231 children using a two-alternative forced choice cross-Modal matching task with three Language conditions (letters, words, and pseudowords) and two levels of difficulty within each Language condition. Neurometabolite concentrations of Choline (Cho), Glutamate (Glu), gamma-Aminobutyric (GABA), and N- acetyl-aspartate (NAA) were then measured in a subset of this sample (n=70) with Magnetic Resonance Spectroscopy (MRS). A structural equation mediation model revealed that the effect of cross-Modal word matching mediated the relationship between increased Glu (which has been proposed to be an index of neural noise) and poorer reading ability. In addition, the effect of cross-Modal word matching fully mediated a relationship between increased Cho and poorer reading ability. Multilevel mixed effects models confirmed that lower Cho predicted faster cross-Modal matching reaction time, specifically in the hard word condition. These findings with Cho are consistent with previous work in both adults and children showing a negative associated between Cho and reading ability. We also found two novel neurochemical relationships with children’s cross-Modal integration. Specifically, lower GABA and higher NAA predicted faster cross-Modal matching reaction times. We interpret these results within a biochemical framework in which the ability of neurochemistry to predict reading ability may at least partially be explained by cross-Modal integration.
Bernd Girod - One of the best experts on this subject based on the ideXlab platform.
-
multi Modal Language models for lecture video retrieval
ACM Multimedia, 2014Co-Authors: Huizhong Chen, Matthew Cooper, Dhiraj Joshi, Bernd GirodAbstract:We propose Multi-Modal Language Models (MLMs), which adapt latent variable techniques for document analysis to exploring co-occurrence relationships in multi-Modal data. In this paper, we focus on the application of MLMs to indexing text from slides and speech in lecture videos, and subsequently employ a multi-Modal probabilistic ranking function for lecture video retrieval. The MLM achieves highly competitive results against well established retrieval methods such as the Vector Space Model and Probabilistic Latent Semantic Analysis. When noise is present in the data, retrieval performance with MLMs is shown to improve with the quality of the spoken text extracted from the video.