Structural Metadata

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 93 Experts worldwide ranked by ideXlab platform

Naomi Dushay - One of the best experts on this subject based on the ideXlab platform.

  • localizing experience of digital content via Structural Metadata
    ACM IEEE Joint Conference on Digital Libraries, 2002
    Co-Authors: Naomi Dushay
    Abstract:

    With the increasing technical sophistication of both information consumers and providers, there is increasing demand for more meaningful experiences of digital information. We present a framework that separates digital object experience, or rendering, from digital object storage and manipulation, so the rendering can be tailored to particular communities of users. Our framework also accommodates extensible digital object behaviors and interoperability. The two key components of our approach are 1) exposing Structural Metadata associated with digital objects - Metadata about labeled access points within a digital object and 2) information intermediaries called context brokers that match Structural characteristics of digital objects with mechanisms that produce behaviors. These context brokers allow for localized rendering of digital information stored externally.

  • using Structural Metadata to localize experience of digital content
    arXiv: Digital Libraries, 2001
    Co-Authors: Naomi Dushay
    Abstract:

    With the increasing technical sophistication of both information consumers and providers, there is increasing demand for more meaningful experiences of digital information. We present a framework that separates digital object experience, or rendering, from digital object storage and manipulation, so the rendering can be tailored to particular communities of users. Our framework also accommodates extensible digital object behaviors and interoperability. The two key components of our approach are 1) exposing Structural Metadata associated with digital objects -- Metadata about the labeled access points within a digital object and 2) information intermediaries called context brokers that match Structural characteristics of digital objects with mechanisms that produce behaviors. These context brokers allow for localized rendering of digital information stored externally.

E. Shriberg - One of the best experts on this subject based on the ideXlab platform.

  • enriching speech recognition with automatic detection of sentence boundaries and disfluencies
    IEEE Transactions on Audio Speech and Language Processing, 2006
    Co-Authors: Yang Liu, Mari Ostendorf, E. Shriberg, A. Stolcke, D. Hillard, Mary P Harper
    Abstract:

    Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and disfluencies, referred to as Structural Metadata. We describe a Metadata detection system that combines information from different types of textual knowledge sources with information from a prosodic classifier. We investigate maximum entropy and conditional random field models, as well as the predominant hidden Markov model (HMM) approach, and find that discriminative models generally outperform generative models. We report system performance on both broadcast news and conversational telephone speech tasks, illustrating significant performance differences across tasks and as a function of recognizer performance. The results represent the state of the art, as assessed in the NIST RT-04F evaluation

  • Structural Metadata research in the EARS program
    Proceedings. (ICASSP '05). IEEE International Conference on Acoustics Speech and Signal Processing 2005., 2005
    Co-Authors: E. Shriberg, A. Stolcke, B. Peskin, D. Hillard, M. Ostendorf, M. Tomalin, P. Woodland, M. Harper
    Abstract:

    Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on Structural Metadata extraction in the DARPA EARS rich transcription program. Tasks include detection of sentence boundaries, filler words, and disfluencies. Modeling approaches combine lexical, prosodic, and syntactic information, using various modeling techniques for knowledge source integration. The performance of these methods is evaluated by task, by data source (broadcast news versus spontaneous telephone conversations) and by whether transcriptions come from humans or from an (errorful) automatic speech recognizer. A representative sample of results shows that combining multiple knowledge sources (words, prosody, syntactic information) is helpful, that prosody is more helpful for news speech than for conversational speech, that word errors significantly impact performance, and that discriminative models generally provide benefit over maximum likelihood models. Important remaining issues, both technical and programmatic, are also discussed.

  • the icsi sri uw rt04 Structural Metadata extraction system
    2004
    Co-Authors: Yang Liu, E. Shriberg, A. Stolcke, Barbara Peskin Mary Harper
    Abstract:

    Both human and automatic processing of speech require recognizing more than just the words. We describe the ICSISRI-UW Metadata detection system in both broadcast news and spontaneous telephone conversations, developed as part of the DARPA EARS Rich Transcription program. System tasks include sentence boundary detection, filler word detection, and detection/correction of disfluencies. To achieve best performance, we combine information from different types of textual knowledge sources (based on words, partof-speech classes, and automatically induced classes) with information from a prosodic classifier. The prosodic classifier employs bagging and ensemble approaches to better estimate posterior probabilities. In addition to our previous HMM approach, we investigate using a maximum entropy (Maxent) and a conditional random field (CRF) approach for various tasks. Results using these techniques are presented for the 2004 NIST Rich Transcription Metadata tasks.

A. Stolcke - One of the best experts on this subject based on the ideXlab platform.

  • enriching speech recognition with automatic detection of sentence boundaries and disfluencies
    IEEE Transactions on Audio Speech and Language Processing, 2006
    Co-Authors: Yang Liu, Mari Ostendorf, E. Shriberg, A. Stolcke, D. Hillard, Mary P Harper
    Abstract:

    Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and disfluencies, referred to as Structural Metadata. We describe a Metadata detection system that combines information from different types of textual knowledge sources with information from a prosodic classifier. We investigate maximum entropy and conditional random field models, as well as the predominant hidden Markov model (HMM) approach, and find that discriminative models generally outperform generative models. We report system performance on both broadcast news and conversational telephone speech tasks, illustrating significant performance differences across tasks and as a function of recognizer performance. The results represent the state of the art, as assessed in the NIST RT-04F evaluation

  • Structural Metadata research in the EARS program
    Proceedings. (ICASSP '05). IEEE International Conference on Acoustics Speech and Signal Processing 2005., 2005
    Co-Authors: E. Shriberg, A. Stolcke, B. Peskin, D. Hillard, M. Ostendorf, M. Tomalin, P. Woodland, M. Harper
    Abstract:

    Both human and automatic processing of speech require recognition of more than just words. In this paper we provide a brief overview of research on Structural Metadata extraction in the DARPA EARS rich transcription program. Tasks include detection of sentence boundaries, filler words, and disfluencies. Modeling approaches combine lexical, prosodic, and syntactic information, using various modeling techniques for knowledge source integration. The performance of these methods is evaluated by task, by data source (broadcast news versus spontaneous telephone conversations) and by whether transcriptions come from humans or from an (errorful) automatic speech recognizer. A representative sample of results shows that combining multiple knowledge sources (words, prosody, syntactic information) is helpful, that prosody is more helpful for news speech than for conversational speech, that word errors significantly impact performance, and that discriminative models generally provide benefit over maximum likelihood models. Important remaining issues, both technical and programmatic, are also discussed.

  • the icsi sri uw rt04 Structural Metadata extraction system
    2004
    Co-Authors: Yang Liu, E. Shriberg, A. Stolcke, Barbara Peskin Mary Harper
    Abstract:

    Both human and automatic processing of speech require recognizing more than just the words. We describe the ICSISRI-UW Metadata detection system in both broadcast news and spontaneous telephone conversations, developed as part of the DARPA EARS Rich Transcription program. System tasks include sentence boundary detection, filler word detection, and detection/correction of disfluencies. To achieve best performance, we combine information from different types of textual knowledge sources (based on words, partof-speech classes, and automatically induced classes) with information from a prosodic classifier. The prosodic classifier employs bagging and ensemble approaches to better estimate posterior probabilities. In addition to our previous HMM approach, we investigate using a maximum entropy (Maxent) and a conditional random field (CRF) approach for various tasks. Results using these techniques are presented for the 2004 NIST Rich Transcription Metadata tasks.

Yang Liu - One of the best experts on this subject based on the ideXlab platform.

  • enriching speech recognition with automatic detection of sentence boundaries and disfluencies
    IEEE Transactions on Audio Speech and Language Processing, 2006
    Co-Authors: Yang Liu, Mari Ostendorf, E. Shriberg, A. Stolcke, D. Hillard, Mary P Harper
    Abstract:

    Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and disfluencies, referred to as Structural Metadata. We describe a Metadata detection system that combines information from different types of textual knowledge sources with information from a prosodic classifier. We investigate maximum entropy and conditional random field models, as well as the predominant hidden Markov model (HMM) approach, and find that discriminative models generally outperform generative models. We report system performance on both broadcast news and conversational telephone speech tasks, illustrating significant performance differences across tasks and as a function of recognizer performance. The results represent the state of the art, as assessed in the NIST RT-04F evaluation

  • the icsi sri uw rt04 Structural Metadata extraction system
    2004
    Co-Authors: Yang Liu, E. Shriberg, A. Stolcke, Barbara Peskin Mary Harper
    Abstract:

    Both human and automatic processing of speech require recognizing more than just the words. We describe the ICSISRI-UW Metadata detection system in both broadcast news and spontaneous telephone conversations, developed as part of the DARPA EARS Rich Transcription program. System tasks include sentence boundary detection, filler word detection, and detection/correction of disfluencies. To achieve best performance, we combine information from different types of textual knowledge sources (based on words, partof-speech classes, and automatically induced classes) with information from a prosodic classifier. The prosodic classifier employs bagging and ensemble approaches to better estimate posterior probabilities. In addition to our previous HMM approach, we investigate using a maximum entropy (Maxent) and a conditional random field (CRF) approach for various tasks. Results using these techniques are presented for the 2004 NIST Rich Transcription Metadata tasks.

Jiehsheng Lee - One of the best experts on this subject based on the ideXlab platform.

  • controlling patent text generation by Structural Metadata
    Conference on Information and Knowledge Management, 2020
    Co-Authors: Jiehsheng Lee
    Abstract:

    The ultimate goal of my long-term project is "Augmented Inventing." This work is a follow-up effort toward the goal. It leverages the Structural Metadata in patent documents and the text-to-text mappings between Metadata. The Structural Metadata includes patent title, abstract, independent claim, and dependent claim. By using the Structural Metadata, it is possible to control what kind of patent text to generate. By using the text-to-text mapping, it is possible to let a generative model generate one type of patent text from another type of patent text. Furthermore, through multiple mappings, it is possible to build a text generation flow, for example, generating from a few words to a patent title, from the title to an abstract, from the abstract to an independent claim, and from the independent claim to multiple dependent claims. The text generation flow can also go backward after training with bi-directional mappings. In addition to those above, the contributions of this work include: (1) released four generative models trained with patent corpus from scratch, (2) released the sample code to demonstrate how to generate patent text bi-directionally, (3) measuring the performances of the models by ROGUE and Universal Sentence Encoder as preliminary evaluations of text generation quality.

  • patenttransformer 2 controlling patent text generation by Structural Metadata
    arXiv: Computation and Language, 2020
    Co-Authors: Jiehsheng Lee, Jieh Hsiang
    Abstract:

    PatentTransformer is our codename for patent text generation based on Transformer-based models. Our goal is "Augmented Inventing." In this second version, we leverage more of the Structural Metadata in patents. The Structural Metadata includes patent title, abstract, and dependent claim, in addition to independent claim previously. Metadata controls what kind of patent text for the model to generate. Also, we leverage the relation between Metadata to build a text-to-text generation flow, for example, from a few words to a title, the title to an abstract, the abstract to an independent claim, and the independent claim to multiple dependent claims. The text flow can go backward because the relation is trained bidirectionally. We release our GPT-2 models trained from scratch and our code for inference so that readers can verify and generate patent text on their own. As for generation quality, we measure it by both ROUGE and Google Universal Sentence Encoder.