Structural Markup

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 1995 Experts worldwide ranked by ideXlab platform

Naber Daniel - One of the best experts on this subject based on the ideXlab platform.

  • Methods for the semantic analysis of document Markup
    New York : ACM, 2016
    Co-Authors: Bayerl, Petra Saskia, Lüngen Harald, Goecke Daniela, Witt Andreas, Naber Daniel
    Abstract:

    We present an approach on how to investigate what kind of semantic information is regularly associated with the Structural Markup of scientific articles. This approach addresses the need for an explicit formal description of the semantics of text-oriented XML-documents. The domain of our investigation is a corpus of scientific articles from psychology and linguistics from both English and German online available journals. For our analyses, we provide XML-Markup representing two kinds of semantic levels: the thematic level (i.e. topics in the text world that the article is about) and the functional or rhetorical level. Our hypothesis is that these semantic levels correlate with the articles’ document structure also represented in XML. Articles have been annotated with the appropriate information. Each of the three informational levels is modelled in a separate XML document, since in our domain, the different description levels might conflict so that it is impossible to model them within a single XML document. For comparing and mining the resulting multi-layered XML annotations of one article, a Prolog-based approach is used. It focusses on the comparison of XML Markup that is distributed among different documents. Prolog predicates have been defined for inferring relations between levels of information that are modelled in separate XML documents. We demonstrate how the Prolog tool is applied in our corpus analyses

Müller Christian - One of the best experts on this subject based on the ideXlab platform.

Bayerl, Petra Saskia - One of the best experts on this subject based on the ideXlab platform.

  • Methods for the semantic analysis of document Markup
    New York : ACM, 2016
    Co-Authors: Bayerl, Petra Saskia, Lüngen Harald, Goecke Daniela, Witt Andreas, Naber Daniel
    Abstract:

    We present an approach on how to investigate what kind of semantic information is regularly associated with the Structural Markup of scientific articles. This approach addresses the need for an explicit formal description of the semantics of text-oriented XML-documents. The domain of our investigation is a corpus of scientific articles from psychology and linguistics from both English and German online available journals. For our analyses, we provide XML-Markup representing two kinds of semantic levels: the thematic level (i.e. topics in the text world that the article is about) and the functional or rhetorical level. Our hypothesis is that these semantic levels correlate with the articles’ document structure also represented in XML. Articles have been annotated with the appropriate information. Each of the three informational levels is modelled in a separate XML document, since in our domain, the different description levels might conflict so that it is impossible to model them within a single XML document. For comparing and mining the resulting multi-layered XML annotations of one article, a Prolog-based approach is used. It focusses on the comparison of XML Markup that is distributed among different documents. Prolog predicates have been defined for inferring relations between levels of information that are modelled in separate XML documents. We demonstrate how the Prolog tool is applied in our corpus analyses

David M Nichols - One of the best experts on this subject based on the ideXlab platform.

  • textual documents the raw material
    How to Build a Digital Library (Second Edition), 2010
    Co-Authors: Ian H Witten, David Bainbridge, David M Nichols
    Abstract:

    This chapter discusses the representation of plain text documents and related issues in digital libraries. Electronic documents have two complementary aspects: structure and appearance. Structural Markup makes certain aspects of the document structure explicit: section divisions, headings, subsection structure, enumerated and bulleted lists, emphasized and quoted text, footnotes, tabular material, and so on. Page description languages portray finished documents, ones that are not intended to be edited. In contrast, word processors represent documents in ways that are expressly designed to support interactive creation and editing. A comprehensive index, capable of rapidly accessing all documents that satisfy a particular query, is a large data structure. Size, as well as being a drawback in its own right, also affects retrieval time, for the computer must read and interpret appropriate parts of the index to locate the desired information. Modern Markup languages use words enclosed in angle brackets as tags to annotate text. HTML has many more features. For example, locally defined link anchors permit navigation within a single document. Fonts, colors, and page backgrounds can be specified explicitly.

Lüngen Harald - One of the best experts on this subject based on the ideXlab platform.

  • Methods for the semantic analysis of document Markup
    New York : ACM, 2016
    Co-Authors: Bayerl, Petra Saskia, Lüngen Harald, Goecke Daniela, Witt Andreas, Naber Daniel
    Abstract:

    We present an approach on how to investigate what kind of semantic information is regularly associated with the Structural Markup of scientific articles. This approach addresses the need for an explicit formal description of the semantics of text-oriented XML-documents. The domain of our investigation is a corpus of scientific articles from psychology and linguistics from both English and German online available journals. For our analyses, we provide XML-Markup representing two kinds of semantic levels: the thematic level (i.e. topics in the text world that the article is about) and the functional or rhetorical level. Our hypothesis is that these semantic levels correlate with the articles’ document structure also represented in XML. Articles have been annotated with the appropriate information. Each of the three informational levels is modelled in a separate XML document, since in our domain, the different description levels might conflict so that it is impossible to model them within a single XML document. For comparing and mining the resulting multi-layered XML annotations of one article, a Prolog-based approach is used. It focusses on the comparison of XML Markup that is distributed among different documents. Prolog predicates have been defined for inferring relations between levels of information that are modelled in separate XML documents. We demonstrate how the Prolog tool is applied in our corpus analyses