Default Namespace

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 3 Experts worldwide ranked by ideXlab platform

Flavio Rizzolo - One of the best experts on this subject based on the ideXlab platform.

  • WWW - Visualizing structural patterns in web collections
    Proceedings of the 16th international conference on World Wide Web - WWW '07, 2007
    Co-Authors: Mariano P. Consens, Flavio Rizzolo
    Abstract:

    We present DescribeX, a tool for exploring and visualizing the structural patterns present in collections of XML documents. DescribeX can be employed by developers to interactively discover those XPath expressions that will actually return elements present in a collection of XML files. The element structure of many collections of XML documents present in the Web can be fairly unpredictable. This is the case even when the documents are validated by a schema, and can happen for two main reasons. First, the documents may follow a schema that allows many elements to occur almost anywhere in the document (e.g., by extensive use of in XML schema). Second, the Default Namespace and corresponding schema can be extended by incorporating elements from other Namespaces with corresponding schemas (e.g., by using the XML Schema construct to allow open content models). A collection of RSS files provides a good example of the situations described above. The RSS schema is fairly permissive of where the basic elements occur. Multiple feed schemas can be present in the collection because of multiple versions (e.g. RSS 0.91 or 2.0) or formats (e.g. Atom, or RSS 1.0/RDF). Finally, RSS encourages the use of extensions so elements from several Namespaces will be present (e.g. Dublin core, iTunes podcast, Microsoft Simple List Extensions). Other web collections for which the XML element structure can be fairly unpredictable are traces generated by web services requests and also XML-ized versions of wikis (and Wikipedia, in particular). A major challenge in working with these kinds of web collections is to understand enough about their structure to be able to pose meaningful queries employing XPath patterns.

Mariano P. Consens - One of the best experts on this subject based on the ideXlab platform.

  • WWW - Visualizing structural patterns in web collections
    Proceedings of the 16th international conference on World Wide Web - WWW '07, 2007
    Co-Authors: Mariano P. Consens, Flavio Rizzolo
    Abstract:

    We present DescribeX, a tool for exploring and visualizing the structural patterns present in collections of XML documents. DescribeX can be employed by developers to interactively discover those XPath expressions that will actually return elements present in a collection of XML files. The element structure of many collections of XML documents present in the Web can be fairly unpredictable. This is the case even when the documents are validated by a schema, and can happen for two main reasons. First, the documents may follow a schema that allows many elements to occur almost anywhere in the document (e.g., by extensive use of in XML schema). Second, the Default Namespace and corresponding schema can be extended by incorporating elements from other Namespaces with corresponding schemas (e.g., by using the XML Schema construct to allow open content models). A collection of RSS files provides a good example of the situations described above. The RSS schema is fairly permissive of where the basic elements occur. Multiple feed schemas can be present in the collection because of multiple versions (e.g. RSS 0.91 or 2.0) or formats (e.g. Atom, or RSS 1.0/RDF). Finally, RSS encourages the use of extensions so elements from several Namespaces will be present (e.g. Dublin core, iTunes podcast, Microsoft Simple List Extensions). Other web collections for which the XML element structure can be fairly unpredictable are traces generated by web services requests and also XML-ized versions of wikis (and Wikipedia, in particular). A major challenge in working with these kinds of web collections is to understand enough about their structure to be able to pose meaningful queries employing XPath patterns.