Syntactic Level

The Experts below are selected from a list of 32970 Experts worldwide ranked by ideXlab platform

Houkuan Huang - One of the best experts on this subject based on the ideXlab platform.

a multi layer text classification framework based on two Level representation model

Expert Systems With Applications, 2012

Co-Authors: Liping Jing, Jian Yu, Houkuan Huang

Abstract:

Text categorization is one of the most common themes in data mining and machine learning fields. Unlike structured data, unstructured text data is more difficult to be analyzed because it contains complicated both Syntactic and semantic information. In this paper, we propose a two-Level representation model (2RM) to represent text data, one is for representing Syntactic information and the other is for semantic information. Each document, in Syntactic Level, is represented as a term vector where the value of each component is the term frequency and inverse document frequency. The Wikipedia concepts related to terms in Syntactic Level are used to represent document in semantic Level. Meanwhile, we designed a multi-layer classification framework (MLCLA) to make use of the semantic and Syntactic information represented in 2RM model. The MLCLA framework contains three classifiers. Among them, two classifiers are applied on Syntactic Level and semantic Level in parallel. The outputs of these two classifiers will be combined and input to the third classifier, so that the final results can be obtained. Experimental results on benchmark data sets (20Newsgroups, Reuters-21578 and Classic3) have shown that the proposed 2RM model plus MLCLA framework improves the text classification performance by comparing with the existing flat text representation models (Term-based VSM, Term Semantic Kernel Model, Concept-based VSM, Concept Semantic Kernel Model and Term+Concept VSM) plus existing classification methods.

15 days free trial to Access Article
KES (2) - Semantics-based representation model for multi-layer text classification

Knowledge-Based and Intelligent Information and Engineering Systems, 2010

Co-Authors: Jiali Yun, Liping Jing, Houkuan Huang

Abstract:

Text categorization is one of the most common themes in data mining and machine learning fields. Unlike structured data, unstructured text data is more complicated to be analyzed because it contains too much information, e.g., Syntactic and semantic. In this paper, we propose a semantics-based model to represent text data in two Levels. One Level is for Syntactic information and the other is for semantic information. Syntactic Level represents each document as a term vector, and the component records tf-idf value of each term. The semantic Level represents document with Wikipedia concepts related to terms in Syntactic Level. The Syntactic and semantic information are efficiently combined by our proposed multi-layer classification framework. Experimental results on benchmark dataset (Reuters-21578) have shown that the proposed representation model plus proposed classification framework improves the performance of text classification by comparing with the flat text representation models (term VSM, concept VSM, term+concept VSM) plus existing classification methods.

15 days free trial to Access Article

Charles L A Clarke - One of the best experts on this subject based on the ideXlab platform.

a comparative evaluation of techniques for Syntactic Level source code analysis

Asia-Pacific Software Engineering Conference, 2000

Co-Authors: Anthony Cox, Charles L A Clarke

Abstract:

Many program maintenance tools rely on traditional parsing techniques to obtain Syntactic Level models of the code being maintained. When, for some reason, code cannot be parsed, software maintainers are forced to fall back on ad hoc tools and techniques, such as grep. As an alternative, hierarchical lexical analysis augmented with simple data structures can be used to extract an approximation of the abstract syntax for a source file. Experiments indicate that such an approach is feasible and produces results comparable to those obtained using a parser.

15 days free trial to Access Article
APSEC - A comparative evaluation of techniques for Syntactic Level source code analysis

Proceedings Seventh Asia-Pacific Software Engeering Conference. APSEC 2000, 1

Co-Authors: Anthony Cox, Charles L A Clarke

Abstract:

Many program maintenance tools rely on traditional parsing techniques to obtain Syntactic Level models of the code being maintained. When, for some reason, code cannot be parsed, software maintainers are forced to fall back on ad hoc tools and techniques, such as grep. As an alternative, hierarchical lexical analysis augmented with simple data structures can be used to extract an approximation of the abstract syntax for a source file. Experiments indicate that such an approach is feasible and produces results comparable to those obtained using a parser.

15 days free trial to Access Article

György Szaszák - One of the best experts on this subject based on the ideXlab platform.

Using prosody to improve automatic speech recognition

Speech Communication, 2010

Co-Authors: Klára Vicsi, György Szaszák

Abstract:

In this paper acoustic processing and modelling of the supra-segmental characteristics of speech is addressed, with the aim of incorporating advanced Syntactic and semantic Level processing of spoken language for speech recognition/understanding tasks. The proposed modelling approach is very similar to the one used in standard speech recognition, where basic HMM units (the most often acoustic phoneme models) are trained and are then connected according to the dictionary and some grammar (language model) to obtain a recognition network, along which recognition can be interpreted also as an alignment process. In this paper the HMM framework is used to model speech prosody, and to perform initial Syntactic and/or semantic Level processing of the input speech in parallel to standard speech recognition. As acoustic-prosodic features, fundamental frequency and energy are used. A method was implemented for Syntactic Level information extraction from the speech. The method was designed to work for fixed-stress languages, and it yields a segmentation of the input speech for Syntactically linked word groups, or even single words corresponding to a Syntactic unit (these word groups are sometimes referred to as phonological phrases in psycholinguistics, which can consist of one or more words). These so-called word-stress units are marked by prosody, and have an associated fundamental frequency and/or energy contour which allows their discovery. For this, HMMs for the different types of word-stress unit contours were trained and then used for recognition and alignment of such units from the input speech. This prosodic segmentation of the input speech also allows word-boundary recovery and can be used for N-best lattice rescoring based on prosodic information. The Syntactic Level input speech segmentation algorithm was evaluated for the Hungarian and for the Finnish languages that have fixed stress on the first syllable. (This means if a word is stressed, stress is realized on the first syllable of the word.) The N-best rescoring based on Syntactic Level word-stress unit alignment was shown to augment the number of correctly recognized words. For further Syntactic and semantic Level processing of the input speech in ASR, clause and sentence boundary detection and modality (sentence type) recognition was implemented. Again, the classification was carried out by HMMs, which model the prosodic contour for each clause and/or sentence modality type. Clause (and hence also sentence) boundary detection was based on HMM's excellent capacity in aligning dynamically the reference prosodic structure to the utterance coming from the ASR input. This method also allows punctuation to be automatically marked. This semantic Level processing of speech was investigated for the Hungarian and for the German languages. The correctness of recognized types of modalities was 69% for Hungarian, and 78% for German.

15 days free trial to Access Article

Yinuo Guo - One of the best experts on this subject based on the ideXlab platform.

meteor 2 0 adopt Syntactic Level paraphrase knowledge into machine translation evaluation

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers Day 1), 2019

Co-Authors: Yinuo Guo

Abstract:

This paper describes Meteor++ 2.0, our submission to the WMT19 Metric Shared Task. The well known Meteor metric improves machine translation evaluation by introducing paraphrase knowledge. However, it only focuses on the lexical Level and utilizes consecutive n-grams paraphrases. In this work, we take into consideration Syntactic Level paraphrase knowledge, which sometimes may be skip-grams. We describe how such knowledge can be extracted from Paraphrase Database (PPDB) and integrated into Meteor-based metrics. Experiments on WMT15 and WMT17 evaluation datasets show that the newly proposed metric outperforms all previous versions of Meteor.

15 days free trial to Access Article
WMT (2) - Meteor++ 2.0: Adopt Syntactic Level Paraphrase Knowledge into Machine Translation Evaluation

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers Day 1), 2019

Co-Authors: Yinuo Guo

Abstract:

This paper describes Meteor++ 2.0, our submission to the WMT19 Metric Shared Task. The well known Meteor metric improves machine translation evaluation by introducing paraphrase knowledge. However, it only focuses on the lexical Level and utilizes consecutive n-grams paraphrases. In this work, we take into consideration Syntactic Level paraphrase knowledge, which sometimes may be skip-grams. We describe how such knowledge can be extracted from Paraphrase Database (PPDB) and integrated into Meteor-based metrics. Experiments on WMT15 and WMT17 evaluation datasets show that the newly proposed metric outperforms all previous versions of Meteor.

15 days free trial to Access Article

Suhyun Park - One of the best experts on this subject based on the ideXlab platform.

Syntactic-Level integration and display of multiple domains' S-100-based data for e-navigation

Cluster Computing, 2017

Co-Authors: Daewon Park, Suhyun Park

Abstract:

In the maritime field, interest in the utilization of multiple and various domains' data for provision of relevant, accurate and timely information ensuring the safety and security of navigation at sea has been growing. Discussion, for example, of the implementation of e-navigation, a new maritime service paradigm introduced by the International Maritime Organization, is ongoing. E-navigation enables and facilitates the on- and off-shore exchange, sharing and utilization of marine and marine-related domains' data in support of users' decision-making. For consistent exchange and sharing of marine and marine-related data in the e-navigation environment, the International Hydrographic Organization's S-100 has been adopted as the baseline of the common maritime data structure. S-100 provides common data models for consistent definition of data elements representing data contents. To that end, it supports Syntactic-Level data interoperability among S-100 applied e-navigation systems. However, the current S-100 does not provide methods or models for formal representation of data semantics or semantic-Level harmonization of data. Therefore, current e-navigation-system efforts are focusing on the utilization of multiple domains' data at the Syntactic-Level. To provide relevant information for various marine activities, e-navigation systems should be able to handle various domains' data and integrate them at the Syntactic-Level. In this paper, we introduce a method by which multiple domains' S-100-based data can be integrated according to the characteristics of such data. Additionally, we present a method for consistent display and integration of multiple domains' S-100-based data. These methods promise to greatly facilitate S-100-based e-navigation systems' handling of multiple and various domains' data.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Houkuan Huang - One of the best experts on this subject based on the ideXlab platform.

a multi layer text classification framework based on two Level representation model

KES (2) - Semantics-based representation model for multi-layer text classification

Charles L A Clarke - One of the best experts on this subject based on the ideXlab platform.

a comparative evaluation of techniques for Syntactic Level source code analysis

APSEC - A comparative evaluation of techniques for Syntactic Level source code analysis

György Szaszák - One of the best experts on this subject based on the ideXlab platform.

Using prosody to improve automatic speech recognition

Yinuo Guo - One of the best experts on this subject based on the ideXlab platform.

meteor 2 0 adopt Syntactic Level paraphrase knowledge into machine translation evaluation

WMT (2) - Meteor++ 2.0: Adopt Syntactic Level Paraphrase Knowledge into Machine Translation Evaluation

Suhyun Park - One of the best experts on this subject based on the ideXlab platform.

Syntactic-Level integration and display of multiple domains' S-100-based data for e-navigation

Syntactic Level

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Houkuan Huang - One of the best experts on this subject based on the ideXlab platform.

Charles L A Clarke - One of the best experts on this subject based on the ideXlab platform.

György Szaszák - One of the best experts on this subject based on the ideXlab platform.

Yinuo Guo - One of the best experts on this subject based on the ideXlab platform.

Suhyun Park - One of the best experts on this subject based on the ideXlab platform.

Related terms