The Experts below are selected from a list of 15762 Experts worldwide ranked by ideXlab platform
Jaroslaw Protasiewicz - One of the best experts on this subject based on the ideXlab platform.
-
A Bidirectional Iterative Algorithm for Nested named entity recognition
IEEE Access, 2020Co-Authors: Slawomir Dadas, Jaroslaw ProtasiewiczAbstract:Nested named entity recognition (NER) is a special case of structured prediction in which annotated sequences can be contained inside each other. It is a challenging and significant problem in natural language processing. In this paper, we propose a novel framework for nested named entity recognition tasks. Our approach is based on a deep learning model which can be called in an iterative way, expanding the set of predicted entity mentions with each subsequent iteration. The proposed framework combines two such models trained to identify named entities in different directions: from general to specific ( outside-in ), and from specific to general ( inside-out ). The predictions of both models are then aggregated by a selection policy. We propose and evaluate several selection policies which can be used with our algorithm. Our method does not impose any restrictions on the length of entity mentions, number of entity classes, depth, or structure of the predicted output. The framework has been validated experimentally on four well-known nested named entity recognition datasets: GENIA, NNE, PolEval, and GermEval. The datasets differ in terms of domain (biomedical, news, mixed), language (English, Polish, German), and the structure of nesting (simple, complex). Through extensive tests, we prove that the approach we have proposed outperforms existing methods for nested named entity recognition.
Jakub Piskorski - One of the best experts on this subject based on the ideXlab platform.
-
named-entity recognition for polish with SProUT
Lecture Notes in Computer Science, 2005Co-Authors: Jakub PiskorskiAbstract:Although considerable work on named-entity recognition for few major languages exists, research on this topic in the context of Slavonic languages has been almost neglected. This paper presents a rule-based named-entity recognition system for Polish built on top of SProUT, a novel multi-lingual NLP platform. We pinpoint the encountered difficulties and present some promising evaluation results.
-
IMTCI - named-entity recognition for polish with SProUT
Intelligent Media Technology for Communicative Intelligence, 2004Co-Authors: Jakub PiskorskiAbstract:Although considerable work on named-entity recognition for few major languages exists, research on this topic in the context of Slavonic languages has been almost neglected. This paper presents a rule-based named-entity recognition system for Polish built on top of SProUT, a novel multi-lingual NLP platform. We pinpoint the encountered difficulties and present some promising evaluation results.
Slawomir Dadas - One of the best experts on this subject based on the ideXlab platform.
-
A Bidirectional Iterative Algorithm for Nested named entity recognition
IEEE Access, 2020Co-Authors: Slawomir Dadas, Jaroslaw ProtasiewiczAbstract:Nested named entity recognition (NER) is a special case of structured prediction in which annotated sequences can be contained inside each other. It is a challenging and significant problem in natural language processing. In this paper, we propose a novel framework for nested named entity recognition tasks. Our approach is based on a deep learning model which can be called in an iterative way, expanding the set of predicted entity mentions with each subsequent iteration. The proposed framework combines two such models trained to identify named entities in different directions: from general to specific ( outside-in ), and from specific to general ( inside-out ). The predictions of both models are then aggregated by a selection policy. We propose and evaluate several selection policies which can be used with our algorithm. Our method does not impose any restrictions on the length of entity mentions, number of entity classes, depth, or structure of the predicted output. The framework has been validated experimentally on four well-known nested named entity recognition datasets: GENIA, NNE, PolEval, and GermEval. The datasets differ in terms of domain (biomedical, news, mixed), language (English, Polish, German), and the structure of nesting (simple, complex). Through extensive tests, we prove that the approach we have proposed outperforms existing methods for nested named entity recognition.
Kashif Riaz - One of the best experts on this subject based on the ideXlab platform.
-
rule based named entity recognition in urdu
Meeting of the Association for Computational Linguistics, 2010Co-Authors: Kashif RiazAbstract:named entity recognition or Extraction (NER) is an important task for automated text processing for industries and academia engaged in the field of language processing, intelligence gathering and Bioinformatics. In this paper we discuss the general problem of named entity recognition, more specifically the challenges in NER in languages that do not have language resources e.g. large annotated corpora. We specifically address the challenges for Urdu NER and differentiate it from other South Asian (Indic) languages. We discuss the differences between Hindi and Urdu and conclude that the NER computational models for Hindi cannot be applied to Urdu. A rule-based Urdu NER algorithm is presented that outperforms the models that use statistical learning.
Christopher D Manning - One of the best experts on this subject based on the ideXlab platform.
-
EMNLP - Nested named entity recognition
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1 - EMNLP '09, 2009Co-Authors: Jenny Rose Finkel, Christopher D ManningAbstract:Many named entities contain other named entities inside them. Despite this fact, the field of named entity recognition has almost entirely ignored nested named entity recognition, but due to technological, rather than ideological reasons. In this paper, we present a new technique for recognizing nested named entities, by using a discriminative constituency parser. To train the model, we transform each sentence into a tree, with constituents for each named entity (and no other syntactic structure). We present results on both newspaper and biomedical corpora which contain nested named entities. In three out of four sets of experiments, our model outperforms a standard semi-CRF on the more traditional top-level entities. At the same time, we improve the overall F-score by up to 30% over the flat model, which is unable to recover any nested entities.
-
Joint parsing and named entity recognition
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics , 2009Co-Authors: Jenny Rose Finkel, Christopher D ManningAbstract:For many language technology applications, such as question answering, the overall system runs several independent processors over the data (such as a named entity recognizer, a coreference system, and a parser). This easily results in inconsistent annotations, which are harmful to the performance of the aggregate system. We begin to address this problem with a joint model of parsing and named entity recognition, based on a discriminative feature-based constituency parser. Our model produces a consistent output, where the named entity spans do not conflict with the phrasal spans of the parse tree. The joint representation also allows the information from each type of annotation to improve performance on the other, and, in experiments with the OntoNotes corpus, we found improvements of up to 1.36% absolute F1 for parsing, and up to 9.0% F1 for named entity recognition
-
Nested named entity recognition
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing Volume 1 EMNLP 09, 2009Co-Authors: Jenny Rose Finkel, Christopher D ManningAbstract:Many named entities contain other named entities inside them. Despite this fact, the field of named entity recognition has almost entirely ignored nested named entity recognition, but due to technological, rather than ideological reasons. In this paper, we present a new technique for recognizing nested named entities, by using a discriminative constituency parser. To train the model, we transform each sentence into a tree, with constituents for each named entity (and no other syntactic structure). We present results on both newspaper and biomedical corpora which contain nested named entities. In three out of four sets of experiments, our model outperforms a standard semi-CRF on the more traditional top-level entities. At the same time, we improve the overall F-score by up to 30% over the flat model, which is unable to recover any nested entities.