Document Analysis

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 47256 Experts worldwide ranked by ideXlab platform

Volker Märgner - One of the best experts on this subject based on the ideXlab platform.

  • Tools and Metrics for Document Analysis Systems Evaluation
    Handbook of Document Image Processing and Recognition, 2014
    Co-Authors: Volker Märgner, Haikal El Abed
    Abstract:

    Any development within a specific research field like Document Analysis and recognition comes along with the need for data and corresponding measurement devices and test equipment. This chapter introduces the basic issues of evaluation methods for different kind of Document Analysis systems and modules with a special emphasis on tools and metrics available and used today.This chapter is organized as follows: After a general introduction including general definitions of terms used in Document Analysis system evaluation and general overviews of evaluation processes in section “Introduction,” different evaluation metrics are discussed in section “Evaluation Metrics.” These metrics cover the different aspects of the Document Analysis Handbook as presented in Chaps. 2 (Document Creation, Image Acquisition and Document Quality)–8 (Text Segmentation for Document Recognition), from image-processing evaluation metrics to special metrics for selected applications e.g., character/text recognition. In section “Evaluation Tools,” an overview of ground-truth file structure and a selection of available ground-truth tools are presented. Performance evaluation tools and competitions organized within the last years are also listed in section “Evaluation Tools.”

  • ICDAR - International Conference on Document Analysis and Recognition (ICDAR 2011) - Competitions Overview
    2011 International Conference on Document Analysis and Recognition, 2011
    Co-Authors: Haikal El Abed, Liu Wenyin, Volker Märgner
    Abstract:

    The great success and high number of participants in pattern recognition related competitions last years show an important improvement of recognition and classification approaches. This success is unconceivable without the availability of huge datasets of real world data. Within the scope of the 11th International Conference on Document Analysis and Recognition (ICDAR2011) a call for competitions was initiated. The aim of the competitions is the performance evaluation of algorithms and methods for a particular task of Document Analysis and recognition. 22 different teams have submitted their proposals. The subjects of these propositions cover the field of research of Document Analysis and recognition from pre-processing over Document Analysis to text recognition or writer identification. 16 competitions have received enough participants (we have defined the threshold to 3 systems) to present their evaluation at the ICDAR 2011.

  • International Conference on Document Analysis and Recognition (ICDAR 2011) - Competitions Overview
    2011 International Conference on Document Analysis and Recognition, 2011
    Co-Authors: Haikal El Abed, Liu Wenyin, Volker Märgner
    Abstract:

    The great success and high number of participants in pattern recognition related competitions last years show an important improvement of recognition and classification approaches. This success is unconceivable without the availability of huge datasets of real world data. Within the scope of the 11th International Conference on Document Analysis and Recognition (ICDAR2011) a call for competitions was initiated. The aim of the competitions is the performance evaluation of algorithms and methods for a particular task of Document Analysis and recognition. 22 different teams have submitted their proposals. The subjects of these propositions cover the field of research of Document Analysis and recognition from pre-processing over Document Analysis to text recognition or writer identification. 16 competitions have received enough participants (we have defined the threshold to 3 systems) to present their evaluation at the ICDAR 2011.

  • On benchmarking of Document Analysis systems
    Proceedings of the Fourth International Conference on Document Analysis and Recognition, 1997
    Co-Authors: Volker Märgner, P. Karcher, A.-k. Pawlowski
    Abstract:

    Thes paper is a contribution to the development of evaluation methods in general and in particular of the evaluation of Document Analysis systems (DAS). We give a general concept of benchmarking DAS and present a method to benchmark single modules of a DAS where we focus on the definition and generation of ground truth (GT), especially of the image processing modules of a DAS. The proposed method describes how to build a database with specific ground truth for the output of each module of interest. Another method to build a database for benchmarking more generally is to use synthetic data.

E. Valveny - One of the best experts on this subject based on the ideXlab platform.

  • ICDAR - A Review of Shape Descriptors for Document Analysis
    Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
    Co-Authors: O.r. Terrades, S. Tabbone, E. Valveny
    Abstract:

    Shape descriptors play an important role in many Document Analysis application. In this paper we review some of the shape descriptors proposed in the last years from a new point of view. We propose the definitions of descriptor and primitive and introduce the notion of feature extraction method. With these definitions, we propose a new classification of shape descriptors that permits to classify according to their properties pointing out their strengths and weaknesses.

  • A Review of Shape Descriptors for Document Analysis
    Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
    Co-Authors: O.r. Terrades, S. Tabbone, E. Valveny
    Abstract:

    Shape descriptors play an important role in many Document Analysis application. In this paper we review some of the shape descriptors proposed in the last years from a new point of view. We propose the definitions of descriptor and primitive and introduce the notion of feature extraction method. With these definitions, we propose a new classification of shape descriptors that permits to classify according to their properties pointing out their strengths and weaknesses.

Haikal El Abed - One of the best experts on this subject based on the ideXlab platform.

  • Tools and Metrics for Document Analysis Systems Evaluation
    Handbook of Document Image Processing and Recognition, 2014
    Co-Authors: Volker Märgner, Haikal El Abed
    Abstract:

    Any development within a specific research field like Document Analysis and recognition comes along with the need for data and corresponding measurement devices and test equipment. This chapter introduces the basic issues of evaluation methods for different kind of Document Analysis systems and modules with a special emphasis on tools and metrics available and used today.This chapter is organized as follows: After a general introduction including general definitions of terms used in Document Analysis system evaluation and general overviews of evaluation processes in section “Introduction,” different evaluation metrics are discussed in section “Evaluation Metrics.” These metrics cover the different aspects of the Document Analysis Handbook as presented in Chaps. 2 (Document Creation, Image Acquisition and Document Quality)–8 (Text Segmentation for Document Recognition), from image-processing evaluation metrics to special metrics for selected applications e.g., character/text recognition. In section “Evaluation Tools,” an overview of ground-truth file structure and a selection of available ground-truth tools are presented. Performance evaluation tools and competitions organized within the last years are also listed in section “Evaluation Tools.”

  • ICDAR - International Conference on Document Analysis and Recognition (ICDAR 2011) - Competitions Overview
    2011 International Conference on Document Analysis and Recognition, 2011
    Co-Authors: Haikal El Abed, Liu Wenyin, Volker Märgner
    Abstract:

    The great success and high number of participants in pattern recognition related competitions last years show an important improvement of recognition and classification approaches. This success is unconceivable without the availability of huge datasets of real world data. Within the scope of the 11th International Conference on Document Analysis and Recognition (ICDAR2011) a call for competitions was initiated. The aim of the competitions is the performance evaluation of algorithms and methods for a particular task of Document Analysis and recognition. 22 different teams have submitted their proposals. The subjects of these propositions cover the field of research of Document Analysis and recognition from pre-processing over Document Analysis to text recognition or writer identification. 16 competitions have received enough participants (we have defined the threshold to 3 systems) to present their evaluation at the ICDAR 2011.

  • International Conference on Document Analysis and Recognition (ICDAR 2011) - Competitions Overview
    2011 International Conference on Document Analysis and Recognition, 2011
    Co-Authors: Haikal El Abed, Liu Wenyin, Volker Märgner
    Abstract:

    The great success and high number of participants in pattern recognition related competitions last years show an important improvement of recognition and classification approaches. This success is unconceivable without the availability of huge datasets of real world data. Within the scope of the 11th International Conference on Document Analysis and Recognition (ICDAR2011) a call for competitions was initiated. The aim of the competitions is the performance evaluation of algorithms and methods for a particular task of Document Analysis and recognition. 22 different teams have submitted their proposals. The subjects of these propositions cover the field of research of Document Analysis and recognition from pre-processing over Document Analysis to text recognition or writer identification. 16 competitions have received enough participants (we have defined the threshold to 3 systems) to present their evaluation at the ICDAR 2011.

Andreas Dengel - One of the best experts on this subject based on the ideXlab platform.

  • Document Analysis Systems - Attentive Tasks: Process-Driven Document Analysis for Multichannel Documents
    2012 10th IAPR International Workshop on Document Analysis Systems, 2012
    Co-Authors: Kristin Stamm, Andreas Dengel
    Abstract:

    The increasing amount of email data has led many companies to new challenges with their employees now having to deal with information overload while managing multiple communication channels at the same time, e.g., email, mail, and phone. Moreover, emails can contain attachments, i.e., files with additional information. Most existing approaches for reducing email processing time require significant domain specific customization efforts to achieve good performance and lack attachment handling. We aim at providing a more domain independent approach by integrating the process context and using the information expectations of a process to guide the Document Analysis (DA) schedule for emails and their attachments. We rely on the concepts of Attentive Tasks (ATs) and Specialist Board (SB). ATs are templates that describe all relevant and expected information about a process currently waiting for input. The SB provides a machine readable description of DA methods, so-called specialists, that extract all relevant information for further processes. We present our approach and demonstrate the benefits for a domain specific application, i.e., a financial institution.

  • ODA-based modeling for Document Analysis
    2011
    Co-Authors: Rainer Bleisinger, Rainer Hoch, Andreas Dengel
    Abstract:

    This article proposes the Document model of a hybrid knowledge-based Document Analysis system for business letters. The model combines requirements of object-oriented representation of both, Documents as well as knowledge necessary for Analysis tasks, and is based on the ODA platform. Model-driven Document Analysis increases the flexibility of a system because several Analysis specialists can be used in co-operation to assist each other and to improve the results of Analysis. The inherent modularity of the system allows for a reuse of knowledge sources and integral constituents of the architecture in other Document classes such as forms or cheques.

  • Document Analysis Systems - On benchmarking of invoice Analysis systems
    Document Analysis Systems VII, 2006
    Co-Authors: Bertin Klein, Stefan Agne, Andreas Dengel
    Abstract:

    An approach is presented to guide the benchmarking of invoice Analysis systems, a specific, applied subclass of Document Analysis systems. The state of the art of benchmarking of Document Analysis systems is presented, based on the processing levels: Document Page Segmentation, Text Recognition, Document Classification, and Information Extraction. The restriction to invoices enables and requires a more purposeful, i.e. detailed, targetting of the benchmarking procedures (acquisition of ground truth data, system runs, comparison of data, condensation into meaningful numbers). Therefore the processing of invoices is dissected. The involved data structures are elicited and presented. These are provided, being the building blocks of the actual benchmarking of invoice Analysis systems.

  • Document Analysis Systems - smartFIX: A Requirements-Driven System for Document Analysis and Understanding
    Lecture Notes in Computer Science, 2002
    Co-Authors: Andreas Dengel, Bertin Klein
    Abstract:

    Although the internet offers a wide-spread platform for information interchange, day-to-day work in large companies still means the processing of tens of thousands of printed Documents every day. This paper presents the system smartFIX which is a Document Analysis and understanding system developed by the DFKI spin-off INSIDERS. It permits the processing of Documents ranging from fixed format forms to unstructured letters of any format. Apart from the architecture, the main components and system characteristics, we also show some results when applying smartFIX to medical bills and prescriptions.

  • ICDAR - On the evaluation of Document Analysis components by recall, precision, and accuracy
    Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318), 1999
    Co-Authors: M. Junker, Rainer Hoch, Andreas Dengel
    Abstract:

    In Document Analysis, it is common to prove the usefulness of a component by an experimental evaluation. By applying the respective algorithms to a test sample, effectiveness measures such as recall, precision, and accuracy are computed. The goal of such an evaluation is two-fold: on the one hand it shows that the absolute effectiveness of the algorithm is acceptable for practical use. On the other hand the evaluation can prove that the algorithm has a better or worse effectiveness than another algorithm. We argue that the experimental evaluation on relative small test sets-as is very common in Document Analysis has to be taken with extreme care from a statistical point of view. In fact, it is surprising how weak statements derived from such evaluations are.

O.r. Terrades - One of the best experts on this subject based on the ideXlab platform.

  • ICDAR - A Review of Shape Descriptors for Document Analysis
    Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
    Co-Authors: O.r. Terrades, S. Tabbone, E. Valveny
    Abstract:

    Shape descriptors play an important role in many Document Analysis application. In this paper we review some of the shape descriptors proposed in the last years from a new point of view. We propose the definitions of descriptor and primitive and introduce the notion of feature extraction method. With these definitions, we propose a new classification of shape descriptors that permits to classify according to their properties pointing out their strengths and weaknesses.

  • A Review of Shape Descriptors for Document Analysis
    Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
    Co-Authors: O.r. Terrades, S. Tabbone, E. Valveny
    Abstract:

    Shape descriptors play an important role in many Document Analysis application. In this paper we review some of the shape descriptors proposed in the last years from a new point of view. We propose the definitions of descriptor and primitive and introduce the notion of feature extraction method. With these definitions, we propose a new classification of shape descriptors that permits to classify according to their properties pointing out their strengths and weaknesses.