African Languages

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 19746 Experts worldwide ranked by ideXlab platform

Laura Martinus - One of the best experts on this subject based on the ideXlab platform.

  • benchmarking neural machine translation for southern African Languages
    Meeting of the Association for Computational Linguistics, 2019
    Co-Authors: Jade Z. Abbott, Laura Martinus
    Abstract:

    Unlike major Western Languages, most African Languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared, meaning researchers struggle to reproduce reported results, and almost no publicly available benchmarks or leaderboards for African machine translation models exist. To start to address these problems, we trained neural machine translation models for a subset of Southern African Languages on publicly-available datasets. We provide the code for training the models and evaluate the models on a newly released evaluation set, with the aim of starting a leaderboard for Southern African Languages and spur future research in the field.

  • Benchmarking Neural Machine Translation for Southern African Languages
    arXiv: Computation and Language, 2019
    Co-Authors: Laura Martinus, Jade Z. Abbott
    Abstract:

    Unlike major Western Languages, most African Languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared. This has lead a struggle to reproduce reported results, and few publicly available benchmarks for African machine translation models exist. To start to address these problems, we trained neural machine translation models for 5 Southern African Languages on publicly-available datasets. Code is provided for training the models and evaluate the models on a newly released evaluation set, with the aim of spur future research in the field for Southern African Languages.

  • WNLP@ACL - Benchmarking Neural Machine Translation for Southern African Languages
    2019
    Co-Authors: Jade Z. Abbott, Laura Martinus
    Abstract:

    Unlike major Western Languages, most African Languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared, meaning researchers struggle to reproduce reported results, and almost no publicly available benchmarks or leaderboards for African machine translation models exist. To start to address these problems, we trained neural machine translation models for a subset of Southern African Languages on publicly-available datasets. We provide the code for training the models and evaluate the models on a newly released evaluation set, with the aim of starting a leaderboard for Southern African Languages and spur future research in the field.

  • A Focus on Neural Machine Translation for African Languages.
    arXiv: Computation and Language, 2019
    Co-Authors: Laura Martinus, Jade Z. Abbott
    Abstract:

    African Languages are numerous, complex and low-resourced. The datasets required for machine translation are difficult to discover, and existing research is hard to reproduce. Minimal attention has been given to machine translation for African Languages so there is scant research regarding the problems that arise when using machine translation techniques. To begin addressing these problems, we trained models to translate English to five of the official South African Languages (Afrikaans, isiZulu, Northern Sotho, Setswana, Xitsonga), making use of modern neural machine translation techniques. The results obtained show the promise of using neural machine translation techniques for African Languages. By providing reproducible publicly-available data, code and results, this research aims to provide a starting point for other researchers in African machine translation to compare to and build upon.

Jade Z. Abbott - One of the best experts on this subject based on the ideXlab platform.

  • benchmarking neural machine translation for southern African Languages
    Meeting of the Association for Computational Linguistics, 2019
    Co-Authors: Jade Z. Abbott, Laura Martinus
    Abstract:

    Unlike major Western Languages, most African Languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared, meaning researchers struggle to reproduce reported results, and almost no publicly available benchmarks or leaderboards for African machine translation models exist. To start to address these problems, we trained neural machine translation models for a subset of Southern African Languages on publicly-available datasets. We provide the code for training the models and evaluate the models on a newly released evaluation set, with the aim of starting a leaderboard for Southern African Languages and spur future research in the field.

  • Benchmarking Neural Machine Translation for Southern African Languages
    arXiv: Computation and Language, 2019
    Co-Authors: Laura Martinus, Jade Z. Abbott
    Abstract:

    Unlike major Western Languages, most African Languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared. This has lead a struggle to reproduce reported results, and few publicly available benchmarks for African machine translation models exist. To start to address these problems, we trained neural machine translation models for 5 Southern African Languages on publicly-available datasets. Code is provided for training the models and evaluate the models on a newly released evaluation set, with the aim of spur future research in the field for Southern African Languages.

  • WNLP@ACL - Benchmarking Neural Machine Translation for Southern African Languages
    2019
    Co-Authors: Jade Z. Abbott, Laura Martinus
    Abstract:

    Unlike major Western Languages, most African Languages are very low-resourced. Furthermore, the resources that do exist are often scattered and difficult to obtain and discover. As a result, the data and code for existing research has rarely been shared, meaning researchers struggle to reproduce reported results, and almost no publicly available benchmarks or leaderboards for African machine translation models exist. To start to address these problems, we trained neural machine translation models for a subset of Southern African Languages on publicly-available datasets. We provide the code for training the models and evaluate the models on a newly released evaluation set, with the aim of starting a leaderboard for Southern African Languages and spur future research in the field.

  • A Focus on Neural Machine Translation for African Languages.
    arXiv: Computation and Language, 2019
    Co-Authors: Laura Martinus, Jade Z. Abbott
    Abstract:

    African Languages are numerous, complex and low-resourced. The datasets required for machine translation are difficult to discover, and existing research is hard to reproduce. Minimal attention has been given to machine translation for African Languages so there is scant research regarding the problems that arise when using machine translation techniques. To begin addressing these problems, we trained models to translate English to five of the official South African Languages (Afrikaans, isiZulu, Northern Sotho, Setswana, Xitsonga), making use of modern neural machine translation techniques. The results obtained show the promise of using neural machine translation techniques for African Languages. By providing reproducible publicly-available data, code and results, this research aims to provide a starting point for other researchers in African machine translation to compare to and build upon.

Harouna Naroua - One of the best experts on this subject based on the ideXlab platform.

  • Evaluation of Virtual Keyboards for West-African Languages
    2017
    Co-Authors: Chantal Enguehard, Harouna Naroua
    Abstract:

    West African Languages are written with alphabets that comprize non classical Latin characters. It is possible to design virtual keyboards which allow the writing of such special characters with a combination of keys. During the last decade, many different virtual keyboards had been created, without any standardization to fix the correspondence between each character and the keys to press to obtain it. We define a grid to evaluate such keyboards and apply it to five virtual keyboards in relation with the five main Languages of Niger (Fulfulde, Hausa, Kanuri, Songhai-Zarma, Tamashek), Bambara and Soninke from Mali and Dyoula from Burkina Faso. We conclude that the African LLACAN keyboard should be recommended in Niger because it covers all the characters used in the alphabets of the main Languages of this country, it produces valid Unicode codes and it minimizes the number of keys to be pressed.

  • On the Computerization of African Languages
    American Journal of Applied Sciences, 2016
    Co-Authors: Harouna Naroua, Lawaly Salifou
    Abstract:

    In this article, a computer tool for processing African Languages has been designed. It is intended to be a contribution to the automatic processing of African Languages. The current study is focused on West African Languages where five main Languages from Niger, two from Mali and one from Burkina Faso are considered. After a brief review of African Languages processing, we designed a tool which uses minimum resources and operates essentially on a dictionary and the characteristics of the language alphabet. The dictionary is represented using a trie data structure. For the sake of application, the designed tool operates as a spell checker. To detect and correct spelling errors, the edit distance and the specificities of the language are used. Although they do not have processing tools, it was shown that existing tools for computerized Languages can be adapted to African Languages efficiently. To extend the designed tool to any African language, we only need to provide an appropriate dictionary and alphabet.

  • LREC - Evaluation of Virtual Keyboards for West-African Languages
    2008
    Co-Authors: Chantal Enguehard, Harouna Naroua
    Abstract:

    West African Languages are written with alphabets that comprize non classical Latin characters. It is possible to design virtual keyboards which allow the writing of such special characters with a combination of keys. During the last decade, many different virtual keyboards had been created, without any standardization to fix the correspondence between each character and the keys to press to obtain it. We define a grid to evaluate such keyboards and apply it to five virtual keyboards in relation with the five main Languages of Niger (Fulfulde, Hausa, Kanuri, Songhai-Zarma, Tamashek), Bambara and Soninke from Mali and Dyoula from Burkina Faso. We conclude that the African LLACAN keyboard should be recommended in Niger because it covers all the characters used in the alphabets of the main Languages of this country, it produces valid Unicode codes and it minimizes the number of keys to be pressed.

Chantal Enguehard - One of the best experts on this subject based on the ideXlab platform.

  • Evaluation of Virtual Keyboards for West-African Languages
    2017
    Co-Authors: Chantal Enguehard, Harouna Naroua
    Abstract:

    West African Languages are written with alphabets that comprize non classical Latin characters. It is possible to design virtual keyboards which allow the writing of such special characters with a combination of keys. During the last decade, many different virtual keyboards had been created, without any standardization to fix the correspondence between each character and the keys to press to obtain it. We define a grid to evaluate such keyboards and apply it to five virtual keyboards in relation with the five main Languages of Niger (Fulfulde, Hausa, Kanuri, Songhai-Zarma, Tamashek), Bambara and Soninke from Mali and Dyoula from Burkina Faso. We conclude that the African LLACAN keyboard should be recommended in Niger because it covers all the characters used in the alphabets of the main Languages of this country, it produces valid Unicode codes and it minimizes the number of keys to be pressed.

  • LREC - Evaluation of Virtual Keyboards for West-African Languages
    2008
    Co-Authors: Chantal Enguehard, Harouna Naroua
    Abstract:

    West African Languages are written with alphabets that comprize non classical Latin characters. It is possible to design virtual keyboards which allow the writing of such special characters with a combination of keys. During the last decade, many different virtual keyboards had been created, without any standardization to fix the correspondence between each character and the keys to press to obtain it. We define a grid to evaluate such keyboards and apply it to five virtual keyboards in relation with the five main Languages of Niger (Fulfulde, Hausa, Kanuri, Songhai-Zarma, Tamashek), Bambara and Soninke from Mali and Dyoula from Burkina Faso. We conclude that the African LLACAN keyboard should be recommended in Niger because it covers all the characters used in the alphabets of the main Languages of this country, it produces valid Unicode codes and it minimizes the number of keys to be pressed.

Malte Zimmermann - One of the best experts on this subject based on the ideXlab platform.

  • Information structure in African Languages: corpora and tools
    Language Resources and Evaluation, 2011
    Co-Authors: Christian Chiarcos, Ines Fiedler, Mira Grubic, Katharina Hartmann, Julia Ritz, Anne Schwarz, Amir Zeldes, Malte Zimmermann
    Abstract:

    In this paper, we describe tools and resources for the study of African Languages developed at the Collaborative Research Centre 632 “Information Structure”. These include deeply annotated data collections of 25 sub-Saharan Languages that are described together with their annotation scheme, as well as the corpus tool ANNIS, which provides unified access to a broad variety of annotations created with a range of different tools. With the application of ANNIS to several African data collections, we illustrate its suitability for the purpose of language documentation, distributed access, and the creation of data archives.

  • Subject focus in West African Languages
    2010
    Co-Authors: Ines Fiedler, Katharina Hartmann, Anne Schwarz, Brigitte Reineke, Malte Zimmermann
    Abstract:

    In chapter 10 'Subject Focus in West African Languages', Ines Fiedler, Katharina Hartmann, Brigitte Reineke, Anne Schwarz, and Malte Zimmermann investigate the peculiarities of subject focus marking in three West African language groups. After a discussion of various strategies of focus realization, it is shown that most Languages in the sample exhibit a subject/non-subject asymmetry with respect to focus marking: While focus on non-subjects can often go unmarked, subject focus must always be marked. The grammatical ways of subject focus marking vary widely across the Languages under discussion. Strategies used include syntactic, morphological and prosodic focus marking, as well as the reorganization of the entire clause into a thetic statement. It is argued that the special status of focused subjects follows from the fact that the default interpretation of subjects is a topic interpretation. In order to avoid this default reading, a focused subject must be marked as such.