Native Language

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Andrea Lassmann - One of the best experts on this subject based on the ideXlab platform.

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

The Economic Journal, 2015

Co-Authors: Peter Egger, Andrea Lassmann

Abstract:

This paper studies the effect of sharing a common Native Language on inter- national trade. Switzerland hosts three major Native Language groups which adjoin countries sharing the same Native majority Languages. In regions close to the internal Language border the alternate major Language is taught early on in school and not only understood but spoken by the residents. This setting allows for an assessment of the impact of common Native rather than spoken Language on transaction-level imports from neighbouring countries. Our findings point to an effect of common Native Language on extensive rather than on intensive margins of trade.

15 days free trial to Access Article
the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

Social Science Research Network, 2015

Co-Authors: Peter Egger, Andrea Lassmann

Abstract:

This article studies the effect of sharing a common Native Language (CNL) on international trade. Switzerland hosts three major Native Language groups which adjoin countries sharing the same Native majority Languages. In regions close to the internal Language border the alternate major Language is taught early on in school and not only understood but spoken by the residents. This setting allows for an assessment of the impact of common Native rather than spoken Language on transaction‐level imports from neighbouring countries. Our findings point to an effect of CNL on extensive rather than on intensive margins of trade.

15 days free trial to Access Article
the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

Research Papers in Economics, 2013

Co-Authors: Peter Egger, Andrea Lassmann

Abstract:

This paper studies the causal effect of sharing a common Native Language on international trade. Switzerland is a multilingual country that hosts four official Language groups of which three are major (French, German, and Italian). These groups of Native Language speakers are geographically separated, with the corresponding regions bordering countries which share a majority of speakers of the same Native Language. All of the three main Languages are understood and spoken by most Swiss citizens, especially the ones residing close to internal Language borders in Switzerland. This unique setting allows for an assessment of the impact of common Native (rather than spoken) Language as a cultural aspect of Language on trade from within country-pairs. We do so by exploiting the discontinuity in various international bilateral trade outcomes based on Swiss transaction-level data at historical Language borders within Switzerland. The effect on various margins of imports is positive and significant. The results suggest that, on average, common Native Language between regions biases the regional structure of the value of international imports towards them by 18 percentage points and that of the number of import transactions by 20 percentage points. In addition, regions import 102 additional products from a neighboring country sharing a common Native Language compared to a different Native Language exporter. This effect is considerably lower than the overall estimate (using aggregate bilateral trade and no regression discontinuity design) of common official Language on Swiss international imports in the same sample. The latter subsumes both the effect of common spoken Language as a communication factor and of confounding economic and institutional factors and is quantitatively well in line with the common official (spoken or Native) Language coefficient in many gravity model estimates of international trade.

15 days free trial to Access Article

Peter Egger - One of the best experts on this subject based on the ideXlab platform.

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

The Economic Journal, 2015

Co-Authors: Peter Egger, Andrea Lassmann

Abstract:

This paper studies the effect of sharing a common Native Language on inter- national trade. Switzerland hosts three major Native Language groups which adjoin countries sharing the same Native majority Languages. In regions close to the internal Language border the alternate major Language is taught early on in school and not only understood but spoken by the residents. This setting allows for an assessment of the impact of common Native rather than spoken Language on transaction-level imports from neighbouring countries. Our findings point to an effect of common Native Language on extensive rather than on intensive margins of trade.

15 days free trial to Access Article
the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

Social Science Research Network, 2015

Co-Authors: Peter Egger, Andrea Lassmann

Abstract:

This article studies the effect of sharing a common Native Language (CNL) on international trade. Switzerland hosts three major Native Language groups which adjoin countries sharing the same Native majority Languages. In regions close to the internal Language border the alternate major Language is taught early on in school and not only understood but spoken by the residents. This setting allows for an assessment of the impact of common Native rather than spoken Language on transaction‐level imports from neighbouring countries. Our findings point to an effect of CNL on extensive rather than on intensive margins of trade.

15 days free trial to Access Article
the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

Research Papers in Economics, 2013

Co-Authors: Peter Egger, Andrea Lassmann

Abstract:

This paper studies the causal effect of sharing a common Native Language on international trade. Switzerland is a multilingual country that hosts four official Language groups of which three are major (French, German, and Italian). These groups of Native Language speakers are geographically separated, with the corresponding regions bordering countries which share a majority of speakers of the same Native Language. All of the three main Languages are understood and spoken by most Swiss citizens, especially the ones residing close to internal Language borders in Switzerland. This unique setting allows for an assessment of the impact of common Native (rather than spoken) Language as a cultural aspect of Language on trade from within country-pairs. We do so by exploiting the discontinuity in various international bilateral trade outcomes based on Swiss transaction-level data at historical Language borders within Switzerland. The effect on various margins of imports is positive and significant. The results suggest that, on average, common Native Language between regions biases the regional structure of the value of international imports towards them by 18 percentage points and that of the number of import transactions by 20 percentage points. In addition, regions import 102 additional products from a neighboring country sharing a common Native Language compared to a different Native Language exporter. This effect is considerably lower than the overall estimate (using aggregate bilateral trade and no regression discontinuity design) of common official Language on Swiss international imports in the same sample. The latter subsumes both the effect of common spoken Language as a communication factor and of confounding economic and institutional factors and is quantitatively well in line with the common official (spoken or Native) Language coefficient in many gravity model estimates of international trade.

15 days free trial to Access Article

Mark Dras - One of the best experts on this subject based on the ideXlab platform.

Native Language identification with classifier stacking and ensembles

Computational Linguistics, 2018

Co-Authors: Shervin Malmasi, Mark Dras

Abstract:

Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. Howe...

15 days free trial to Access Article
Native Language identification with classifier stacking and ensembles

Computational Linguistics, 2018

Co-Authors: Shervin Malmasi, Mark Dras

Abstract:

Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.

15 days free trial to Access Article
unsupervised text segmentation based on Native Language characteristics

Meeting of the Association for Computational Linguistics, 2017

Co-Authors: Shervin Malmasi, Mark Dras, Mark Johnson, Magdalena Wolska

Abstract:

Most work on segmenting text does so on the basis of topic changes, but it can be of interest to segment by other, stylistically expressed characteristics such as change of authorship or Native Language. We propose a Bayesian unsupervised text segmentation approach to the latter. While baseline models achieve essentially random segmentation on our task, indicating its difficulty, a Bayesian model that incorporates appropriately compact Language models and alternating asymmetric priors can achieve scores on the standard metrics around halfway to perfect segmentation.

15 days free trial to Access Article
multilingual Native Language identification

Natural Language Engineering, 2017

Co-Authors: Shervin Malmasi, Mark Dras

Abstract:

We present the first comprehensive study of Native Language Identification (NLI) applied to text written in Languages other than English, using data from six Languages. NLI is the task of predicting an author’s first Language using only their writings in a second Language, with applications in Second Language Acquisition and forensic linguistics. Most research to date has focused on English but there is a need to apply NLI to other Languages, not only to gauge its applicability but also to aid in teaching research for other emerging Languages. With this goal, we identify six typologically very different sources of non-English second Language data and conduct six experiments using a set of commonly used features. Our first two experiments evaluate our features and corpora, showing that the features perform well and at similar rates across Languages. The third experiment compares non-Native and Native control data, showing that they can be discerned with 95 per cent accuracy. Our fourth experiment provides a cross-linguistic assessment of how the degree of syntactic data encoded in part-of-speech tags affects their efficiency as classification features, finding that most differences between first Language groups lie in the ordering of the most basic word categories. We also tackle two questions that have not previously been addressed for NLI. Other work in NLI has shown that ensembles of classifiers over feature types work well and in our final experiment we use such an oracle classifier to derive an upper limit for classification accuracy with our feature set. We also present an analysis examining feature diversity, aiming to estimate the degree of overlap and complementarity between our chosen features employing an association measure for binary data. Finally, we conclude with a general discussion and outline directions for future work.

15 days free trial to Access Article
oracle and human baselines for Native Language identification

Workshop on Innovative Use of NLP for Building Educational Applications, 2015

Co-Authors: Shervin Malmasi, Joel Tetreault, Mark Dras

Abstract:

We examine different ensemble methods, including an oracle, to estimate the upper-limit of classification accuracy for Native Language Identification (NLI). The oracle outperforms state-of-the-art systems by over 10% and results indicate that for many misclassified texts the correct class label receives a significant portion of the ensemble votes, often being the runner-up. We also present a pilot study of human performance for NLI, the first such experiment. While some participants achieve modest results on our simplified setup with 5 L1s, they did not outperform our NLI system, and this performance gap is likely to widen on the standard NLI setup.

15 days free trial to Access Article

Shervin Malmasi - One of the best experts on this subject based on the ideXlab platform.

portuguese Native Language identification

Processing of the Portuguese Language, 2018

Co-Authors: Shervin Malmasi, Marcos Zampieri

Abstract:

This study presents the first Native Language Identification (NLI) study for L2 Portuguese. We used a sub-set of the NLI-PT dataset, containing texts written by speakers of five different Native Languages: Chinese, English, German, Italian, and Spanish. We explore the linguistic annotations available in NLI-PT to extract a range of (morpho-)syntactic features and apply NLI classification methods to predict the Native Language of the authors. The best results were obtained using an ensemble combination of the features, achieving \(54.1\%\) accuracy.

15 days free trial to Access Article
Native Language identification with classifier stacking and ensembles

Computational Linguistics, 2018

Co-Authors: Shervin Malmasi, Mark Dras

Abstract:

Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. Howe...

15 days free trial to Access Article
Native Language identification with classifier stacking and ensembles

Computational Linguistics, 2018

Co-Authors: Shervin Malmasi, Mark Dras

Abstract:

Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.

15 days free trial to Access Article
a portuguese Native Language identification dataset

Workshop on Innovative Use of NLP for Building Educational Applications, 2018

Co-Authors: Iria Gayo, Marcos Zampieri, Shervin Malmasi

Abstract:

In this paper we present NLI-PT, the first Portuguese dataset compiled for Native Language Identification (NLI), the task of identifying an author’s first Language based on their second Language writing. The dataset includes 1,868 student essays written by learners of European Portuguese, Native speakers of the following L1s: Chinese, English, Spanish, German, Russian, French, Japanese, Italian, Dutch, Tetum, Arabic, Polish, Korean, Romanian, and Swedish. NLI-PT includes the original student text and four different types of annotation: POS, fine-grained POS, constituency parses, and dependency parses. NLI-PT can be used not only in NLI but also in research on several topics in the field of Second Language Acquisition and educational NLP. We discuss possible applications of this dataset and present the results obtained for the first lexical baseline system for Portuguese NLI.

15 days free trial to Access Article
unsupervised text segmentation based on Native Language characteristics

Meeting of the Association for Computational Linguistics, 2017

Co-Authors: Shervin Malmasi, Mark Dras, Mark Johnson, Magdalena Wolska

Abstract:

Most work on segmenting text does so on the basis of topic changes, but it can be of interest to segment by other, stylistically expressed characteristics such as change of authorship or Native Language. We propose a Bayesian unsupervised text segmentation approach to the latter. While baseline models achieve essentially random segmentation on our task, indicating its difficulty, a Bayesian model that incorporates appropriately compact Language models and alternating asymmetric priors can achieve scores on the standard metrics around halfway to perfect segmentation.

15 days free trial to Access Article

Boris Katz - One of the best experts on this subject based on the ideXlab platform.

predicting Native Language from gaze

Meeting of the Association for Computational Linguistics, 2017

Co-Authors: Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz

Abstract:

In an embodiment, a method includes presenting, on a display, sample text in a given Language to a user. The method further includes recording eye fixation times for each word of the sample text for the user and recording saccade times between each pair of fixations of the sample text. The method further includes comparing features of the gaze pattern of the user to features of a gaze pattern of a plurality of training readers. Each training reader (e.g., training user) has a known Native Language. The method further generates a probability of at least one an estimated Native Language of the user based on the results of the comparison.

15 days free trial to Access Article
predicting Native Language from gaze

arXiv: Computation and Language, 2017

Co-Authors: Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz

Abstract:

A fundamental question in Language learning concerns the role of a speaker's first Language in second Language acquisition. We present a novel methodology for studying this question: analysis of eye-movement patterns in second Language reading of free-form text. Using this methodology, we demonstrate for the first time that the Native Language of English learners can be predicted from their gaze fixations when reading English. We provide analysis of classifier uncertainty and learned features, which indicates that differences in English reading are likely to be rooted in linguistic divergences across Native Languages. The presented framework complements production studies and offers new ground for advancing research on multilingualism.

15 days free trial to Access Article
reconstructing Native Language typology from foreign Language usage

arXiv: Computation and Language, 2014

Co-Authors: Yevgeni Berzak, Roi Reichart, Boris Katz

Abstract:

Linguists and psychologists have long been studying cross-linguistic transfer, the influence of Native Language properties on linguistic performance in a foreign Language. In this work we provide empirical evidence for this process in the form of a strong correlation between Language similarities derived from structural features in English as Second Language (ESL) texts and equivalent similarities obtained from the typological features of the Native Languages. We leverage this finding to recover Native Language typological similarity structure directly from ESL text, and perform prediction of typological features in an unsupervised fashion with respect to the target Languages. Our method achieves 72.2% accuracy on the typology prediction task, a result that is highly competitive with equivalent methods that rely on typological resources.

15 days free trial to Access Article
reconstructing Native Language typology from foreign Language usage

Conference on Computational Natural Language Learning, 2014

Co-Authors: Yevgeni Berzak, Roi Reichart, Boris Katz

Abstract:

This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Andrea Lassmann - One of the best experts on this subject based on the ideXlab platform.

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

Peter Egger - One of the best experts on this subject based on the ideXlab platform.

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

the causal impact of common Native Language on international trade evidence from a spatial regression discontinuity design

Mark Dras - One of the best experts on this subject based on the ideXlab platform.

Native Language identification with classifier stacking and ensembles

Native Language identification with classifier stacking and ensembles

unsupervised text segmentation based on Native Language characteristics

multilingual Native Language identification

oracle and human baselines for Native Language identification

Shervin Malmasi - One of the best experts on this subject based on the ideXlab platform.

portuguese Native Language identification

Native Language identification with classifier stacking and ensembles

Native Language identification with classifier stacking and ensembles

a portuguese Native Language identification dataset

unsupervised text segmentation based on Native Language characteristics

Boris Katz - One of the best experts on this subject based on the ideXlab platform.

predicting Native Language from gaze

predicting Native Language from gaze

reconstructing Native Language typology from foreign Language usage

reconstructing Native Language typology from foreign Language usage

Native Language

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Andrea Lassmann - One of the best experts on this subject based on the ideXlab platform.

Peter Egger - One of the best experts on this subject based on the ideXlab platform.

Mark Dras - One of the best experts on this subject based on the ideXlab platform.

Shervin Malmasi - One of the best experts on this subject based on the ideXlab platform.

Boris Katz - One of the best experts on this subject based on the ideXlab platform.

Related terms