Punjabi

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 14199 Experts worldwide ranked by ideXlab platform

Gurpreet Singh Lehal - One of the best experts on this subject based on the ideXlab platform.

  • automatic text summarization system for Punjabi language
    Journal of Emerging Technologies in Web Intelligence, 2013
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    This paper concentrates on single document multi news Punjabi extractive summarizer. Although lot of research is going on in field of multi document news summarization systems but not even a single paper was found in literature for single document multi news summarization for any language. It is first time that this system has been developed for Punjabi language and is available online at: http://pts.learnPunjabi.org/. Punjab is one of Indian states and Punjabi is its official language. Punjabi is under resourced language. Various linguistic resources for Punjabi were also developed first time as part of this project like Punjabi noun morph, Punjabi stemmer and Punjabi named entity recognition, Punjabi keywords identification, normalization of Punjabi nouns etc.  A Punjabi document (like single page of Punjabi E-news paper) can have hundreds of multi news of varying length. Based on compression ratio selected by user, this system starts by extracting headlines of each news, lines just next to headlines and other important lines depending upon their importance.  Selection of sentences is on the basis of statistical and linguistic features of sentences. This system comprises of two main steps: Pre Processing and Processing phase. Pre Processing phase represents the Punjabi text in structured way. In processing phase, different features deciding the importance of sentences are determined and calculated. Some of the statistical features are Punjabi keywords identification, relative sentence length feature and numbered data feature. Various linguistic features for selecting important sentences in summary are: Punjabi-headlines identification, identification of lines just next to headlines, identification of Punjabi-nouns, identification of Punjabi-proper-nouns, identification of common-English-Punjabi-nouns, identification of Punjabi-cue-phrases and identification of title-keywords in sentences. Scores of sentences are determined from sentence-feature-weight equation. Weights of features are determined using mathematical regression. Using regression, feature values of some Punjabi documents which are manually summarized are treated as independent input values and their corresponding dependent output values are provided. In the training phase, manually summaries of fifty news-documents are made by giving fuzzy scores to the sentences of those documents and then regression is applied for finding values of feature-weights and then average values of feature-weights are calculated. High scored sentences in proper order are selected for final summary. In final summary, sentences coherence is maintained by properly ordering the sentences in the same order as they  appear  in  the  input  text  at  the  selective  compression  ratios. This extractive Punjabi summarizer is available online.

  • automatic Punjabi text extractive summarization system
    International Conference on Computational Linguistics, 2012
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    Text Summarization is condensing the source text into shorter form and retaining its information content and overall meaning. Punjabi text Summarization system is text extraction based summarization system which is used to summarize the Punjabi text by retaining relevant sentences based on statistical and linguistic features of text. Punjabi text summarization system is available online at website: http://pts.learnPunjabi.org/default.aspx It comprises of two main phases: 1) Pre Processing 2) Processing. Pre Processing is structured representation of original Punjabi text. Pre processing phase includes Punjabi words boundary identification, Punjabi sentences boundary identification, Punjabi stop words elimination, Punjabi language stemmer for nouns and proper names, applying input restrictions and elimination of duplicate sentences. In processing phase, sentence features are calculated and final score of each sentence is determined using feature-weight equation. Top ranked sentences in proper order are selected for final summary. This demo paper concentrates on Automatic Punjabi Text Extractive Summarization

  • Punjabi language stemmer for nouns and proper names
    Proceedings of the 2nd Workshop on South Southeast Asian Natural Language Processing (WSSANLP), 2011
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    This paper concentrates on Punjabi language noun and proper name stemming. The purpose of stemming is to obtain the stem or radix of those words which are not found in dictionary. If stemmed word is present in dictionary, then that is a genuine word, otherwise it may be proper name or some invalid word. In Punjabi language stemming for nouns and proper names, an attempt is made to obtain stem or radix of a Punjabi word and then stem or radix is checked against Punjabi noun and proper name dictionary. An in depth analysis of Punjabi news corpus was made and various possible noun suffixes were identified like ੀ ਆ īāṃ, ਿੀਆ iāṃ, ੀ ਆ ūāṃ, ੀ ੀ āṃ, ੀ ਏ īē etc. and the various rules for noun and proper name stemming have been generated. Punjabi language stemmer for nouns and proper names is applied for Punjabi Text Summarization. The efficiency of Punjabi language noun and Proper name stemmer is 87.37%.

  • preprocessing phase of Punjabi language text summarization
    International Conference on Information Systems, 2011
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    Punjabi Text Summarization is the process of condensing the source Punjabi text into a shorter version, preserving its information content and overall meaning. It comprises two phases: 1) Pre Processing 2) Processing. Pre Processing is structured representation of the Punjabi text. This paper concentrates on Pre processing phase of Punjabi Text summarization. Various sub phases of pre processing are: Punjabi words boundary identification, Punjabi language stop words elimination, Punjabi language noun stemming, finding Common English Punjabi noun words, finding Punjabi language proper nouns, Punjabi sentence boundary identification, and identification of Punjabi language Cue phrase in a sentence.

  • features selection and weight learning for Punjabi text summarization
    2011
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    This paper concentrates on features selection and weight learning for Punjabi Text Summarization. Text Summarization is condensing the source text into a shorter version preserving its information content. It is the process of selecting important sentences from the original document and concatenating them into shorter form. The importance of sentences is decided based on statistical and linguistic features of sentences. For Punjabi language text Summarization, some of statistical features that often increase the candidacy of a sentence for inclusion in summary are: Sentence length feature, Punjabi Keywords selection feature (TF-ISF approach) and number feature. Some of linguistic features that often increase the candidacy of a sentence for inclusion in summary are: Punjabi sentence headline feature, next line feature, Punjabi noun feature, Punjabi proper noun feature, common English-Punjabi noun feature, cue phrase feature and presence of title keywords in a sentence. Mathematical regression is used to estimate the text feature weights based on fuzzy scores of sentences of 50 Punjabi news documents.

Vishal Gupta - One of the best experts on this subject based on the ideXlab platform.

  • a novel hybrid text summarization system for Punjabi text
    Cognitive Computation, 2016
    Co-Authors: Vishal Gupta, Narvinder Kaur
    Abstract:

    Text summarization is the task of shortening text documents but retaining their overall meaning and information. A good summary should highlight the main concepts of any text document. Many statistical-based, location-based and linguistic-based techniques are available for text summarization. This paper has described a novel hybrid technique for automatic summarization of Punjabi text. Punjabi is an official language of Punjab State in India. There are very few linguistic resources available for Punjabi. The proposed summarization system is hybrid of conceptual-, statistical-, location- and linguistic-based features for Punjabi text. In this system, four new location-based features and two new statistical features (entropy measure and Z score) are used and results are very much encouraging. Support vector machine-based classifier is also used to classify Punjabi sentences into summary and non-summary sentences and to handle imbalanced data. Synthetic minority over-sampling technique is applied for over-sampling minority class data. Results of proposed system are compared with different baseline systems, and it is found that F score, Precision, Recall and ROUGE-2 score of our system are reasonably well as compared to other baseline systems. Moreover, summary quality of proposed system is comparable to the gold summary.

  • automatic stemming of words for Punjabi language
    SIRS, 2014
    Co-Authors: Vishal Gupta
    Abstract:

    The major task of a stemmer is to find root words that are not in original form and are hence absent in the dictionary. The stemmer after stemming finds the word in the dictionary. If a match of the word is not found, then it may be some incorrect word or a name, otherwise the word is correct. For any language in the world, stemmer is a basic linguistic resource required to develop any type of application in Natural Language Processing (NLP) with high accuracy such as machine translation, document classification, document clustering, text question answering, topic tracking, text summarization and keywords extraction etc. This paper concentrates on complete automatic stemming of Punjabi words covering Punjabi nouns, verbs, adjectives, adverbs, pronouns and proper names. A suffix list of 18 suffixes for Punjabi nouns and proper names and a number of other suffixes for Punjabi verbs, adjectives and adverbs and different stemming rules for Punjabi nouns, verbs, adjectives, adverbs, pronouns and proper names have been generated after analysis of corpus of Punjabi. It is first time that complete Punjabi stemmer covering Punjabi nouns, verbs, adjectives, adverbs, pronouns, and proper names has been proposed and it will be useful for developing other Punjabi NLP applications with high accuracy. A portion of Punjabi stemmer of proper names and nouns has been implemented as a part of Punjabi text summarizer in MS Access as back end and ASP.NET as front end with 87.37% efficiency

  • automatic text summarization system for Punjabi language
    Journal of Emerging Technologies in Web Intelligence, 2013
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    This paper concentrates on single document multi news Punjabi extractive summarizer. Although lot of research is going on in field of multi document news summarization systems but not even a single paper was found in literature for single document multi news summarization for any language. It is first time that this system has been developed for Punjabi language and is available online at: http://pts.learnPunjabi.org/. Punjab is one of Indian states and Punjabi is its official language. Punjabi is under resourced language. Various linguistic resources for Punjabi were also developed first time as part of this project like Punjabi noun morph, Punjabi stemmer and Punjabi named entity recognition, Punjabi keywords identification, normalization of Punjabi nouns etc.  A Punjabi document (like single page of Punjabi E-news paper) can have hundreds of multi news of varying length. Based on compression ratio selected by user, this system starts by extracting headlines of each news, lines just next to headlines and other important lines depending upon their importance.  Selection of sentences is on the basis of statistical and linguistic features of sentences. This system comprises of two main steps: Pre Processing and Processing phase. Pre Processing phase represents the Punjabi text in structured way. In processing phase, different features deciding the importance of sentences are determined and calculated. Some of the statistical features are Punjabi keywords identification, relative sentence length feature and numbered data feature. Various linguistic features for selecting important sentences in summary are: Punjabi-headlines identification, identification of lines just next to headlines, identification of Punjabi-nouns, identification of Punjabi-proper-nouns, identification of common-English-Punjabi-nouns, identification of Punjabi-cue-phrases and identification of title-keywords in sentences. Scores of sentences are determined from sentence-feature-weight equation. Weights of features are determined using mathematical regression. Using regression, feature values of some Punjabi documents which are manually summarized are treated as independent input values and their corresponding dependent output values are provided. In the training phase, manually summaries of fifty news-documents are made by giving fuzzy scores to the sentences of those documents and then regression is applied for finding values of feature-weights and then average values of feature-weights are calculated. High scored sentences in proper order are selected for final summary. In final summary, sentences coherence is maintained by properly ordering the sentences in the same order as they  appear  in  the  input  text  at  the  selective  compression  ratios. This extractive Punjabi summarizer is available online.

  • automatic Punjabi text extractive summarization system
    International Conference on Computational Linguistics, 2012
    Co-Authors: Vishal Gupta, Gurpreet Singh Lehal
    Abstract:

    Text Summarization is condensing the source text into shorter form and retaining its information content and overall meaning. Punjabi text Summarization system is text extraction based summarization system which is used to summarize the Punjabi text by retaining relevant sentences based on statistical and linguistic features of text. Punjabi text summarization system is available online at website: http://pts.learnPunjabi.org/default.aspx It comprises of two main phases: 1) Pre Processing 2) Processing. Pre Processing is structured representation of original Punjabi text. Pre processing phase includes Punjabi words boundary identification, Punjabi sentences boundary identification, Punjabi stop words elimination, Punjabi language stemmer for nouns and proper names, applying input restrictions and elimination of duplicate sentences. In processing phase, sentence features are calculated and final score of each sentence is determined using feature-weight equation. Top ranked sentences in proper order are selected for final summary. This demo paper concentrates on Automatic Punjabi Text Extractive Summarization

  • domain based classification of Punjabi text documents using ontology and hybrid based approach
    International Conference on Computational Linguistics, 2012
    Co-Authors: Nidhi Krail, Vishal Gupta
    Abstract:

    Classification of text documents become a need in today’s world due to increase in the availability of electronic data over internet. Till now, no text classifier is available for the classification of Punjabi documents. The objective of the work is to find best Punjabi Text Classifier for Punjabi language. Two new algorithms, Ontology Based Classification and Hybrid Approach (which is the combination of Naive Bayes and Ontology Based Classification) are proposed for Punjabi Text Classification. A corpus of 180 Punjabi News Articles is used for training and testing purpose of the classifier. The experimental results conclude that Ontology Based Classification (85%) and Hybrid Approach (85%) provide better results in comparison to standard classification algorithms, Centroid Based Classification (71%) and Naive Bayes Classification (64%).

Parteek Kumar - One of the best experts on this subject based on the ideXlab platform.

  • Word sense disambiguation for Punjabi language using deep learning techniques
    Neural Computing and Applications, 2019
    Co-Authors: Varinder Pal Singh, Parteek Kumar
    Abstract:

    Word sense disambiguation (WSD) identifies the right meaning of the word in the given context. It is an indispensable and critical application for all the natural language processing tasks. In this paper, two deep learning techniques multilayer perceptron and long short-term memory (LSTM) have been individually inspected on the word vectors of 66 ambiguous Punjabi nouns for an explicit WSD system of Punjabi language. The inputs to the deep learning techniques are the simple word vectors derived directly from manually sense-tagged corpus of Punjabi language. The multilayer perceptron has outperformed the LSTM deep learning technique for WSD task of Punjabi language. Six traditional supervised machine learning techniques have also been tested on same dataset using unigram and bigram feature sets. A comparison between deep learning techniques and traditional six supervised machine learning techniques clearly indicates that the deep learning techniques using simple word vectors outperforms the earlier techniques.

  • Sense disambiguation for Punjabi language using supervised machine learning techniques
    Sādhanā, 2019
    Co-Authors: Varinder Pal Singh, Parteek Kumar
    Abstract:

    Automatic identification of a meaning of a word in a context is termed as Word Sense Disambiguation (WSD). It is a vital and hard artificial intelligence problem used in several natural language processing applications like machine translation, question answering, information retrieval, etc. In this paper, an explicit WSD system for Punjabi language using supervised techniques has been analysed. The sense tagged corpus of 150 ambiguous Punjabi noun words has been manually prepared. The six supervised machine learning techniques Decision List, Decision Tree, Naive Bayes, K-Nearest Neighbour (K-NN), Random Forest and Support Vector Machines (SVM) have been investigated in this proposed work. Every classifier has used same feature space encompassing lexical (unigram, bigram, collocations, and co-occurrence) and syntactic (part of speech) count based features. The semantic features of Punjabi language have been devised from the unlabelled Punjabi Wikipedia text using word2vec continuous bag of word and skip gram shallow neural network models. Two deep learning neural network classifiers multilayer perceptron and long short term memory have also been applied for WSD of Punjabi words. The word embedding features have experimented on six classifiers for the Punjabi WSD task. It has been observed that the performance of the supervised classifiers applied for the WSD task of Punjabi language has been enhanced with the application of word embedding features. In this work, an accuracy of 84% has been achieved by LSTM classifier using word embedding feature.

  • development of Punjabi wordnet
    CSI Transactions on ICT, 2013
    Co-Authors: Ashish Narang, R K Sharma, Parteek Kumar
    Abstract:

    Natural language processing (NLP) tasks such as word sense disambiguation, machine translation (MT) and part-of-speech tagging etc. require large scale lexical resources. These lexical resources have already been developed for digitally advanced languages, such as English, but yet to be developed for widely spoken but digitally young languages, such as Punjabi. WordNet is one such lexical resource that can be used for variety of NLP tasks ranging from digital dictionary to automated MT. The basic building block of WordNet is synset, a word sense with which one or more synonymous words are associated. Each synset in WordNet is linked to other synsets using lexical and semantic relations. Lexical relations are between word forms, whereas semantic relations exist between two whole synsets. This paper presents lexical and semantic relations of Punjabi WordNet, including synonymy, hypernymy/hyponymy, meronymy/holonomy, entailment and troponymy etc. It also illustrates the process of creation of Punjabi synsets from Hindi synsets, design of synsets and semantic databases for implementation of semantic relations for Punjabi WordNet.

  • Punjabi deconverter for generating Punjabi from universal networking language
    Journal of Zhejiang University Science C, 2013
    Co-Authors: Parteek Kumar, R K Sharma
    Abstract:

    DeConverter is core software in a Universal Networking Language (UNL) system. A UNL system has EnConverter and DeConverter as its two major components. EnConverter is used to convert a natural language sentence into an equivalent UNL expression, and DeConverter is used to generate a natural language sentence from an input UNL expression. This paper presents design and development of a Punjabi DeConverter. It describes five phases of the proposed Punjabi DeConverter, i.e., UNL parser, lexeme selection, morphology generation, function word insertion, and syntactic linearization. This paper also illustrates all these phases of the Punjabi DeConverter with a special focus on syntactic linearization issues of the Punjabi DeConverter. Syntactic linearization is the process of defining arrangements of words in generated output. The algorithms and pseudocodes for implementation of syntactic linearization of a simple UNL graph, a UNL graph with scope nodes and a node having un-traversed parents or multiple parents in a UNL graph have been discussed in this paper. Special cases of syntactic linearization with respect to Punjabi language for UNL relations like ‘and’, ‘or’, ‘fmt’, ‘cnt’, and ‘seq’ have also been presented in this paper. This paper also provides implementation results of the proposed Punjabi DeConverter. The DeConverter has been tested on 1000 UNL expressions by considering a Spanish UNL language server and agricultural domain threads developed by Indian Institute of Technology (IIT), Bombay, India, as gold-standards. The proposed system generates 89.0% grammatically correct sentences, 92.0% faithful sentences to the original sentences, and has a fluency score of 3.61 and an adequacy score of 3.70 on a 4-point scale. The system is also able to achieve a bilingual evaluation understudy (BLEU) score of 0.72.

  • Punjabi to unl enconversion system
    Sadhana-academy Proceedings in Engineering Sciences, 2012
    Co-Authors: Parteek Kumar, Rajendra K. Sharma
    Abstract:

    This paper reports the work for the EnConversion of input Punjabi sentences to an interlingua representation called Universal Networking Language (UNL). The UNL system consists of two main components, namely, EnConverter (used for converting the text from a source language to UNL) and DeConverter (used for converting the text from UNL to a target language). This paper discusses the framework for designing the EnConverter for Punjabi language with a special focus on generation of UNL attributes and relations from Punjabi source text. It also describes the working of Punjabi Shallow Parser used for the processing of the input sentence, which performs the tasks of Tokenizer, Morph-analyzer, Part-of-Speech Tagger and Chunker. This paper also considers the seven phases used in the process of EnConversion of input Punjabi text to UNL representation. The paper highlights the EnConversion analysis rules used for the EnConverter and indicates its usage in the generation of UNL expressions. This paper also covers the results of implementation of Punjabi EnConverter and its evaluation on sample UNL sentences available at Spanish Language Server. The accuracy of the developed system has also been presented in this paper.

Collette Clifford - One of the best experts on this subject based on the ideXlab platform.

  • validation of the Punjabi version of the edinburgh postnatal depression scale epds
    International Journal of Nursing Studies, 2006
    Co-Authors: Julie Werrett, Collette Clifford
    Abstract:

    This study reports a project to validate a Punjabi translation of the Edinburgh postnatal depression scale (EPDS). The study involved three points of data collection. Bilingual (Punjabi and English speaking) new mothers completed the English and Punjabi EPDS on two separate occasions and underwent a diagnostic interview. At a threshold of 12.5 the Punjabi scale yielded a sensitivity of 71.4 and specificity of 93.7. Analysis suggests criterion and conceptual equivalence between the two scales. A users' evaluation indicates that the tool was acceptable to the majority of mothers, however, the scale may be more applicable for mothers for whom Punjabi is their first language.

  • A cross-cultural analysis of the use of the Edinburgh Post-Natal Depression Scale (EPDS) in health visiting practice.
    Journal of Advanced Nursing, 1999
    Co-Authors: Collette Clifford, Julie Werrett
    Abstract:

    A cross-cultural analysis of the use of the Edinburgh Post-Natal Depression Scale (EPDS) in health visiting practice This report describes a project that developed and undertook initial validation of a Punjabi version of the Edinburgh Post-Natal Depression Scale (EPDS). A multi-disciplinary and multi-ethnic project team translated the EPDS from English to Punjabi. A pilot study indicated a high level of correlation between the two scales opening the way for a larger study in which a total of 98 bi-lingual women completed both the English and Punjabi version of the scale 6–8 weeks after delivery of their child. Of these a further 52 completed the scales on a second occasion, 16–18 weeks post-partum. A small sub-group (n=15) was subject to independent clinical assessment by a community psychiatric nurse (CPN) to determine their mental state, enabling the outcome of the assessment to be compared with the EPDS score. The scores of the English and Punjabi versions of the scale were analysed using Spearman correlation coefficient and the Bland Altman test. A high correlation was found between overall scores and most individual items on the scale. Furthermore, the independent assessment of mental health state indicated that a number of those women who scored 12 or above on the EPDS scale (the cut-off point for determining risk of post-natal depression (PND)) were diagnosed as having a post-natal depressive disorder by the CPN assessing them independently. Whilst the results to date are promising there is a need for further work to determine the validity, sensitivity and specificity of the Punjabi EPDS tool against international classification of depressive disorders and to establish optimal cut-off scores when using the Punjabi version of the EPDS.

Kamaldeep Bhui - One of the best experts on this subject based on the ideXlab platform.

  • assessing the prevalence of depression in Punjabi and english primary care attenders the role of culture physical illness and somatic symptoms
    Transcultural Psychiatry, 2004
    Co-Authors: Kamaldeep Bhui, Dinesh Bhugra, David Goldberg, Justin Sauer, Andre Tylee
    Abstract:

    Previous studies exploring the prevalence of depression among South Asians reported inconsistent findings. Research artefacts due to sampling bias, measurements errors and a failure to include ethnographic methods may all explain this. We estimated the prevalence of depression, and variations of prevalence with culture, cultural adaptation, somatic symptoms and physical disability in a cross-sectional primary care survey of Punjabi and English attendees. We included a culture specific screening instrument, culturally adapted the instruments and offered bilingual interviews. We found that, compared with their English counterparts, depressive diagnoses were more common among Punjabis, Punjabi women, Punjabis with physical complaints and, contrary to expectation, even Punjabis with low scores for somatic symptoms.

  • causal explanations of distress and general practitioners assessments of common mental disorder among Punjabi and english attendees
    Social Psychiatry and Psychiatric Epidemiology, 2002
    Co-Authors: Kamaldeep Bhui, Dinesh Bhugra, David Goldberg
    Abstract:

    Background: The literature on the primary care assessment of mental distress among Indian sub-continent origin patients suggests frequent presentations to general practitioner, but rarely for recognisable psychiatric disorders. This study investigates whether cultural variations in patients' causal explanatory models account for cultural variations in the assessment of non-psychotic mental disorders in primary care. Methods In a two-phase survey, 272 Punjabi and 269 English subjects were screened. The second phase was completed by 209 and 180 subjects, respectively. Causal explanatory models were elicited as explanations of two vignette scenarios. One of these emphasised a somatic presentation and the other anxiety symptoms. Psychiatric disorder was assessed by GPs on a Likert scale and by a psychiatrist on the Clinical Interview Schedule. Results Punjabis more commonly expressed medical/somatic and religious beliefs. General practitioners were more likely to assess any subject giving psychological explanations to vignette A and English subjects giving religious explanations to vignette B as having a significant psychiatric disorder. Where medical/somatic explanations of distress were most prevalent in response to the somatic vignette, psychological, religious and work explanations were less prevalent among Punjabis but not among English subjects. Causal explanations did not fully explain cultural differences in assessments. Conclusions General practitioners' assessments and causal explanations are related and influenced by culture, but causal explanations do not fully explain cultural differences in assessments.

  • cultural influences on the prevalence of common mental disorder general practitioners assessments and help seeking among Punjabi and english people visiting their general practitioner
    Psychological Medicine, 2001
    Co-Authors: Kamaldeep Bhui, Dinesh Bhugra, David Goldberg, G Dunn, Manisha Desai
    Abstract:

    Background. Culture influences symptom presentation and help-seeking and may influence the general practitioner's assessment. Methods. We recruited Punjabi and English GP attenders to a two-phase survey in London (UK) using the Amritsar Depression Inventory and the General Health Questionnaire as screening instruments. The Clinical Interview Schedule was the criterion measure. General practitioners completed Likert assessments. Results. The second phase was completed by 209 Punjabi and 180 English subjects. The prevalence of common mental disorders was not influenced by culture. Punjabi cases more often had ‘poor concentration and memory’ and ‘depressive ideas’ but were not more likely to have somatic symptoms. General practitioners were more likely to assess Punjabis with common mental disorder as having ‘physical and somatic’ symptoms or ‘sub-clinical disorders’. Punjabi cases with depressive ideas were less likely to be detected compared with English ones. In comparison to English men, English women were under-detected by Asian general practitioners. Help-seeking English subjects were more likely to be correctly identified as cases. Conclusions. The prevalence of common mental disorders and somatic symptoms does not differ across cultures. Among English subjects, general practitioners were more likely to identify correctly pure psychiatric illness and mixed pathology; but Punjabi subjects with common mental disorders were more often assessed as having ‘sub-clinical disorders’ and ‘physical and somatic’ disorders. English women were less well detected than English men. English help-seeking cases were more likely to be detected.