The Experts below are selected from a list of 8799 Experts worldwide ranked by ideXlab platform
Sivaji Bandyopadhyay - One of the best experts on this subject based on the ideXlab platform.
-
INEX - A Hybrid QA System with Focused IR and Automatic Summarization for INEX 2011
Focused Retrieval of Content and Structure, 2012Co-Authors: Pinaki Bhaskar, Somnath Banerjee, Snehasis Neogi, Sivaji BandyopadhyayAbstract:The article presents the experiments carried out as part of the participation in the QA track of INEX 2011. We have submitted two runs. The INEX QA task has two main sub tasks, Focused IR and Automatic Summarization. In the Focused IR system, we first preprocess the Wikipedia documents and then index them using Nutch. Stop words are removed from each query tweet and all the remaining tweet words are stemmed using Porter stemmer. The stemmed tweet words form the query for retrieving the most relevant document using the index. The Automatic Summarization system takes as input the query tweet along with the tweet’s text and the title from the most relevant text document. Most relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query tweet, tweet’s text and title words. Each retrieved sentence is assigned a ranking score in the Automatic Summarization system. The answer passage includes the top ranked retrieved sentences with a limit of 500 words. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. Our first run got the highest score of 432.2 in Relaxed metric of Readability evaluation among all the participants.
-
a query focused multi document Automatic Summarization
Pacific Asia Conference on Language Information and Computation, 2010Co-Authors: Pinaki Bhaskar, Sivaji BandyopadhyayAbstract:The present paper describes the development of a query focused multi-document Automatic Summarization. A graph is constructed, where the nodes are sentences of the documents and edge scores reflect the correlation measure between the nodes. The system clusters similar texts having related topical features from the graph using edge scores. Next, query dependent weights for each sentence are added to the edge score of the sentence and accumulated with the corresponding cluster score. Top ranked sentence of each cluster is identified and compressed using a dependency parser. The compressed sentences are included in the output summary. The inter-document cluster is revisited in order until the length of the summary is less than the maximum limit. The summarizer has been tested on the standard TAC 2008 test data sets of the Update Summarization Track. Evaluation of the summarizer yielded accuracy scores of 0.10317 (ROUGE-2) and 0.13998 (ROUGE–SU-4).
-
PACLIC - A Query Focused Multi Document Automatic Summarization
2010Co-Authors: Pinaki Bhaskar, Sivaji BandyopadhyayAbstract:The present paper describes the development of a query focused multi-document Automatic Summarization. A graph is constructed, where the nodes are sentences of the documents and edge scores reflect the correlation measure between the nodes. The system clusters similar texts having related topical features from the graph using edge scores. Next, query dependent weights for each sentence are added to the edge score of the sentence and accumulated with the corresponding cluster score. Top ranked sentence of each cluster is identified and compressed using a dependency parser. The compressed sentences are included in the output summary. The inter-document cluster is revisited in order until the length of the summary is less than the maximum limit. The summarizer has been tested on the standard TAC 2008 test data sets of the Update Summarization Track. Evaluation of the summarizer yielded accuracy scores of 0.10317 (ROUGE-2) and 0.13998 (ROUGE–SU-4).
Pinaki Bhaskar - One of the best experts on this subject based on the ideXlab platform.
-
INEX - A Hybrid QA System with Focused IR and Automatic Summarization for INEX 2011
Focused Retrieval of Content and Structure, 2012Co-Authors: Pinaki Bhaskar, Somnath Banerjee, Snehasis Neogi, Sivaji BandyopadhyayAbstract:The article presents the experiments carried out as part of the participation in the QA track of INEX 2011. We have submitted two runs. The INEX QA task has two main sub tasks, Focused IR and Automatic Summarization. In the Focused IR system, we first preprocess the Wikipedia documents and then index them using Nutch. Stop words are removed from each query tweet and all the remaining tweet words are stemmed using Porter stemmer. The stemmed tweet words form the query for retrieving the most relevant document using the index. The Automatic Summarization system takes as input the query tweet along with the tweet’s text and the title from the most relevant text document. Most relevant sentences are retrieved from the associated document based on the TF-IDF of the matching query tweet, tweet’s text and title words. Each retrieved sentence is assigned a ranking score in the Automatic Summarization system. The answer passage includes the top ranked retrieved sentences with a limit of 500 words. The two unique runs differ in the way in which the relevant sentences are retrieved from the associated document. Our first run got the highest score of 432.2 in Relaxed metric of Readability evaluation among all the participants.
-
a query focused multi document Automatic Summarization
Pacific Asia Conference on Language Information and Computation, 2010Co-Authors: Pinaki Bhaskar, Sivaji BandyopadhyayAbstract:The present paper describes the development of a query focused multi-document Automatic Summarization. A graph is constructed, where the nodes are sentences of the documents and edge scores reflect the correlation measure between the nodes. The system clusters similar texts having related topical features from the graph using edge scores. Next, query dependent weights for each sentence are added to the edge score of the sentence and accumulated with the corresponding cluster score. Top ranked sentence of each cluster is identified and compressed using a dependency parser. The compressed sentences are included in the output summary. The inter-document cluster is revisited in order until the length of the summary is less than the maximum limit. The summarizer has been tested on the standard TAC 2008 test data sets of the Update Summarization Track. Evaluation of the summarizer yielded accuracy scores of 0.10317 (ROUGE-2) and 0.13998 (ROUGE–SU-4).
-
PACLIC - A Query Focused Multi Document Automatic Summarization
2010Co-Authors: Pinaki Bhaskar, Sivaji BandyopadhyayAbstract:The present paper describes the development of a query focused multi-document Automatic Summarization. A graph is constructed, where the nodes are sentences of the documents and edge scores reflect the correlation measure between the nodes. The system clusters similar texts having related topical features from the graph using edge scores. Next, query dependent weights for each sentence are added to the edge score of the sentence and accumulated with the corresponding cluster score. Top ranked sentence of each cluster is identified and compressed using a dependency parser. The compressed sentences are included in the output summary. The inter-document cluster is revisited in order until the length of the summary is less than the maximum limit. The summarizer has been tested on the standard TAC 2008 test data sets of the Update Summarization Track. Evaluation of the summarizer yielded accuracy scores of 0.10317 (ROUGE-2) and 0.13998 (ROUGE–SU-4).
Mohamed El Bachir Menai - One of the best experts on this subject based on the ideXlab platform.
-
Automatic Summarization of scientific articles: A survey
Journal of King Saud University - Computer and Information Sciences, 2020Co-Authors: Nouf Ibrahim Altmami, Mohamed El Bachir MenaiAbstract:Abstract The scientific research process generally starts with the examination of the state of the art, which may involve a vast number of publications. Automatically summarizing scientific articles would help researchers in their investigation by speeding up the research process. The Automatic Summarization of scientific articles differs from the Summarization of generic texts due to their specific structure and inclusion of citation sentences. Most of the valuable information in scientific articles is presented in tables, figures, and algorithm pseudocode. These elements, however, do not usually appear in a generic text. Therefore, several approaches that consider the particularity of a scientific article structure were proposed to enhance the quality of the generated summary, resulting in ad hoc Automatic summarizers. This paper provides a comprehensive study of the state of the art in this field and discusses some future research directions. It particularly presents a review of approaches developed during the last decade, the corpora used, and their evaluation methods. It also discusses their limitations and points out some open problems. The conclusions of this study highlight the prevalence of extractive techniques for the Automatic Summarization of single monolingual articles using a combination of statistical, natural language processing, and machine learning techniques. The absence of benchmark corpora and gold standard summaries for scientific articles remains the main issue for this task.
-
AIMSA - Semantic Graph Based Automatic Summarization of Multiple Related Work Sections of Scientific Articles
Artificial Intelligence: Methodology Systems and Applications, 2018Co-Authors: Nouf Ibrahim Altmami, Mohamed El Bachir MenaiAbstract:The Summarization of scientific articles and particularly their related work sections would support the researchers in their investigation by allowing them to summarize a large number of articles. Scientific articles differ from generic text due to their specific structure and inclusion of citation sentences. Related work sections of scientific articles generally describe the most important facts of prior related work. Automatically summarizing these sections would support research development by speeding up the research process and consequently enhancing research quality. However, these sections may overlap syntactically and semantically. This research proposes to explore the Automatic Summarization of multiple related work sections. More specifically, the research goals of this work are to reduce the redundancy of citation sentences and enhance the readability of the generated summary by investigating a semantic graph-based approach and cross-document structure theory. These approaches have proven successful in the field of abstractive document Summarization.
Xiaodong Yan - One of the best experts on this subject based on the ideXlab platform.
-
ISCTCS - A Uighur Automatic Summarization Method Based on Sub-theme Division
Trustworthy Computing and Services, 2015Co-Authors: Xiaodong YanAbstract:As a very important research focus of natural language processing, Automatic Summarization can be used in many fields whether in improving the quality of searching results on a search engine or as a means of public opinion analysis. A method for Uighur Automatic Summarization is proposed in this paper which is base on sub-theme division and weight value. And by experiments, we find that it can get good precision and recall rates.
-
IMIS - A New Uighur Automatic Summarization Method
2015 9th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2015Co-Authors: Xiaodong YanAbstract:Research on Automatic Summarization becomes a hot spot in natural language processing. With the rapid development of the Internet, a lot of Uighur website and communications platform has also been established and improved. Uighur network text increased sharply. How to effectively use the text is face to people. A method for Uighur Automatic Summarization is proposed in this paper which is based on sub-theme division and weight value is given in it. And by experiments, we find that it can get good precision and recall rates than traditional method on statistics.
Nouf Ibrahim Altmami - One of the best experts on this subject based on the ideXlab platform.
-
Automatic Summarization of scientific articles: A survey
Journal of King Saud University - Computer and Information Sciences, 2020Co-Authors: Nouf Ibrahim Altmami, Mohamed El Bachir MenaiAbstract:Abstract The scientific research process generally starts with the examination of the state of the art, which may involve a vast number of publications. Automatically summarizing scientific articles would help researchers in their investigation by speeding up the research process. The Automatic Summarization of scientific articles differs from the Summarization of generic texts due to their specific structure and inclusion of citation sentences. Most of the valuable information in scientific articles is presented in tables, figures, and algorithm pseudocode. These elements, however, do not usually appear in a generic text. Therefore, several approaches that consider the particularity of a scientific article structure were proposed to enhance the quality of the generated summary, resulting in ad hoc Automatic summarizers. This paper provides a comprehensive study of the state of the art in this field and discusses some future research directions. It particularly presents a review of approaches developed during the last decade, the corpora used, and their evaluation methods. It also discusses their limitations and points out some open problems. The conclusions of this study highlight the prevalence of extractive techniques for the Automatic Summarization of single monolingual articles using a combination of statistical, natural language processing, and machine learning techniques. The absence of benchmark corpora and gold standard summaries for scientific articles remains the main issue for this task.
-
AIMSA - Semantic Graph Based Automatic Summarization of Multiple Related Work Sections of Scientific Articles
Artificial Intelligence: Methodology Systems and Applications, 2018Co-Authors: Nouf Ibrahim Altmami, Mohamed El Bachir MenaiAbstract:The Summarization of scientific articles and particularly their related work sections would support the researchers in their investigation by allowing them to summarize a large number of articles. Scientific articles differ from generic text due to their specific structure and inclusion of citation sentences. Related work sections of scientific articles generally describe the most important facts of prior related work. Automatically summarizing these sections would support research development by speeding up the research process and consequently enhancing research quality. However, these sections may overlap syntactically and semantically. This research proposes to explore the Automatic Summarization of multiple related work sections. More specifically, the research goals of this work are to reduce the redundancy of citation sentences and enhance the readability of the generated summary by investigating a semantic graph-based approach and cross-document structure theory. These approaches have proven successful in the field of abstractive document Summarization.