Prior Art - Explore the Science & Experts

The Experts below are selected from a list of 5736 Experts worldwide ranked by ideXlab platform

W. Bruce Croft - One of the best experts on this subject based on the ideXlab platform.

SIGIR - Transforming patents into Prior-Art queries

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09, 2009

Co-Authors: W. Bruce Croft

Abstract:

Searching for Prior-Art patents is an essential step for the patent examiner to validate or invalidate a patent application. In this paper, we consider the whole patent as the query, which reduces the burden on the user, and also makes many more potential search features available. We explore how to automatically transform the query patent into an effective search query, especially focusing on the effect of different patent fields. Experiments show that the background summary of a patent is the most useful source of terms for generating a query, even though most previous work used the patent claims.

15 days free trial to Access Article
Transforming patents into Prior-Art queries

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '09, 2009

Co-Authors: Xiaoibng Xue, W. Bruce Croft

Abstract:

Searching for Prior-Art patents is an essential step for the patent examiner to validate or invalidate a patent application. In this paper, we consider the whole patent as the query, which reduces the burden on the user, and also makes many more potential search features available. We explore how to automatically transform the query patent into an effective search query, especially focusing on the effect of different patent fields. Experiments show that the background summary of a patent is the most useful source of terms for generating a query, even though most previous work used the patent claims.

15 days free trial to Access Article

Gareth J F Jones - One of the best experts on this subject based on the ideXlab platform.

Studying machine translation technologies for large-data CLIR tasks: a patent Prior-Art search case study

Information Retrieval, 2013

Co-Authors: Walid Magdy, Gareth J F Jones

Abstract:

Prior-Art search in patent retrieval is concerned with finding all existing patents relevant to a patent application. Since patents often appear in different languages, cross-language information retrieval (CLIR) is an essential component of effective patent search. In recent years machine translation (MT) has become the dominant approach to translation in CLIR. Standard MT systems focus on generating proper translations that are morphologically and syntactically correct. Development of effective MT systems of this type requires large training resources and high computational power for training and translation. This is an important issue for patent CLIR where queries are typically very long sometimes taking the form of a full patent application, meaning that query translation using MT systems can be very slow. However, in contrast to MT, the focus for information retrieval (IR) is on the conceptual meaning of the search words regardless of their surface form, or the linguistic structure of the output. Thus much of the complexity of MT is not required for effective CLIR. We present an adapted MT technique specifically designed for CLIR. In this method IR text pre-processing in the form of stop word removal and stemming are applied to the MT training corpus Prior to the training phase. Applying this step leads to a significant decrease in the MT computational and training resources requirements. Experimental application of the new approach to the cross language patent retrieval task from CLEF-IP 2010 shows that the new technique to be up to 23 times faster than standard MT for query translations, while maintaining IR effectiveness statistically indistinguishable from standard MT when large training resources are used. Furthermore the new method is significantly better than standard MT when only limited translation training resources are available, which can be a significant issue for translation in specialized domains. The new MT technique also enables patent document translation in a practical amount of time with a resulting significant improvement in the retrieval effectiveness.

15 days free trial to Access Article
simple vs sophisticated approaches for patent Prior Art search

European Conference on Information Retrieval, 2011

Co-Authors: Walid Magdy, Patrice Lopez, Gareth J F Jones

Abstract:

Patent Prior-Art search is concerned with finding all filed patents relevant to a given patent application. We report a comparison between two search approaches representing the state-of-the-Art in patent Prior-Art search. The first approach uses simple and straightforward information retrieval (IR) techniques, while the second uses much more sophisticated techniques which try to model the steps taken by a patent examiner in patent search. Experiments show that the retrieval effectiveness using both techniques is statistically indistinguishable when patent applications contain some initial citations. However, the advanced search technique is statistically better when no initial citations are provided. Our findings suggest that less time and effort can be exerted by applying simple IR approaches when initial citations are provided.

15 days free trial to Access Article
ECIR - Simple vs. sophisticated approaches for patent Prior-Art search

Lecture Notes in Computer Science, 2011

Co-Authors: Walid Magdy, Patrice Lopez, Gareth J F Jones

Abstract:

Patent Prior-Art search is concerned with finding all filed patents relevant to a given patent application. We report a comparison between two search approaches representing the state-of-the-Art in patent Prior-Art search. The first approach uses simple and straightforward information retrieval (IR) techniques, while the second uses much more sophisticated techniques which try to model the steps taken by a patent examiner in patent search. Experiments show that the retrieval effectiveness using both techniques is statistically indistinguishable when patent applications contain some initial citations. However, the advanced search technique is statistically better when no initial citations are provided. Our findings suggest that less time and effort can be exerted by applying simple IR approaches when initial citations are provided.

15 days free trial to Access Article
United we fall, divided we stand: a study of query segmentation and prf for patent Prior Art search

Proceedings of the 4th workshop on Patent information retrieval - PaIR '11, 2011

Co-Authors: Debasis Ganguly, Johannes Leveling, Gareth J F Jones

Abstract:

Previous research in patent search has shown that reducing queries by extracting a few key terms is ineffective primarily because of the vocabulary mismatch between patent applications used as queries and existing patent documents. This finding has led to the use of full patent applications as queries in patent Prior Art search. In addition, standard information retrieval (IR) techniques such as query expansion (QE) do not work effectively with patent queries, principally because of the presence of noise terms in the massive queries. In this study, we take a new approach to QE for patent search. Text segmentation is used to decompose a patent query into self coherent sub-topic blocks. Each of these much shorted sub-topic blocks which is representative of a specific aspect or facet of the invention, is then used as a query to retrieve documents. Documents retrieved using the different resulting sub-queries or query streams are interleaved to construct a final ranked list. This technique can exploit the potential benefit of QE since the segmented queries are generally more focused and less ambiguous than the full patent query. Experiments on the CLEF-2010 IP Prior-Art search task show that the proposed method outperforms the retrieval effectiveness achieved when using a single full patent application text as the query, and also demonstrates the potential benefits of QE to alleviate the vocabulary mismatch problem in patent search.

15 days free trial to Access Article
applying the kiss principle for the clef ip 2010 Prior Art candidate patent search task

2010 Working Notes for CLEF Conference CLEF 2010, 2010

Co-Authors: Walid Magdy, Gareth J F Jones

Abstract:

We present our experiments and results for the DCU CNGL pArticipation in the CLEF-IP 2010 Candidate Patent Search Task. Our work applied standard information retrieval (IR) techniques to patent search. In addition, a very simple citation extraction method was applied to improve the results. This was our second consecutive pArticipation in the CLEF-IP tasks. Our experiments in 2009 showed that many sophisticated approach to IR do not improve the retrieval effectiveness for this task. For this reason of we decided to apply only simple methods in 2010. These were demonstrated to be highly competitive with other pArticipants. DCU submitted three runs for the Prior Art Candidate Search Task, two of these runs achieved the second and third ranks among the 25 runs submitted by nine different pArticipants. Our best run achieved MAP of 0.203, recall of 0.618, and PRES of 0.523.

15 days free trial to Access Article

Carla P Gomes - One of the best experts on this subject based on the ideXlab platform.

ranking structured documents a large margin based approach for patent Prior Art search

International Joint Conference on Artificial Intelligence, 2009

Co-Authors: Yunsong Guo, Carla P Gomes

Abstract:

We propose an approach for automatically ranking structured documents applied to patent Prior Art search. Our model, SVM Patent Ranking (SVMPR) incorporates margin constraints that directly capture the specificities of patent citation ranking. Our approach combines patent domain knowledge features with meta-score features from several different general Information Retrieval methods. The training algorithm is an extension of the Pegasos algorithm with performance guarantees, effectively handling hundreds of thousands of patent-pair judgements in a high dimensional feature space. Experiments on a homogeneous essential wireless patent dataset show that SVMPR performs on average 30%-40% better than many other state-of-the-Art general-purpose Information Retrieval methods in terms of the NDCG measure at different cut-off positions.

15 days free trial to Access Article
IJCAI - Ranking structured documents: a large margin based approach for patent Prior Art search

2009

Co-Authors: Yunsong Guo, Carla P Gomes

Abstract:

We propose an approach for automatically ranking structured documents applied to patent Prior Art search. Our model, SVM Patent Ranking (SVMPR) incorporates margin constraints that directly capture the specificities of patent citation ranking. Our approach combines patent domain knowledge features with meta-score features from several different general Information Retrieval methods. The training algorithm is an extension of the Pegasos algorithm with performance guarantees, effectively handling hundreds of thousands of patent-pair judgements in a high dimensional feature space. Experiments on a homogeneous essential wireless patent dataset show that SVMPR performs on average 30%-40% better than many other state-of-the-Art general-purpose Information Retrieval methods in terms of the NDCG measure at different cut-off positions.

15 days free trial to Access Article

Walid Magdy - One of the best experts on this subject based on the ideXlab platform.

Studying machine translation technologies for large-data CLIR tasks: a patent Prior-Art search case study

Information Retrieval, 2013

Co-Authors: Walid Magdy, Gareth J F Jones

Abstract:

Prior-Art search in patent retrieval is concerned with finding all existing patents relevant to a patent application. Since patents often appear in different languages, cross-language information retrieval (CLIR) is an essential component of effective patent search. In recent years machine translation (MT) has become the dominant approach to translation in CLIR. Standard MT systems focus on generating proper translations that are morphologically and syntactically correct. Development of effective MT systems of this type requires large training resources and high computational power for training and translation. This is an important issue for patent CLIR where queries are typically very long sometimes taking the form of a full patent application, meaning that query translation using MT systems can be very slow. However, in contrast to MT, the focus for information retrieval (IR) is on the conceptual meaning of the search words regardless of their surface form, or the linguistic structure of the output. Thus much of the complexity of MT is not required for effective CLIR. We present an adapted MT technique specifically designed for CLIR. In this method IR text pre-processing in the form of stop word removal and stemming are applied to the MT training corpus Prior to the training phase. Applying this step leads to a significant decrease in the MT computational and training resources requirements. Experimental application of the new approach to the cross language patent retrieval task from CLEF-IP 2010 shows that the new technique to be up to 23 times faster than standard MT for query translations, while maintaining IR effectiveness statistically indistinguishable from standard MT when large training resources are used. Furthermore the new method is significantly better than standard MT when only limited translation training resources are available, which can be a significant issue for translation in specialized domains. The new MT technique also enables patent document translation in a practical amount of time with a resulting significant improvement in the retrieval effectiveness.

15 days free trial to Access Article
simple vs sophisticated approaches for patent Prior Art search

European Conference on Information Retrieval, 2011

Co-Authors: Walid Magdy, Patrice Lopez, Gareth J F Jones

Abstract:

Patent Prior-Art search is concerned with finding all filed patents relevant to a given patent application. We report a comparison between two search approaches representing the state-of-the-Art in patent Prior-Art search. The first approach uses simple and straightforward information retrieval (IR) techniques, while the second uses much more sophisticated techniques which try to model the steps taken by a patent examiner in patent search. Experiments show that the retrieval effectiveness using both techniques is statistically indistinguishable when patent applications contain some initial citations. However, the advanced search technique is statistically better when no initial citations are provided. Our findings suggest that less time and effort can be exerted by applying simple IR approaches when initial citations are provided.

15 days free trial to Access Article
ECIR - Simple vs. sophisticated approaches for patent Prior-Art search

Lecture Notes in Computer Science, 2011

Co-Authors: Walid Magdy, Patrice Lopez, Gareth J F Jones

Abstract:

Patent Prior-Art search is concerned with finding all filed patents relevant to a given patent application. We report a comparison between two search approaches representing the state-of-the-Art in patent Prior-Art search. The first approach uses simple and straightforward information retrieval (IR) techniques, while the second uses much more sophisticated techniques which try to model the steps taken by a patent examiner in patent search. Experiments show that the retrieval effectiveness using both techniques is statistically indistinguishable when patent applications contain some initial citations. However, the advanced search technique is statistically better when no initial citations are provided. Our findings suggest that less time and effort can be exerted by applying simple IR approaches when initial citations are provided.

15 days free trial to Access Article
applying the kiss principle for the clef ip 2010 Prior Art candidate patent search task

2010 Working Notes for CLEF Conference CLEF 2010, 2010

Co-Authors: Walid Magdy, Gareth J F Jones

Abstract:

We present our experiments and results for the DCU CNGL pArticipation in the CLEF-IP 2010 Candidate Patent Search Task. Our work applied standard information retrieval (IR) techniques to patent search. In addition, a very simple citation extraction method was applied to improve the results. This was our second consecutive pArticipation in the CLEF-IP tasks. Our experiments in 2009 showed that many sophisticated approach to IR do not improve the retrieval effectiveness for this task. For this reason of we decided to apply only simple methods in 2010. These were demonstrated to be highly competitive with other pArticipants. DCU submitted three runs for the Prior Art Candidate Search Task, two of these runs achieved the second and third ranks among the 25 runs submitted by nine different pArticipants. Our best run achieved MAP of 0.203, recall of 0.618, and PRES of 0.523.

15 days free trial to Access Article

Suzan Verberne - One of the best experts on this subject based on the ideXlab platform.

combining document representations for Prior Art retrieval

CLEF (Notebook Papers Labs Workshop), 2011

Co-Authors: Eva Dhondt, Suzan Verberne, Wouter Alink, Roberto Cornacchia

Abstract:

In this paper we report on our pArticipation in the CLEF-IP 2011 Prior Art retrieval task. We investigated whether adding syntactic information in the form of dependency triples to a bag-of-words representation could lead to improvements in patent retrieval. In our experiments, we investigated this effect on the title, abstract and first 400 words of the description section. The experiments were conducted in the Spinque framework with which we tried to optimize for the combinations of text representation and document sections. We found that adding triples did not improve overall MAP scores, compared to the baseline bag-of-words approach but does result in slightly higher set recall scores. In future work we will extend our experiments to use all the text sections of the patent documents and fine-tune the mixture weights.

15 days free trial to Access Article
Re-ranking based on Syntactic Dependencies in Prior-Art Retrieval

2010

Co-Authors: Eva D'hondt, Suzan Verberne, Nelleke Oostdijk, Lou Boves

Abstract:

In this paper we present an experiment using syntax (in the form of dependency triplets) to rerank retrieval results in the patent domain. This work is a follow-up experiment of our pArticipation in the first CLEF-IP track, which focussed on Prior Art retrieval. We shall first describe the work done in our pArticipation to the CLEF-IP track and then go on to show why improving Mean Average Precision (MAP) is important to the patent searchers community. We then introduce an additional reranking step to our BOW retrieval approach which is based on syntactic information. Using syntactic structures called Dependency Triplets as index terms we perform a second retrieval step within the retrieved result sets and examine if the ranking of the relevant documents (captured by the MAP score) can be improved for Prior Art search.

15 days free trial to Access Article
clef ip 2010 Prior Art retrieval using the different sections in patent documents

CLEF-IP 2010. Proceedings of the Conference on Multilingual and Multimodal Information Access Evaluation (CLEF 2010) CLEF-IP workshop, 2010

Co-Authors: Eva Dhondt, Suzan Verberne

Abstract:

In this paper we describe our pArticipation in the 2010 CLEF-IP Prior Art Retrieval task where we examined the impact of information in dierent sections of patent documents, namely the title, abstract, claims, description and IPC-R sections, on the retrieval and re-ranking of patent documents. Using a standard bag-of-words approach in Lemur we found that the IPC-R sections are the most informative for patent retrieval. We then performed a re-ranking of the retrieved documents using a Logistic Regression Model, trained on the retrieved documents in the training set. We found indications that the information contained in the text sections of the patent document can contribute to a better ranking of the retrieved documents. The ocial results have shown that among the nine groups that pArticipated in the Prior Art Retrieval task we achieved the eigth rank in terms of both Mean Average Precision (MAP) and Recall.

15 days free trial to Access Article
CLEF (Notebook Papers/LABs/Workshops) - CLEF-IP 2010: Prior Art Retrieval using the different sections in patent documents

2010

Co-Authors: Eva D'hondt, Suzan Verberne

Abstract:

In this paper we describe our pArticipation in the 2010 CLEF-IP Prior Art Retrieval task where we examined the impact of information in dierent sections of patent documents, namely the title, abstract, claims, description and IPC-R sections, on the retrieval and re-ranking of patent documents. Using a standard bag-of-words approach in Lemur we found that the IPC-R sections are the most informative for patent retrieval. We then performed a re-ranking of the retrieved documents using a Logistic Regression Model, trained on the retrieved documents in the training set. We found indications that the information contained in the text sections of the patent document can contribute to a better ranking of the retrieved documents. The ocial results have shown that among the nine groups that pArticipated in the Prior Art Retrieval task we achieved the eigth rank in terms of both Mean Average Precision (MAP) and Recall.

15 days free trial to Access Article
CLEF (Working Notes) - Prior Art retrieval using the claims section as a bag of words

Lecture Notes in Computer Science, 2010

Co-Authors: Suzan Verberne, Eva D'hondt

Abstract:

In this paper we describe our pArticipation in the 2009 CLEFIP task, which was targeted at Prior-Art search for topic patent documents. We opted for a baseline approach to get a feeling for the specifics of the task and the documents used. Our system retrieved patent documents based on a standard bag-of-words approach for both the Main Task and the English Task. In both runs, we extracted the claim sections from all English patents in the corpus and saved them in the Lemur index format with the patent IDs as DOCIDs. These claims were then indexed using Lemur's BuildIndex function. In the topic documents we also focused exclusively on the claims sections. These were extracted and converted to queries by removing stopwords and punctuation.We did not perform any term selection or query expansion. We retrieved 100 patents per topic using Lemur's RetEval function, retrieval model TF-IDF. Compared to the other runs submitted to the track, we obtained good results in terms of nDCG (0.46) and moderate results in terms of MAP (0.054).

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Prior Art with ideXlab!

W. Bruce Croft - One of the best experts on this subject based on the ideXlab platform.

SIGIR - Transforming patents into Prior-Art queries

Transforming patents into Prior-Art queries

Gareth J F Jones - One of the best experts on this subject based on the ideXlab platform.

Studying machine translation technologies for large-data CLIR tasks: a patent Prior-Art search case study

simple vs sophisticated approaches for patent Prior Art search

ECIR - Simple vs. sophisticated approaches for patent Prior-Art search

United we fall, divided we stand: a study of query segmentation and prf for patent Prior Art search

applying the kiss principle for the clef ip 2010 Prior Art candidate patent search task

Carla P Gomes - One of the best experts on this subject based on the ideXlab platform.

ranking structured documents a large margin based approach for patent Prior Art search

IJCAI - Ranking structured documents: a large margin based approach for patent Prior Art search

Walid Magdy - One of the best experts on this subject based on the ideXlab platform.

Studying machine translation technologies for large-data CLIR tasks: a patent Prior-Art search case study

simple vs sophisticated approaches for patent Prior Art search

ECIR - Simple vs. sophisticated approaches for patent Prior-Art search

applying the kiss principle for the clef ip 2010 Prior Art candidate patent search task

Suzan Verberne - One of the best experts on this subject based on the ideXlab platform.

combining document representations for Prior Art retrieval

Re-ranking based on Syntactic Dependencies in Prior-Art Retrieval

clef ip 2010 Prior Art retrieval using the different sections in patent documents

CLEF (Notebook Papers/LABs/Workshops) - CLEF-IP 2010: Prior Art Retrieval using the different sections in patent documents

CLEF (Working Notes) - Prior Art retrieval using the claims section as a bag of words