Biographical Information

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 324 Experts worldwide ranked by ideXlab platform

Kristen Cook - One of the best experts on this subject based on the ideXlab platform.

Meg Galasso - One of the best experts on this subject based on the ideXlab platform.

Hongsu Wang - One of the best experts on this subject based on the ideXlab platform.

  • mining local gazetteers of literary chinese with crf and pattern based methods for Biographical Information in chinese history
    International Conference on Big Data, 2015
    Co-Authors: Chihkai Huang, Hongsu Wang
    Abstract:

    Person names and location names are essential building blocks for identifying events and social networks in historical documents that were written in literary Chinese. We take the lead to explore the research on algorithmically recognizing named entities in literary Chinese for historical studies with language-model based and conditional-random-field based methods, and extend our work to mining the document structures in historical documents. Practical evaluations were conducted with texts that were extracted from more than 220 volumes of local gazetteers (Difangzhi, $$$). Difangzhi is a huge and the single most important collection that contains Information about officers who served in local government in Chinese history. Our methods performed very well on these realistic tests. Thousands of names and addresses were identified from the texts. A good portion of the extracted names match the Biographical Information currently recorded in the China Biographical Database (CBDB) of Harvard University, and many others can be verified by historians and will become as new additions to CBDB.1

  • Big Data - Mining local gazetteers of literary Chinese with CRF and pattern based methods for Biographical Information in Chinese history
    2015 IEEE International Conference on Big Data (Big Data), 2015
    Co-Authors: Chihkai Huang, Hongsu Wang
    Abstract:

    Person names and location names are essential building blocks for identifying events and social networks in historical documents that were written in literary Chinese. We take the lead to explore the research on algorithmically recognizing named entities in literary Chinese for historical studies with language-model based and conditional-random-field based methods, and extend our work to mining the document structures in historical documents. Practical evaluations were conducted with texts that were extracted from more than 220 volumes of local gazetteers (Difangzhi, $$$). Difangzhi is a huge and the single most important collection that contains Information about officers who served in local government in Chinese history. Our methods performed very well on these realistic tests. Thousands of names and addresses were identified from the texts. A good portion of the extracted names match the Biographical Information currently recorded in the China Biographical Database (CBDB) of Harvard University, and many others can be verified by historians and will become as new additions to CBDB.1

  • toward algorithmic discovery of Biographical Information in local gazetteers of ancient china
    Pacific Asia Conference on Language Information and Computation, 2015
    Co-Authors: Chihkai Huang, Hongsu Wang
    Abstract:

    Difangzhi (地方志) is a large collection of local gazetteers complied by local governments of China, and the documents provide invaluable Information about the host locality. This paper reports the current status of using natural language processing and text mining methods to identify Biographical Information of government officers so that we can add the Information into the China Biographical Database (CBDB), which is hosted by Harvard University. Information offered by CBDB is instrumental for human historians, and serves as a core foundation for automatic tagging systems, like MARKUS of the Leiden University. Mining texts in Difangzhi is not easy partially because there is litter knowledge about the grammars of literary Chinese so far. We employed techniques of language modeling and conditional random fields to find person and location names and their relationships. The methods were evaluated with realistic Difangzhi data of more than 2 million Chinese characters written in literary Chinese. Experimental results indicate that useful Information was discovered from the current dataset.

  • PACLIC - Toward Algorithmic Discovery of Biographical Information in Local Gazetteers of Ancient China
    2015
    Co-Authors: Chihkai Huang, Hongsu Wang
    Abstract:

    Difangzhi (地方志) is a large collection of local gazetteers complied by local governments of China, and the documents provide invaluable Information about the host locality. This paper reports the current status of using natural language processing and text mining methods to identify Biographical Information of government officers so that we can add the Information into the China Biographical Database (CBDB), which is hosted by Harvard University. Information offered by CBDB is instrumental for human historians, and serves as a core foundation for automatic tagging systems, like MARKUS of the Leiden University. Mining texts in Difangzhi is not easy partially because there is litter knowledge about the grammars of literary Chinese so far. We employed techniques of language modeling and conditional random fields to find person and location names and their relationships. The methods were evaluated with realistic Difangzhi data of more than 2 million Chinese characters written in literary Chinese. Experimental results indicate that useful Information was discovered from the current dataset.

Christina Connor - One of the best experts on this subject based on the ideXlab platform.

Cynthia Johnson - One of the best experts on this subject based on the ideXlab platform.