Extract Metadata

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 4341 Experts worldwide ranked by ideXlab platform

Krisda Khankasikam - One of the best experts on this subject based on the ideXlab platform.

  • Metadata Extraction Using Case-based Reasoning for Heterogeneous Thai Documents
    2011
    Co-Authors: Krisda Khankasikam
    Abstract:

    This paper reports an experience of human-assisted process to Extract Metadata from Thai documents. Nowadays, a number of Thai archives are placed online for sharing increasingly because the Internet infrastructure is powerful preserving and sharing knowledge require appropriate processes. Metadata, data about data, is a very useful information technology today because it helps users to differentiate significant from non-significant documents. The manually harvesting of these Metadata elements is highly labor-intensive, costly and time-consuming then automated is a key to successful preservation. The experiment, a prototype system by using Case-based Reasoning algorithm for Metadata Extraction is introduced. Cased-based Reasoning is an approach in artificial intelligence that differs from other approaches. The Thai Metadata Extraction were performed on some Thai articles which content related to sufficient economy and Thai folk wisdom and was evaluated the approach by using the standard precision, recall and f-measure indices. The study illustrated that this approach helps knowledge workers in a domain to come together, share educational material and greatly reduce the labor work of Metadata creation process.

  • Thai Metadata Extraction by Using Case-based Reasoning
    2010
    Co-Authors: Krisda Khankasikam
    Abstract:

    This paper reports an experience of humanassisted process to Extract Metadata from Thai documents. Nowadays, a number of Thai archives are placed online for sharing increasingly because the Internet infrastructure is powerful preserving and sharing knowledge require appropriate processes. Metadata, data about data, is a very useful information technology today because it helps users to differentiate significant from non-significant documents. The manually harvesting of these Metadata elements is highly laborintensive, costly and time-consuming then automated is a key to successful preservation. The experiment, a prototype system by using Case-based Reasoning algorithm for Metadata Extraction is introduced. Casedbased Reasoning is an approach in artificial intelligence that differs from other approaches. The Thai Metadata Extraction were performed on some Thai articles which content related to sufficient economy and Thai folk wisdom and was evaluated the approach by using the standard precision, recall and f-measure indices. The study illustrated that this approach helps knowledge workers in a domain to come together, share educational material and greatly reduce the labor work of Metadata creation process.

  • Research Article Thai Metadata Extraction by using case-based reasoning การสกัุ้้ี
    2010
    Co-Authors: Krisda Khankasikam
    Abstract:

    This paper reports an experience of human-assisted process to Extract Metadata from Thai documents. Nowadays, a number of Thai archives are placed online for sharing increasingly because the Internet infrastructure is powerful preserving and sharing knowledge require appropriate processes. Metadata, data about data, is a very useful information technology today because it helps users to differentiate significant documents from non-significant ones. Manually harvesting these Metadata elements is highly labor-intensive, costly and time-consuming then automated is a key to successful preservation. The experiment, a prototype system by using case-based reasoning algorithm for Metadata Extraction is introduced. Cased-based reasoning is an approach in artificial intelligence that differs from other approaches. The Thai Metadata Extraction was performed on some Thai articles which content related to sufficient economy and Thai folk wisdom and was evaluated the approach by using the standard precision, recall and f-measure indices. The study illustrated that this approach helps knowledge workers in a domain come together, share educational material and greatly reduce the labor work of Metadata creation process.

  • A Combined Template-Based and Case-Based Metadata Extraction for Heterogeneous Thai Documents
    2009 International Conference on Advanced Computer Control, 2009
    Co-Authors: Krisda Khankasikam, Nopasit Chakpitak, Thana Udomsripaiboon
    Abstract:

    Nowadays, a number of universities, laboratories, government agencies and companies that placing theirs documents online and making them searchable are increasing because the Internet infrastructure for global data access is fully functional. However, a large number of organizations have documents that lack Metadata. The lack of Metadata breaks off not only the discovery and dissemination of these documents over the Internet, but also their connectivity with other documents. Unfortunately, manual Metadata Extraction is expensive and time-consuming for a large document, and most existing automated Metadata Extraction approaches have focused on specific domains and homogeneous documents. In this paper, we propose a combined cased-based and template-based Metadata Extraction approach to solve these issues. The key idea of solving the heterogeneity is to classify documents into equivalent groups so that each document group contains similar documents only. Next, for each document group we have a template of previous case that contains a process to Extract Metadata from documents in the group.

Nitin Agarwal - One of the best experts on this subject based on the ideXlab platform.

  • leveraging social network analysis and cyber forensics approaches to study cyber propaganda campaigns
    2019
    Co-Authors: Samer Alkhateeb, Muhammad Nihal Hussain, Nitin Agarwal
    Abstract:

    In today’s information technology age, our political discourse is shrinking to fit our smartphone screens. Further, with the availability of inexpensive and ubiquitous mass communication tools like social media, disseminating false information and propaganda is both convenient and effective. Groups use social media to coordinate cyber propaganda campaigns in order to achieve strategic and political goals, influence mass thinking, and steer behaviors or perspectives about an event. In this research, we study the online deviant groups (ODGs) who created a lot of cyber propaganda that were projected against the NATO’s Trident Juncture Exercise 2015 (TRJE 2015) on both Twitter and blogs. Anti-NATO narratives were observed on social media websites that got stronger as the TRJE 2015 event approached. Call for civil disobedience, planned protests, and direct action against TRJE 2015 propagated on social media websites. We employ computational social network analysis and cyber forensics informed methodologies to study information competitors who seek to take the initiative and the strategic message away from NATO in order to further their own agenda. Through social cyber forensics tools, e.g., Maltego, we Extract Metadata associated with propaganda-riddled websites. The Extracted Metadata helps in the collection of social network information (i.e., friends and followers) and communication network information (i.e., network depicting the flow of information such as tweets, retweets, mentions, and hyperlinks). Through computational social network analysis, we identify influential users and powerful groups (or the focal structures) coordinating the cyber propaganda campaigns. The study examines 21 blogs having over 18,000 blog posts dating back to 1997 and over 9000 Twitter users for the period between August 3, 2014, and September 12, 2015. These blogs were identified, crawled, and stored in our database that is accessible through the Blogtrackers tool. Blogtrackers tool further helped us identify the activity patterns of blogs, keyword patterns, and the influence a blog or a blogger has on the community, and analyze the sentiment diffusion in the community.

  • social cyber forensics approach to study twitter s and blogs influence on propaganda campaigns
    International Conference on Social Computing, 2017
    Co-Authors: Samer Alkhateeb, Muhammad Nihal Hussain, Nitin Agarwal
    Abstract:

    In today’s information technology age our political discourse is shrinking to fit our smartphone screens. Online Deviant Groups (ODGs) use social media to coordinate cyber propaganda campaigns to achieve strategic and political goals, influence mass thinking, and steer behaviors. In this research, we study the ODGs who conducted cyber propaganda campaigns against NATO’s Trident Juncture Exercise 2015 (TRJE 2015) and how they used Twitter and blogs to drive the campaigns. Using a blended Social Network Analysis (SNA) and Social Cyber Forensics (SCF) approaches, “anti-NATO” narratives were identified on blogs. The narratives intensified as the TRJE 2015 approached. The most influential narrative identified by the proposed methodology called for civil disobedience and direct actions against TRJE 2015 specifically and NATO in general. We use SCF analysis to Extract Metadata associated with propaganda-riddled websites. The Metadata helps in the collection of social and communication network information. By applying SNA on the data, we identify influential users and powerful groups (or, focal structures) coordinating the propaganda campaigns. Data for this research (including blogs and Metadata) is accessible through our in-house developed Blogtrackers tool.

  • SBP-BRiMS - Social Cyber Forensics Approach to Study Twitter’s and Blogs’ Influence on Propaganda Campaigns
    Social Cultural and Behavioral Modeling, 2017
    Co-Authors: Samer Al-khateeb, Muhammad Nihal Hussain, Nitin Agarwal
    Abstract:

    In today’s information technology age our political discourse is shrinking to fit our smartphone screens. Online Deviant Groups (ODGs) use social media to coordinate cyber propaganda campaigns to achieve strategic and political goals, influence mass thinking, and steer behaviors. In this research, we study the ODGs who conducted cyber propaganda campaigns against NATO’s Trident Juncture Exercise 2015 (TRJE 2015) and how they used Twitter and blogs to drive the campaigns. Using a blended Social Network Analysis (SNA) and Social Cyber Forensics (SCF) approaches, “anti-NATO” narratives were identified on blogs. The narratives intensified as the TRJE 2015 approached. The most influential narrative identified by the proposed methodology called for civil disobedience and direct actions against TRJE 2015 specifically and NATO in general. We use SCF analysis to Extract Metadata associated with propaganda-riddled websites. The Metadata helps in the collection of social and communication network information. By applying SNA on the data, we identify influential users and powerful groups (or, focal structures) coordinating the propaganda campaigns. Data for this research (including blogs and Metadata) is accessible through our in-house developed Blogtrackers tool.

Sridhar Maradugu - One of the best experts on this subject based on the ideXlab platform.

  • A web services integration to manage invoice identification, Metadata Extraction, storage and retrieval in a multi-tenancy SaaS application
    IEEE International Conference on e-Business Engineering ICEBE'08 - Workshops: AiR'08 EM2I'08 SOAIC'08 SOKM'08 BIMA'08 DKEEE'08, 2008
    Co-Authors: Thomas Kwok, Jim Laredo, Sridhar Maradugu
    Abstract:

    In most invoice transaction, process and payment solutions that provide invoice services to businesses, extensible markup language (XML) invoice data are fed to the invoice services management system using Web services while files composed of a stack of scanned copies of their original invoices are uploaded separately. Values of invoice Metadata are then manually inputted into the system as a link reference. However, this manual process is tedious, costly and prone to cause errors.In this paper, we describe a novel method to automatically identify, Extract Metadata values and separate each individual invoice copy from the stack.??The separated invoice copies are then associated and linked with their corresponding XML invoice data by matching their Metadata values using Web services integration. Storage and retrieval of these invoice data and files are managed through Web services in an "Invoice to Cash" multi-tenancy Software as a Service (SaaS) invoice services management application. This Web application can present XML invoice data along with its corresponding original invoice copy to the customer for review and printing. The service is enhanced with a resolution interface that allows for the manual processing for those invoices that mismatched with the original invoice data or were not recognized by the system. The attachment of the original invoice copy greatly improves invoice clarity, accuracy, reduces the number of disputes and improves customer satisfaction.

Guangping Fu - One of the best experts on this subject based on the ideXlab platform.

  • Metadata Extraction and correction for large-scale traffic surveillance videos
    2014 IEEE International Conference on Big Data (Big Data), 2014
    Co-Authors: Xiaomeng Zhao, Yi Tang, Haitao Zhang, Guangping Fu
    Abstract:

    Metadata is widely used to facilitate user defined queries and high-level event recognition applications in traffic surveillance videos. Current Metadata Extraction approaches rely on some computer vision algorithms, which are not accurate enough in the real world traffic scenes, and do not deal with big surveillance data efficiently. In this paper, we design a novel Metadata Extraction and Metadata correction system. Firstly, we define the structure of Metadata to determine which attribute (e.g., vehicle enter time, license plate number, vehicle type) we need to Extract. Based on this structure, we employ a three-phase method to Extract Metadata. Secondly, we propose a graph-based Metadata correction approach for compensating the accuracy of Metadata Extraction method. It fuses the big Metadata of whole camera network, automatically detects suspicious Metadata and corrects them based on the Metadata spatial-temporal relationship and the image similarity. As the centralized framework may not be able to cope with the huge amount of data generated by traffic surveillance system, our system is implemented in a distributed fashion using Hadoop and HBase. Finally, the experimental results on real world traffic surveillance videos demonstrate the efficiency of our system, and also demonstrate that the Metadata quality is significantly improved after Metadata correction.

Fu Guangping - One of the best experts on this subject based on the ideXlab platform.

  • BigData Conference - Metadata Extraction and correction for large-scale traffic surveillance videos
    2014 IEEE International Conference on Big Data (Big Data), 2014
    Co-Authors: Zhao Xiaomeng, Yi Tang, Ma Huadong, Haitao Zhang, Fu Guangping
    Abstract:

    Metadata is widely used to facilitate user defined queries and high-level event recognition applications in traffic surveillance videos. Current Metadata Extraction approaches rely on some computer vision algorithms, which are not accurate enough in the real world traffic scenes, and do not deal with big surveillance data efficiently. In this paper, we design a novel Metadata Extraction and Metadata correction system. Firstly, we define the structure of Metadata to determine which attribute (e.g., vehicle enter time, license plate number, vehicle type) we need to Extract. Based on this structure, we employ a three-phase method to Extract Metadata. Secondly, we propose a graph-based Metadata correction approach for compensating the accuracy of Metadata Extraction method. It fuses the big Metadata of whole camera network, automatically detects suspicious Metadata and corrects them based on the Metadata spatial-temporal relationship and the image similarity. As the centralized framework may not be able to cope with the huge amount of data generated by traffic surveillance system, our system is implemented in a distributed fashion using Hadoop and HBase. Finally, the experimental results on real world traffic surveillance videos demonstrate the efficiency of our system, and also demonstrate that the Metadata quality is significantly improved after Metadata correction.