Semantic Concept

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 8205 Experts worldwide ranked by ideXlab platform

Mei-ling Shyu - One of the best experts on this subject based on the ideXlab platform.

  • Supporting Semantic Concept Retrieval with Negative Correlations in a Multimedia Big Data Mining System
    International Journal of Semantic Computing, 2016
    Co-Authors: Mei-ling Shyu
    Abstract:

    With the extensive use of smart devices and blooming popularity of social media websites such as Flickr, YouTube, Twitter, and Facebook, we have witnessed an explosion of multimedia data. The amount of data nowadays is formidable without effective big data technologies. It is well-acknowledged that multimedia high-level Semantic Concept mining and retrieval has become an important research topic; while the Semantic gap (i.e., the gap between the low-level features and high-level Concepts) makes it even more challenging. To address these challenges, it requires the joint research efforts from both big data mining and multimedia areas. In particular, the correlations among the classes can provide important context cues to help bridge the Semantic gap. However, correlation discovery is computationally expensive due to the huge amount of data. In this paper, a novel multimedia big data mining system based on the MapReduce framework is proposed to discover negative correlations for Semantic Concept mining and retrieval. Furthermore, the proposed multimedia big data mining system consists of a big data processing platform with Mesos for efficient resource management and with Cassandra for handling data across multiple data centers. Experimental results on the TRECVID benchmark datasets demonstrate the feasibility and the effectiveness of the proposed multimedia big data mining system with negative correlation discovery for Semantic Concept mining and retrieval.

  • ICSC - Negative Correlation Discovery for Big Multimedia Data Semantic Concept Mining and Retrieval
    2016 IEEE Tenth International Conference on Semantic Computing (ICSC), 2016
    Co-Authors: Mei-ling Shyu
    Abstract:

    With massive amounts of data producing each day in almost every field, traditional data processing techniques have become more and more inadequate. However, the research of effectively managing and retrieving these big data is still under development. Multimedia high-level Semantic Concept mining and retrieval in big data is one of the most challenging research topics, which requires joint efforts from researchers in both big data mining and multimedia domains. In order to bridge the Semantic gap between high-level Concepts and low-level visual features, correlation discovery in Semantic Concept mining is worth exploring. Meanwhile, correlation discovery is a computationally intensive task in the sense that it requires a deep analysis of very large and growing repositories. This paper presents a novel system of discovering negative correlation for Semantic Concept mining and retrieval. It is designed to adapt to Hadoop MapReduce framework, which is further extended to utilize Spark, a more efficient and general cluster computing engine. The experimental results demonstrate the feasibility of utilizing big data technologies in negative correlation discovery.

  • Negative Correlation Discovery for Big Multimedia Data Semantic Concept Mining and Retrieval
    2016 IEEE Tenth International Conference on Semantic Computing (ICSC), 2016
    Co-Authors: Mei-ling Shyu
    Abstract:

    With massive amounts of data producing each day in almost every field, traditional data processing techniques have become more and more inadequate. However, the research of effectively managing and retrieving these big data is still under development. Multimedia high-level Semantic Concept mining and retrieval in big data is one of the most challenging research topics, which requires joint efforts from researchers in both big data mining and multimedia domains. In order to bridge the Semantic gap between high-level Concepts and low-level visual features, correlation discovery in Semantic Concept mining is worth exploring. Meanwhile, correlation discovery is a computationally intensive task in the sense that it requires a deep analysis of very large and growing repositories. This paper presents a novel system of discovering negative correlation for Semantic Concept mining and retrieval. It is designed to adapt to Hadoop MapReduce framework, which is further extended to utilize Spark, a more efficient and general cluster computing engine. The experimental results demonstrate the feasibility of utilizing big data technologies in negative correlation discovery.

  • Gaussian Mixture Model-Based Subspace Modeling for Semantic Concept Retrieval
    2015 IEEE International Conference on Information Reuse and Integration, 2015
    Co-Authors: Chao Chen, Mei-ling Shyu, Shu-ching Chen
    Abstract:

    Data mining and machine learning methods have been playing an important role in searching and retrieving multimedia information from all kinds of multimedia repositories. Although some of these methods have been proven to be useful, it is still an interesting and active research area to effectively and efficiently retrieve multimedia information under difficult scenarios, i.e., detecting rare events or learning from imbalanced datasets. In this paper, we propose a novel subspace modeling framework that is able to effectively retrieve Semantic Concepts from highly imbalanced datasets. The proposed framework builds positive subspace models on a set of positive training sets, each of which is generated by a Gaussian Mixture Model (GMM) that partitions the data instances of a target Concept (i.e., the original positive set of the target Concept) into several subsets and later merges each subset with the original positive data instances. Afterwards, a joint-scoring method is proposed to fuse the final ranking scores from all such positive subspace models and the negative subspace model. Experimental results evaluated on a public-available benchmark dataset show that the proposed subspace modeling framework is able to outperform peer methods commonly used for Semantic Concept retrieval.

  • Sparse Linear Integration of Content and Context Modalities for Semantic Concept Retrieval
    IEEE Transactions on Emerging Topics in Computing, 2015
    Co-Authors: Mei-ling Shyu
    Abstract:

    The Semantic gap between low-level visual features and high-level Semantics is a well-known challenge in content-based multimedia information retrieval. With the rapid popularization of social media, which allows users to assign tags to describe images and videos, attention is naturally drawn to take advantage of these metadata in order to bridge the Semantic gap. This paper proposes a sparse linear integration (SLI) model that focuses on integrating visual content and its associated metadata, which are referred to as the content and the context modalities, respectively, for Semantic Concept retrieval. An optimization problem is formulated to approximate an instance using a sparse linear combination of other instances and minimize the difference between them. The prediction score of a Concept for a test instance measures how well it can be reconstructed by the positive instances of that Concept. Two benchmark image data sets and their associated tags are used to evaluate the SLI model. Experimental results show promising performance by comparing with the approaches based on a single modality and approaches based on popular fusion methods.

M. Naphade - One of the best experts on this subject based on the ideXlab platform.

  • Semi-supervised cross feature learning for Semantic Concept detection in videos
    2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005
    Co-Authors: M. Naphade
    Abstract:

    For large scale automatic Semantic video characterization, it is necessary to learn and model a large number of Semantic Concepts. But a major obstacle to this is the insufficiency of labeled training samples. Multi-view semi-supervised learning algorithms such as co-training may help by incorporating a large amount of unlabeled data. However, one of their assumptions requiring that each view be sufficient for learning is usually violated in Semantic Concept detection. In this paper, we propose a novel multi-view semi-supervised learning algorithm called semi-supervised cross feature learning (SCFL). The proposed algorithm has two advantages over co-training. First, SCFL can theoretically guarantee its performance not being significantly degraded even when the assumption of view sufficiency fails. Also, SCFL can also handle additional views of unlabeled data even when these views are absent from the training data. As demonstrated in the TRECVID '03 Semantic Concept extraction task, the proposed SCFL algorithm not only significantly outperforms the conventional co-training algorithms, but also comes close to achieving the performance when the unlabeled set were to be manually annotated and used for training along with the labeled data set.

  • CVPR (1) - Semi-supervised cross feature learning for Semantic Concept detection in videos
    2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005
    Co-Authors: M. Naphade
    Abstract:

    For large scale automatic Semantic video characterization, it is necessary to learn and model a large number of Semantic Concepts. But a major obstacle to this is the insufficiency of labeled training samples. Multi-view semi-supervised learning algorithms such as co-training may help by incorporating a large amount of unlabeled data. However, one of their assumptions requiring that each view be sufficient for learning is usually violated in Semantic Concept detection. In this paper, we propose a novel multi-view semi-supervised learning algorithm called semi-supervised cross feature learning (SCFL). The proposed algorithm has two advantages over co-training. First, SCFL can theoretically guarantee its performance not being significantly degraded even when the assumption of view sufficiency fails. Also, SCFL can also handle additional views of unlabeled data even when these views are absent from the training data. As demonstrated in the TRECVID '03 Semantic Concept extraction task, the proposed SCFL algorithm not only significantly outperforms the conventional co-training algorithms, but also comes close to achieving the performance when the unlabeled set were to be manually annotated and used for training along with the labeled data set.

Shu-ching Chen - One of the best experts on this subject based on the ideXlab platform.

  • Semantic Concept Detection Using Weighted Discretization Multiple Correspondence Analysis for Disaster Information Management
    2016 IEEE 17th International Conference on Information Reuse and Integration (IRI), 2016
    Co-Authors: Samira Pouyanfar, Shu-ching Chen
    Abstract:

    Multimedia Semantic Concept detection is an emerging research area in recent years. One of the prominent challenges in multimedia Concept detection is data imbalance. In this study, a multimedia data mining framework for interesting Concept detection in videos is presented. First, the Minimum Description Length (MDL) discretization algorithm is extended to handle the imbalanced data. Thereafter, a novel Weighted Discretization Multiple Correspondence Analysis (WD-MCA) algorithm based on the Multiple Correspondence Analysis (MCA) approach is proposed to maximize the correlation between the feature value pairs and Concept classes by incorporating the discretization information captured from the MDL module. The proposed framework achieves promising performance to videos containing disaster events. The experimental results demonstrate the effectiveness of the WD-MCA algorithm, specifically for imbalanced datasets, compared to several existing methods.

  • Correlation-based deep learning for multimedia Semantic Concept detection
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015
    Co-Authors: Hsin-yu Ha, Haiman Tian, Yimin Yang, Samira Pouyanfar, Shu-ching Chen
    Abstract:

    Nowadays, Concept detection from multimedia data is con-sidered as an emerging topic due to its applicability to various applica-tions in both academia and industry. However, there are some inevitable challenges including the high volume and variety of multimedia data as well as its skewed distribution. To cope with these challenges, in this paper, a novel framework is proposed to integrate two correlation-based methods, Feature-Correlation Maximum Spanning Tree (FC-MST) and Negative-based Sampling (NS), with a well-known deep learning algo-rithm called Convolutional Neural Network (CNN). First, FC-MST is introduced to select the most relevant low-level features, which are ex-tracted from multiple modalities, and to decide the input layer dimension of the CNN. Second, NS is adopted to improve the batch sampling in the CNN. Using NUS-WIDE image data set as a web-based applica-tion, the experimental results demonstrate the effectiveness of the pro-posed framework for Semantic Concept detection, comparing to other well-known classifiers.

  • Gaussian Mixture Model-Based Subspace Modeling for Semantic Concept Retrieval
    2015 IEEE International Conference on Information Reuse and Integration, 2015
    Co-Authors: Chao Chen, Mei-ling Shyu, Shu-ching Chen
    Abstract:

    Data mining and machine learning methods have been playing an important role in searching and retrieving multimedia information from all kinds of multimedia repositories. Although some of these methods have been proven to be useful, it is still an interesting and active research area to effectively and efficiently retrieve multimedia information under difficult scenarios, i.e., detecting rare events or learning from imbalanced datasets. In this paper, we propose a novel subspace modeling framework that is able to effectively retrieve Semantic Concepts from highly imbalanced datasets. The proposed framework builds positive subspace models on a set of positive training sets, each of which is generated by a Gaussian Mixture Model (GMM) that partitions the data instances of a target Concept (i.e., the original positive set of the target Concept) into several subsets and later merges each subset with the original positive data instances. Afterwards, a joint-scoring method is proposed to fuse the final ranking scores from all such positive subspace models and the negative subspace model. Experimental results evaluated on a public-available benchmark dataset show that the proposed subspace modeling framework is able to outperform peer methods commonly used for Semantic Concept retrieval.

  • IRI - Correlation-based re-ranking for Semantic Concept detection
    Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014), 2014
    Co-Authors: Hsin-yu Ha, Shu-ching Chen, Fausto C. Fleites, Min Chen
    Abstract:

    Semantic Concept detection is among the most important and challenging topics in multimedia research. Its objective is to effectively identify high-level Semantic Concepts from low-level features for multimedia data analysis and management. In this paper, a novel re-ranking method is proposed based on correlation among Concepts to automatically refine detection results and improve detection accuracy. Specifically, multiple correspondence analysis (MCA) is utilized to capture the relationship between a targeted Concept and all other Semantic Concepts. Such relationship is then used as a transaction weight to refine detection ranking scores. To demonstrate its effectiveness in refining Semantic Concept detection, the proposed re-ranking method is applied to the detection scores of TRECVID 2011 benchmark data set, and its performance is compared with other state-of-the-art re-ranking approaches.

  • Correlation-based re-ranking for Semantic Concept detection
    Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014), 2014
    Co-Authors: Hsin-yu Ha, Shu-ching Chen, Fausto C. Fleites, Min Chen
    Abstract:

    Semantic Concept detection is among the most important and challenging topics in multimedia research. Its objective is to effectively identify high-level Semantic Concepts from low-level features for multimedia data analysis and management. In this paper, a novel re-ranking method is proposed based on correlation among Concepts to automatically refine detection results and improve detection accuracy. Specifically, multiple correspondence analysis (MCA) is utilized to capture the relationship between a targeted Concept and all other Semantic Concepts. Such relationship is then used as a transaction weight to refine detection ranking scores. To demonstrate its effectiveness in refining Semantic Concept detection, the proposed re-ranking method is applied to the detection scores of TRECVID 2011 benchmark data set, and its performance is compared with other state-of-the-art re-ranking approaches.

Yufeng Chen - One of the best experts on this subject based on the ideXlab platform.

  • Improving Performance of NMT Using Semantic Concept of WordNet Synset
    Communications in Computer and Information Science, 2019
    Co-Authors: Jinan Xu, Yufeng Chen, Gouyi Miao, Yujie Zhang
    Abstract:

    Neural machine translation (NMT) has shown promising progress in recent years. However, for reducing the computational complexity, NMT typically needs to limit its vocabulary scale to a fixed or relatively acceptable size, which leads to the problem of rare word and out-of-vocabulary (OOV). In this paper, we present that the Semantic Concept information of word can help NMT learn better Semantic representation of word and improve the translation accuracy. The key idea is to utilize the external Semantic knowledge base WordNet to replace rare words and OOVs with their Semantic Concepts of WordNet synsets. More specifically, we propose two Semantic similarity models to obtain the most similar Concepts of rare words and OOVs. Experimental results on 4 translation tasks (We verify the effectiveness of our method on four translation tasks, including English-to- German, German-to-English, English-to-Chinese and Chinese-to-English.) show that our method outperforms the baseline RNNSearch by 2.38–2.88 BLEU points. Furthermore, the proposed hybrid method by combining BPE and our proposed method can also gain 0.39–0.97 BLEU points improvement over BPE. Experiments and analysis presented in this study also demonstrate that the proposed method can significantly improve translation quality of OOVs in NMT.

  • Improving Performance of NMT Using Semantic Concept of WordNet Synset
    EasyChair Preprints, 2018
    Co-Authors: Jinan Xu, Yufeng Chen, Gouyi Miao, Yujie Zhang
    Abstract:

    Neural machine translation (NMT) has shown promising progress in recent years. However, for reducing the computational complexity, NMT typically needs to limit its vocabulary scale to a fixed or relatively acceptable size, which leads to the problem of rare word and out-of-vocabulary (OOV). In this paper, we present that the Semantic Concept information of word can help NMT learn better Semantic representation of word and improve the translation accuracy. The key idea is to utilize the external Semantic knowledge base WordNet to replace rare words and OOVs with their Semantic Concepts of WordNet synsets. More specifically, we propose two Semantic similarity models to obtain the most similar Concepts of rare words and OOVs. Experimental results on 4 translation tasks show that our method outperforms the baseline RNNSearch by 2.38~2.88 BLEU points. Furthermore, the proposed hybrid method by combining BPE and our proposed method can also gain 0.39~0.97 BLEU points improvement over BPE. Experiments and analysis presented in this study also demonstrate that the proposed method can significantly improve translation quality of OOVs in NMT.

  • NLPCC - A Semantic Concept Based Unknown Words Processing Method in Neural Machine Translation
    Natural Language Processing and Chinese Computing, 2018
    Co-Authors: Shaotong Li, Jinan Xu, Guoyi Miao, Yujie Zhang, Yufeng Chen
    Abstract:

    The problem of unknown words in neural machine translation (NMT), which not only affects the Semantic integrity of the source sentences but also adversely affects the generating of the target sentences. The traditional methods usually replace the unknown words according to the similarity of word vectors, these approaches are difficult to deal with rare words and polysemous words. Therefore, this paper proposes a new method of unknown words processing in NMT based on the Semantic Concept of the source language. Firstly, we use the Semantic Concept of source language Semantic dictionary to find the candidate in-vocabulary words. Secondly, we propose a method to calculate the Semantic similarity by integrating the source language model and the Semantic Concept network, to obtain the best replacement word. Experiments on English to Chinese translation task demonstrate that our proposed method can achieve more than 2.6 BLEU points over the conventional NMT method. Compared with the traditional method based on word vector similarity, our method can also obtain an improvement by nearly 0.8 BLEU points.

  • CWMT - An Unknown Word Processing Method in NMT by Integrating Syntactic Structure and Semantic Concept
    Communications in Computer and Information Science, 2017
    Co-Authors: Guoyi Miao, Shaotong Li, Jinan Xu, Yancui Li, Yufeng Chen
    Abstract:

    The unknown words in neural machine translation (NMT) may undermine the integrity of sentence structure, increase ambiguity and have adverse effect on the translation. In order to solve this problem, we propose a method of processing unknown words in NMT based on integrating syntactic structure and Semantic Concept. Firstly, the Semantic Concept network is used to construct the set of in-vocabulary synonyms corresponding to the unknown words. Secondly, a Semantic similarity calculation method based on the syntactic structure and Semantic Concept is proposed. The best substitute is selected from the set of in-vocabulary synonyms by calculating the Semantic similarity between the unknown words and their candidate substitutes. English-Chinese translation experiments demonstrate that this method can maintain the Semantic integrity of the source language sentences. Meanwhile, in performance, our proposed method can obtain an improvement by 2.9 BLEU points when compared with the conventional NMT method, and the method can also achieve an improvement by 0.95 BLEU points when compared with the traditional method of positioning the UNK character based on word alignment information.

Yong Man Ro - One of the best experts on this subject based on the ideXlab platform.

  • Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement
    IEEE Transactions on Circuits and Systems for Video Technology, 2012
    Co-Authors: Jae-young Choi, Wesley De Neve, Yong Man Ro
    Abstract:

    Motivated by the observation that content transformations tend to preserve the Semantic information conveyed by video clips, this paper introduces a novel technique for near-duplicate video clip (NDVC) detection, leveraging model-free Semantic Concept detection and adaptive Semantic distance measurement. In particular, model-free Semantic Concept detection is realized by taking advantage of the collective knowledge in an image folksonomy (which is an unstructured collection of user-contributed images and tags), facilitating the use of an unrestricted Concept vocabulary. Adaptive Semantic distance measurement is realized by means of the signature quadratic form distance (SQFD), making it possible to flexibly measure the similarity between video shots that contain a varying number of Semantic Concepts, and where these Semantic Concepts may also differ in terms of relevance and nature. Experimental results obtained for the MIRFLICKR-25000 image set (used as a source of collective knowledge) and the TRECVID 2009 video set (used to create query and reference video clips) demonstrate that model-free Semantic Concept detection and SQFD can be successfully used for the purpose of identifying NDVCs.

  • ICIP - Towards a better understanding of model-free Semantic Concept detection for annotation and near-duplicate video clip detection
    2011 18th IEEE International Conference on Image Processing, 2011
    Co-Authors: Jae-young Choi, Wesley De Neve, Yong Man Ro
    Abstract:

    Given the observation that content transformations tend to preserve Semantic information, we demonstrated in previous research that model-free Semantic Concept detection can be successfully leveraged for identifying NDVCs. In this paper, we seek a better understanding of the usefulness of model-free Semantic Concept detection for both the task of annotation and NDVC detection. In particular, through extensive experiments, we demonstrate that the problem of detecting Semantic Concepts for the goal of identifying NDVCs is more relaxed than the problem of detecting Semantic Concepts for annotation purposes: whereas incorrectly detected Semantic Concepts negatively affect the effectiveness of annotation, they do not negatively affect the effectiveness of NDVC detection, as long as the same incorrect Semantic Concepts are detected for both the reference and near-duplicate video clips. This observation has practical implications for the design of a video management system that makes use of model-free Semantic Concept detection for both the purpose of annotation and NDVC detection.

  • Towards a better understanding of model-free Semantic Concept detection for annotation and near-duplicate video clip detection
    2011 18th IEEE International Conference on Image Processing, 2011
    Co-Authors: Jae-young Choi, Wesley De Neve, Yong Man Ro
    Abstract:

    Given the observation that content transformations tend to preserve Semantic information, we demonstrated in previous research that model-free Semantic Concept detection can be successfully leveraged for identifying NDVCs. In this paper, we seek a better understanding of the usefulness of model-free Semantic Concept detection for both the task of annotation and NDVC detection. In particular, through extensive experiments, we demonstrate that the problem of detecting Semantic Concepts for the goal of identifying NDVCs is more relaxed than the problem of detecting Semantic Concepts for annotation purposes: whereas incorrectly detected Semantic Concepts negatively affect the effectiveness of annotation, they do not negatively affect the effectiveness of NDVC detection, as long as the same incorrect Semantic Concepts are detected for both the reference and near-duplicate video clips. This observation has practical implications for the design of a video management system that makes use of model-free Semantic Concept detection for both the purpose of annotation and NDVC detection.

  • PCM (1) - Training strategy of Semantic Concept detectors using support vector machine in naked image classification
    Advances in Multimedia Information Processing - PCM 2010, 2010
    Co-Authors: Jae-hyun Jeon, Jae-young Choi, Yong Man Ro
    Abstract:

    Recently, in the Web and online social networking sites, the classification and filtering for naked images have been receiving a significant amount of attention. In our previous work, Semantic feature in the aforementioned application is found to be more useful compared to using only low-level visual feature. In this paper, we further investigate the effective training strategy when making use of Support Vector Machine (SVM) for the purpose of generating Semantic Concept detectors. The proposed training strategy aims at increasing the performances of Semantic Concept detectors by boosting the 'naked' image classification performance. Extensive and comparative experiments have been carried out to access the effectiveness of proposed training strategy. In our experiments, each of the Semantic Concept detectors is trained with 600 images and tested with 300 images. In addition, 3 data sets comprising of 600 training images and 1000 testing images are used to test the naked image classification performance. The experimental results show that the proposed training strategy allows for improving Semantic Concept detection performance compared to conventional training strategy in use of SVM. In addition, by using our training strategy, one can improve the overall naked image classification performance when making use of Semantic features.