Graph Database

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 27288 Experts worldwide ranked by ideXlab platform

Toyotaro Suzumura - One of the best experts on this subject based on the ideXlab platform.

  • System G Distributed Graph Database.
    arXiv: Databases, 2018
    Co-Authors: Gabriel Tanase, Toyotaro Suzumura, Jinho Lee, Chun-fu Chen, Jason Crawford, Hiroki Kanezashi, Song Zhang, Warut D. Vijitbenjaronk
    Abstract:

    Motivated by the need to extract knowledge and value from interconnected data, Graph analytics on big data is a very active area of research in both industry and academia. To support Graph analytics efficiently a large number of in memory Graph libraries, Graph processing systems and Graph Databases have emerged. Projects in each of these categories focus on particular aspects such as static versus dynamic Graphs, off line versus on line processing, small versus large Graphs, etc. While there has been much advance in Graph processing in the past decades, there is still a need for a fast Graph processing, using a cluster of machines with distributed storage. In this paper, we discuss a novel distributed Graph Database called System G designed for efficient Graph data storage and processing on modern computing architectures. In particular we describe a single node Graph Database and a runtime and communication layer that allows us to compose a distributed Graph Database from multiple single node instances. From various industry requirements, we find that fast insertions and large volume concurrent queries are critical parts of the Graph Databases and we optimize our Database for such features. We experimentally show the efficiency of System G for storing data and processing Graph queries on state-of-the-art platforms.

  • acacia rdf an x10 based scalable distributed rdf Graph Database engine
    International Conference on Cloud Computing, 2016
    Co-Authors: Miyuru Dayarathna, Isuru Herath, Yasima Dewmini, Gayan Mettananda, Sameera Nandasiri, Sanath Jayasena, Toyotaro Suzumura
    Abstract:

    Linked data mining has become one of the key questions in High Performance Graph mining in recent years. However, the existing Resource Description Framework (RDF) Database engines are not scalable and are less reliable in heterogeneous clouds. In this paper we describe the design and implementation of Acacia-RDF which is a scalable distributed RDF Graph Database engine developed with X10 programming language to solve this issue. Acacia-RDF partitions the RDF data sets into subGraphs following vertex cut paradigm. The partitioned data sets are persisted on secondary storage across X10 places. We developed a scalable SPARQL (an RDF query language) processor for Acacia-RDF which operates on top of partitioned RDF data. Furthermore, we designed and implemented a replication based fault tolerance mechanism for Acacia-RDF. We present performance results gathered from Acacia with different scales of LUBM (Lehigh University Benchmark) RDF benchmark data sets. We make a comparison of Acacia-RDF's performance against Neo4j Graph Database server. From the scalability experiments conducted upto 16 X10 places, we observed that Acacia-RDF scales well with LUBM data sets. Acacia-RDF reported less than ten seconds elapsed times on 16 places for running the first and the third queries of the LUBM benchmark on LUBM 160 universities data set with 3.6 million vertices and 28.5 million edges which was 1.7GB in size. Through this work we describe and demonstrate the use of X10 language for development of scalable RDF Graph data management systems.

  • introducing acacia rdf an x10 based scalable distributed rdf Graph Database engine
    International Parallel and Distributed Processing Symposium, 2016
    Co-Authors: Miyuru Dayarathna, Isuru Herath, Yasima Dewmini, Gayan Mettananda, Sameera Nandasiri, Sanath Jayasena, Toyotaro Suzumura
    Abstract:

    Linked data mining has become one of the key questions in HPC Graph mining in recent years. However, the existing RDF Database engines are not scalable and are less reliable in heterogeneous clouds. In this paper we describe the design and implementation of Acacia-RDF which is a scalable distributed RDF Graph Database engine developed with X10 programming language to solve this issue. Acacia-RDF partitions the RDF data sets into subGraphs following vertex cut paradigm. The partitioned data sets are persisted on secondary storage across X10 places. We developed a scalable SPARQL processor for Acacia-RDF which operates on top of partitioned RDF data. Furthermore, we demonstrate the implementation of scalable Graph algorithms such as Triangle counting with such partitioned data sets. We present performance results gathered from Acacia with different scales of LUBM RDF benchmark data sets and make a comparison of Acacia's performance against Neo4j Graph Database server. From the scalability experiments conducted upto 16 X10 places, we observed that Acacia-RDF scales well with LUBM data sets. Acacia-RDF reported approximately 2 seconds elapsed time on 4 places for running the first and third queries of the LUBM benchmark on LUBM scale 40 data set. Through this work we introduce the use of X10 language for scalable RDF Graph data management.

  • Graph Database benchmarking on cloud environments with XGDBench
    Automated Software Engineering, 2014
    Co-Authors: Miyuru Dayarathna, Toyotaro Suzumura
    Abstract:

    Online Graph Database service providers have started migrating their operations to public clouds due to the increasing demand for low-cost, ubiquitous Graph data storage and analysis. However, there is little support available for benchmarking Graph Database systems in cloud environments. We describe XGDBench which is a Graph Database benchmarking platform for cloud computing systems. XGDBench has been designed with the aim of creating an extensible platform for Graph Database benchmarking which makes it suitable for benchmarking future HPC systems. We extend the Yahoo! Cloud Serving Benchmark (YCSB) to the area of Graph Database benchmarking by creation of XGDBench. The benchmarking platform is written in X10 which is a PGAS language intended for programming future HPC systems. We describe the architecture of the XGDBench and explain how it differs from the current state-of-the-art. We conduct performance evaluation of five famous Graph data stores AllegroGraph, Fuseki, Neo4j, OrientDB, and Titan using XGDBench on Tsubame 2.0 HPC cloud environment.

  • towards scalable distributed Graph Database engine for hybrid clouds
    Proceedings of the 5th International Workshop on Data-Intensive Computing in the Clouds, 2014
    Co-Authors: Miyuru Dayarathna, Toyotaro Suzumura
    Abstract:

    Large Graph data management and mining in clouds has become an important issue in recent times. We propose Acacia which is a distributed Graph Database engine for scalable handling of such large Graph data. Acacia operates between the boundaries of private and public clouds. Acacia partitions and stores the Graph data in the private cloud during its initial deployment. Acacia bursts into the public cloud when the resources of the private cloud are insufficient to maintain its service-level agreements. We implement Acacia using X10 programming language. We describe how Top-K PageRank has been implemented in Acacia. We report preliminary experiment results conducted with Acacia on a small compute cluster. Acacia is able to upload 69 million edges LiveJournal social network data set in about 10 minutes. Furthermore, Acacia calculates the average out degree of vertices of LiveJournal Graph in 2 minutes. These results indicate Acacias potential for handling large Graphs.

Wei Wang - One of the best experts on this subject based on the ideXlab platform.

  • power grid modeling and topology analysis based on Graph Database conforming with cim e
    2020
    Co-Authors: Hua Huang, Mingyu Zhai, Xin Shan, Yi Wang, Wei Wang
    Abstract:

    The power system is becoming larger and larger, and the operation is more frequent, which puts higher requirements on the real-time performance of analysis and calculation. The Graph Database is a new type of Database that originated from the parallel processing of massive data in the Internet. The data model can visually express the topology of the grid and easily realize parallel traversal query. Firstly, the characteristics of Graph Database are introduced from the aspects of data model and data query, and the potential advantages of applying it to large-scale power system analysis and calculation are analyzed. Secondly, the design method of power system modeling is presented for satisfying the guidelines of integrity, consistency and efficiency conforming with CIM/E standard. Finally, a parallel power network topology analysis algorithm is implemented based on the Graph Database model for a provincial grid. The calculation results of the actual large-scale provincial power grid show that the proposed method can significantly improve the topology search efficiency.

  • Graph Database indexing using structured Graph decomposition
    Proceedings - International Conference on Data Engineering, 2007
    Co-Authors: David W. Williams, Jun Huan, Wei Wang
    Abstract:

    We introduce a novel method of indexing Graph Databases in order to facilitate subGraph isomorphism and similarity queries. The index is comprised of two major data structures. The primary structure is a directed acyclic Graph which contains a node for each of the unique, induced subGraphs of the Database Graphs. The secondary structure is a hash table which cross-indexes each subGraph for fast isomorphic lookup. In order to create a hash key independent of isomorphism, we utilize a code-based canonical representation of adjacency matrices, which we have further refined to improve computation speed. We validate the concept by demonstrating its effectiveness in answering queries for two practical datasets. Our experiments show that for subGraph isomorphism queries, our method outperforms existing methods by more than an order of magnitude.

Shrikrishna A. Khaparde - One of the best experts on this subject based on the ideXlab platform.

  • A Common Information Model Oriented Graph Database Framework for Power Systems
    IEEE Transactions on Power Systems, 2017
    Co-Authors: Gelli Ravikumar, Shrikrishna A. Khaparde
    Abstract:

    Common Information Model (CIM) is widely adopted by many utilities since it offers interoperability through standard information models. Storing, processing, retrieving, and providing concurrent access of the large power network models to the various power system applications in CIM framework are the current challenges faced by utility operators. As the power network models resemble largely connected-data sets, the design of CIM oriented Database has to support high-speed data retrieval of the connected-data and efficient storage for processing. The Graph Database is gaining wide acceptance for storing and processing of largely connected-data for various applications. This paper presents a design of CIM oriented Graph Database (CIMGDB) for storing and processing the largely connected-data of power system applications. Three significant advantages of the CIMGDB are efficient data retrieval and storage, agility to adapt dynamic changes in CIM profile, and greater flexibility of modeling CIM unified modeling language (UML) in GDB. The CIMGDB does not need a predefined Database schema. Therefore, the CIM semantics needs to be added to the artifacts of GDB for every instance of CIM objects storage. A CIM based object-Graph mapping methodology is proposed to automate the process. An integration of CIMGDB and power system applications is discussed by an implementation architecture. The data-intensive network topology processing (NTP) is implemented, and demonstrated for six IEEE test networks and one practical 400 kV Maharashtra network. Results such as computation time of executing network topology processing evaluate the performance of the CIMGDB.

  • CIM oriented Graph Database for network topology processing and applications integration
    2015 50th International Universities Power Engineering Conference (UPEC), 2015
    Co-Authors: Gelli Ravikumar, Shrikrishna A. Khaparde
    Abstract:

    Though CIM brings an integration of proprietary software applications of a utility, managing a sheer volume of data between the software application systems is still challenging for utility operators. With the advent of high performance Database technologies and real-time big-data processing tools available, CIM oriented Database can play important role in the operation and processing of the power system data. As the power system resembles connected-data, the choice of the Database technology has to be based on the performance measures such as high-speed data retrieval of the connected-data and efficient storage. This paper presents a CIM oriented Graph Database (CIMGDB) by the object-Graph mapping methodology. The integration of the CIMGDB with power system applications is discussed by the developed implementation architecture. The network topology processing (NTP) application is implemented on the CIMGDB. The NTP is tested on the six IEEE test systems and on one practical power system network. The six IEEE test systems are considered in evaluating and comparing the time complexity of CIMGDB with the CIM oriented relational Database framework. The practical power system network is considered to demonstrate the implementation architecture and the CIMGDB integration with power system applications.

Miyuru Dayarathna - One of the best experts on this subject based on the ideXlab platform.

  • acacia rdf an x10 based scalable distributed rdf Graph Database engine
    International Conference on Cloud Computing, 2016
    Co-Authors: Miyuru Dayarathna, Isuru Herath, Yasima Dewmini, Gayan Mettananda, Sameera Nandasiri, Sanath Jayasena, Toyotaro Suzumura
    Abstract:

    Linked data mining has become one of the key questions in High Performance Graph mining in recent years. However, the existing Resource Description Framework (RDF) Database engines are not scalable and are less reliable in heterogeneous clouds. In this paper we describe the design and implementation of Acacia-RDF which is a scalable distributed RDF Graph Database engine developed with X10 programming language to solve this issue. Acacia-RDF partitions the RDF data sets into subGraphs following vertex cut paradigm. The partitioned data sets are persisted on secondary storage across X10 places. We developed a scalable SPARQL (an RDF query language) processor for Acacia-RDF which operates on top of partitioned RDF data. Furthermore, we designed and implemented a replication based fault tolerance mechanism for Acacia-RDF. We present performance results gathered from Acacia with different scales of LUBM (Lehigh University Benchmark) RDF benchmark data sets. We make a comparison of Acacia-RDF's performance against Neo4j Graph Database server. From the scalability experiments conducted upto 16 X10 places, we observed that Acacia-RDF scales well with LUBM data sets. Acacia-RDF reported less than ten seconds elapsed times on 16 places for running the first and the third queries of the LUBM benchmark on LUBM 160 universities data set with 3.6 million vertices and 28.5 million edges which was 1.7GB in size. Through this work we describe and demonstrate the use of X10 language for development of scalable RDF Graph data management systems.

  • introducing acacia rdf an x10 based scalable distributed rdf Graph Database engine
    International Parallel and Distributed Processing Symposium, 2016
    Co-Authors: Miyuru Dayarathna, Isuru Herath, Yasima Dewmini, Gayan Mettananda, Sameera Nandasiri, Sanath Jayasena, Toyotaro Suzumura
    Abstract:

    Linked data mining has become one of the key questions in HPC Graph mining in recent years. However, the existing RDF Database engines are not scalable and are less reliable in heterogeneous clouds. In this paper we describe the design and implementation of Acacia-RDF which is a scalable distributed RDF Graph Database engine developed with X10 programming language to solve this issue. Acacia-RDF partitions the RDF data sets into subGraphs following vertex cut paradigm. The partitioned data sets are persisted on secondary storage across X10 places. We developed a scalable SPARQL processor for Acacia-RDF which operates on top of partitioned RDF data. Furthermore, we demonstrate the implementation of scalable Graph algorithms such as Triangle counting with such partitioned data sets. We present performance results gathered from Acacia with different scales of LUBM RDF benchmark data sets and make a comparison of Acacia's performance against Neo4j Graph Database server. From the scalability experiments conducted upto 16 X10 places, we observed that Acacia-RDF scales well with LUBM data sets. Acacia-RDF reported approximately 2 seconds elapsed time on 4 places for running the first and third queries of the LUBM benchmark on LUBM scale 40 data set. Through this work we introduce the use of X10 language for scalable RDF Graph data management.

  • Graph Database benchmarking on cloud environments with XGDBench
    Automated Software Engineering, 2014
    Co-Authors: Miyuru Dayarathna, Toyotaro Suzumura
    Abstract:

    Online Graph Database service providers have started migrating their operations to public clouds due to the increasing demand for low-cost, ubiquitous Graph data storage and analysis. However, there is little support available for benchmarking Graph Database systems in cloud environments. We describe XGDBench which is a Graph Database benchmarking platform for cloud computing systems. XGDBench has been designed with the aim of creating an extensible platform for Graph Database benchmarking which makes it suitable for benchmarking future HPC systems. We extend the Yahoo! Cloud Serving Benchmark (YCSB) to the area of Graph Database benchmarking by creation of XGDBench. The benchmarking platform is written in X10 which is a PGAS language intended for programming future HPC systems. We describe the architecture of the XGDBench and explain how it differs from the current state-of-the-art. We conduct performance evaluation of five famous Graph data stores AllegroGraph, Fuseki, Neo4j, OrientDB, and Titan using XGDBench on Tsubame 2.0 HPC cloud environment.

  • towards scalable distributed Graph Database engine for hybrid clouds
    Proceedings of the 5th International Workshop on Data-Intensive Computing in the Clouds, 2014
    Co-Authors: Miyuru Dayarathna, Toyotaro Suzumura
    Abstract:

    Large Graph data management and mining in clouds has become an important issue in recent times. We propose Acacia which is a distributed Graph Database engine for scalable handling of such large Graph data. Acacia operates between the boundaries of private and public clouds. Acacia partitions and stores the Graph data in the private cloud during its initial deployment. Acacia bursts into the public cloud when the resources of the private cloud are insufficient to maintain its service-level agreements. We implement Acacia using X10 programming language. We describe how Top-K PageRank has been implemented in Acacia. We report preliminary experiment results conducted with Acacia on a small compute cluster. Acacia is able to upload 69 million edges LiveJournal social network data set in about 10 minutes. Furthermore, Acacia calculates the average out degree of vertices of LiveJournal Graph in 2 minutes. These results indicate Acacias potential for handling large Graphs.

Claudio Gutierrez - One of the best experts on this subject based on the ideXlab platform.

  • Survey of Graph Database models
    ACM Computing Surveys, 2008
    Co-Authors: Renzo Angles, Claudio Gutierrez
    Abstract:

    Graph Database models can be defined as those in which data structures for the schema and instances are modeled as Graphs or generalizations of them, and data manipulation is expressed by Graph-oriented operations and type constructors. These models took off in the eighties and early nineties alongside object-oriented models. Their influence gradually died out with the emergence of other Database models, in particular geoGraphical, spatial, semistructured, and XML. Recently, the need to manage information with Graph-like nature has reestablished the relevance of this area. The main objective of this survey is to present the work that has been conducted in the area of Graph Database modeling, concentrating on data structures, query languages, and integrity constraints.

  • querying rdf data from a Graph Database perspective
    European Semantic Web Conference, 2005
    Co-Authors: Renzo Angles, Claudio Gutierrez
    Abstract:

    This paper studies the RDF model from a Database perspective. From this point of view it is compared with other Database models, particularly with Graph Database models, which are very close in motivations and use cases to RDF. We concentrate on query languages, analyze current RDF trends, and propose the incorporation to RDF query languages of primitives which are not present today, based on the experience and techniques of Graph Database research.