Data Consolidation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 55623 Experts worldwide ranked by ideXlab platform

Panagiotis Kokkinos - One of the best experts on this subject based on the ideXlab platform.

  • Data Consolidation and Information Aggregation in Grid Networks
    Advances in Grid Computing, 2011
    Co-Authors: Panagiotis Kokkinos, Emmanouel Varvarigos
    Abstract:

    Grids consist of geographically distributed and heterogeneous computational and storage resources that may belong to different administrative domains, but are shared among users by establishing a global resource management architecture. A variety of applications can benefit from Grid computing; among them Data-intensive applications that perform computations on large sized Datasets, stored at geographically distributed resources. In this context, we identify two important issues: i) Data Consolidation that relates to the handling of these Data-intensive applications and ii) information aggregation, which relates to the summarization of resource information and the provision of information confidentiality among the different administrative domains. Data Consolidation (DC) applies to Data-intensive applications that need more than one pieces of Data to be transferred to an appropriate site, before the application can start its execution at that site. It is true, though, that an application/task may not need all the Datasets at the time it starts executing, but, it is usually beneficial both for the network and for the application to perform the Datasets transfers concurrently and before the task’s execution. The DC problem consists of three interrelated sub-problems: (i) the selection of the replica of each Dataset (i.e., the Data repository site from which to obtain the Dataset) that will be used by the task, (ii) the selection of the site where these pieces of Data will be gathered and the task will be executed and (iii) the selection of the paths the Datasets will follow in order to be concurrently transferred to the Data consolidating site. Furthermore, the delay required for transferring the output Data files to the originating user (or to a site specified by him) should also be accounted for. In most cases the task’s required Datasets will not be located into a single site, and a Data Consolidation operation is therefore required. Generally, a number of algorithms or policies can be used for solving these three subproblems either separately or jointly. Moreover, the order in which these sub-problems are handled may be different, while the performance optimization criteria used may also vary. The algorithms or policies for solving these sub-problems compromise a DC scheme. We will present a number of DC schemes. Some consider only the computational or only the communication requirements of the tasks, while others consider both kinds of requirements. We will also describe DC schemes, which are based on Minimum Spanning Trees (MST) that route concurrently the Datasets so as to reduce the congestion that may appear in the future, due to these transfers. Our results brace our belief that DC is an important problem that

  • Efficient Data Consolidation in grid networks and performance analysis
    Future Generation Computer Systems, 2011
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, Emmanouel Varvarigos
    Abstract:

    We examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task concurrently requests multiple pieces of Data, possibly scattered throughout the grid network, that have to be present at a selected site before the task's execution starts. In such a case, the scheduler and the Data manager must select (i) the Data replicas to be used, (ii) the site where these Data will be gathered for the task to be executed, and (iii) the routing paths to be followed; this is assuming that the selected Datasets are transferred concurrently to the execution site. The algorithms or policies for selecting the Data replicas, the Data consolidating site and the corresponding paths comprise a Data Consolidation scheme. We propose and experimentally evaluate several DC schemes of polynomial number of operations that attempt to estimate the cost of the concurrent Data transfers, to avoid congestion that may appear due to these transfers and to provide fault tolerance. Our simulation results strengthen our belief that DC is an important problem that needs to be addressed in the design of Data grids, and can lead, if performed efficiently, to significant benefits in terms of task delay, network load and other performance parameters.

  • Data Consolidation: A task scheduling and Data migration technique for grid networks
    Proceedings CCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid, 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

  • CCGRID - Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks
    2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis, Emmanouel Varvarigos
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

Emmanouel Varvarigos - One of the best experts on this subject based on the ideXlab platform.

  • Data Consolidation and Information Aggregation in Grid Networks
    Advances in Grid Computing, 2011
    Co-Authors: Panagiotis Kokkinos, Emmanouel Varvarigos
    Abstract:

    Grids consist of geographically distributed and heterogeneous computational and storage resources that may belong to different administrative domains, but are shared among users by establishing a global resource management architecture. A variety of applications can benefit from Grid computing; among them Data-intensive applications that perform computations on large sized Datasets, stored at geographically distributed resources. In this context, we identify two important issues: i) Data Consolidation that relates to the handling of these Data-intensive applications and ii) information aggregation, which relates to the summarization of resource information and the provision of information confidentiality among the different administrative domains. Data Consolidation (DC) applies to Data-intensive applications that need more than one pieces of Data to be transferred to an appropriate site, before the application can start its execution at that site. It is true, though, that an application/task may not need all the Datasets at the time it starts executing, but, it is usually beneficial both for the network and for the application to perform the Datasets transfers concurrently and before the task’s execution. The DC problem consists of three interrelated sub-problems: (i) the selection of the replica of each Dataset (i.e., the Data repository site from which to obtain the Dataset) that will be used by the task, (ii) the selection of the site where these pieces of Data will be gathered and the task will be executed and (iii) the selection of the paths the Datasets will follow in order to be concurrently transferred to the Data consolidating site. Furthermore, the delay required for transferring the output Data files to the originating user (or to a site specified by him) should also be accounted for. In most cases the task’s required Datasets will not be located into a single site, and a Data Consolidation operation is therefore required. Generally, a number of algorithms or policies can be used for solving these three subproblems either separately or jointly. Moreover, the order in which these sub-problems are handled may be different, while the performance optimization criteria used may also vary. The algorithms or policies for solving these sub-problems compromise a DC scheme. We will present a number of DC schemes. Some consider only the computational or only the communication requirements of the tasks, while others consider both kinds of requirements. We will also describe DC schemes, which are based on Minimum Spanning Trees (MST) that route concurrently the Datasets so as to reduce the congestion that may appear in the future, due to these transfers. Our results brace our belief that DC is an important problem that

  • Efficient Data Consolidation in grid networks and performance analysis
    Future Generation Computer Systems, 2011
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, Emmanouel Varvarigos
    Abstract:

    We examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task concurrently requests multiple pieces of Data, possibly scattered throughout the grid network, that have to be present at a selected site before the task's execution starts. In such a case, the scheduler and the Data manager must select (i) the Data replicas to be used, (ii) the site where these Data will be gathered for the task to be executed, and (iii) the routing paths to be followed; this is assuming that the selected Datasets are transferred concurrently to the execution site. The algorithms or policies for selecting the Data replicas, the Data consolidating site and the corresponding paths comprise a Data Consolidation scheme. We propose and experimentally evaluate several DC schemes of polynomial number of operations that attempt to estimate the cost of the concurrent Data transfers, to avoid congestion that may appear due to these transfers and to provide fault tolerance. Our simulation results strengthen our belief that DC is an important problem that needs to be addressed in the design of Data grids, and can lead, if performed efficiently, to significant benefits in terms of task delay, network load and other performance parameters.

  • CCGRID - Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks
    2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis, Emmanouel Varvarigos
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

A. Kretsis - One of the best experts on this subject based on the ideXlab platform.

  • Data Consolidation: A task scheduling and Data migration technique for grid networks
    Proceedings CCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid, 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

  • CCGRID - Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks
    2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis, Emmanouel Varvarigos
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

Kostas Christodoulopoulos - One of the best experts on this subject based on the ideXlab platform.

  • Efficient Data Consolidation in grid networks and performance analysis
    Future Generation Computer Systems, 2011
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, Emmanouel Varvarigos
    Abstract:

    We examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task concurrently requests multiple pieces of Data, possibly scattered throughout the grid network, that have to be present at a selected site before the task's execution starts. In such a case, the scheduler and the Data manager must select (i) the Data replicas to be used, (ii) the site where these Data will be gathered for the task to be executed, and (iii) the routing paths to be followed; this is assuming that the selected Datasets are transferred concurrently to the execution site. The algorithms or policies for selecting the Data replicas, the Data consolidating site and the corresponding paths comprise a Data Consolidation scheme. We propose and experimentally evaluate several DC schemes of polynomial number of operations that attempt to estimate the cost of the concurrent Data transfers, to avoid congestion that may appear due to these transfers and to provide fault tolerance. Our simulation results strengthen our belief that DC is an important problem that needs to be addressed in the design of Data grids, and can lead, if performed efficiently, to significant benefits in terms of task delay, network load and other performance parameters.

  • Data Consolidation: A task scheduling and Data migration technique for grid networks
    Proceedings CCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid, 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

  • CCGRID - Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks
    2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID), 2008
    Co-Authors: Panagiotis Kokkinos, Kostas Christodoulopoulos, A. Kretsis, Emmanouel Varvarigos
    Abstract:

    In this work we examine a task scheduling and Data migration problem for grid networks, which we refer to as the Data Consolidation (DC) problem. DC arises when a task needs for its execution two or more pieces of Data, possibly scattered throughout the grid network. In such a case, the scheduler and the Data manager must select the Data replicas to be used and the site where these will accumulate for the task to be executed. The policies for selecting the Data replicas and the Data consolidating site comprise the Data Consolidation problem. We propose and experimentally evaluate a number of DC techniques. Our simulation results brace our belief that DC is an important technique for Data grids since it can substantially improve task delay, network load and other performance related parameters.

Alexander V Bogdanov - One of the best experts on this subject based on the ideXlab platform.

  • ICCSA (4) - Storage Database System in the Cloud Data Processing on the Base of Consolidation Technology
    Computational Science and Its Applications -- ICCSA 2015, 2015
    Co-Authors: Alexander V Bogdanov, Thurein Kyaw Lwin, Elena N. Stankova
    Abstract:

    In this article we were studying the types of architectures for cloud processing and storage of Data, Data Consolidation and enterprise storage. Special attention is given to the use of large Data sets in computational process. It was shown, that based on the methods of theoretical analysis and experimental study of computer systems architectures, including heterogeneous, special techniques of Data processing, large volumes of information models relevant architectures, methods of optimization software for the heterogeneous systems, it is possible to ensure the integration of computer systems to provide computations with very large Data sets.

  • constructing virtual private supercomputer using virtualization and cloud technologies
    International Conference on Computational Science and Its Applications, 2014
    Co-Authors: Ivan Gankevich, Vladimir Korkhov, Serob Balyan, Vladimir Gaiduchok, Dmitry Gushchanskiy, Yuri Tipikin, Alexander B Degtyarev, Alexander V Bogdanov
    Abstract:

    One of efficient ways to conduct experiments on HPC platforms is to create custom virtual computing environments tailored to the requirements of users and their applications. In this paper we investigate virtual private supercomputer, an approach based on virtualization, Data Consolidation, and cloud technologies. Virtualization is used to abstract applications from underlying hardware and operating system while Data Consolidation is applied to store Data in a distributed storage system. Both virtualization and Data Consolidation layers offer APIs for distributed computations and Data processing. Combined, these APIs shift the focus from supercomputing technologies to problems being solved. Based on these concepts, we propose an approach to construct virtual clusters with help of cloud computing technologies to be used as on-demand private supercomputers and evaluate performance of this solution.

  • ICCSA (6) - Constructing Virtual Private Supercomputer Using Virtualization and Cloud Technologies
    Computational Science and Its Applications – ICCSA 2014, 2014
    Co-Authors: Ivan Gankevich, Vladimir Korkhov, Serob Balyan, Vladimir Gaiduchok, Dmitry Gushchanskiy, Yuri Tipikin, Alexander B Degtyarev, Alexander V Bogdanov
    Abstract:

    One of efficient ways to conduct experiments on HPC platforms is to create custom virtual computing environments tailored to the requirements of users and their applications. In this paper we investigate virtual private supercomputer, an approach based on virtualization, Data Consolidation, and cloud technologies. Virtualization is used to abstract applications from underlying hardware and operating system while Data Consolidation is applied to store Data in a distributed storage system. Both virtualization and Data Consolidation layers offer APIs for distributed computations and Data processing. Combined, these APIs shift the focus from supercomputing technologies to problems being solved. Based on these concepts, we propose an approach to construct virtual clusters with help of cloud computing technologies to be used as on-demand private supercomputers and evaluate performance of this solution.