The Experts below are selected from a list of 13068 Experts worldwide ranked by ideXlab platform
Youngjae Kim - One of the best experts on this subject based on the ideXlab platform.
-
An Integrated Indexing and Search Service for Distributed File Systems
IEEE Transactions on Parallel and Distributed Systems, 2020Co-Authors: Hyogi Sim, Awais Khan, Sudharshan S. Vazhkudai, Seung-hwan Lim, Ali R. Butt, Youngjae KimAbstract:Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the Underlying File systems, and are often deployed using external databases and indexing services. However, modern data production rates, looming data movement costs, and the lack of metadata, entail revisiting the decoupled File system-data services design philosophy. In this article, we present TagIt, a scalable data management service framework aimed at scientific datasets, which can be integrated into prevalent distributed File system architectures. A key feature of TagIt is a scalable, distributed metadata indexing framework, which facilitates a flexible tagging capability to support data discovery. Furthermore, the tags can also be associated with an active operator, for pre-processing, filtering, or automatic metadata extraction, which we seamlessly offload to File servers in a load-aware fashion. We have integrated TagIt into two popular distributed File systems, i.e., GlusterFS and CephFS. Our evaluation demonstrates that TagIt can expedite data search operation by up to 10× over the extant decoupled approach.
-
SC - Tagit: an integrated indexing and search service for File systems
Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis, 2017Co-Authors: Hyogi Sim, Sudharshan S. Vazhkudai, Seung-hwan Lim, Youngjae Kim, Geoffroy Vallée, Ali R. ButtAbstract:Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the Underlying File systems, and are often deployed using external databases and indexing services. However, modern data production rates, looming data movement costs, and the lack of metadata, entail revisiting the decoupled File system-data services design philosophy. In this paper, we present TagIt, a scalable data management service framework aimed at scientific datasets, which is tightly integrated into a shared-nothing distributed File system. A key feature of TagIt is a scalable, distributed metadata indexing framework, using which we implement a flexible tagging capability to support data discovery. The tags can also be associated with an active operator, for pre-processing, filtering, or automatic metadata extraction, which we seamlessly offload to File servers in a load-aware fashion. Our evaluation shows that TagIt can expedite data search by up to 10X over the extant decoupled approach.
Ali R. Butt - One of the best experts on this subject based on the ideXlab platform.
-
An Integrated Indexing and Search Service for Distributed File Systems
IEEE Transactions on Parallel and Distributed Systems, 2020Co-Authors: Hyogi Sim, Awais Khan, Sudharshan S. Vazhkudai, Seung-hwan Lim, Ali R. Butt, Youngjae KimAbstract:Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the Underlying File systems, and are often deployed using external databases and indexing services. However, modern data production rates, looming data movement costs, and the lack of metadata, entail revisiting the decoupled File system-data services design philosophy. In this article, we present TagIt, a scalable data management service framework aimed at scientific datasets, which can be integrated into prevalent distributed File system architectures. A key feature of TagIt is a scalable, distributed metadata indexing framework, which facilitates a flexible tagging capability to support data discovery. Furthermore, the tags can also be associated with an active operator, for pre-processing, filtering, or automatic metadata extraction, which we seamlessly offload to File servers in a load-aware fashion. We have integrated TagIt into two popular distributed File systems, i.e., GlusterFS and CephFS. Our evaluation demonstrates that TagIt can expedite data search operation by up to 10× over the extant decoupled approach.
-
SC - Tagit: an integrated indexing and search service for File systems
Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis, 2017Co-Authors: Hyogi Sim, Sudharshan S. Vazhkudai, Seung-hwan Lim, Youngjae Kim, Geoffroy Vallée, Ali R. ButtAbstract:Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the Underlying File systems, and are often deployed using external databases and indexing services. However, modern data production rates, looming data movement costs, and the lack of metadata, entail revisiting the decoupled File system-data services design philosophy. In this paper, we present TagIt, a scalable data management service framework aimed at scientific datasets, which is tightly integrated into a shared-nothing distributed File system. A key feature of TagIt is a scalable, distributed metadata indexing framework, using which we implement a flexible tagging capability to support data discovery. The tags can also be associated with an active operator, for pre-processing, filtering, or automatic metadata extraction, which we seamlessly offload to File servers in a load-aware fashion. Our evaluation shows that TagIt can expedite data search by up to 10X over the extant decoupled approach.
Ajay Mohindra - One of the best experts on this subject based on the ideXlab platform.
-
server recovery using naturally replicated state a case study
International Conference on Distributed Computing Systems, 1995Co-Authors: Murthy V Devarakonda, B Kish, Ajay MohindraAbstract:This paper describes design and preliminary measurements of a File server recovery scheme that uses naturally replicated state among clients. This scheme, implemented in the Calypso File system, is truly transparent to the user and avoids the overhead of explicit replication. A three-phase protocol reconstructs the server state either on a backup node (if disks are multi-ported) or on the rebooted server node. Measurements show that the recovery time is about 21 seconds for a busy 10-node cluster. However, the time to rebuild the distributed state is only about 1.5 seconds, and most of the recovery time is spent in replaying the write-ahead log of the Underlying File system. Fortunately, the log redo time is bounded by the log size.
-
ICDCS - Server recovery using naturally replicated state: a case study
Proceedings of 15th International Conference on Distributed Computing Systems, 1Co-Authors: Murthy V Devarakonda, B Kish, Ajay MohindraAbstract:This paper describes design and preliminary measurements of a File server recovery scheme that uses naturally replicated state among clients. This scheme, implemented in the Calypso File system, is truly transparent to the user and avoids the overhead of explicit replication. A three-phase protocol reconstructs the server state either on a backup node (if disks are multi-ported) or on the rebooted server node. Measurements show that the recovery time is about 21 seconds for a busy 10-node cluster. However, the time to rebuild the distributed state is only about 1.5 seconds, and most of the recovery time is spent in replaying the write-ahead log of the Underlying File system. Fortunately, the log redo time is bounded by the log size.
Hyogi Sim - One of the best experts on this subject based on the ideXlab platform.
-
An Integrated Indexing and Search Service for Distributed File Systems
IEEE Transactions on Parallel and Distributed Systems, 2020Co-Authors: Hyogi Sim, Awais Khan, Sudharshan S. Vazhkudai, Seung-hwan Lim, Ali R. Butt, Youngjae KimAbstract:Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the Underlying File systems, and are often deployed using external databases and indexing services. However, modern data production rates, looming data movement costs, and the lack of metadata, entail revisiting the decoupled File system-data services design philosophy. In this article, we present TagIt, a scalable data management service framework aimed at scientific datasets, which can be integrated into prevalent distributed File system architectures. A key feature of TagIt is a scalable, distributed metadata indexing framework, which facilitates a flexible tagging capability to support data discovery. Furthermore, the tags can also be associated with an active operator, for pre-processing, filtering, or automatic metadata extraction, which we seamlessly offload to File servers in a load-aware fashion. We have integrated TagIt into two popular distributed File systems, i.e., GlusterFS and CephFS. Our evaluation demonstrates that TagIt can expedite data search operation by up to 10× over the extant decoupled approach.
-
SC - Tagit: an integrated indexing and search service for File systems
Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis, 2017Co-Authors: Hyogi Sim, Sudharshan S. Vazhkudai, Seung-hwan Lim, Youngjae Kim, Geoffroy Vallée, Ali R. ButtAbstract:Data services such as search, discovery, and management in scalable distributed environments have traditionally been decoupled from the Underlying File systems, and are often deployed using external databases and indexing services. However, modern data production rates, looming data movement costs, and the lack of metadata, entail revisiting the decoupled File system-data services design philosophy. In this paper, we present TagIt, a scalable data management service framework aimed at scientific datasets, which is tightly integrated into a shared-nothing distributed File system. A key feature of TagIt is a scalable, distributed metadata indexing framework, using which we implement a flexible tagging capability to support data discovery. The tags can also be associated with an active operator, for pre-processing, filtering, or automatic metadata extraction, which we seamlessly offload to File servers in a load-aware fashion. Our evaluation shows that TagIt can expedite data search by up to 10X over the extant decoupled approach.
Robert Ross - One of the best experts on this subject based on the ideXlab platform.
-
Optimizing I/O forwarding techniques for extreme-scale event tracing
Cluster Computing, 2014Co-Authors: Thomas Ilsche, Robert Ross, Joseph Schuchart, Jason Cope, Dries Kimpe, Terry Jones, Andreas Knüpfer, Kamil Iskra, Wolfgang E. Nagel, Stephen PooleAbstract:Programming development tools are a vital component for understanding the behavior of parallel applications. Event tracing is a principal ingredient to these tools, but new and serious challenges place event tracing at risk on extreme-scale machines. As the quantity of captured events increases with concurrency, the additional data can overload the parallel File system and perturb the application being observed. In this work we present a solution for event tracing on extreme-scale machines. We enhance an I/O forwarding software layer to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the Underlying File system. Furthermore, we introduce a sophisticated write buffering capability to limit the impact. To validate the approach, we employ the Vampir tracing toolset using these new capabilities. Our results demonstrate that the approach increases the maximum traced application size by a factor of 5× to more than 200,000 processes.
-
SC - Characterization and modeling of PIDX parallel I/O for performance optimization
Proceedings of the International Conference for High Performance Computing Networking Storage and Analysis on - SC '13, 2013Co-Authors: Sidharth Kumar, Robert Latham, Avishek Saha, Venkatram Vishwanath, Philip Carns, John A. Schmidt, Giorgio Scorzelli, Hemanth Kolla, Ray W. Grout, Robert RossAbstract:Parallel I/O library performance can vary greatly in response to user-tunable parameter values such as aggregator count, File count, and aggregation strategy. Unfortunately, manual selection of these values is time consuming and dependent on characteristics of the target machine, the Underlying File system, and the dataset itself. Some characteristics, such as the amount of memory per core, can also impose hard constraints on the range of viable parameter values. In this work we address these problems by using machine learning techniques to model the performance of the PIDX parallel I/O library and select appropriate tunable parameter values. We characterize both the network and I/O phases of PIDX on a Cray XE6 as well as an IBM Blue Gene/P system. We use the results of this study to develop a machine learning model for parameter space exploration and performance prediction.
-
HPDC - Enabling event tracing at leadership-class scale through I/O forwarding middleware
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing - HPDC '12, 2012Co-Authors: Thomas Ilsche, Robert Ross, Joseph Schuchart, Jason Cope, Dries Kimpe, Terry Jones, Andreas Knüpfer, Kamil Iskra, Wolfgang E. Nagel, Stephen W. PooleAbstract:Event tracing is an important tool for understanding the performance of parallel applications. As concurrency increases in leadership-class computing systems, the quantity of performance log data can overload the parallel File system, perturbing the application being observed. In this work we present a solution for event tracing at leadership scales. We enhance the I/O forwarding system software to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the Underlying File system for this type of traffic. Furthermore, we augment the I/O forwarding system with a write buffering capability to limit the impact of artificial perturbations from log data accesses on traced applications. To validate the approach, we modify the Vampir tracing toolset to take advantage of this new capability and show that the approach increases the maximum traced application size by a factor of 5x to more than 200,000 processes.
-
on the duality of data intensive File system design reconciling hdfs and pvfs
IEEE International Conference on High Performance Computing Data and Analytics, 2011Co-Authors: Wittawat Tantisiriroj, Swapnil Patil, Samuel Lang, Garth A Gibson, Robert RossAbstract:Data-intensive applications fall into two computing styles: Internet services (cloud computing) or high-performance computing (HPC). In both categories, the Underlying File system is a key component for scalable application performance. In this paper, we explore the similarities and differences between PVFS, a parallel File system used in HPC at large scale, and HDFS, the primary storage system used in cloud computing with Hadoop. We integrate PVFS into Hadoop and compare its performance to HDFS using a set of data-intensive computing benchmarks. We study how HDFS-specific optimizations can be matched using PVFS and how consistency, durability, and persistence tradeoffs made by these File systems affect application performance. We show how to embed multiple replicas into a PVFS File, including a mapping with a complete copy local to the writing client, to emulate HDFS's File layout policies. We also highlight implementation issues with HDFS's dependence on disk bandwidth and benefits from pipelined replication.
-
PVM/MPI - Implementing MPI-IO shared File pointers without File system support
Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2005Co-Authors: Robert Latham, Robert Ross, Rajeev Thakur, Brian ToonenAbstract:The ROMIO implementation of the MPI-IO standard provides a portable infrastructure for use on top of any number of different Underlying storage targets. These targets vary widely in their capabilities, and in some cases additional effort is needed within ROMIO to support all MPI-IO semantics. The MPI-2 standard defines a class of File access routines that use a shared File pointer. These routines require communication internal to the MPI-IO implementation in order to allow processes to atomically update this shared value. We discuss a technique that leverages MPI-2 one-sided operations and can be used to implement this concept without requiring any features from the Underlying File system. We then demonstrate through a simulation that our algorithm adds reasonable overhead for independent accesses and very small overhead for collective accesses.