The Experts below are selected from a list of 258 Experts worldwide ranked by ideXlab platform
Wenmei W Hwu - One of the best experts on this subject based on the ideXlab platform.
-
flatflash exploiting the byte accessibility of ssds within a unified memory Storage Hierarchy
Architectural Support for Programming Languages and Operating Systems, 2019Co-Authors: Ahmed Abulila, Vikram Sharma Mailthody, Zaid Qureshi, Jian Huang, Nam Sung Kim, Jinjun Xiong, Wenmei W HwuAbstract:Using flash-based solid state drives (SSDs) as main memory has been proposed as a practical solution towards scaling memory capacity for data-intensive applications. However, almost all existing approaches rely on the paging mechanism to move data between SSDs and host DRAM. This inevitably incurs significant performance overhead and extra I/O traffic. Thanks to the byte-addressability supported by the PCIe interconnect and the internal memory in SSD controllers, it is feasible to access SSDs in both byte and block granularity today. Exploiting the benefits of SSD's byte-accessibility in today's memory-Storage Hierarchy is, however, challenging as it lacks systems support and abstractions for programs. In this paper, we present FlatFlash, an optimized unified memory-Storage Hierarchy, to efficiently use byte-addressable SSD as part of the main memory. We extend the virtual memory management to provide a unified memory interface so that programs can access data across SSD and DRAM in byte granularity seamlessly. We propose a lightweight, adaptive page promotion mechanism between SSD and DRAM to gain benefits from both the byte-addressable large SSD and fast DRAM concurrently and transparently, while avoiding unnecessary page movements. Furthermore, we propose an abstraction of byte-granular data persistence to exploit the persistence nature of SSDs, upon which we rethink the design primitives of crash consistency of several representative software systems that require data persistence, such as file systems and databases. Our evaluation with a variety of applications demonstrates that, compared to the current unified memory-Storage systems, FlatFlash improves the performance for memory-intensive applications by up to 2.3x, reduces the tail latency for latency-critical applications by up to 2.8x, scales the throughput for transactional database by up to 3.0x, and decreases the meta-data persistence overhead for file systems by up to 18.9x. FlatFlash also improves the cost-effectiveness by up to 3.8x compared to DRAM-only systems, while enhancing the SSD lifetime significantly.
-
ASPLOS - FlatFlash: Exploiting the Byte-Accessibility of SSDs within a Unified Memory-Storage Hierarchy
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019Co-Authors: Ahmed Abulila, Vikram Sharma Mailthody, Zaid Qureshi, Jian Huang, Nam Sung Kim, Jinjun Xiong, Wenmei W HwuAbstract:Using flash-based solid state drives (SSDs) as main memory has been proposed as a practical solution towards scaling memory capacity for data-intensive applications. However, almost all existing approaches rely on the paging mechanism to move data between SSDs and host DRAM. This inevitably incurs significant performance overhead and extra I/O traffic. Thanks to the byte-addressability supported by the PCIe interconnect and the internal memory in SSD controllers, it is feasible to access SSDs in both byte and block granularity today. Exploiting the benefits of SSD's byte-accessibility in today's memory-Storage Hierarchy is, however, challenging as it lacks systems support and abstractions for programs. In this paper, we present FlatFlash, an optimized unified memory-Storage Hierarchy, to efficiently use byte-addressable SSD as part of the main memory. We extend the virtual memory management to provide a unified memory interface so that programs can access data across SSD and DRAM in byte granularity seamlessly. We propose a lightweight, adaptive page promotion mechanism between SSD and DRAM to gain benefits from both the byte-addressable large SSD and fast DRAM concurrently and transparently, while avoiding unnecessary page movements. Furthermore, we propose an abstraction of byte-granular data persistence to exploit the persistence nature of SSDs, upon which we rethink the design primitives of crash consistency of several representative software systems that require data persistence, such as file systems and databases. Our evaluation with a variety of applications demonstrates that, compared to the current unified memory-Storage systems, FlatFlash improves the performance for memory-intensive applications by up to 2.3x, reduces the tail latency for latency-critical applications by up to 2.8x, scales the throughput for transactional database by up to 3.0x, and decreases the meta-data persistence overhead for file systems by up to 18.9x. FlatFlash also improves the cost-effectiveness by up to 3.8x compared to DRAM-only systems, while enhancing the SSD lifetime significantly.
Ahmed Abulila - One of the best experts on this subject based on the ideXlab platform.
-
flatflash exploiting the byte accessibility of ssds within a unified memory Storage Hierarchy
Architectural Support for Programming Languages and Operating Systems, 2019Co-Authors: Ahmed Abulila, Vikram Sharma Mailthody, Zaid Qureshi, Jian Huang, Nam Sung Kim, Jinjun Xiong, Wenmei W HwuAbstract:Using flash-based solid state drives (SSDs) as main memory has been proposed as a practical solution towards scaling memory capacity for data-intensive applications. However, almost all existing approaches rely on the paging mechanism to move data between SSDs and host DRAM. This inevitably incurs significant performance overhead and extra I/O traffic. Thanks to the byte-addressability supported by the PCIe interconnect and the internal memory in SSD controllers, it is feasible to access SSDs in both byte and block granularity today. Exploiting the benefits of SSD's byte-accessibility in today's memory-Storage Hierarchy is, however, challenging as it lacks systems support and abstractions for programs. In this paper, we present FlatFlash, an optimized unified memory-Storage Hierarchy, to efficiently use byte-addressable SSD as part of the main memory. We extend the virtual memory management to provide a unified memory interface so that programs can access data across SSD and DRAM in byte granularity seamlessly. We propose a lightweight, adaptive page promotion mechanism between SSD and DRAM to gain benefits from both the byte-addressable large SSD and fast DRAM concurrently and transparently, while avoiding unnecessary page movements. Furthermore, we propose an abstraction of byte-granular data persistence to exploit the persistence nature of SSDs, upon which we rethink the design primitives of crash consistency of several representative software systems that require data persistence, such as file systems and databases. Our evaluation with a variety of applications demonstrates that, compared to the current unified memory-Storage systems, FlatFlash improves the performance for memory-intensive applications by up to 2.3x, reduces the tail latency for latency-critical applications by up to 2.8x, scales the throughput for transactional database by up to 3.0x, and decreases the meta-data persistence overhead for file systems by up to 18.9x. FlatFlash also improves the cost-effectiveness by up to 3.8x compared to DRAM-only systems, while enhancing the SSD lifetime significantly.
-
ASPLOS - FlatFlash: Exploiting the Byte-Accessibility of SSDs within a Unified Memory-Storage Hierarchy
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019Co-Authors: Ahmed Abulila, Vikram Sharma Mailthody, Zaid Qureshi, Jian Huang, Nam Sung Kim, Jinjun Xiong, Wenmei W HwuAbstract:Using flash-based solid state drives (SSDs) as main memory has been proposed as a practical solution towards scaling memory capacity for data-intensive applications. However, almost all existing approaches rely on the paging mechanism to move data between SSDs and host DRAM. This inevitably incurs significant performance overhead and extra I/O traffic. Thanks to the byte-addressability supported by the PCIe interconnect and the internal memory in SSD controllers, it is feasible to access SSDs in both byte and block granularity today. Exploiting the benefits of SSD's byte-accessibility in today's memory-Storage Hierarchy is, however, challenging as it lacks systems support and abstractions for programs. In this paper, we present FlatFlash, an optimized unified memory-Storage Hierarchy, to efficiently use byte-addressable SSD as part of the main memory. We extend the virtual memory management to provide a unified memory interface so that programs can access data across SSD and DRAM in byte granularity seamlessly. We propose a lightweight, adaptive page promotion mechanism between SSD and DRAM to gain benefits from both the byte-addressable large SSD and fast DRAM concurrently and transparently, while avoiding unnecessary page movements. Furthermore, we propose an abstraction of byte-granular data persistence to exploit the persistence nature of SSDs, upon which we rethink the design primitives of crash consistency of several representative software systems that require data persistence, such as file systems and databases. Our evaluation with a variety of applications demonstrates that, compared to the current unified memory-Storage systems, FlatFlash improves the performance for memory-intensive applications by up to 2.3x, reduces the tail latency for latency-critical applications by up to 2.8x, scales the throughput for transactional database by up to 3.0x, and decreases the meta-data persistence overhead for file systems by up to 18.9x. FlatFlash also improves the cost-effectiveness by up to 3.8x compared to DRAM-only systems, while enhancing the SSD lifetime significantly.
Suren Byna - One of the best experts on this subject based on the ideXlab platform.
-
HiPC - Analysis in the Data Path of an Object-Centric Data Management System
2019 IEEE 26th International Conference on High Performance Computing Data and Analytics (HiPC), 2019Co-Authors: Richard Warren, Suren Byna, Jerome Soumagne, Houjun Tang, Bin Dong, Quincey KoziolAbstract:Emerging high performance computing (HPC) systems are expected to be deployed with an unprecedented level of complexity due to a deep system memory and Storage Hierarchy. Efficient and scalable methods of data management and movement through the multi-level Storage Hierarchy of upcoming HPC systems will be critical for scientific applications at exascale. In this paper, we propose in locus analysis that allows registering user-defined functions (UDFs) and running those functions automatically while the data is moving between levels of a Storage Hierarchy. We implement this analysis in the data path approach in our object-centric data management system, called Proactive Data Containers (PDC). The transparent invocation of analysis functions as part of PDC object mapping is an optimized approach to minimize latency to access data as it moves within the Storage Hierarchy. Because a user defined analysis or transform function will be invoked automatically by the PDC runtime, the user simply registers their functions for PDC to identify the function name as well as the required list of actual parameters. To demonstrate the validity and flexibility of this analysis approach, we have implemented several scientific analysis kernels to compare against other HPC analysis-oriented approaches.
-
toward transparent data management in multi layer Storage Hierarchy of hpc systems
IEEE International Conference on Cloud Engineering, 2018Co-Authors: Bharti Wadhwa, Suren Byna, Ali R ButtAbstract:Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier Storage Hierarchy, and thus will necessitate innovative Storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of Storage hierarchies due to the lack of Hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. In this paper, we take the first steps of realizing such an object abstraction and explore Storage mechanisms for these objects to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our Storage model stores data objects in different physical organizations to support data movement across layers of memory/Storage Hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level Storage Hierarchy achieves up to 7× I/O performance improvement for scientific data.
-
IC2E - Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems
2018 IEEE International Conference on Cloud Engineering (IC2E), 2018Co-Authors: Bharti Wadhwa, Suren Byna, Ali R ButtAbstract:Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier Storage Hierarchy, and thus will necessitate innovative Storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of Storage hierarchies due to the lack of Hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. In this paper, we take the first steps of realizing such an object abstraction and explore Storage mechanisms for these objects to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our Storage model stores data objects in different physical organizations to support data movement across layers of memory/Storage Hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level Storage Hierarchy achieves up to 7× I/O performance improvement for scientific data.
-
CCGrid - Toward scalable and asynchronous object-centric data management for HPC
2018 18th IEEE ACM International Symposium on Cluster Cloud and Grid Computing (CCGRID), 2018Co-Authors: Houjun Tang, Suren Byna, Jerome Soumagne, Bin Dong, Quincey Koziol, Francois Tessier, Teng Wang, Venkatram Vishwanath, Jialin LiuAbstract:Emerging high performance computing (HPC) systems are expected to be deployed with an unprecedented level of complexity due to a deep system memory and Storage Hierarchy. Efficient and scalable methods of data management and movement through this Hierarchy is critical for scientific applications using exascale systems. Moving toward new paradigms for scalable I/O in the extreme-scale era, we introduce novel object-centric data abstractions and Storage mechanisms that take advantage of the deep Storage Hierarchy, named Proactive Data Containers (PDC). In this paper, we formulate object-centric PDCs and their mappings in different levels of the Storage Hierarchy. PDC adopts a client-server architecture with a set of servers managing data movement across Storage layers. To demonstrate the effectiveness of the proposed PDC system, we have measured performance of benchmarks and I/O kernels from scientific simulation and analysis applications using PDC programming interface, and compared the results with existing highly tuned I/O libraries. Using asynchronous I/O along with data and metadata optimizations, PDC demonstrates up to 23X speedup over HDF5 and PLFS in writing and reading data from a plasma physics simulation. PDC achieves comparable performance with HDF5 and PLFS in reading and writing data of a single timestep at small scale, and outperforms them at a scale of larger than ten thousand cores. In contrast to existing Storage systems, PDC offers user-space data management with the flexibility to allocate the number of PDC servers depending on the workload.
Ali R Butt - One of the best experts on this subject based on the ideXlab platform.
-
toward transparent data management in multi layer Storage Hierarchy of hpc systems
IEEE International Conference on Cloud Engineering, 2018Co-Authors: Bharti Wadhwa, Suren Byna, Ali R ButtAbstract:Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier Storage Hierarchy, and thus will necessitate innovative Storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of Storage hierarchies due to the lack of Hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. In this paper, we take the first steps of realizing such an object abstraction and explore Storage mechanisms for these objects to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our Storage model stores data objects in different physical organizations to support data movement across layers of memory/Storage Hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level Storage Hierarchy achieves up to 7× I/O performance improvement for scientific data.
-
IC2E - Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems
2018 IEEE International Conference on Cloud Engineering (IC2E), 2018Co-Authors: Bharti Wadhwa, Suren Byna, Ali R ButtAbstract:Upcoming exascale high performance computing (HPC) systems are expected to comprise multi-tier Storage Hierarchy, and thus will necessitate innovative Storage and I/O mechanisms. Traditional disk and block-based interfaces and file systems face severe challenges in utilizing capabilities of Storage hierarchies due to the lack of Hierarchy support and semantic interfaces. Object-based and semantically-rich data abstractions for scientific data management on large scale systems offer a sustainable solution to these challenges. Such data abstractions can also simplify users involvement in data movement. In this paper, we take the first steps of realizing such an object abstraction and explore Storage mechanisms for these objects to enhance I/O performance, especially for scientific applications. We explore how an object-based interface can facilitate next generation scalable computing systems by presenting the mapping of data I/O from two real world HPC scientific use cases: a plasma physics simulation code (VPIC) and a cosmology simulation code (HACC). Our Storage model stores data objects in different physical organizations to support data movement across layers of memory/Storage Hierarchy. Our implementation sclaes well to 16K parallel processes, and compared to the state of the art, such as MPI-IO and HDF5, our object-based data abstractions and data placement strategy in multi-level Storage Hierarchy achieves up to 7× I/O performance improvement for scientific data.
Yanbin Liu - One of the best experts on this subject based on the ideXlab platform.
-
ICWS - Effectiveness Assessment of Solid-State Drive Used in Big Data Services
2014 IEEE International Conference on Web Services, 2014Co-Authors: Wei Tan, Liana Fong, Yanbin LiuAbstract:Big data poses challenges to the technologies required to process data of high volume, velocity, variety, and veracity. Among the challenges, the Storage and computing required by big data analytics is usually huge, and as a result big data capabilities are often provisioned in cloud and delivered in the form of Web-based services. Solid-state drive (SSD) is widely used nowadays as an elementary hardware feature in cloud infrastructure for big data services. For example, Amazon Web Service (AWS) offers EC2 instances with SSD Storage, and its key-value data store, DynamoDB, is backed up by SSD for superior performance. Compared to hard disk drive (HDD), SSD prevails in both access latency and bandwidth. In the foreseeable future, SSD would be readily available on commodity servers though its capacity would be neither large enough nor cost effective to accommodate big data on its own. Therefore, it is essential to investigate how to efficiently leverage SSD as one layer in a Storage Hierarchy in addition to HDD. In this paper, we investigate the effectiveness of using SSD in three workloads, namely standalone Hadoop MapReduce jobs, Hive jobs, and HBase queries. Firstly, we device an approach to enable Hadoop Distributed File System (HDFS) having a SSD-HDD Storage Hierarchy. Secondly, we investigate the IO involved in different phases of Hadoop jobs and design different schemes to place data discriminatively in the aforementioned Storage Hierarchy. Afterward, the effectiveness of different schemes are evaluated with respect to job run time. Finally, we summarize best practices of data placement for examined workloads in a SSD-HDD Storage Hierarchy.