Parallel File System

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 8763 Experts worldwide ranked by ideXlab platform

Alok Choudhary - One of the best experts on this subject based on the ideXlab platform.

  • A high-performance distributed Parallel File System for data-intensive computations
    Journal of Parallel and Distributed Computing, 2004
    Co-Authors: X. Shen, Alok Choudhary
    Abstract:

    One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using sufficient local storage resources to hold huge amounts of data generated by the simulation while providing high-performance I/O. DPFS, a distributed Parallel File System, is designed and implemented to address this problem. DPFS collects locally distributed and unused storage resources as a supplement to the internal storage of Parallel computing Systems to satisfy the storage capacity requirement of large-scale applications. In addition, like Parallel File Systems, DPFS provides striping mechanisms that divide a File into small pieces and distributes them across multiple storage devices for Parallel data access. The unique feature of DPFS is that it provides three File levels with each File level corresponding to a File striping method. In addition to the traditional linear striping method, DPFS also provides a novel Multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balancing and user interface are also addressed in DPFS.

  • ICPP - DPFS: a distributed Parallel File System
    International Conference on Parallel Processing 2001., 2001
    Co-Authors: Xiaohui Shen, Alok Choudhary
    Abstract:

    One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using enough local storage resources to hold huge amount of data generated by the simulation while providing high performance I/O. DPFS, a Distributed Parallel File System, is designed and implemented to address this problem. DPFS collects locally distributed unused storage resources as a supplement to the internal storage of Parallel computing Systems to satisfy the storage capacity requirement of large-scale applications. In addition, like Parallel File Systems, DPFS provides striping mechanisms that divides a File into small pieces and distributes them across multiple storage devices for Parallel data access. The unique feature of DPFS is that it provides three File levels with each File level corresponding to a File striping method. In addition to the traditional linear striping method, DPFS also provides a novel multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balancing and user interface are also addressed in DPFS.

  • Implementation and evaluation of prefetching in the intel paragon Parallel File System
    1996
    Co-Authors: Meenakshi Arunachalam, Alok Choudhary, Brad Rullman
    Abstract:

    The significant difference between the speeds of the I/O System (e.g. disks) and compute processors in Parallel Systems creates a bottleneck that lowers the performance of an application that does a considerable amount of disk accesses. A major portion of the compute processors' time is wasted on waiting for I/O to complete. This problem can be addressed to a certain extent, if the necessary data can be fetched from the disk before the I/O call to the disk is issued. Fetching data ahead of time, known as prefetching in a multiprocessor environment depends a great deal on the application's access pattern. The subject of this paper is implementation and performance evaluation of a prefetching prototype in a production Parallel File System on the Intel Paragon. Specifically, this paper presents: the design and implementation of a prefetching strategy in the Parallel File System; and performance measurements and evaluation of the File System with and without prefetching. The prototype is designed at the operating System level for the PFS. It is implemented in the PFS subSystem of the Intel Paragon Operating System. It is observed that in many cases prefetching provides considerable performance improvements. In some other cases no improvements or some performance degradation is observed due to the overheads incurred in prefetching.

  • The design of VIP-FS: a virtual, Parallel File System for high performance Parallel and distributed computing
    ACM SIGOPS Operating Systems Review, 1995
    Co-Authors: Michael Harry, J.m. Del Rosario, Alok Choudhary
    Abstract:

    In the past couple of years, significant progress has been made in the development of message-passing libraries for Parallel and distributed computing, and in the area of high-speed networking. Both technologies have evolved to the point where programmers and scientists are now porting many applications previously executed exclusively on Parallel machines into distributed programs for execution on more readily available networks of workstations. Such advances in computing technology have also led to a tremendous increase in the amount of data being manipulated and produced by scientific and commercial application programs. Despite their popularity, message-passing libraries only provide part of the support necessary for most high performance distributed computing applications --- support for high speed Parallel I/O is still lacking.In this paper, we provide an overview of the conceptual design of a Parallel and distributed I/O File System, the Virtual Parallel File System (VIP-FS), and describe its implementation. VIP-FS makes use of message-passing libraries to provide a Parallel and distributed File System which can execute over multiprocessor machines or heterogeneous network environments.

  • ICMCS, Vol. 2 - Design and evaluation of a multimedia integrated Parallel File System
    Proceedings IEEE International Conference on Multimedia Computing and Systems, 1
    Co-Authors: Jesus Carretero, Weiyu Zhu, Xiaohui Shen, Alok Choudhary
    Abstract:

    This paper shows the design of MiPFS, a Parallel File System intended to be used as a low-level platform to develop more complex I/O entities on top of it. This Parallel File System relies on the idea that the user should execute I/O operations using the data-types stored on each object, so that each File can be managed as a typed object. However, as we want to provide a mechanism, and not policies, MiPFS includes functions to manage fixed and variable length records, and their associated indexes. This approach, and the quality of service functionality, allows to use MiPFS as a continuous media Parallel File System.

Jeanpierre Prost - One of the best experts on this subject based on the ideXlab platform.

  • performance of the vesta Parallel File System
    International Parallel Processing Symposium, 1995
    Co-Authors: Dror G Feitelson, Peter F Corbett, Jeanpierre Prost
    Abstract:

    Vesta is an experimental Parallel File System implemented on the IBM SPI. Its main features are support for Parallel access from multiple application processes to File, and the ability to partition and re-partition the File data among these processes. This paper reports on a set of experiments designed to evaluate Vesta's performance. This includes basic single-node performance, and performance using Parallel access with different File partitioning schemes. Results are that bandwidth scales with the number of I/O nodes accessed, and that orthogonal partitioning schemes achieve essentially the same performance. In many cases performance equals the disk hardware limit. This is often attributed to prefetching and write-behind in the I/O nodes. >

  • IPPS - Performance of the Vesta Parallel File System
    Proceedings of 9th International Parallel Processing Symposium, 1
    Co-Authors: Dror G Feitelson, Peter F Corbett, Jeanpierre Prost
    Abstract:

    Vesta is an experimental Parallel File System implemented on the IBM SPI. Its main features are support for Parallel access from multiple application processes to File, and the ability to partition and re-partition the File data among these processes. This paper reports on a set of experiments designed to evaluate Vesta's performance. This includes basic single-node performance, and performance using Parallel access with different File partitioning schemes. Results are that bandwidth scales with the number of I/O nodes accessed, and that orthogonal partitioning schemes achieve essentially the same performance. In many cases performance equals the disk hardware limit. This is often attributed to prefetching and write-behind in the I/O nodes. >

Dror G Feitelson - One of the best experts on this subject based on the ideXlab platform.

  • The Vesta Parallel File System
    ACM Transactions on Computer Systems, 1996
    Co-Authors: Peter F Corbett, Dror G Feitelson
    Abstract:

    The Vesta Parallel File System is designed to provide Parallel File access to application programs running on multicomputers with Parallel I/O subSystems. Vesta uses a new abstraction of Files: a File is not a sequence of bytes, but rather it can be partitioned into multiple disjoint sequences that are accessed in Parallel. The partitioning—which can also be changed dynamically—reduces the need for synchronization and coordination during the access. Some control over the layout of data is also provided, so the layout can be matched with the anticipated access patterns. The System is fully implemented and forms the basis for the AIX Parallel I/O File System on the IBM SP2. The implementation does not compromise scalability or Parallelism. In fact, all data accesses are done directly to the I/O node that contains the requested data, without any indirection or access to shared metadata. Disk mapping and caching functions are confined to each I/O node, so there is no need to keep data coherent across nodes. Performance measurements shown good scalability with increased resources. Moreover, different access patterns are show to achieve similar performance.

  • performance of the vesta Parallel File System
    International Parallel Processing Symposium, 1995
    Co-Authors: Dror G Feitelson, Peter F Corbett, Jeanpierre Prost
    Abstract:

    Vesta is an experimental Parallel File System implemented on the IBM SPI. Its main features are support for Parallel access from multiple application processes to File, and the ability to partition and re-partition the File data among these processes. This paper reports on a set of experiments designed to evaluate Vesta's performance. This includes basic single-node performance, and performance using Parallel access with different File partitioning schemes. Results are that bandwidth scales with the number of I/O nodes accessed, and that orthogonal partitioning schemes achieve essentially the same performance. In many cases performance equals the disk hardware limit. This is often attributed to prefetching and write-behind in the I/O nodes. >

  • design and implementation of the vesta Parallel File System
    IEEE International Conference on High Performance Computing Data and Analytics, 1994
    Co-Authors: Peter F Corbett, Dror G Feitelson
    Abstract:

    The Vesta Parallel File System is designed to provide Parallel File access to application programs running on multicomputers with Parallel I/O subSystems. Vesta uses a new abstraction of Files: a File is not a sequence of bytes, but rather it can be partitioned into multiple disjoint sequences that are accessed in Parallel. The partitioning-which can also be changed dynamically-reduces the need for synchronization and coordination during the access. Some control over the layout of data is also provided, so the layout can be marched with the anticipated access patterns. The System is fully implemented, and is beginning to be used by application programmers. The implementation does not compromise scalability or Parallelism. In fact, all data accesses are done directly to the I/O node that contains the requested data, without any indirection or access to shared metadata. There are no centralized control points in the System. >

  • Overview of the Vesta Parallel File System
    ACM SIGARCH Computer Architecture News, 1993
    Co-Authors: Peter F Corbett, Sandra Johnson Baylor, Dror G Feitelson
    Abstract:

    The Vesta Parallel File System provides Parallel access from compute nodes to Files distributed across I/O nodes in a massively Parallel computer. Vesta is intended to solve the I/O problems of massively Parallel computers executing numerically intensive scientific applications. Vesta has three interesting characteristics: First, it provides a user defined Parallel view of File data, and allows user defined partitioning and repartitioning of Files without moving data among I/O nodes. The Parallel File access semantics of Vesta directly support the operations required by Parallel language I/O libraries. Second, Vesta is scalable to a very large number (many hundreds) of I/O and compute nodes and does not contain any sequential bottlenecks in the data-access path. Third, it provides user-directed checkpointing of Files during continuing program execution with very little processing overhead.

  • IPPS - Performance of the Vesta Parallel File System
    Proceedings of 9th International Parallel Processing Symposium, 1
    Co-Authors: Dror G Feitelson, Peter F Corbett, Jeanpierre Prost
    Abstract:

    Vesta is an experimental Parallel File System implemented on the IBM SPI. Its main features are support for Parallel access from multiple application processes to File, and the ability to partition and re-partition the File data among these processes. This paper reports on a set of experiments designed to evaluate Vesta's performance. This includes basic single-node performance, and performance using Parallel access with different File partitioning schemes. Results are that bandwidth scales with the number of I/O nodes accessed, and that orthogonal partitioning schemes achieve essentially the same performance. In many cases performance equals the disk hardware limit. This is often attributed to prefetching and write-behind in the I/O nodes. >

Dae-wha Seo - One of the best experts on this subject based on the ideXlab platform.

  • CLUSTER - Table-comparison prefetching in VIA-based Parallel File System
    Proceedings 2001 IEEE International Conference on Cluster Computing, 2001
    Co-Authors: Yoon-young Lee, Chei-yol Kim, Dae-wha Seo
    Abstract:

    Abstract : A Parallel File System is normally used to support excessive File requests from Parallel applications in a cluster System, whereas prefetching is useful for improving the File System performance. This paper proposes a table-comparison prefetching policy that is particularly suitable for Parallel scientific applications and multimedia web services in a VIA-based Parallel File System. VIA relieves the communi-cation overhead of traditional communication protocols, such as TCP/IP. The proposed policy introduces a table-comparison method to predict data for prefetching. In ad-dition, it includes an algorithm to determine whether and when prefetching is performed using the current available I/O bandwidth. Experimental results confirmed that the use of the proposed prefetching policy in a VIA-based Parallel File System produced a higher File System performance.

  • CLUSTER - Adaptive Dual-Cache Scheme with Dynamic Prefetching Scheme in Parallel File System
    Cluster Computing, 2000
    Co-Authors: Chei-yol Kim, Jong-hyun Cho, Dae-wha Seo
    Abstract:

    An adaptive Dual-Cache Scheme(DCS) and dynamic prefetching scheme were designed to improve the performance of the Parallel File System for Linux(PFSL). PFSL is a Parallel File System for a clustering environment and is implemented using a multi-threaded programming technique with POSIX thread libraries supported by Linux. The proposed adaptive DCS and dynamic prefetching scheme both reflect the current File request. The former reduces the first File request response time, whereas the latter improves the File System performance by adjusting the amount of data for prefetching.

  • Design and Implementation of a Communication Module of the Parallel Operating File System based on MISIX
    Journal of KIISE:Computing Practices and Letters, 2000
    Co-Authors: Sung-kn Jin, Jong-hyun Cho, Hae-jin Kim, Dae-wha Seo
    Abstract:

    This paper is concerned with development of a communication module of POFS(Parallel Operating File System), which is the Parallel File System to be operated on SPAX computer. SPAX is multiprocessor computer with clustering SMP architecture and being developed by ETRI. The operating System for SPAX is MISIX based on the Chorus microkernel. POFS has client/server architecture basically so that it is important to design a communication module. The communication module is so easily affected by network environment that bad design is the major reason that decreases the portability and performance of the Parallel File System. This paper describes the structure and performance of the communication of the POFS. the theme is issued in the course of designing and developing POFS. The communication module of POFS was designed to support the portability and the architecture of Parallel File System.

  • CLUSTER - Adaptive message management using hybrid channel model in Parallel File System
    Proceedings. IEEE International Conference on Cluster Computing, 1
    Co-Authors: Joon-hyung Hwangbo, Sang-ki Lee, Yoon-young Lee, Dae-wha Seo
    Abstract:

    A Parallel File System is utilized for supporting an excessive File request resulted from a Parallel application in a cluster System. It uses traditional communication protocols like TCP/IP or UDP/IP that were designed for Wide Area Networks(WANs). For a cluster System, however, these protocols are inappropriate for its large scale of network overhead. In accordance with this problem, we propose a Hybrid Channel Model(HCM) for inter-cluster communication protocol. In a Parallel File System, messages can be classified as control messages and File data block. Therefore, we divided a message channel into two parts, a control message channel and data channel. The first is used for transferring the control messages, while the last is used for transferring the File data blocks. For the message channel, TCP/IP is used as a communication protocol, and Virtual Interface Architecture(VIA) is used for data blocks. In tests, the proposed channel model exhibited a considerably improved performance.

Walter F Tichy - One of the best experts on this subject based on the ideXlab platform.

  • integrating collective i o and cooperative caching into the clusterFile Parallel File System
    International Conference on Supercomputing, 2004
    Co-Authors: Florin Isaila, Guido Malpohl, Vlad Olaru, Gabor Szeder, Walter F Tichy
    Abstract:

    This paper presents the integration of two collective I/O techniques into the ClusterFile Parallel File System: disk-directed I/O and two-phase I/O. We show that global cooperative cache management improves the collective I/O performance. The solution focuses on integrating disk Parallelism with other types of Parallelism: memory (by buffering and caching on several nodes), network (by Parallel I/O scheduling strategies) and processors (by redistributing the I/O related computation over several nodes). The performance results show considerable throughput increases over ROMIO's extended two-phase I/O.

  • clusterFile a flexible physical layout Parallel File System
    Concurrency and Computation: Practice and Experience, 2003
    Co-Authors: Florin Isaila, Walter F Tichy
    Abstract:

    This paper presents ClusterFile, a Parallel File System that provides Parallel File access on a cluster of computers. We introduce a File partitioning model that has been used in the design of ClusterFile. The model uses a data representation that is optimized for multidimensional array partitioning while allowing arbitrary partitions. The paper shows how the File model can be employed for File partitioning into both physical subFiles and logical views. We also present how the conversion between two partitions of the same File is implemented using a general memory redistribution algorithm. We show how we use the algorithm to optimize non-contiguous read and write operations. The experimental results include performance comparisons with the Parallel Virtual File System (PVFS) and an MPI-IO implementation for PVFS. Copyright © 2003 John Wiley & Sons, Ltd.

  • clusterFile a flexible physical layout Parallel File System
    Foundations of Computer Science, 2001
    Co-Authors: Florin Isaila, Walter F Tichy
    Abstract:

    This paper presents ClusterFile, a Parallel File System that provides Parallel File access on a cluster of computers. Existing Parallel File Systems offer little control over matching the I/O access patterns and File data layout. Without this matching the applications may face the following problems: contention at I/O nodes, fragmentation of File data, false sharing, small network messages, high overhead of scattering/gathering the data. ClusterFile addresses some of these inefficiencies. Parallel applications can physically partition a File in arbitrary patterns. They can also set arbitrary views on a File. Views hide the Parallel structure of the File and ease the programmer's burden of computing complex access indices. The intersections between views and layouts are computed by a memory redistribution algorithm. Read and write operations are optimized by pre-computing the direct mapping between access patterns and disks. ClusterFile uses the same data representation for File layouts, access patterns, and the mappings between each other.

  • CLUSTER - ClusterFile: a flexible physical layout Parallel File System
    Proceedings 2001 IEEE International Conference on Cluster Computing, 2001
    Co-Authors: Florin Isaila, Walter F Tichy
    Abstract:

    This paper presents ClusterFile, a Parallel File System that provides Parallel File access on a cluster of computers. Existing Parallel File Systems offer little control over matching the I/O access patterns and File data layout. Without this matching the applications may face the following problems: contention at I/O nodes, fragmentation of File data, false sharing, small network messages, high overhead of scattering/gathering the data. ClusterFile addresses some of these inefficiencies. Parallel applications can physically partition a File in arbitrary patterns. They can also set arbitrary views on a File. Views hide the Parallel structure of the File and ease the programmer's burden of computing complex access indices. The intersections between views and layouts are computed by a memory redistribution algorithm. Read and write operations are optimized by pre-computing the direct mapping between access patterns and disks. ClusterFile uses the same data representation for File layouts, access patterns, and the mappings between each other.