Secondary Storage

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 240 Experts worldwide ranked by ideXlab platform

Srinivas Aluru - One of the best experts on this subject based on the ideXlab platform.

  • Obtaining provably good performance from suffix trees in Secondary Storage
    Lecture Notes in Computer Science, 2006
    Co-Authors: Srinivas Aluru
    Abstract:

    Designing external memory data structures for string databases is of significant recent interest due to the proliferation of biological sequence data. The suffix tree is an important indexing structure that provides optimal algorithms for memory bound data. However, string B-trees provide the best known asymptotic performance in external memory for substring search and update operations. Work on external memory variants of suffix trees has largely focused on constructing suffix trees in external memory or layout schemes for suffix trees that preserve link locality. In this paper, we present a new suffix tree layout scheme for Secondary Storage and present construction, substring search, insertion and deletion algorithms that are competitive with the string B-tree. For a set of strings of total length n, a pattern p and disk blocks of size B, we provide a substring search algorithm that uses O(|p|/B+log B n) disk accesses. We present algorithms for insertion and deletion of all suffixes of a string of length m that take O(m log B (n + m)) and O(mlog B n) disk accesses, respectively. Our results demonstrate that suffix trees can be directly used as efficient Secondary Storage data structures for string and sequence data.

  • CPM - Obtaining provably good performance from suffix trees in Secondary Storage
    Combinatorial Pattern Matching, 2006
    Co-Authors: Srinivas Aluru
    Abstract:

    Designing external memory data structures for string data-bases is of significant recent interest due to the proliferation of biological sequence data. The suffix tree is an important indexing structure that provides optimal algorithms for memory bound data. However, string B-trees provide the best known asymptotic performance in external memory for substring search and update operations. Work on external memory variants of suffix trees has largely focused on constructing suffix trees in external memory or layout schemes for suffix trees that preserve link locality. In this paper, we present a new suffix tree layout scheme for Secondary Storage and present construction, substring search, insertion and deletion algorithms that are competitive with the string B-tree. For a set of strings of total length n, a pattern p and disk blocks of size B, we provide a substring search algorithm that uses O(|p|/B + logBn) disk accesses. We present algorithms for insertion and deletion of all suffixes of a string of length m that take O(m logB (n+m)) and O(mlogBn) disk accesses, respectively. Our results demonstrate that suffix trees can be directly used as efficient Secondary Storage data structures for string and sequence data.

  • Suffix trees and suffix arrays in primary and Secondary Storage
    1
    Co-Authors: Srinivas Aluru
    Abstract:

    In recent years the volume of string data has increased exponentially, and the speed at which these data is being generated has also increased. Some examples of string data includes biological sequences, internet webpages, and digitalized documents, to name a few. The indexing of biological sequence data is especially challenging due to the lack of natural word and sentence boundaries. Although many algorithms are able to deal with this lack of natural boundaries, they are not able to process the large quantity of data in reasonable time. To speed up the runtime of these algorithms, suffix trees and suffix arrays are routinely used to generate a set of starting positions quickly and/or narrow down the set of possibilities need to be considered. The first contribution of this dissertation is a linear time algorithm to sort all the suffixes of a string over a large alphabet of integers. The sorted order of suffixes of a string is also called suffix array, a data structure introduced by Manber and Myers that has numerous applications in pattern matching, string processing, and computational biology. Though the suffix tree of a string can be constructed in linear time and the sorted order of suffixes derived from it, a direct algorithm for suffix sorting is of great interest due to the space requirements of suffix trees. Our result is one of the first linear time suffix array construction algorithms, which improve upon the previously known O(n log n) time direct algorithms for suffix sorting. It can also be used to derive a different linear time construction algorithm for suffix trees. Apart from being simple and applicable for alphabets not necessarily of fixed size, this method of constructing suffix trees is more space efficient. The second contribution of this dissertation is providing a new suffix tree layout scheme for Secondary Storage and present construction, substring search, insertion and deletion algorithms using this layout scheme. For a set of strings of total length n, a pattern p and disk blocks of size B, we provide a substring search algorithm that uses O(|p|/ B +logB n) disk accesses. We present algorithms for insertion and deletion of all suffixes of a string of length m that take O(mlogB( n + m)) and O(mlog B n) disk accesses, respectively. Our results demonstrate that suffix trees can be directly used as efficient Secondary Storage data structures for string and sequence data. The last contribution of this dissertation is providing a self-adjusting variant of our layout scheme for suffix trees in Secondary Storage that provides optimal number of disk accesses for a sequence of string or substring queries. This has been an open problem since Sleator and Tarjan presented their splaying technique to create self-adjusting binary search trees in 1985. In addition to resolving this open problem, our scheme provides two additional advantages: (1) The partitions are slowly readjusted, requiring fewer disk accesses than splaying methods, and (2) the initial state of the layout is balanced, making it useful even when the sequence of queries is not highly skewed. Our layout scheme, and its self-adjusting variant are also applicable to PATRICIA trees, and potentially to other data structures.

  • SPIRE - Optimal self-adjusting trees for dynamic string data in Secondary Storage
    String Processing and Information Retrieval, 1
    Co-Authors: Srinivas Aluru
    Abstract:

    We present a self-adjusting layout scheme for suffix trees in Secondary Storage that provides optimal number of disk accesses for a sequence of string or substring queries. This has been an open problem since Sleator and Tarjan presented their splaying technique to create self-adjusting binary search trees in 1985. In addition to resolving this open problem, our scheme provides two additional advantages: 1) The partitions are slowly readjusted, requiring fewer disk accesses than splaying methods, and 2) the initial state of the layout is balanced, making it useful even when the sequence of queries is not highly skewed. Our method is also applicable to PATRICIA trees, and potentially to other data structures.

Michal Welnicki - One of the best experts on this subject based on the ideXlab platform.

  • hydrastor a scalable Secondary Storage
    File and Storage Technologies, 2009
    Co-Authors: Cezary Dubnicki, Leszek Gryz, Lukasz Heldt, Michal Kaczmarczyk, Wojciech Kilian, Przemyslaw Strzelczak, Jerzy Szczepkowski, Cristian Ungureanu, Michal Welnicki
    Abstract:

    HYDRAstor is a scalable, Secondary Storage solution aimed at the enterprise market. The system consists of a back-end architectured as a grid of Storage nodes built around a distributed hash table; and a front-end consisting of a layer of access nodes which implement a traditional file system interface and can be scaled in number for increased performance. This paper concentrates on the back-end which is, to our knowledge, the first commercial implementation of a scalable, high-performance content-addressable Secondary Storage delivering global duplicate elimination, per-block user-selectable failure resiliency, self-maintenance including automatic recovery from failures with data and network overlay rebuilding. The back-end programming model is based on an abstraction of a sea of variable-sized, content-addressed, immutable, highly-resilient data blocks organized in a DAG (directed acyclic graph). This model is exported with a low-level API allowing clients to implement new access protocols and to add them to the system on-line. The API has been validated with an implementation of the file system interface. The critical factor for meeting the design targets has been the selection of proper data organization based on redundant chains of data containers. We present this organization in detail and describe how it is used to deliver required data services. Surprisingly, the most complex to deliver turned out to be on-demand data deletion, followed (not surprisingly) by the management of data consistency and integrity.

  • FAST - HYDRAstor: a Scalable Secondary Storage
    2009
    Co-Authors: Cezary Dubnicki, Leszek Gryz, Lukasz Heldt, Michal Kaczmarczyk, Wojciech Kilian, Przemyslaw Strzelczak, Jerzy Szczepkowski, Cristian Ungureanu, Michal Welnicki
    Abstract:

    HYDRAstor is a scalable, Secondary Storage solution aimed at the enterprise market. The system consists of a back-end architectured as a grid of Storage nodes built around a distributed hash table; and a front-end consisting of a layer of access nodes which implement a traditional file system interface and can be scaled in number for increased performance. This paper concentrates on the back-end which is, to our knowledge, the first commercial implementation of a scalable, high-performance content-addressable Secondary Storage delivering global duplicate elimination, per-block user-selectable failure resiliency, self-maintenance including automatic recovery from failures with data and network overlay rebuilding. The back-end programming model is based on an abstraction of a sea of variable-sized, content-addressed, immutable, highly-resilient data blocks organized in a DAG (directed acyclic graph). This model is exported with a low-level API allowing clients to implement new access protocols and to add them to the system on-line. The API has been validated with an implementation of the file system interface. The critical factor for meeting the design targets has been the selection of proper data organization based on redundant chains of data containers. We present this organization in detail and describe how it is used to deliver required data services. Surprisingly, the most complex to deliver turned out to be on-demand data deletion, followed (not surprisingly) by the management of data consistency and integrity.

Ji-quan Shi - One of the best experts on this subject based on the ideXlab platform.

  • A Methodology to Assess Increased Storage Capacity Provided by Fracture Networks at CO2 Storage Sites: Application to in Salah Storage Site
    Energy Procedia, 2013
    Co-Authors: James Richard Smith, Sevket Durucan, Anna Korre, Ji-quan Shi
    Abstract:

    Abstract The presence of fractures in the Storage reservoir at CO2 Storage sites may increase the reservoir permeability and subsequently cause the CO2 plume extent to increase. Similarly, fractures in the caprock could provide regions of Secondary Storage if CO2 escapes from the reservoir. An important factor influencing the degree of these effects is whether the fractures form a continuously connected, or percolating, pathway. A methodology assessing the existence of percolating network of fractures, which incorporates the uncertainties in measured fracture properties around wells, was applied to assess Secondary Storage in the lower caprock at the In Salah Storage Site. It is demonstrated that Secondary Storage will occur if the fracture line density is equal to or greater than 2 m-1 and further shown what length distributions will provide Secondary Storage, if line density is less than 2 m-1.

Sung Jo Kim - One of the best experts on this subject based on the ideXlab platform.

  • A Hybrid Swapping Scheme Based On Per-Process Reclaim for Performance Improvement of Android Smartphones (August 2018)
    IEEE Access, 2018
    Co-Authors: Junyeong Han, Sungeun Kim, Sungyoung Lee, Jaehwan Lee, Sung Jo Kim
    Abstract:

    As a way to increase the actual main memory capacity of Android smartphones, most of them make use of zRAM swapping, but it has limitation in increasing its capacity since it utilizes main memory. Unfortunately, they cannot use Secondary Storage as a swap space due to the long response time and wear-out problem. In this paper, we propose a hybrid swapping scheme based on per-process reclaim that supports both Secondary-Storage swapping and zRAM swapping. It attempts to swap out all the pages in the working set of a process to a zRAM swap space rather than killing the process selected by a low-memory killer, and to swap out the least recently used pages into a Secondary Storage swap space. The main reason being is that frequently swap- in/out pages use the zRAM swap space while less frequently swap-in/out pages use the Secondary Storage swap space, in order to reduce the page operation cost. Our scheme resolves both the response time and wear-out problems of Secondary-Storage swapping and zSWAP, and overcomes the size limitation of the zRAM swap space. According to performance measurements, it also increased the extension ratio of main memory by 15 ~ 17% and 6 ~ 17% and reduced the page operation cost by 9 ~ 22% and 18 ~ 28%, respectively, compared with zRAM swapping and zSWAP.

Thomas N. Seyfried - One of the best experts on this subject based on the ideXlab platform.

  • Bis(monoacylglycero)phosphate: a Secondary Storage lipid in the gangliosidoses
    Journal of lipid research, 2015
    Co-Authors: Zeynep Akgoc, Miguel Sena-esteves, Douglas R. Martin, Xianlin Han, Alessandra D'azzo, Thomas N. Seyfried
    Abstract:

    Bis(monoacylglycero)phosphate (BMP) is a nega- tively charged glycerophospholipid with an unusual sn-1;sn-1 ' structural confi guration. BMP is primarily enriched in endo- somal/lysosomal membranes. BMP is thought to play a role in glycosphingolipid degradation and cholesterol transport. Elevated BMP levels have been found in many lysosomal stor- age diseases (LSDs), suggesting an association with lysosomal Storage material. The gangliosidoses are a group of neurode- generative LSDs involving the accumulation of either GM1 or GM2 gangliosides resulting from inherited defi ciencies in -galactosidase or -hexosaminidase, respectively. Little in- formation is available on BMP levels in gangliosidosis brain tissue. Our results showed that the content of BMP in brain was signifi cantly greater in humans and in animals (mice, cats, American black bears) with either GM1 or GM2 ganglioside Storage diseases, than in brains of normal subjects. The stor- age of BMP and ganglioside GM2 in brain were reduced simi- larly following adeno-associated viral-mediated gene therapy in Sandhoff disease mice. We also found that C22:6, C18:0, and C18:1 were the predominant BMP fatty acid species in gangliosidosis brains. The results show that BMP accumu- lates as a Secondary Storage material in the brain of a broad range of mammals with gangliosidoses. —Akgoc, Z., M. Sena- Esteves, D. R. Martin, X. Han, A. d'Azzo, and T. N. Seyfried. Bis(monoacylglycero)phosphate: a Secondary Storage lipid in the gangliosidoses. J. Lipid Res. 2015. 56: 1006-1013.

  • Bis(monoacylglycero)phosphate: a Secondary Storage lipid in the gangliosidoses
    Journal of Lipid Research, 2015
    Co-Authors: Zeynep Akgoc, Miguel Sena-esteves, Douglas R. Martin, Xianlin Han, Alessandra D'azzo, Thomas N. Seyfried
    Abstract:

    Bis(monoacylglycero)phosphate (BMP) is a negatively charged glycerophospholipid with an unusual sn-1;sn-1\u27 structural configuration. BMP is primarily enriched in endosomal/lysosomal membranes. BMP is thought to play a role in glycosphingolipid degradation and cholesterol transport. Elevated BMP levels have been found in many lysosomal Storage diseases (LSDs), suggesting an association with lysosomal Storage material. The gangliosidoses are a group of neurodegenerative LSDs involving the accumulation of either GM1 or GM2 gangliosides resulting from inherited deficiencies in beta-galactosidase or beta-hexosaminidase, respectively. Little information is available on BMP levels in gangliosidosis brain tissue. Our results showed that the content of BMP in brain was significantly greater in humans and in animals (mice, cats, American black bears) with either GM1 or GM2 ganglioside Storage diseases, than in brains of normal subjects. The Storage of BMP and ganglioside GM2 in brain were reduced similarly following adeno-associated viral-mediated gene therapy in Sandhoff disease mice. We also found that C22:6, C18:0, and C18:1 were the predominant BMP fatty acid species in gangliosidosis brains. The results show that BMP accumulates as a Secondary Storage material in the brain of a broad range of mammals with gangliosidoses