Workload Characterization

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 3777 Experts worldwide ranked by ideXlab platform

Paolo Romano - One of the best experts on this subject based on the ideXlab platform.

  • automated Workload Characterization in cloud based transactional data grids
    International Parallel and Distributed Processing Symposium, 2012
    Co-Authors: Bruno Ciciani, Diego Didona, Sebastiano Peluso, Pierangelo Di Sanzo, Francesco Quaglia, Roberto Palmieri, Paolo Romano
    Abstract:

    Cloud computing represents a cost-effective paradigm to deploy a wide class of large-scale distributed applications, for which the pay-per-use model combined with automatic resource provisioning promise to reduce the cost of dependability and scalability. However, a key challenge to be addressed to materialize the advantages promised by Cloud computing is the design of effective auto-scaling and self-tuning mechanisms capable of ensuring pre-determined QoS levels at minimum cost in face of changing Workload conditions. This is one of the keys goals that are being pursued by the Cloud-TM project, a recent EU project that is developing a novel, self-optimizing transactional data platform for the cloud. In this paper we present the key design choices underlying the development of Cloud-TM's Workload Analyzer (WA), a crucial component of the Cloud-TM platform that is change of three key functionalities: aggregating, filtering and correlating the streams of statistical data gathered from the various nodes of the Cloud-TM platform, building detailed Workload profiles of applications deployed on the Cloud-TM platform, characterizing their present and future demands in terms of both logical (i.e. data) and physical (e.g. hardware-related) resources, triggering alerts in presence of violations (or risks of future violations) of pre-determined SLAs.

  • automated Workload Characterization in cloud based transactional data grids
    International Parallel and Distributed Processing Symposium, 2012
    Co-Authors: Bruno Ciciani, Diego Didona, Sebastiano Peluso, Pierangelo Di Sanzo, Francesco Quaglia, Roberto Palmieri, Paolo Romano
    Abstract:

    Cloud computing represents a cost-effective paradigm to deploy a wide class of large-scale distributed applications, for which the pay-per-use model combined with automatic resource provisioning promise to reduce the cost of dependability and scalability. However, a key challenge to be addressed to materialize the advantages promised by Cloud computing is the design of effective auto-scaling and self-tuning mechanisms capable of ensuring pre-determined QoS levels at minimum cost in face of changing Workload conditions. This is one of the keys goals that are being pursued by the Cloud-TM project, a recent EU project that is developing a novel, self-optimizing transactional data platform for the cloud. In this paper we present the key design choices underlying the development of Cloud-TM's Workload Analyzer (WA), a crucial component of the Cloud-TM platform that is change of three key functionalities: aggregating, filtering and correlating the streams of statistical data gathered from the various nodes of the Cloud-TM platform, building detailed Workload profiles of applications deployed on the Cloud-TM platform, characterizing their present and future demands in terms of both logical (i.e. data) and physical (e.g. hardware-related) resources, triggering alerts in presence of violations (or risks of future violations) of pre-determined SLAs.

Martin Arlitt - One of the best experts on this subject based on the ideXlab platform.

  • web Workload Characterization ten years later
    2005
    Co-Authors: Adepele Williams, Martin Arlitt, Carey Williamson, Ken Barker
    Abstract:

    In 1996, Arlitt and Williamson [Arlitt et al., 1997] conducted a comprehensive Workload Characterization study of Internet Web servers. By analyzing access logs from 6 Web sites (3 academic, 2 research, and 1 industrial) in 1994 and 1995, the authors identified 10 invariants: Workload characteristics common to all the sites that are likely to persist over time. In this present work, we revisit the 1996 work by Arlitt and Williamson, repeating many of the same analyses on new data sets collected in 2004. In particular, we study access logs from the same 3 academic sites used in the 1996 paper. Despite a 30-fold increase in overall traffic volume from 1994 to 2004, our main conclusion is that there are no dramatic changes in Web server Workload characteristics in the last 10 years. Although there have been many changes in Web technologies (e.g., new protocols, scripting languages, caching infrastructures), most of the 1996 invariants still hold true today. We postulate that these invariants will continue to hold in the future, because they represent fundamental characteristics of how humans organize, store, and access information on the Web.

  • a Workload Characterization study of the 1998 world cup web site
    IEEE Network, 2000
    Co-Authors: Martin Arlitt
    Abstract:

    This article presents a detailed Workload Characterization study of the 1998 World Cup Web site. Measurements from this site were collected over a three-month period. During this time the site received 1.35 billion requests, making this the largest Web Workload analyzed to date. By examining this extremely busy site and through comparison with existing Characterization studies, we are able to determine how Web server Workloads are evolving. We find that improvements in the caching architecture of the World Wide Web are changing the Workloads of Web servers, but major improvements to that architecture are still necessary. In particular, we uncover evidence that a better consistency mechanism is required for World Wide Web caches.

  • Workload Characterization of a web proxy in a cable modem environment
    Measurement and Modeling of Computer Systems, 1999
    Co-Authors: Martin Arlitt, Rich Friedrich, Tai Jin
    Abstract:

    This paper presents a detailed Workload Characterization study of a World-Wide Web proxy. Measurements from a proxy within an Internet Service Provider (ISP) environment were collected. This ISP allows clients to access the Web using high-speed cable modems rather than traditional dialup modems. By examining this site we are able to evaluate the effects that cable modems have on proxy Workloads.This paper focuses on Workload characteristics such as file type distribution, file size distribution, file referencing behaviour and turnover in the active set of files. We find that when presented with faster access speeds users are willing to download extremely large files. A widespread increase in the transfer of these large files would have a significant impact on the Web. This behaviour increases the importance of caching for ensuring the scalability of the Web.

  • internet web servers Workload Characterization and performance implications
    IEEE ACM Transactions on Networking, 1997
    Co-Authors: Martin Arlitt, Carey Williamson
    Abstract:

    This paper presents a Workload Characterization study for Internet Web servers. Six different data sets are used in the study: three from academic environments, two from scientific research organizations, and one from a commercial Internet provider. These data sets represent three different orders of magnitude in server activity, and two different orders of magnitude in time duration, ranging from one week of activity to one year. The Workload Characterization focuses on the document type distribution, the document size distribution, the document referencing behavior, and the geographic distribution of server requests. Throughout the study, emphasis is placed on finding Workload characteristics that are common to all the data sets studied. Ten such characteristics are identified. The paper concludes with a discussion of caching and performance issues, using the observed Workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.

  • web server Workload Characterization the search for invariants
    Measurement and Modeling of Computer Systems, 1996
    Co-Authors: Martin Arlitt, Carey Williamson
    Abstract:

    The phenomenal growth in popularity of the World Wide Web (WWW, or the Web) has made WWW traffic the largest contributor to packet and byte traffic on the NSFNET backbone. This growth has triggered recent research aimed at reducing the volume of network traffic produced by Web clients and servers, by using caching, and reducing the latency for WWW users, by using improved protocols for Web interaction.Fundamental to the goal of improving WWW performance is an understanding of WWW Workloads. This paper presents a Workload Characterization study for Internet Web servers. Six different data sets are used in this study: three from academic (i.e., university) environments, two from scientific research organizations, and one from a commercial Internet provider. These data sets represent three different orders of magnitude in server activity, and two different orders of magnitude in time duration, ranging from one week of activity to one year of activity.Throughout the study, emphasis is placed on finding Workload invariants: observations that apply across all the data sets studied. Ten invariants are identified. These invariants are deemed important since they (potentially) represent universal truths for all Internet Web servers. The paper concludes with a discussion of caching and performance issues, using the invariants to suggest performance enhancements that seem most promising for Internet Web servers.

Hossein Asadi - One of the best experts on this subject based on the ideXlab platform.

  • reca an efficient reconfigurable cache architecture for storage systems with online Workload Characterization
    arXiv: Performance, 2018
    Co-Authors: Reza Salkhordeh, Shahriar Ebrahimi, Hossein Asadi
    Abstract:

    In recent years, SSDs have gained tremendous attention in computing and storage systems due to significant performance improvement over HDDs. The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks to reduce the number of accesses to disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the Workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive Workload Characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O Workloads and classify them into five major classes. Based on this Characterization, an optimal cache configuration is presented for each class of Workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of Workloads. The cache reconfiguration is done online and Workload classes can be extended to emerging I/O Workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a server running Linux show that the proposed architecture improves performance and lifetime up to 24\% and 33\%, respectively.

  • ReCA: An Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization
    IEEE Transactions on Parallel and Distributed Systems, 2018
    Co-Authors: Reza Salkhordeh, Shahriar Ebrahimi, Hossein Asadi
    Abstract:

    In recent years, Solid-State Drives (SSDs) have gained tremendous attention in computing and storage systems due to significant performance improvement over Hard DiskDrives (HDDs). The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks in order to reduce the number of accesses to HDD-based disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous OS-level caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the Workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive Workload Characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O Workloads and classify them into five major classes. Based on this Characterization, an optimal cache configuration is presented for each class of Workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of Workloads. The cache reconfiguration is done online and Workload classes can be extended to emerging I/O Workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a 4U rackmount server with SATA 6Gb/s disk interfaces running Linux 3.17.0 show that the proposed architecture improves performance and lifetime up to 24 and 33 percent, respectively.

  • Operating system level data tiering using online Workload Characterization
    The Journal of Supercomputing, 2015
    Co-Authors: Reza Salkhordeh, Hossein Asadi, Shahriar Ebrahimi
    Abstract:

    Over the past decade, storage has been the performance bottleneck in I/O-intensive programs such as online transaction processing applications. To alleviate this bottleneck with minimal cost penalty, cost-effective design of a high-performance disk subsystem is of decisive importance in enterprise applications. Data tiering is an efficient way to optimize cost, performance, and reliability in storage servers. With the promising advantages of solid-state drives (SSDs) over hard disk drives (HDDs) such as lower power consumption and higher performance, traditional data tiering techniques should be revisited to use SSDs in a more efficient way. Previously proposed tiering solutions have attempted to enhance performance based on different parameters such as request size or randomness. These solutions, however, are mostly optimized towards one type of I/O Workloads and are not applicable to Workloads with different characteristics. This paper presents an online data tiering technique at the operating system level with a linear weighted formulation to enhance I/O performance with minimal cost overhead. The proposed technique characterizes the Workload access pattern with respect to metadata versus user data, frequency of accesses, random versus sequential accesses, and read versus write accesses. To evaluate the proposed technique, it is implemented on a Linux 3.1.4 equipped with ext2 filesystem. The experimental results over I/O-intensive Workloads show that the proposed technique improves performance up to 30 % as compared to the previous techniques while imposing negligible memory overhead to the system.

Bruno Ciciani - One of the best experts on this subject based on the ideXlab platform.

  • automated Workload Characterization in cloud based transactional data grids
    International Parallel and Distributed Processing Symposium, 2012
    Co-Authors: Bruno Ciciani, Diego Didona, Sebastiano Peluso, Pierangelo Di Sanzo, Francesco Quaglia, Roberto Palmieri, Paolo Romano
    Abstract:

    Cloud computing represents a cost-effective paradigm to deploy a wide class of large-scale distributed applications, for which the pay-per-use model combined with automatic resource provisioning promise to reduce the cost of dependability and scalability. However, a key challenge to be addressed to materialize the advantages promised by Cloud computing is the design of effective auto-scaling and self-tuning mechanisms capable of ensuring pre-determined QoS levels at minimum cost in face of changing Workload conditions. This is one of the keys goals that are being pursued by the Cloud-TM project, a recent EU project that is developing a novel, self-optimizing transactional data platform for the cloud. In this paper we present the key design choices underlying the development of Cloud-TM's Workload Analyzer (WA), a crucial component of the Cloud-TM platform that is change of three key functionalities: aggregating, filtering and correlating the streams of statistical data gathered from the various nodes of the Cloud-TM platform, building detailed Workload profiles of applications deployed on the Cloud-TM platform, characterizing their present and future demands in terms of both logical (i.e. data) and physical (e.g. hardware-related) resources, triggering alerts in presence of violations (or risks of future violations) of pre-determined SLAs.

  • automated Workload Characterization in cloud based transactional data grids
    International Parallel and Distributed Processing Symposium, 2012
    Co-Authors: Bruno Ciciani, Diego Didona, Sebastiano Peluso, Pierangelo Di Sanzo, Francesco Quaglia, Roberto Palmieri, Paolo Romano
    Abstract:

    Cloud computing represents a cost-effective paradigm to deploy a wide class of large-scale distributed applications, for which the pay-per-use model combined with automatic resource provisioning promise to reduce the cost of dependability and scalability. However, a key challenge to be addressed to materialize the advantages promised by Cloud computing is the design of effective auto-scaling and self-tuning mechanisms capable of ensuring pre-determined QoS levels at minimum cost in face of changing Workload conditions. This is one of the keys goals that are being pursued by the Cloud-TM project, a recent EU project that is developing a novel, self-optimizing transactional data platform for the cloud. In this paper we present the key design choices underlying the development of Cloud-TM's Workload Analyzer (WA), a crucial component of the Cloud-TM platform that is change of three key functionalities: aggregating, filtering and correlating the streams of statistical data gathered from the various nodes of the Cloud-TM platform, building detailed Workload profiles of applications deployed on the Cloud-TM platform, characterizing their present and future demands in terms of both logical (i.e. data) and physical (e.g. hardware-related) resources, triggering alerts in presence of violations (or risks of future violations) of pre-determined SLAs.

Reza Salkhordeh - One of the best experts on this subject based on the ideXlab platform.

  • reca an efficient reconfigurable cache architecture for storage systems with online Workload Characterization
    arXiv: Performance, 2018
    Co-Authors: Reza Salkhordeh, Shahriar Ebrahimi, Hossein Asadi
    Abstract:

    In recent years, SSDs have gained tremendous attention in computing and storage systems due to significant performance improvement over HDDs. The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks to reduce the number of accesses to disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the Workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive Workload Characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O Workloads and classify them into five major classes. Based on this Characterization, an optimal cache configuration is presented for each class of Workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of Workloads. The cache reconfiguration is done online and Workload classes can be extended to emerging I/O Workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a server running Linux show that the proposed architecture improves performance and lifetime up to 24\% and 33\%, respectively.

  • ReCA: An Efficient Reconfigurable Cache Architecture for Storage Systems with Online Workload Characterization
    IEEE Transactions on Parallel and Distributed Systems, 2018
    Co-Authors: Reza Salkhordeh, Shahriar Ebrahimi, Hossein Asadi
    Abstract:

    In recent years, Solid-State Drives (SSDs) have gained tremendous attention in computing and storage systems due to significant performance improvement over Hard DiskDrives (HDDs). The cost per capacity of SSDs, however, prevents them from entirely replacing HDDs in such systems. One approach to effectively take advantage of SSDs is to use them as a caching layer to store performance critical data blocks in order to reduce the number of accesses to HDD-based disk subsystem. Due to characteristics of Flash-based SSDs such as limited write endurance and long latency on write operations, employing caching algorithms at the Operating System (OS) level necessitates to take such characteristics into consideration. Previous OS-level caching techniques are optimized towards only one type of application, which affects both generality and applicability. In addition, they are not adaptive when the Workload pattern changes over time. This paper presents an efficient Reconfigurable Cache Architecture (ReCA) for storage systems using a comprehensive Workload Characterization to find an optimal cache configuration for I/O intensive applications. For this purpose, we first investigate various types of I/O Workloads and classify them into five major classes. Based on this Characterization, an optimal cache configuration is presented for each class of Workloads. Then, using the main features of each class, we continuously monitor the characteristics of an application during system runtime and the cache organization is reconfigured if the application changes from one class to another class of Workloads. The cache reconfiguration is done online and Workload classes can be extended to emerging I/O Workloads in order to maintain its efficiency with the characteristics of I/O requests. Experimental results obtained by implementing ReCA in a 4U rackmount server with SATA 6Gb/s disk interfaces running Linux 3.17.0 show that the proposed architecture improves performance and lifetime up to 24 and 33 percent, respectively.

  • Operating system level data tiering using online Workload Characterization
    The Journal of Supercomputing, 2015
    Co-Authors: Reza Salkhordeh, Hossein Asadi, Shahriar Ebrahimi
    Abstract:

    Over the past decade, storage has been the performance bottleneck in I/O-intensive programs such as online transaction processing applications. To alleviate this bottleneck with minimal cost penalty, cost-effective design of a high-performance disk subsystem is of decisive importance in enterprise applications. Data tiering is an efficient way to optimize cost, performance, and reliability in storage servers. With the promising advantages of solid-state drives (SSDs) over hard disk drives (HDDs) such as lower power consumption and higher performance, traditional data tiering techniques should be revisited to use SSDs in a more efficient way. Previously proposed tiering solutions have attempted to enhance performance based on different parameters such as request size or randomness. These solutions, however, are mostly optimized towards one type of I/O Workloads and are not applicable to Workloads with different characteristics. This paper presents an online data tiering technique at the operating system level with a linear weighted formulation to enhance I/O performance with minimal cost overhead. The proposed technique characterizes the Workload access pattern with respect to metadata versus user data, frequency of accesses, random versus sequential accesses, and read versus write accesses. To evaluate the proposed technique, it is implemented on a Linux 3.1.4 equipped with ext2 filesystem. The experimental results over I/O-intensive Workloads show that the proposed technique improves performance up to 30 % as compared to the previous techniques while imposing negligible memory overhead to the system.