Packet Processing

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 21453 Experts worldwide ranked by ideXlab platform

Hakim Weatherspoon - One of the best experts on this subject based on the ideXlab platform.

  • NetSlices: Scalable Multi-Core Packet Processing in User-Space
    ACM IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), 2012
    Co-Authors: Tudor Marian, Ki Suh Lee, Hakim Weatherspoon
    Abstract:

    Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed Packet Processing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adapters. In this paper we present the NetSlice operating system abstraction. Unlike the conventional raw socket, NetSlice tightly couples the hardware and software Packet Processing resources, and provides the application with control over these resources. To reduce shared resource contention, NetSlice performs domain specific, coarse-grained, spatial partitioning of CPU cores, memory, and NICs. Moreover, it provides a streamlined communication channel between NICs and user-space. Although backward compatible with the conventional socket API, the NetSlice API also provides batched (multi-) send / receive operations to amortize the cost of protection domain crossings. We show that complex user-space Packet processors---like a protocol accelerator and an IPsec gateway---built from commodity components can scale linearly with the number of cores and operate at 10Gbps network line speeds.

  • ANCS - NetSlices: scalable multi-core Packet Processing in user-space
    Proceedings of the eighth ACM IEEE symposium on Architectures for networking and communications systems - ANCS '12, 2012
    Co-Authors: Tudor Marian, Ki Suh Lee, Hakim Weatherspoon
    Abstract:

    Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed Packet Processing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adapters. In this paper we present the NetSlice operating system abstraction. Unlike the conventional raw socket, NetSlice tightly couples the hardware and software Packet Processing resources, and provides the application with control over these resources. To reduce shared resource contention, NetSlice performs domain specific, coarse-grained, spatial partitioning of CPU cores, memory, and NICs. Moreover, it provides a streamlined communication channel between NICs and user-space. Although backward compatible with the conventional socket API, the NetSlice API also provides batched (multi-) send / receive operations to amortize the cost of protection domain crossings. We show that complex user-space Packet processors---like a protocol accelerator and an IPsec gateway---built from commodity components can scale linearly with the number of cores and operate at 10Gbps network line speeds.

R. Iyer - One of the best experts on this subject based on the ideXlab platform.

  • architectural characterization of tcp ip Packet Processing on the pentium spl reg m microprocessor
    High-Performance Computer Architecture, 2004
    Co-Authors: S. Makineni, R. Iyer
    Abstract:

    A majority of the current and next generation server applications (Web services, e-commerce, storage, etc.) employ TCP/IP as the communication protocol of choice. As a result, the performance of these applications is heavily dependent on the efficient TCP/IP Packet Processing within the termination nodes. This dependency becomes even greater as the bandwidth needs of these applications grow from 100 Mbps to 1 Gbps to 10 Gbps in the near future. Motivated by this, we focus on the following: (a) to understand the performance behavior of the various modes of TCP/IP Processing, (b) to analyze the underlying architectural characteristics of TCP/IP Packet Processing and (c) to quantify the computational requirements of the TCP/IP Packet Processing component within realistic workloads. We achieve these goals by performing an in-depth analysis of Packet Processing performance on Intel's state-of-the-art low power Pentium/spl reg/ M microprocessor running the Microsoft Windows* Server 2003 operating system. Some of our key observations are - (i) that the mode of TCP/IP operation can significantly affect the performance requirements, (ii) that transmit-side Processing is largely compute-intensive as compared to receive-side Processing which is more memory-bound and (iii) that the computational requirements for sending/receiving Packets can form a substantial component (28% to 40%) of commercial server workloads. From our analysis, we also discuss architectural as well as stack-related improvements that can help achieve higher server network throughput and result in improved application performance.

  • Performance characterization of TCP/IP Packet Processing in commercial server workloads
    2003 IEEE International Conference on Communications (Cat. No.03CH37441), 1
    Co-Authors: S. Makineni, R. Iyer
    Abstract:

    TCP/IP is the communication protocol of choice for many current and next generation server applications (Web services, e-commerce, storage, etc.). As a result, the performance of these applications can be heavily dependent on the efficient TCP/IP Packet Processing within the termination nodes. Motivated by this, our work presented in this paper focuses on analyzing the underlying architectural characteristics of TCP/IP Packet Processing component within server workloads. Our analysis and characterization methodology is based on in-depth measurement experiments of TCP/IP Packet Processing performance on Intel's state-of-the-art low-power Pentium/spl reg/ M microprocessor running the Microsoft Windows* Server 2003 operating system. We start by analyzing the impact of NIC features such as Large Segment Offload and the use of Jumbo frames on TCP/IP Packet Processing performance. We then show that the architectural characteristics of transmit-side Processing (largely compute-bound) are significantly different than receive-side Processing (mostly memory-bound). Finally we quantify the computational requirements for sending/receiving Packets within commercial workloads (SPECweb99, TPC-C and TPC-W) and show that they can form a substantial component.

Laurent Mathy - One of the best experts on this subject based on the ideXlab platform.

  • Fast userspace Packet Processing
    ANCS 2015 - 11th 2015 ACM IEEE Symposium on Architectures for Networking and Communications Systems, 2015
    Co-Authors: Tom Barbette, C. Soldani, Laurent Mathy
    Abstract:

    In recent years, we have witnessed the emergence of high speed Packet I/O frameworks, bringing unprecedented net-work performance to userspace. Using the Click modular router, we first review and quantitatively compare several such Packet I/O frameworks, showing their superiority to kernel-based forwarding. We then reconsider the issue of software Packet Processing, in the context of modern commodity hardware with hard-ware multi-queues, multi-core processors and non-uniform memory access. Through a combination of existing tech-niques and improvements of our own, we derive modern gen-eral principles for the design of software Packet processors. Our implementation of a fast Packet processor framework, integrating a faster Click with both Netmap and DPDK, ex-hibits up-to about 2.3x speed-up compared to other software implementations, when used as an IP router.

  • ANCS - Fast Userspace Packet Processing
    2015 ACM IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), 2015
    Co-Authors: Tom Barbette, C. Soldani, Laurent Mathy
    Abstract:

    In recent years, we have witnessed the emergence of high speed Packet I/O frameworks, bringing unprecedented network performance to userspace. Using the Click modular router, we first review and quantitatively compare several such Packet I/O frameworks, showing their superiority to kernel-based forwarding. We then reconsider the issue of software Packet Processing, in the context of modern commodity hardware with hardware multi-queues, multi-core processors and non-uniform memory access. Through a combination of existing techniques and improvements of our own, we derive modern general principles for the design of software Packet processors. Our implementation of a fast Packet processor framework, integrating a faster Click with both Netmap and DPDK, exhibits up-to about 2.3x speed-up compared to other software implementations, when used as an IP router.

  • Understanding the Packet Processing Capabilities of Multi-core Servers
    2009
    Co-Authors: Norbert Egi, Mihai Dobrescu, Katerina Argyraki, Byung-gon Chun, Kevin Fall, Gianluca Iannaccone, Allan D. Knies, Maziar Manesh, Laurent Mathy
    Abstract:

    Compared to specialized network equipment, software routers running on commodity servers allow programmers to rapidly build and (re)program networks using the software and hardware platforms they tend to be most familiar with—that of the general-purpose computer. Unfortunately, the Achilles’ heel of software routers has been performance; commodity servers have traditionally proven incapable of high-speed Packet Processing thereby motivating an entire industry around the development of specialized network hardware and software. However, recent advances in PC technology promise significant speed-ups for applications amenable to parallelization; router workloads appear ideally suited to exploit these advances. This leads us to question whether it is now (or soon will be) plausible to scale software routers to current high-speed networks. As a first step toward answering this question, we study the PacketProcessing capability of current commodity multi-core servers: we identify performance bottlenecks, evaluate tradeoffs between performance and programmability, and discuss what changes are needed to further scale the Packet-Processing capability of generalpurpose servers.

Tudor Marian - One of the best experts on this subject based on the ideXlab platform.

  • NetSlices: Scalable Multi-Core Packet Processing in User-Space
    ACM IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), 2012
    Co-Authors: Tudor Marian, Ki Suh Lee, Hakim Weatherspoon
    Abstract:

    Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed Packet Processing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adapters. In this paper we present the NetSlice operating system abstraction. Unlike the conventional raw socket, NetSlice tightly couples the hardware and software Packet Processing resources, and provides the application with control over these resources. To reduce shared resource contention, NetSlice performs domain specific, coarse-grained, spatial partitioning of CPU cores, memory, and NICs. Moreover, it provides a streamlined communication channel between NICs and user-space. Although backward compatible with the conventional socket API, the NetSlice API also provides batched (multi-) send / receive operations to amortize the cost of protection domain crossings. We show that complex user-space Packet processors---like a protocol accelerator and an IPsec gateway---built from commodity components can scale linearly with the number of cores and operate at 10Gbps network line speeds.

  • ANCS - NetSlices: scalable multi-core Packet Processing in user-space
    Proceedings of the eighth ACM IEEE symposium on Architectures for networking and communications systems - ANCS '12, 2012
    Co-Authors: Tudor Marian, Ki Suh Lee, Hakim Weatherspoon
    Abstract:

    Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed Packet Processing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adapters. In this paper we present the NetSlice operating system abstraction. Unlike the conventional raw socket, NetSlice tightly couples the hardware and software Packet Processing resources, and provides the application with control over these resources. To reduce shared resource contention, NetSlice performs domain specific, coarse-grained, spatial partitioning of CPU cores, memory, and NICs. Moreover, it provides a streamlined communication channel between NICs and user-space. Although backward compatible with the conventional socket API, the NetSlice API also provides batched (multi-) send / receive operations to amortize the cost of protection domain crossings. We show that complex user-space Packet processors---like a protocol accelerator and an IPsec gateway---built from commodity components can scale linearly with the number of cores and operate at 10Gbps network line speeds.

S. Makineni - One of the best experts on this subject based on the ideXlab platform.

  • architectural characterization of tcp ip Packet Processing on the pentium spl reg m microprocessor
    High-Performance Computer Architecture, 2004
    Co-Authors: S. Makineni, R. Iyer
    Abstract:

    A majority of the current and next generation server applications (Web services, e-commerce, storage, etc.) employ TCP/IP as the communication protocol of choice. As a result, the performance of these applications is heavily dependent on the efficient TCP/IP Packet Processing within the termination nodes. This dependency becomes even greater as the bandwidth needs of these applications grow from 100 Mbps to 1 Gbps to 10 Gbps in the near future. Motivated by this, we focus on the following: (a) to understand the performance behavior of the various modes of TCP/IP Processing, (b) to analyze the underlying architectural characteristics of TCP/IP Packet Processing and (c) to quantify the computational requirements of the TCP/IP Packet Processing component within realistic workloads. We achieve these goals by performing an in-depth analysis of Packet Processing performance on Intel's state-of-the-art low power Pentium/spl reg/ M microprocessor running the Microsoft Windows* Server 2003 operating system. Some of our key observations are - (i) that the mode of TCP/IP operation can significantly affect the performance requirements, (ii) that transmit-side Processing is largely compute-intensive as compared to receive-side Processing which is more memory-bound and (iii) that the computational requirements for sending/receiving Packets can form a substantial component (28% to 40%) of commercial server workloads. From our analysis, we also discuss architectural as well as stack-related improvements that can help achieve higher server network throughput and result in improved application performance.

  • Performance characterization of TCP/IP Packet Processing in commercial server workloads
    2003 IEEE International Conference on Communications (Cat. No.03CH37441), 1
    Co-Authors: S. Makineni, R. Iyer
    Abstract:

    TCP/IP is the communication protocol of choice for many current and next generation server applications (Web services, e-commerce, storage, etc.). As a result, the performance of these applications can be heavily dependent on the efficient TCP/IP Packet Processing within the termination nodes. Motivated by this, our work presented in this paper focuses on analyzing the underlying architectural characteristics of TCP/IP Packet Processing component within server workloads. Our analysis and characterization methodology is based on in-depth measurement experiments of TCP/IP Packet Processing performance on Intel's state-of-the-art low-power Pentium/spl reg/ M microprocessor running the Microsoft Windows* Server 2003 operating system. We start by analyzing the impact of NIC features such as Large Segment Offload and the use of Jumbo frames on TCP/IP Packet Processing performance. We then show that the architectural characteristics of transmit-side Processing (largely compute-bound) are significantly different than receive-side Processing (mostly memory-bound). Finally we quantify the computational requirements for sending/receiving Packets within commercial workloads (SPECweb99, TPC-C and TPC-W) and show that they can form a substantial component.