Iterative Routing

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 3480 Experts worldwide ranked by ideXlab platform

Theobald Martin - One of the best experts on this subject based on the ideXlab platform.

  • AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing
    2020
    Co-Authors: Venugopal, Vinu E., Theobald Martin, Chaychi Samira, Tawakuli Amal
    Abstract:

    Distributed Stream Processing Systems (DSPSs) are among the currently most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. The major market players in this domain are clearly represented by Apache Spark and Flink, which provide a variety of frontend APIs for SQL, statistical inference, machine learning, stream processing, and many others. Yet rather few details are reported on the integration of these engines into the underlying High-Performance Computing (HPC) infrastructure and the communication protocols they use. Spark and Flink, for example, are implemented in Java and still rely on a dedicated master node for managing their control flow among the worker nodes in a compute cluster. In this paper, we describe the architecture of our AIR engine, which is designed from scratch in C++ using the Message Passing Interface (MPI), pthreads for multithreading, and is directly deployed on top of a common HPC workload manager such as SLURM. AIR implements a light-weight, dynamic sharding protocol (referred to as "Asynchronous Iterative Routing"), which facilitates a direct and asynchronous communication among all client nodes and thereby completely avoids the overhead induced by the control flow with a master node that may otherwise form a performance bottleneck. Our experiments over a variety of benchmark settings confirm that AIR outperforms Spark and Flink in terms of latency and throughput by a factor of up to 15; moreover, we demonstrate that AIR scales out much better than existing DSPSs to clusters consisting of up to 8 nodes and 224 cores.Comment: 16 pages, 6 figures, 15 plot

  • AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing
    2020
    Co-Authors: Ellampallil Venugopal Vinu, Theobald Martin, Chaychi Samira, Tawakuli Amal
    Abstract:

    Distributed Stream Processing Engines (DSPEs) are currently among the most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. In this paper, we describe the architecture of our AIR engine, which is designed from scratch in C++ using the Message Passing Interface (MPI), pthreads for multithreading, and is directly deployed on top of a common HPC workload manager such as SLURM. AIR implements a light-weight, dynamic sharding protocol (referred to as “Asynchronous Iterative Routing”), which facilitates a direct and asynchronous communication among all worker nodes and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR fills the gap between the prevalent scale-out (but Java-based) architectures like Apache Spark and Flink, on one hand, and recent scale-up (and C++ based) prototypes such as StreamBox and PiCo, on the other hand. Our experiments over various benchmark settings confirm that AIR performs as good as the best scale-up SPEs on a single-node setup, while it outperforms existing scale-out DSPEs in terms of processing latency and sustainable throughput by a factor of up to 15 in a distributed setting

  • Asynchronous Stream Data Processing using a Light-Weight and High-Performance Dataflow Engine
    2020
    Co-Authors: Ellampallil Venugopal Vinu, Theobald Martin
    Abstract:

    Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distributed stream processing systems in the past few years, the high-throughput and low-latency (a.k.a. high sustainable-throughput) requirement of modern applications is pushing the limits of traditional data processing infrastructures. This paper introduces a new distributed stream data processing engine (DSPE), called “Asynchronous Iterative Routing” or simply AIR, which implements a light-weight, dynamic sharding protocol. AIR expedites a direct and asynchronous communication among all the worker nodes via multiple Message Passing Interface (MPI) communication channels and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR scales out to clusters consisting of up to 8 nodes and 224 cores, performing much better than existing DSPEs, and it performs up to 15 times better than Spark and Flink in terms of sustainable-throughput

Tawakuli Amal - One of the best experts on this subject based on the ideXlab platform.

  • AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing
    2020
    Co-Authors: Venugopal, Vinu E., Theobald Martin, Chaychi Samira, Tawakuli Amal
    Abstract:

    Distributed Stream Processing Systems (DSPSs) are among the currently most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. The major market players in this domain are clearly represented by Apache Spark and Flink, which provide a variety of frontend APIs for SQL, statistical inference, machine learning, stream processing, and many others. Yet rather few details are reported on the integration of these engines into the underlying High-Performance Computing (HPC) infrastructure and the communication protocols they use. Spark and Flink, for example, are implemented in Java and still rely on a dedicated master node for managing their control flow among the worker nodes in a compute cluster. In this paper, we describe the architecture of our AIR engine, which is designed from scratch in C++ using the Message Passing Interface (MPI), pthreads for multithreading, and is directly deployed on top of a common HPC workload manager such as SLURM. AIR implements a light-weight, dynamic sharding protocol (referred to as "Asynchronous Iterative Routing"), which facilitates a direct and asynchronous communication among all client nodes and thereby completely avoids the overhead induced by the control flow with a master node that may otherwise form a performance bottleneck. Our experiments over a variety of benchmark settings confirm that AIR outperforms Spark and Flink in terms of latency and throughput by a factor of up to 15; moreover, we demonstrate that AIR scales out much better than existing DSPSs to clusters consisting of up to 8 nodes and 224 cores.Comment: 16 pages, 6 figures, 15 plot

  • AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing
    2020
    Co-Authors: Ellampallil Venugopal Vinu, Theobald Martin, Chaychi Samira, Tawakuli Amal
    Abstract:

    Distributed Stream Processing Engines (DSPEs) are currently among the most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. In this paper, we describe the architecture of our AIR engine, which is designed from scratch in C++ using the Message Passing Interface (MPI), pthreads for multithreading, and is directly deployed on top of a common HPC workload manager such as SLURM. AIR implements a light-weight, dynamic sharding protocol (referred to as “Asynchronous Iterative Routing”), which facilitates a direct and asynchronous communication among all worker nodes and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR fills the gap between the prevalent scale-out (but Java-based) architectures like Apache Spark and Flink, on one hand, and recent scale-up (and C++ based) prototypes such as StreamBox and PiCo, on the other hand. Our experiments over various benchmark settings confirm that AIR performs as good as the best scale-up SPEs on a single-node setup, while it outperforms existing scale-out DSPEs in terms of processing latency and sustainable throughput by a factor of up to 15 in a distributed setting

G Schiavon - One of the best experts on this subject based on the ideXlab platform.

  • capsule and convolutional neural network based sar ship classification in sentinel 1 data
    Active and Passive Microwave Remote Sensing for Environmental Monitoring III, 2019
    Co-Authors: Leonardo De Laurentiis, Andrea Pomente, Fabio Del Frate, G Schiavon
    Abstract:

    Synthetic Aperture Radar (SAR) constitutes a fundamental asset for wide-areas monitoring with high-resolution requirements. The first SAR sensors have given rise to coarse coastal and maritime monitoring applications, including oil spill, ship and ice floes detection. With the upgrade to very high-resolution sensors in the recent years, with relatively new SAR missions such as Sentinel-1, a great deal of data providing a stronger information content has been released, enabling more refined studies on general targets features and thus permitting complex classifications, as for ship classification, which has become increasingly relevant given the growing need for coastal surveillance in commercial and military segments. In the last decade, several works focused on this topic have been presented, generally based on radiometric features processing; furthermore, in the very recent years a significant amount of research works have focused on emerging deep learning techniques, in particular on Convolutional Neural Networks (CNN). Recently Capsule Neural Networks (CapsNets) have been presented, demonstrating a notable improvement in capturing the properties of given entities, improving the use of spatial informations, in particular of spatial dependence between features, a severely lacking feature in CNNs. In fact, CNNs pooling operations have been criticized for losing spatial relations, thus special capsules, along with a new Iterative Routing-by-agreement mechanism, have been proposed. In this work a comparison between Capsule and CNNs potential in the ship classification application domain is shown, by leveraging the OpenSARShip, a SAR Sentinel-1 ship chips dataset; in particular, a performance comparison between capsule and various convolutional architectures is built, demonstrating better performances of CapsNet in classifying ships within a small dataset.

Schiavon Giovanni - One of the best experts on this subject based on the ideXlab platform.

  • Capsule and convolutional neural network-based SAR ship classification in Sentinel-1 data
    'SPIE-Intl Soc Optical Eng', 2019
    Co-Authors: De Laurentiis Leonardo, Pomente Andrea, Del Frate Fabio, Schiavon Giovanni
    Abstract:

    Synthetic Aperture Radar (SAR) constitutes a fundamental asset for wide-areas monitoring with high-resolution requirements. The first SAR sensors have given rise to coarse coastal and maritime monitoring applications, including oil spill, ship and ice floes detection. With the upgrade to very high-resolution sensors in the recent years, with relatively new SAR missions such as Sentinel-1, a great deal of data providing a stronger information content has been released, enabling more refined studies on general targets features and thus permitting complex classifications, as for ship classification, which has become increasingly relevant given the growing need for coastal surveillance in commercial and military segments. In the last decade, several works focused on this topic have been presented, generally based on radiometric features processing; furthermore, in the very recent years a significant amount of research works have focused on emerging deep learning techniques, in particular on Convolutional Neural Networks (CNN). Recently Capsule Neural Networks (CapsNets) have been presented, demonstrating a notable improvement in capturing the properties of given entities, improving the use of spatial informations, in particular of spatial dependence between features, a severely lacking feature in CNNs. In fact, CNNs pooling operations have been criticized for losing spatial relations, thus special capsules, along with a new Iterative Routing-by-agreement mechanism, have been proposed. In this work a comparison between Capsule and CNNs potential in the ship classification application domain is shown, by leveraging the OpenSARShip, a SAR Sentinel-1 ship chips dataset; in particular, a performance comparison between capsule and various convolutional architectures is built, demonstrating better performances of CapsNet in classifying ships within a small dataset.Comment: Please check out the original SPIE paper for a complete list of figures, tables, references and general conten

Ellampallil Venugopal Vinu - One of the best experts on this subject based on the ideXlab platform.

  • AIR: A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing
    2020
    Co-Authors: Ellampallil Venugopal Vinu, Theobald Martin, Chaychi Samira, Tawakuli Amal
    Abstract:

    Distributed Stream Processing Engines (DSPEs) are currently among the most emerging topics in data management, with applications ranging from real-time event monitoring to processing complex dataflow programs and big data analytics. In this paper, we describe the architecture of our AIR engine, which is designed from scratch in C++ using the Message Passing Interface (MPI), pthreads for multithreading, and is directly deployed on top of a common HPC workload manager such as SLURM. AIR implements a light-weight, dynamic sharding protocol (referred to as “Asynchronous Iterative Routing”), which facilitates a direct and asynchronous communication among all worker nodes and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR fills the gap between the prevalent scale-out (but Java-based) architectures like Apache Spark and Flink, on one hand, and recent scale-up (and C++ based) prototypes such as StreamBox and PiCo, on the other hand. Our experiments over various benchmark settings confirm that AIR performs as good as the best scale-up SPEs on a single-node setup, while it outperforms existing scale-out DSPEs in terms of processing latency and sustainable throughput by a factor of up to 15 in a distributed setting

  • Asynchronous Stream Data Processing using a Light-Weight and High-Performance Dataflow Engine
    2020
    Co-Authors: Ellampallil Venugopal Vinu, Theobald Martin
    Abstract:

    Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distributed stream processing systems in the past few years, the high-throughput and low-latency (a.k.a. high sustainable-throughput) requirement of modern applications is pushing the limits of traditional data processing infrastructures. This paper introduces a new distributed stream data processing engine (DSPE), called “Asynchronous Iterative Routing” or simply AIR, which implements a light-weight, dynamic sharding protocol. AIR expedites a direct and asynchronous communication among all the worker nodes via multiple Message Passing Interface (MPI) communication channels and thereby completely avoids any additional communication overhead with a dedicated master node. With its unique design, AIR scales out to clusters consisting of up to 8 nodes and 224 cores, performing much better than existing DSPEs, and it performs up to 15 times better than Spark and Flink in terms of sustainable-throughput