The Experts below are selected from a list of 96 Experts worldwide ranked by ideXlab platform
Hong Tian - One of the best experts on this subject based on the ideXlab platform.
-
A parallel Solution to finding nodal neighbors in generic meshes
MethodsX, 2020Co-Authors: Gang Mei, Hong TianAbstract:Abstract In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory. • Our parallel Solution can generate the speedups of approximately 55 and 90 over the Serial Solution when finding the neighboring nodes and elements, respectively. • It is easy to implement due to the reason it does not need to perform the mesh-coloring before finding neighbors • There are no complex data structures, only integer arrays are needed, which makes our parallel Solution very effective.
-
A Parallel Solution to Finding Nodal Neighbors in Generic Meshes
arXiv: Computational Geometry, 2016Co-Authors: Gang Mei, Hong TianAbstract:In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction, and can be applied to determine both the neighboring nodes and elements. To evaluate the performance, the parallel Solution is compared to the corresponding Serial Solution. Experimental results show that: our parallel Solution can achieve the speedups of approximately 55 and 90 over the corresponding Serial Solution for finding neighboring nodes and elements, respectively. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory.
Hessel P Idzenga - One of the best experts on this subject based on the ideXlab platform.
-
structural decomposition and Serial Solution of spn models of the atm gauss switch
Lecture Notes in Computer Science, 1999Co-Authors: Boudewijn R Haverkort, Hessel P IdzengaAbstract:We address the performance, in particular, the cell loss ratio, of the ATM GAUSS switch under a variety of realistic video and constant bit rate traffic patterns.
-
Application of Petri Nets to Communication Networks - Structural Decomposition and Serial Solution of SPN Models of the ATM GAUSS Switch
Application of Petri Nets to Communication Networks, 1999Co-Authors: Boudewijn R Haverkort, Hessel P IdzengaAbstract:We address the performance, in particular, the cell loss ratio, of the ATM GAUSS switch under a variety of realistic video and constant bit rate traffic patterns.
Gang Mei - One of the best experts on this subject based on the ideXlab platform.
-
A parallel Solution to finding nodal neighbors in generic meshes
MethodsX, 2020Co-Authors: Gang Mei, Hong TianAbstract:Abstract In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory. • Our parallel Solution can generate the speedups of approximately 55 and 90 over the Serial Solution when finding the neighboring nodes and elements, respectively. • It is easy to implement due to the reason it does not need to perform the mesh-coloring before finding neighbors • There are no complex data structures, only integer arrays are needed, which makes our parallel Solution very effective.
-
Efficient Parallel Algorithms for 3D Laplacian Smoothing on the GPU
Applied Sciences, 2019Co-Authors: Lei Xiao, Guoxiang Yang, Kunyang Zhao, Gang MeiAbstract:In numerical modeling, mesh quality is one of the decisive factors that strongly affects the accuracy of calculations and the convergence of iterations. To improve mesh quality, the Laplacian mesh smoothing method, which repositions nodes to the barycenter of adjacent nodes without changing the mesh topology, has been widely used. However, smoothing a large-scale three dimensional mesh is quite computationally expensive, and few studies have focused on accelerating the Laplacian mesh smoothing method by utilizing the graphics processing unit (GPU). This paper presents a GPU-accelerated parallel algorithm for Laplacian smoothing in three dimensions by considering the influence of different data layouts and iteration forms. To evaluate the efficiency of the GPU implementation, the parallel Solution is compared with the original Serial Solution. Experimental results show that our parallel implementation is up to 46 times faster than the Serial version.
-
A Parallel Solution to Finding Nodal Neighbors in Generic Meshes
arXiv: Computational Geometry, 2016Co-Authors: Gang Mei, Hong TianAbstract:In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction, and can be applied to determine both the neighboring nodes and elements. To evaluate the performance, the parallel Solution is compared to the corresponding Serial Solution. Experimental results show that: our parallel Solution can achieve the speedups of approximately 55 and 90 over the corresponding Serial Solution for finding neighboring nodes and elements, respectively. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory.
H. C. Kamath - One of the best experts on this subject based on the ideXlab platform.
-
estimation of multiple heat flux components at the metal mold interface in bar and plate aluminum alloy castings
Metallurgical and Materials Transactions B-process Metallurgy and Materials Processing Science, 2004Co-Authors: T Prasanna S Kumar, H. C. KamathAbstract:In the present investigation, a Serial Solution of the inverse heat-conduction problem (IHCP) is extended for Al-3 pct Cu-4.5 pct Si alloy square bars and rectangular plates cast in metal molds. The metal/mold interface was divided into three segments, viz., the half face, the quarter face, and the corner. The heat-flux transients during casting solidification were then estimated at these segments. Three configurations were considered, viz., (1) one boundary segment for the whole length on the interface, (2) two boundary segments delineating two heat-flux components, and (3) three boundary segments leading to three heat-flux components. In order to identify the most acceptable spatial distribution of interface heat flux, two types of analyses were performed on the results of the IHCP, viz., convergence of absolute error in the computed and the measured temperatures at the sensor locations and total heat energy transferred across the boundary from the casting to the mold. The error convergence at the thermocouple locations was more or less identical for all the three cases in both bars and plates. However, the total heat absorbed by the mold, in the case of the one-segment model in bars and the three-segment model in plates, was found to be a minimum. This indicated that the interface heat flux did not show any spatial distribution in the case of bars, while a distinct spatial distribution could be identified in the case of plates. The individual heat fluxes at the different interface segments for the plate casting showed a peak within 3 to 3.5 seconds of pouring, after which it reduced and reached stable values in about 200 seconds. The maximum heat flux occurred at the corner segment. The analysis of heat-flux gradients showed that the air gap initiated at the corner and spread toward the center.
-
Estimation of multiple heat-flux components at the metal/mold interface in bar and plate aluminum alloy castings
Metallurgical and Materials Transactions B, 2004Co-Authors: T. S. Prasanna Kumar, H. C. KamathAbstract:In the present investigation, a Serial Solution of the inverse heat-conduction problem (IHCP) is extended for Al-3 pct Cu-4.5 pct Si alloy square bars and rectangular plates cast in metal molds. The metal/mold interface was divided into three segments, viz., the half face, the quarter face, and the corner. The heat-flux transients during casting solidification were then estimated at these segments. Three configurations were considered, viz., (1) one boundary segment for the whole length on the interface, (2) two boundary segments delineating two heat-flux components, and (3) three boundary segments leading to three heat-flux components. In order to identify the most acceptable spatial distribution of interface heat flux, two types of analyses were performed on the results of the IHCP, viz., convergence of absolute error in the computed and the measured temperatures at the sensor locations and total heat energy transferred across the boundary from the casting to the mold. The error convergence at the thermocouple locations was more or less identical for all the three cases in both bars and plates. However, the total heat absorbed by the mold, in the case of the one-segment model in bars and the three-segment model in plates, was found to be a minimum. This indicated that the interface heat flux did not show any spatial distribution in the case of bars, while a distinct spatial distribution could be identified in the case of plates. The individual heat fluxes at the different interface segments for the plate casting showed a peak within 3 to 3.5 seconds of pouring, after which it reduced and reached stable values in about 200 seconds. The maximum heat flux occurred at the corner segment. The analysis of heat-flux gradients showed that the air gap initiated at the corner and spread toward the center.
Miriam Furst - One of the best experts on this subject based on the ideXlab platform.
-
Fast evaluation of a time-domain non-linear cochlear model on GPUs
Journal of Computational Physics, 2014Co-Authors: Doron Sabo, Shlomo Weiss, Oded Barzelay, Miriam FurstAbstract:We present a parallel algorithm that solves a time-domain non-linear mathematical model of the cochlea. The previously known Serial Solution of the cochlear model is based on LU decomposition in the longitudinal dimension and is iterative in the time dimension. These two characteristics of the Serial Solution limit parallelism and prevent efficient computations on a massively parallel processor. We introduce a novel parallel algorithm that successfully overcomes the challenges posed by the cochlear model. We present performance results of a parallel implementation of the algorithm that shortens the computation time by a typical factor of 160-180, which makes the proposed algorithm of practical value for applications such as clinical audiological diagnosis.
-
ICCS - A Parallel Algorithm for a Physiological Non-linear Model of the Cochlea
Procedia Computer Science, 2013Co-Authors: Doron Sabo, Shlomo Weiss, Miriam FurstAbstract:Abstract We present a parallel algorithm that solves a time-domain non-linear mathematical model of the cochlea. The previously known Serial Solution of the cochlear model is recursive in the longitudinal dimension and iterative in the time dimension. These two characteristics of the Serial Solution limit parallelism and prevent efficient computations on a massively parallel processor. We introduce a novel parallel algorithm that successfully overcomes the challenges posed by the cochlear model. We present performance results of a parallel implementation of the algorithm that shortens the computation time by a typical factor of 160 – 180, which makes the proposed algorithm of practical value for applications such as clinical audiological diagnosis.