Serial Solution

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 96 Experts worldwide ranked by ideXlab platform

Hong Tian - One of the best experts on this subject based on the ideXlab platform.

  • A parallel Solution to finding nodal neighbors in generic meshes
    MethodsX, 2020
    Co-Authors: Gang Mei, Hong Tian
    Abstract:

    Abstract In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory. • Our parallel Solution can generate the speedups of approximately 55 and 90 over the Serial Solution when finding the neighboring nodes and elements, respectively. • It is easy to implement due to the reason it does not need to perform the mesh-coloring before finding neighbors • There are no complex data structures, only integer arrays are needed, which makes our parallel Solution very effective.

  • A Parallel Solution to Finding Nodal Neighbors in Generic Meshes
    arXiv: Computational Geometry, 2016
    Co-Authors: Gang Mei, Hong Tian
    Abstract:

    In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction, and can be applied to determine both the neighboring nodes and elements. To evaluate the performance, the parallel Solution is compared to the corresponding Serial Solution. Experimental results show that: our parallel Solution can achieve the speedups of approximately 55 and 90 over the corresponding Serial Solution for finding neighboring nodes and elements, respectively. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory.

Hessel P Idzenga - One of the best experts on this subject based on the ideXlab platform.

Gang Mei - One of the best experts on this subject based on the ideXlab platform.

  • A parallel Solution to finding nodal neighbors in generic meshes
    MethodsX, 2020
    Co-Authors: Gang Mei, Hong Tian
    Abstract:

    Abstract In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory. • Our parallel Solution can generate the speedups of approximately 55 and 90 over the Serial Solution when finding the neighboring nodes and elements, respectively. • It is easy to implement due to the reason it does not need to perform the mesh-coloring before finding neighbors • There are no complex data structures, only integer arrays are needed, which makes our parallel Solution very effective.

  • Efficient Parallel Algorithms for 3D Laplacian Smoothing on the GPU
    Applied Sciences, 2019
    Co-Authors: Lei Xiao, Guoxiang Yang, Kunyang Zhao, Gang Mei
    Abstract:

    In numerical modeling, mesh quality is one of the decisive factors that strongly affects the accuracy of calculations and the convergence of iterations. To improve mesh quality, the Laplacian mesh smoothing method, which repositions nodes to the barycenter of adjacent nodes without changing the mesh topology, has been widely used. However, smoothing a large-scale three dimensional mesh is quite computationally expensive, and few studies have focused on accelerating the Laplacian mesh smoothing method by utilizing the graphics processing unit (GPU). This paper presents a GPU-accelerated parallel algorithm for Laplacian smoothing in three dimensions by considering the influence of different data layouts and iteration forms. To evaluate the efficiency of the GPU implementation, the parallel Solution is compared with the original Serial Solution. Experimental results show that our parallel implementation is up to 46 times faster than the Serial version.

  • A Parallel Solution to Finding Nodal Neighbors in Generic Meshes
    arXiv: Computational Geometry, 2016
    Co-Authors: Gang Mei, Hong Tian
    Abstract:

    In this paper we specifically present a parallel Solution to finding the one-ring neighboring nodes and elements for each vertex in generic meshes. The finding of nodal neighbors is computationally straightforward but expensive for large meshes. To improve the efficiency, the parallelism is adopted by utilizing the modern Graphics Processing Unit (GPU). The presented parallel Solution is heavily dependent on the parallel sorting, scan, and reduction, and can be applied to determine both the neighboring nodes and elements. To evaluate the performance, the parallel Solution is compared to the corresponding Serial Solution. Experimental results show that: our parallel Solution can achieve the speedups of approximately 55 and 90 over the corresponding Serial Solution for finding neighboring nodes and elements, respectively. Our parallel Solution is efficient and easy to implement, but requires the allocation of large device memory.

H. C. Kamath - One of the best experts on this subject based on the ideXlab platform.

  • estimation of multiple heat flux components at the metal mold interface in bar and plate aluminum alloy castings
    Metallurgical and Materials Transactions B-process Metallurgy and Materials Processing Science, 2004
    Co-Authors: T Prasanna S Kumar, H. C. Kamath
    Abstract:

    In the present investigation, a Serial Solution of the inverse heat-conduction problem (IHCP) is extended for Al-3 pct Cu-4.5 pct Si alloy square bars and rectangular plates cast in metal molds. The metal/mold interface was divided into three segments, viz., the half face, the quarter face, and the corner. The heat-flux transients during casting solidification were then estimated at these segments. Three configurations were considered, viz., (1) one boundary segment for the whole length on the interface, (2) two boundary segments delineating two heat-flux components, and (3) three boundary segments leading to three heat-flux components. In order to identify the most acceptable spatial distribution of interface heat flux, two types of analyses were performed on the results of the IHCP, viz., convergence of absolute error in the computed and the measured temperatures at the sensor locations and total heat energy transferred across the boundary from the casting to the mold. The error convergence at the thermocouple locations was more or less identical for all the three cases in both bars and plates. However, the total heat absorbed by the mold, in the case of the one-segment model in bars and the three-segment model in plates, was found to be a minimum. This indicated that the interface heat flux did not show any spatial distribution in the case of bars, while a distinct spatial distribution could be identified in the case of plates. The individual heat fluxes at the different interface segments for the plate casting showed a peak within 3 to 3.5 seconds of pouring, after which it reduced and reached stable values in about 200 seconds. The maximum heat flux occurred at the corner segment. The analysis of heat-flux gradients showed that the air gap initiated at the corner and spread toward the center.

  • Estimation of multiple heat-flux components at the metal/mold interface in bar and plate aluminum alloy castings
    Metallurgical and Materials Transactions B, 2004
    Co-Authors: T. S. Prasanna Kumar, H. C. Kamath
    Abstract:

    In the present investigation, a Serial Solution of the inverse heat-conduction problem (IHCP) is extended for Al-3 pct Cu-4.5 pct Si alloy square bars and rectangular plates cast in metal molds. The metal/mold interface was divided into three segments, viz., the half face, the quarter face, and the corner. The heat-flux transients during casting solidification were then estimated at these segments. Three configurations were considered, viz., (1) one boundary segment for the whole length on the interface, (2) two boundary segments delineating two heat-flux components, and (3) three boundary segments leading to three heat-flux components. In order to identify the most acceptable spatial distribution of interface heat flux, two types of analyses were performed on the results of the IHCP, viz., convergence of absolute error in the computed and the measured temperatures at the sensor locations and total heat energy transferred across the boundary from the casting to the mold. The error convergence at the thermocouple locations was more or less identical for all the three cases in both bars and plates. However, the total heat absorbed by the mold, in the case of the one-segment model in bars and the three-segment model in plates, was found to be a minimum. This indicated that the interface heat flux did not show any spatial distribution in the case of bars, while a distinct spatial distribution could be identified in the case of plates. The individual heat fluxes at the different interface segments for the plate casting showed a peak within 3 to 3.5 seconds of pouring, after which it reduced and reached stable values in about 200 seconds. The maximum heat flux occurred at the corner segment. The analysis of heat-flux gradients showed that the air gap initiated at the corner and spread toward the center.

Miriam Furst - One of the best experts on this subject based on the ideXlab platform.

  • Fast evaluation of a time-domain non-linear cochlear model on GPUs
    Journal of Computational Physics, 2014
    Co-Authors: Doron Sabo, Shlomo Weiss, Oded Barzelay, Miriam Furst
    Abstract:

    We present a parallel algorithm that solves a time-domain non-linear mathematical model of the cochlea. The previously known Serial Solution of the cochlear model is based on LU decomposition in the longitudinal dimension and is iterative in the time dimension. These two characteristics of the Serial Solution limit parallelism and prevent efficient computations on a massively parallel processor. We introduce a novel parallel algorithm that successfully overcomes the challenges posed by the cochlear model. We present performance results of a parallel implementation of the algorithm that shortens the computation time by a typical factor of 160-180, which makes the proposed algorithm of practical value for applications such as clinical audiological diagnosis.

  • ICCS - A Parallel Algorithm for a Physiological Non-linear Model of the Cochlea
    Procedia Computer Science, 2013
    Co-Authors: Doron Sabo, Shlomo Weiss, Miriam Furst
    Abstract:

    Abstract We present a parallel algorithm that solves a time-domain non-linear mathematical model of the cochlea. The previously known Serial Solution of the cochlear model is recursive in the longitudinal dimension and iterative in the time dimension. These two characteristics of the Serial Solution limit parallelism and prevent efficient computations on a massively parallel processor. We introduce a novel parallel algorithm that successfully overcomes the challenges posed by the cochlear model. We present performance results of a parallel implementation of the algorithm that shortens the computation time by a typical factor of 160 – 180, which makes the proposed algorithm of practical value for applications such as clinical audiological diagnosis.