Bandwidth Memory

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 37125 Experts worldwide ranked by ideXlab platform

Sungjoo Hong - One of the best experts on this subject based on the ideXlab platform.

Alejandro Duran - One of the best experts on this subject based on the ideXlab platform.

  • effective use of large high Bandwidth Memory caches in hpc stencil computation via temporal wave front tiling
    IEEE International Conference on High Performance Computing Data and Analytics, 2016
    Co-Authors: Charles R Yount, Alejandro Duran
    Abstract:

    Stencil computation is an important class of algorithms used in a large variety of scientific-simulation applications. The performance of stencil calculations is often bounded by Memory Bandwidth. High-Bandwidth Memory (HBM) on devices such as those in the Intel® Xeon Phi™ ™200 processor family (code-named Knights Landing) can thus provide additional performance. In a traditional sequential time-step approach, the additional Bandwidth can be best utilized when the stencil data fits into the HBM, restricting the problem sizes that can be undertaken and under-utilizing the larger DDR Memory on the platform. As problem sizes become significantly larger than the HBM, the effective Bandwidth approaches that of the DDR, degrading performance. This paper explores the use of temporal wave-front tiling to add an additional layer of cache-blocking to allow efficient use of both the HBM Bandwidth and the DDR capacity. Details of the cache-blocking and wave-front tiling algorithms are given, and results on a Xeon Phi processor are presented, comparing performance across problem sizes and among four experimental configurations. Analyses of the Bandwidth utilization and HBM-cache hit rates are also provided, illustrating the correlation between these metrics and performance. It is demonstrated that temporal wave-front tiling can provide a 2.4™ speedup compared to using HBM cache without temporal tiling and 3.3x speedup compared to only using DDR Memory for large problem sizes.

  • PMBS@SC - Effective use of large high-Bandwidth Memory caches in HPC stencil computation via temporal wave-front tiling
    2016
    Co-Authors: Charles R Yount, Alejandro Duran
    Abstract:

    Stencil computation is an important class of algorithms used in a large variety of scientific-simulation applications. The performance of stencil calculations is often bounded by Memory Bandwidth. High-Bandwidth Memory (HBM) on devices such as those in the Intel® Xeon Phi™ ™200 processor family (code-named Knights Landing) can thus provide additional performance. In a traditional sequential time-step approach, the additional Bandwidth can be best utilized when the stencil data fits into the HBM, restricting the problem sizes that can be undertaken and under-utilizing the larger DDR Memory on the platform. As problem sizes become significantly larger than the HBM, the effective Bandwidth approaches that of the DDR, degrading performance. This paper explores the use of temporal wave-front tiling to add an additional layer of cache-blocking to allow efficient use of both the HBM Bandwidth and the DDR capacity. Details of the cache-blocking and wave-front tiling algorithms are given, and results on a Xeon Phi processor are presented, comparing performance across problem sizes and among four experimental configurations. Analyses of the Bandwidth utilization and HBM-cache hit rates are also provided, illustrating the correlation between these metrics and performance. It is demonstrated that temporal wave-front tiling can provide a 2.4™ speedup compared to using HBM cache without temporal tiling and 3.3x speedup compared to only using DDR Memory for large problem sizes.

Sang Jin Byeon - One of the best experts on this subject based on the ideXlab platform.

Jun Hyun Chun - One of the best experts on this subject based on the ideXlab platform.

  • high Bandwidth Memory hbm with tsv technique
    International SoC Design Conference, 2016
    Co-Authors: Young Jun Ku, Jun Hyun Chun, Ki Hun Kwon, Chunseok Jeong, Sangmuk Oh, Young Jae Choi, Jonghoon Oh
    Abstract:

    In this paper, HBM DRAM with TSV technique is introduced. This paper covers the general TSV feature and techniques such as TSV architecture, TSV reliability, TSV open / short test, and TSV repair. And HBM DRAM, representative DRAM product using TSV, is widely presented, especially the use and features.

  • an exact measurement and repair circuit of tsv connections for 128gb s high Bandwidth Memory hbm stacked dram
    Symposium on VLSI Circuits, 2014
    Co-Authors: Sang Jin Byeon, Jun Hyun Chun, Sungjoo Hong
    Abstract:

    For the heterogeneous-structured high Bandwidth Memory (HBM) DRAM, it is important to guarantee the reliability of TSV connections. An exact TSV current scan and repair method is proposed, that uses similar to the correlated double sampling method. The register-based pre-repair method improves testability. The measurement results for thousands of TSV shows impedance distribution under 0.1 ohm. Methods are integrated in 8Gb HBM stacked DRAM using 29nm process.

  • VLSIC - An exact measurement and repair circuit of TSV connections for 128GB/s high-Bandwidth Memory(HBM) stacked DRAM
    2014 Symposium on VLSI Circuits Digest of Technical Papers, 2014
    Co-Authors: Dong Uk Lee, Jinhee Cho, Kang-seol Lee, Hanho Jin, Sang Jin Byeon, Kyung Whan Kim, Kwan Weon Kim, Sang Kyun Nam, Jae-jin Lee, Jun Hyun Chun
    Abstract:

    For the heterogeneous-structured high Bandwidth Memory (HBM) DRAM, it is important to guarantee the reliability of TSV connections. An exact TSV current scan and repair method is proposed, that uses similar to the correlated double sampling method. The register-based pre-repair method improves testability. The measurement results for thousands of TSV shows impedance distribution under 0.1 ohm. Methods are integrated in 8Gb HBM stacked DRAM using 29nm process.

Joungho Kim - One of the best experts on this subject based on the ideXlab platform.

  • Processing-in-Memory in High Bandwidth Memory (PIM-HBM) Architecture with Energy-efficient and Low Latency Channels for High Bandwidth System
    2019 IEEE 28th Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS), 2019
    Co-Authors: Seongguk Kim, Subin Kim, Kyungjun Cho, Taein Shin, Hyunwook Park, Daehwan Lho, Shinyoung Park, Kyungjune Son, Gapyeol Park, Joungho Kim
    Abstract:

    In this paper, for the first time, we propose a processing-in-Memory in high Bandwidth Memory (PIM-HBM) architecture for high Bandwidth systems with low dynamic random-access Memory (DRAM) access costs. The main concept of the proposed PIM-HBM architecture is to embed processing units into a logic base of high Bandwidth Memory (HBM) to decrease the energy consumption and latency of interconnections as the physical length between core and DRAM decreases. To verify the proposed PIM-HBM architecture, we designed on-chip and on-interposer I/O channels using a CMOS 0.18 µm process. We extracted channel parasitic using an electromagnetic (EM) solver and performed a SPICE simulation to compare the system performance of the proposed architecture with the conventional HBM. As a result, the performance of the proposed PIM-HBM architecture is successfully verified by reducing energy consumption and latency of interconnections by 77 % and 79 % compared to the conventional HBM system.

  • signal integrity design and analysis of silicon interposer for gpu Memory channels in high Bandwidth Memory interface
    IEEE Transactions on Components Packaging and Manufacturing Technology, 2018
    Co-Authors: Kyungjun Cho, Subin Kim, Hyunsuk Lee, Heegon Kim, Sumin Choi, Youngwoo Kim, Jinwook Song, Junyong Park, Seongsoo Lee, Joungho Kim
    Abstract:

    In this paper, for the first time, we designed and analyzed channels between a graphic processing unit and Memory in a silicon interposer for a 3-D stacked high Bandwidth Memory (HBM). We thoroughly analyzed and verified the electrical characteristics of the silicon interposer considering various design parameters, such as the channel width and space, redistribution layer via, and under bump metallurgy pads. In particular, we also considered the meshed ground planes used for the proposed transmission lines, which are microstrip and strip lines. Signal integrity (SI) of the proposed channels in the silicon interposer was successfully analyzed and verified using a full 3-D electromagnetic solver and circuit simulations. Based on the extracted lumped circuit resistance, inductance, conductance and capacitance parameters, we thoroughly analyzed the channel characteristics and identified the parameters that dominantly affect SI in relation to each frequency range. From the analyzed insertion loss and far end crosstalk, we verified SI of the silicon interposer by eye-diagram simulations in terms of eye-height voltage and timing jitter in the time domain. In the worst case, the eye-height voltage and timing jitter of the proposed microstrip lines are 0.911 V and 36.8 ps, respectively, with 72 mV of signal coupling. The eye-height voltage and timing jitter of the proposed strip line are 0.887 V and 42.1 ps with 34 mV of single couplings. We show that the proposed channels of the silicon interposer can successfully transfer data at a 2-Gb/s data rate. Finally, we propose concepts and solutions for the next-generation HBM interface with higher data rates up to 8 Gb/s.

  • Design of an On-Silicon-Interposer Passive Equalizer for Next Generation High Bandwidth Memory With Data Rate Up To 8 Gb/s
    IEEE Transactions on Circuits and Systems I: Regular Papers, 2018
    Co-Authors: Yeseul Jeon, Heegon Kim, Joungho Kim
    Abstract:

    In this paper, we propose a new on-silicon-interposer passive equalizer for next generation high Bandwidth Memory (HBM) with 1024 I/O lines and 8-Gb/s data transmission, which is four times higher than the data rate of HBM generation 2. The proposed equalizer meets the three requirements for the implementation of ultra-high Bandwidth interface with wide I/O lines: 1) small area; 2) fine pitch; and 3) low power. The proposed equalizer is embedded in a ground plane on an interposer to reduce additional area consumption. By staggering the equalizers in two rows, 7- $\mu \text{m}$ pitch of the channel can be maintained. The equalizer consumes only 8.24 mW at the data rate of 8 Gb/s since it adopts passive equalization methodology. Robust performance that is independent of insertion location provides design flexibility. The proposed design process for the equalizer helps to reduce manufacturing time and cost. We have verified the performance of the proposed equalizer using simulation and measurement. By applying the proposed equalizer, the eye diagram which was completely closed is successfully open with an eye height of 11.5% $V_{{\mathrm {TX,output}}}$ and an eye width of 57.8% unit interval at a bit-error rate of 10−12.

  • Estimation and analysis of crosstalk effects in high-Bandwidth Memory channel
    2018 IEEE International Symposium on Electromagnetic Compatibility and 2018 IEEE Asia-Pacific Symposium on Electromagnetic Compatibility (EMC APEMC), 2018
    Co-Authors: Sumin Choi, Kyungjun Cho, Heegon Kim, Jaemin Lim, Junyong Park, Daniel H Jung, Dong-hyun Kim, Joungho Kim
    Abstract:

    In This paper, we present an efficient crosstalk-included eye-diagram estimation with simulation and measurement results. Crosstalk level, total jitter, and eye height are analyzed from the obtained eye-diagram. Crosstalk effects show substantial impact on eye-diagram in HBM channel.

  • design and signal integrity analysis of high Bandwidth Memory hbm interposer in 2 5d terabyte s Bandwidth graphics module
    Electrical Design of Advanced Packaging and Systems Symposium, 2016
    Co-Authors: Hyunsuk Lee, Kyungjun Cho, Heegon Kim, Sumin Choi, Jaemin Lim, Hyunwoo Shim, Joungho Kim
    Abstract:

    Spurred by the industrial demands for terabyte/s Bandwidth graphics module, high Bandwidth Memory (HBM) has been emerged to overcome the limitations of conventional DRAMs. Additionally, due to the fine pitch and high density interconnect routing between GPU and 4 HBMs in 2.5D terabyte/s Bandwidth graphics module, HBM interposer has also been to the force. However, several signal integrity issues of the HBM interposer occur due to the manufacturing process constraints. In this paper, we design the HBM interposer using 6 layers redistribution layer (RDL) and TSVs in 2.5D terabyte/s Bandwidth graphics module. And then, in the designed HBM interposer, electrical performance of the HBM interposer channels using M1, M3, and M5 layer is analyzed by simulation in the frequency-and time-domain. With the simulation results, it is observed that the designed HBM interposer shows good signal integrity.