Bit Manipulation - Explore the Science & Experts | ideXlab

Scan Science and Technology

Contact Leading Edge Experts & Companies

Bit Manipulation

The Experts below are selected from a list of 297 Experts worldwide ranked by ideXlab platform

Yedidya Hilewitz – 1st expert on this subject based on the ideXlab platform

  • A New Basis for Shifters in General-Purpose Processors for Existing and Advanced Bit Manipulations
    IEEE Transactions on Computers, 2009
    Co-Authors: Yedidya Hilewitz

    Abstract:

    This paper describes a new basis for the implementation of the shifter functional unit in microprocessors that can implement new advanced Bit Manipulations as well as standard shifter operations. Our design is based on the inverse butterfly and butterfly data path circuits, rather than the barrel shifter or log-shifter designs currently used. We show how this new shifter can implement the standard shift and rotate operations, as well as more advanced extract, deposit, and mix operations found in some processors. Furthermore, it can perform important new classes of even more advanced Bit Manipulation instructions like arBitrary Bit permutations, Bit gather (or parallel extract), and Bit scatter (or parallel deposit) instructions. Thus, our new functional unit performs the functionality of three functional units-the basic shifter, the multimedia-mix unit, and the advanced Bit Manipulation functional unit, while having a latency only slightly longer than that of the log-shifter. For performing only the existing functions of a shifter, it has significantly smaller area.

  • Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors
    Journal of Signal Processing Systems, 2008
    Co-Authors: Yedidya Hilewitz

    Abstract:

    Advanced Bit Manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence of instructions needed to emulate these complicated Bit operations. As these Bit Manipulation operations are relevant to applications that are becoming increasingly important, we propose direct support for them in microprocessors. In particular, we propose fast Bit gather (or parallel extract), Bit scatter (or parallel deposit) and Bit permutation instructions (including group, butterfly and inverse butterfly). We show that all these instructions can be implemented efficiently using both the fast butterfly and inverse butterfly network datapaths. Specifically, we show that parallel deposit can be mapped onto a butterfly circuit and parallel extract can be mapped onto an inverse butterfly circuit. We define static, dynamic and loop invariant versions of the instructions, with static versions utilizing a much simpler functional unit. We show how a hardware decoder can be implemented for the dynamic and loop-invariant versions to generate, dynamically, the control signals for the butterfly and inverse butterfly datapaths. The simplest functional unit we propose is smaller and faster than an ALU. We also show that these instructions yield significant speedups over a basic RISC architecture for a variety of different application kernels taken from applications domains including bioinformatics, steganography, coding, compression and random number generation.

  • Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors
    18th IEEE Symposium on Computer Arithmetic (ARITH '07), 2007
    Co-Authors: Yedidya Hilewitz

    Abstract:

    This paper describes a new basis for the implementation of a shifter functional unit. We present a design based on the inverse butterfly and butterfly datapath circuits that performs the standard shift and rotate operations, as well as more advanced extract, deposit and mix operations found in some processors. Additionally, it also supports important new classes of even more advanced Bit Manipulation instructions recently proposed: these include arBitrary Bit permutations, Bit scatter and Bit gather instructions. The new functional unit’s datapath is comparable in latency to that of the classic barrel shifter. It replaces two existing functional units-shifter and mix-with a much more powerful one.

Myung Hoon Sunwoo – 2nd expert on this subject based on the ideXlab platform

  • Bit Manipulation Accelerator for Communication Systems Digital Signal Processor
    EURASIP Journal on Advances in Signal Processing, 2005
    Co-Authors: Sug Hyun Jeong, Myung Hoon Sunwoo, Seong K. Oh

    Abstract:

    This paper proposes application-specific instructions and their Bit Manipulation unit (BMU), which efficiently support scrambling, convolutional encoding, puncturing, interleaving, and Bit stream multiplexing. The proposed DSP employs the BMU supporting parallel shift and XOR (exclusive-OR) operations and Bit insertion/extraction operations on multiple data. The proposed architecture has been modeled by VHDL and synthesized using the SEC 0.18 m standard cell library and the gate count of the BMU is only about 1700 gates. Performance comparisons show that the number of clock cycles can be reduced about for scrambling, convolutional encoding, and interleaving compared with existing DSPs.

  • Novel Bit Manipulation unit for communication digital signal processors
    2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512), 2004
    Co-Authors: Sug Hyun Jeong, Myung Hoon Sunwoo

    Abstract:

    This paper proposes application-specific instructions and their Bit Manipulation unit (BMU), which efficiently support scrambling, convolutional encoding, puncturing, and interleaving. The proposed DSP employs the BMU supporting parallel shift and XOR (Exclusive-OR) operations and Bit insertion/extraction operations on multiple data. The proposed architecture has been modeled by VHDL and synthesized using the SEC 0.18 /spl mu/m standard cell library and the gate count of the BMU is only about 1700 gates. Performance comparisons show that the number of clock cycles can be reduced about 40%/spl sim/80% for scrambling, convolutional encoding and interleaving compared with existing DSPs.

  • ISCAS (2) – Novel Bit Manipulation unit for communication digital signal processors
    2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512), 2004
    Co-Authors: Sug Hyun Jeong, Myung Hoon Sunwoo

    Abstract:

    This paper proposes application-specific instructions and their Bit Manipulation unit (BMU), which efficiently support scrambling, convolutional encoding, puncturing, and interleaving. The proposed DSP employs the BMU supporting parallel shift and XOR (Exclusive-OR) operations and Bit insertion/extraction operations on multiple data. The proposed architecture has been modeled by VHDL and synthesized using the SEC 0.18 /spl mu/m standard cell library and the gate count of the BMU is only about 1700 gates. Performance comparisons show that the number of clock cycles can be reduced about 40%/spl sim/80% for scrambling, convolutional encoding and interleaving compared with existing DSPs.

Akash Kumar – 3rd expert on this subject based on the ideXlab platform

  • FPL – Improving autonomous soft-error tolerance of FPGA through LUT configuration Bit Manipulation
    2013 23rd International Conference on Field programmable Logic and Applications, 2013
    Co-Authors: Shyamsundar Venkataraman, Akash Kumar

    Abstract:

    Soft-errors in LUT configuration Bits of FPGAs can alter the functionality of an implemented design, rendering it useless, unless re-programmed. This paper proposes a technique to improve autonomous fault-masking capabilities of a design by maximizing the number of zeros or ones in LUTs. The technique utilizes spare resources (XOR gates and carry chain) of FPGA devices to selectively manipulate LUT contents using two operations – LUT restructuring and LUT decomposition. Experiments conducted with a wide set of benchmarks from MCNC, IWLS 2005 and ITC99 benchmark suite on Xilinx Virtex 6 FPGA board demonstrate that the proposed methodology maximizes logic 0/1 of LUTs by an average 20% achieving 80% fault-masking with no area overhead. The fault-rate of the entire design is reduced by 60% on average as compared to the existing techniques. Further, an additional 5% fault-masking can be achieved with a 7% increase in slice usage.

  • Improving autonomous soft-error tolerance of FPGA through LUT configuration Bit Manipulation
    2013 23rd International Conference on Field programmable Logic and Applications, 2013
    Co-Authors: Shyamsundar Venkataraman, Akash Kumar

    Abstract:

    Soft-errors in LUT configuration Bits of FPGAs can alter the functionality of an implemented design, rendering it useless, unless re-programmed. This paper proposes a technique to improve autonomous fault-masking capabilities of a design by maximizing the number of zeros or ones in LUTs. The technique utilizes spare resources (XOR gates and carry chain) of FPGA devices to selectively manipulate LUT contents using two operations – LUT restructuring and LUT decomposition. Experiments conducted with a wide set of benchmarks from MCNC, IWLS 2005 and ITC99 benchmark suite on Xilinx Virtex 6 FPGA board demonstrate that the proposed methodology maximizes logic 0/1 of LUTs by an average 20% achieving 80% fault-masking with no area overhead. The fault-rate of the entire design is reduced by 60% on average as compared to the existing techniques. Further, an additional 5% fault-masking can be achieved with a 7% increase in slice usage.