Machine Code

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 63123 Experts worldwide ranked by ideXlab platform

Tom Reps - One of the best experts on this subject based on the ideXlab platform.

  • model assisted Machine Code synthesis
    Conference on Object-Oriented Programming Systems Languages and Applications, 2017
    Co-Authors: Venkatesh Srinivasan, Ara Vartanian, Tom Reps
    Abstract:

    Binary rewriters are tools that are used to modify the functionality of binaries lacking source Code. Binary rewriters can be used to rewrite binaries for a variety of purposes including optimization, hardening, and extraction of executable components. To rewrite a binary based on semantic criteria, an essential primitive to have is a Machine-Code synthesizer—a tool that synthesizes an instruction sequence from a specification of the desired behavior, often given as a formula in quantifier-free bit-vector logic (QFBV). However, state-of-the-art Machine-Code synthesizers such as McSynth++ employ naive search strategies for synthesis: McSynth++ merely enumerates candidates of increasing length without performing any form of prioritization. This inefficient search strategy is compounded by the huge number of unique instruction schemas in instruction sets (e.g., around 43,000 in Intel's IA-32) and the exponential cost inherent in enumeration. The effect is slow synthesis: even for relatively small specifications, McSynth++ might take several minutes or a few hours to find an implementation. In this paper, we describe how we use Machine learning to make the search in McSynth++ smarter and potentially faster. We converted the linear search in McSynth++ into a best-first search over the space of instruction sequences. The cost heuristic for the best-first search comes from two models—used together—built from a corpus of pairs: (i) a language model that favors useful instruction sequences, and (ii) a regression model that correlates features of instruction sequences with features of QFBV formulas, and favors instruction sequences that are more likely to implement the input formula. Our experiments for IA-32 showed that our model-assisted synthesizer enables synthesis of Code for 6 out of 50 formulas on which McSynth++ times out, speeding up the synthesis time by at least 549X, and for the remaining formulas, speeds up synthesis by 4.55X.

  • model assisted Machine Code synthesis
    Proceedings of the ACM on Programming Languages, 2017
    Co-Authors: Venkatesh Srinivasan, Ara Vartanian, Tom Reps
    Abstract:

    Binary rewriters are tools that are used to modify the functionality of binaries lacking source Code. Binary rewriters can be used to rewrite binaries for a variety of purposes including optimization, hardening, and extraction of executable components. To rewrite a binary based on semantic criteria, an essential primitive to have is a Machine-Code synthesizer---a tool that synthesizes an instruction sequence from a specification of the desired behavior, often given as a formula in quantifier-free bit-vector logic (QFBV). However, state-of-the-art Machine-Code synthesizers such as McSynth++ employ naive search strategies for synthesis: McSynth++ merely enumerates candidates of increasing length without performing any form of prioritization. This inefficient search strategy is compounded by the huge number of unique instruction schemas in instruction sets (e.g., around 43,000 in Intel's IA-32) and the exponential cost inherent in enumeration. The effect is slow synthesis: even for relatively small specifications, McSynth++ might take several minutes or a few hours to find an implementation. In this paper, we describe how we use Machine learning to make the search in McSynth++ smarter and potentially faster. We converted the linear search in McSynth++ into a best-first search over the space of instruction sequences. The cost heuristic for the best-first search comes from two models---used together---built from a corpus of 〈QFBV-formula, instruction-sequence〉 pairs: (i) a language model that favors useful instruction sequences, and (ii) a regression model that correlates features of instruction sequences with features of QFBV formulas, and favors instruction sequences that are more likely to implement the input formula. Our experiments for IA-32 showed that our model-assisted synthesizer enables synthesis of Code for 6 out of 50 formulas on which McSynth++ times out, speeding up the synthesis time by at least 549X, and for the remaining formulas, speeds up synthesis by 4.55X.

  • Synthesis of Machine Code from Semantics ∗
    2015
    Co-Authors: Venkatesh Srinivasan, Tom Reps
    Abstract:

    In this paper, we present a technique to synthesize Machine-Code instructions from a semantic specification, given as a Quantifier-Free Bit-Vector (QFBV) logic for-mula. Our technique uses an instantiation of the Counter-Example Guided Inductive Synthesis (CEGIS) framework, in combination with search-space pruning heuristics to syn-thesize instruction-sequences. To counter the exponential cost inherent in enumerative synthesis, our technique uses a divide-and-conquer strategy to break the input QFBV for-mula into independent sub-formulas, and synthesize instruc-tions for the sub-formulas. Synthesizers created by our tech-nique could be used to create semantics-based binary rewrit-ing tools such as optimizers, partial evaluators, program obfuscators/de-obfuscators, etc. Our experiments for Intel’s IA-32 instruction set show that, in comparison to our base-line algorithm, our search-space pruning heuristics reduce the synthesis time by a factor of 473, and our divide-and-conquer strategy reduces the synthesis time by a further 3 to 5 orders of magnitude. 1

  • tsl a system for generating abstract interpreters and its application to Machine Code analysis
    ACM Transactions on Programming Languages and Systems, 2013
    Co-Authors: Tom Reps
    Abstract:

    This article describes the design and implementation of a system, called T SL (for Transformer Specification Language), that provides a systematic solution to the problem of creating retargetable tools for analyzing Machine Code. T SL is a tool generator---that is, a metatool---that automatically creates different abstract interpreters for Machine-Code instruction sets. The most challenging technical issue that we faced in designing T SL was how to automate the generation of the set of abstract transformers for a given abstract interpretation of a given instruction set. From a description of the concrete operational semantics of an instruction set, together with the datatypes and operations that define an abstract domain, T SL automatically creates the set of abstract transformers for the instructions of the instruction set. T SL advances the state-of-the-art in program analysis because it provides two dimensions of parameterizability: (i) a given analysis component can be retargeted to different instruction sets; (ii) multiple analysis components can be created automatically from a single specification of the concrete operational semantics of the language to be analyzed. T SL is an abstract-transformer-generator generator. The article describes the principles behind T SL , and discusses how one uses T SL to develop different abstract interpreters.

  • directed proof generation for Machine Code
    Computer Aided Verification, 2010
    Co-Authors: Aditya Thakur, Amanda Burton, Evan Driscoll, Matt Elder, Tycho Andersen, Tom Reps
    Abstract:

    We present the algorithms used in McVeto (Machine-Code VErification TOol), a tool to check whether a stripped Machine-Code program satisfies a safety property The verification problem that McVeto addresses is challenging because it cannot assume that it has access to (i) certain structures commonly relied on by source-Code verification tools, such as control-flow graphs and call-graphs, and (ii) meta-data, such as information about variables, types, and aliasing It cannot even rely on out-of-scope local variables and return addresses being protected from the program's actions What distinguishes McVeto from other work on software model checking is that it shows how verification of Machine-Code can be performed, while avoiding conventional techniques that would be unsound if applied at the Machine-Code level.

Magnus O Myreen - One of the best experts on this subject based on the ideXlab platform.

  • The Reflective Milawa Theorem Prover is Sound (Down to the Machine Code that Runs it)
    Journal of Automated Reasoning, 2015
    Co-Authors: Jared Davis, Magnus O Myreen
    Abstract:

    This paper presents, we believe, the most comprehensive evidence of a theorem prover’s soundness to date. Our subject is the Milawa theorem prover. We present evidence of its soundness down to the Machine Code. Milawa is a theorem prover styled after NQTHM and ACL2. It is based on an idealised version of ACL2’s computational logic and provides the user with high-level tactics similar to ACL2’s. In contrast to NQTHM and ACL2, Milawa has a small kernel that is somewhat like an LCF-style system. We explain how the Milawa theorem prover is constructed as a sequence of reflective extensions from its kernel. The kernel establishes the soundness of these extensions during Milawa’s bootstrapping process. Going deeper, we explain how we have shown that the Milawa kernel is sound using the HOL4 theorem prover. In HOL4, we have formalized its logic, proved the logic sound, and proved that the source Code for the Milawa kernel (1,700 lines of Lisp) faithfully implements this logic. Going even further, we have combined these results with the x86 Machine-Code level verification of the Lisp runtime Jitawa. Our top-level theorem states that Milawa can never claim to prove anything that is false when it is run on this Lisp runtime.

  • a verified runtime for a verified theorem prover
    Interactive Theorem Proving, 2011
    Co-Authors: Magnus O Myreen, Jared Davis
    Abstract:

    Theorem provers, such as ACL2, HOL, Isabelle and Coq, rely on the correctness of runtime systems for programming languages like ML, OCaml or Common Lisp. These runtime systems are complex and critical to the integrity of the theorem provers. In this paper, we present a new Lisp runtime which has been formally verified and can run the Milawa theorem prover. Our runtime consists of 7,500 lines of Machine Code and is able to complete a 4 gigabyte Milawa proof effort. When our runtime is used to carry out Milawa proofs, less unverified Code must be trusted than with any other theorem prover. Our runtime includes a just-in-time compiler, a copying garbage collector, a parser and a printer, all of which are HOL4-verified down to the concrete x86 Code. We make heavy use of our previously developed tools for Machine-Code verification. This work demonstrates that our approach to Machine-Code verification scales to non-trivial applications.

  • formal verification of Machine Code programs
    2011
    Co-Authors: Magnus O Myreen
    Abstract:

    Formal program verification provides mathematical means of increasing assurance for the correctness of software. Most approaches to program verification are either fully automatic and prove only weak properties, or alternatively are manual and labour intensive to apply; few target realistically modelled Machine Code. The work presented in this dissertation aims to ease the effort required in proving properties of programs on top of detailed models of Machine Code. The contributions are novel approaches for both verification of existing programs and methods for automatically constructing correct Code. For program verification, this thesis presents a new approach based on translation: the problem of proving properties of programs is reduced, via fully-automatic deduction, to a problem of proving properties of recursive functions. The translation from programs to recursive functions is shown to be implementable in a theorem prover both for simple while-programs as well as real Machine Code. This verification-after-translation approach has several advantages over established approaches of verification condition generation. In particular, the new approach does not require annotating the program with assertions. More importantly, the proposed approach separates the verification proof from the underlying model so that specific resource names, some instruction orderings and certain control-flow structures become irrelevant. As a result, proof reuse is enabled to a greater extent than in currently used methods. The scalability of this new approach is illustrated through the verification of ARM, x86 and PowerPC implementations of a copying garbage collector. For construction of correct Code, this thesis presents a new compiler which maps functions from logic, via proof, down to multiple carefully modelled commercial Machine languages. Unlike previously published work on compilation from higher-order logic, this compiler allows input functions to be partially specified and supports a broad range of user-defined extensions. These features enabled the production of formally verified Machine-Code implementations of a LISP interpreter, as a case study. The automation and proofs have been implemented in the HOL4 theorem prover, using a new Machine-Code Hoare triple instantiated to detailed specifications of ARM, x86 and PowerPC instruction set architectures. “Making formal methods into normal methods.” — Peter Homeier

  • verified just in time compiler on x86
    Symposium on Principles of Programming Languages, 2010
    Co-Authors: Magnus O Myreen
    Abstract:

    This paper presents a method for creating formally correct just-in-time (JIT) compilers. The tractability of our approach is demonstrated through, what we believe is the first, verification of a JIT compiler with respect to a realistic semantics of self-modifying x86 Machine Code. Our semantics includes a model of the instruction cache. Two versions of the verified JIT compiler are presented: one generates all of the Machine Code at once, the other one is incremental i.e. produces Code on-demand. All proofs have been performed inside the HOL4 theorem prover.

  • verified lisp implementations on arm x86 and powerpc
    Theorem Proving in Higher Order Logics, 2009
    Co-Authors: Magnus O Myreen, Michael J. C. Gordon
    Abstract:

    This paper reports on a case study, which we believe is the first to produce a formally verified end-to-end implementation of a functional programming language running on commercial processors. Interpreters for the core of McCarthy's LISP 1.5 were implemented in ARM, x86 and PowerPC Machine Code, and proved to correctly parse, evaluate and print LISP s-expressions. The proof of evaluation required working on top of verified implementations of memory allocation and garbage collection. All proofs are mechanised in the HOL4 theorem prover.

Jonathan Katz - One of the best experts on this subject based on the ideXlab platform.

  • secure computation of mips Machine Code
    European Symposium on Research in Computer Security, 2016
    Co-Authors: Xiao Wang, Dov S Gordon, Allen Mcintosh, Jonathan Katz
    Abstract:

    Existing systems for secure computation require programmers to express the program to be securely computed as a circuit, or in a domain-specific language that can be compiled to a form suitable for applying known protocols. We propose a new system that can securely execute native MIPS Code with no special annotations. Our system allows programmers to use a language of their choice to express their programs, together with any off-the-shelf compiler to MIPS; it can be used for secure computation of “legacy” MIPS Code as well.

  • secure computation of mips Machine Code
    IACR Cryptology ePrint Archive, 2015
    Co-Authors: Xiao Wang, Dov S Gordon, Allen Mcintosh, Jonathan Katz
    Abstract:

    Existing systems for secure computation require programmers to express the program to be securely computed as a circuit, or in a domain-specific language that can be compiled to a form suitable for applying known protocols. We propose a new system that can securely execute native MIPS Code with no special annotations. Our system allows programmers to use a language of their choice to express their programs, together with any off-the-shelf compiler to MIPS; it can be used for secure computation of “legacy” MIPS Code as well. Our system uses oblivious RAM for fetching instructions and performing load/store operations in memory, and garbled universal circuits for the execution of a MIPS CPU in each instruction step. We also explore various optimizations based on an offline analysis of the MIPS Code to be executed, in order to minimize the overhead of executing each instruction while still maintaining security.

Xiao Wang - One of the best experts on this subject based on the ideXlab platform.

  • secure computation of mips Machine Code
    European Symposium on Research in Computer Security, 2016
    Co-Authors: Xiao Wang, Dov S Gordon, Allen Mcintosh, Jonathan Katz
    Abstract:

    Existing systems for secure computation require programmers to express the program to be securely computed as a circuit, or in a domain-specific language that can be compiled to a form suitable for applying known protocols. We propose a new system that can securely execute native MIPS Code with no special annotations. Our system allows programmers to use a language of their choice to express their programs, together with any off-the-shelf compiler to MIPS; it can be used for secure computation of “legacy” MIPS Code as well.

  • secure computation of mips Machine Code
    IACR Cryptology ePrint Archive, 2015
    Co-Authors: Xiao Wang, Dov S Gordon, Allen Mcintosh, Jonathan Katz
    Abstract:

    Existing systems for secure computation require programmers to express the program to be securely computed as a circuit, or in a domain-specific language that can be compiled to a form suitable for applying known protocols. We propose a new system that can securely execute native MIPS Code with no special annotations. Our system allows programmers to use a language of their choice to express their programs, together with any off-the-shelf compiler to MIPS; it can be used for secure computation of “legacy” MIPS Code as well. Our system uses oblivious RAM for fetching instructions and performing load/store operations in memory, and garbled universal circuits for the execution of a MIPS CPU in each instruction step. We also explore various optimizations based on an offline analysis of the MIPS Code to be executed, in order to minimize the overhead of executing each instruction while still maintaining security.

Francois Dupressoir - One of the best experts on this subject based on the ideXlab platform.

  • certified computer aided cryptography efficient provably secure Machine Code from high level implementations
    Computer and Communications Security, 2013
    Co-Authors: Jose B Almeida, Manuel Barbosa, Gilles Barthe, Francois Dupressoir
    Abstract:

    We present a computer-aided framework for proving concrete security bounds for cryptographic Machine Code implementations. The front-end of the framework is an interactive verification tool that extends the EasyCrypt framework to reason about relational properties of C-like programs extended with idealised probabilistic operations in the style of Code-based security proofs. The framework also incorporates an extension of the CompCert certified compiler to support trusted libraries providing complex arithmetic calculations or instantiating idealized components such as sampling operations. This certified compiler allows us to carry to executable Code the security guarantees established at the high-level, and is also instrumented to detect when compilation may interfere with side-channel countermeasures deployed in source Code.We demonstrate the applicability of the framework by applying it to the RSA-OAEP encryption scheme, as standardized in PKCS#1 v2.1. The outcome is a rigorous analysis of the advantage of an adversary to break the security of assembly implementations of the algorithms specified by the standard. The example also provides two contributions of independent interest: it bridges the gap between computer-assisted security proofs and real-world cryptographic implementations as described by standards such as PKCS,and demonstrates the use of the CompCert certified compiler in the context of cryptographic software development.

  • certified computer aided cryptography efficient provably secure Machine Code from high level implementations
    IACR Cryptology ePrint Archive, 2013
    Co-Authors: Jose B Almeida, Manuel Barbosa, Gilles Barthe, Francois Dupressoir
    Abstract:

    We present a computer-aided framework for proving concrete security bounds for cryptographic Machine Code implementations. The front-end of the framework is an interactive verification tool that extends the EasyCrypt framework to reason about relational properties of C-like programs extended with idealised probabilistic operations in the style of Code-based security proofs. The framework also incorporates an extension of the CompCert certified compiler to support trusted libraries providing complex arithmetic calculations or instantiating idealised components such as sampling operations. This certified compiler allows us to carry to executable Code the security guarantees established at the high-level, and is also instrumented to detect when compilation may interfere with side-channel countermeasures deployed in source Code. We demonstrate the applicability of the framework with the RSA-OAEP encryption scheme, as standardized in PKCS#1 v2.1. The outcome is a rigorous analysis of the advantage of an adversary to break the security of assembly implementations of the algorithms specified by the standard. The example also provides two contributions of independent interest: it is the first application of computer-aided cryptographic tools to real-world security, and the first application of CompCert to cryptographic software.