Undefined Behavior - Explore the Science & Experts

The Experts below are selected from a list of 270 Experts worldwide ranked by ideXlab platform

John Regehr - One of the best experts on this subject based on the ideXlab platform.

Random testing for C and C++ compilers with YARPGen

Proceedings of the ACM on Programming Languages, 2020

Co-Authors: Vsevolod Livinskii, Babokin Dmitry Yurievich, John Regehr

Abstract:

Compilers should not crash and they should not miscompile applications. Random testing is an effective method for finding compiler bugs that have escaped other kinds of testing. This paper presents Yet Another Random Program Generator (YARPGen), a random test-case generator for C and C++ that we used to find and report more than 220 bugs in GCC, LLVM, and the Intel® C++ Compiler. Our research contributions include a method for generating expressive programs that avoid Undefined Behavior without using dynamic checks, and generation policies, a mechanism for increasing diversity of generated code and for triggering more optimizations. Generation policies decrease the testing time to find hard-to-trigger compiler bugs and, for the kinds of scalar optimizations YARPGen was designed to stress-test, increase the number of times these optimizations are applied by the compiler by an average of 20% for LLVM and 40% for GCC. We also created tools for automating most of the common tasks related to compiler fuzzing; these tools are also useful for fuzzers other than ours.

15 days free trial to Access Article
Practical verification of peephole optimizations with Alive

Communications of The ACM, 2018

Co-Authors: Nuno P. Lopes, David Menendez, Santosh Nagarakatte, John Regehr

Abstract:

Compilers should not miscompile. Peephole optimizations, which perform local rewriting of the input program to improve the efficiency of generated code, are a persistent source of compiler bugs. We created Alive, a domain-specific language for writing optimizations and for automatically either proving them correct or else generating counterexamples. Furthermore, Alive can be automatically translated into C++ code that is suitable for inclusion in an LLVM optimization pass. Alive is based on an attempt to balance usability and formal methods; for example, it captures---but largely hides---the detailed semantics of the various kinds of Undefined Behavior. Alive has found numerous bugs in the LLVM compiler and is being used by LLVM developers.

15 days free trial to Access Article
Taming Undefined Behavior in LLVM

ACM SIGPLAN Notices, 2017

Co-Authors: Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-kil Hur, Sanjoy Das, David Majnemer, John Regehr, Nuno P. Lopes

Abstract:

A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis. In this paper we study an aspect of IR design that has received little attention: the role of Undefined Behavior. The IR for every optimizing compiler we have looked at, including GCC, LLVM, Intel's, and Microsoft's, supports one or more forms of Undefined Behavior (UB), not only to reflect the semantics of UB-heavy programming languages such as C and C++, but also to model inherently unsafe low-level operations such as memory stores and to avoid over-constraining IR semantics to the point that desirable transformations become illegal. The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs. We present solutions to the problems we have identified in LLVM's IR and show that most optimizations currently in LLVM remain sound, and that some desirable new transformations become permissible. Our solutions do not degrade compile time or performance of generated code.

15 days free trial to Access Article
PLDI - Taming Undefined Behavior in LLVM

Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2017

Co-Authors: Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-kil Hur, Sanjoy Das, David Majnemer, John Regehr, Nuno P. Lopes

Abstract:

A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis. In this paper we study an aspect of IR design that has received little attention: the role of Undefined Behavior. The IR for every optimizing compiler we have looked at, including GCC, LLVM, Intel's, and Microsoft's, supports one or more forms of Undefined Behavior (UB), not only to reflect the semantics of UB-heavy programming languages such as C and C++, but also to model inherently unsafe low-level operations such as memory stores and to avoid over-constraining IR semantics to the point that desirable transformations become illegal. The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs. We present solutions to the problems we have identified in LLVM's IR and show that most optimizations currently in LLVM remain sound, and that some desirable new transformations become permissible. Our solutions do not degrade compile time or performance of generated code.

15 days free trial to Access Article
Provably correct peephole optimizations with alive

ACM SIGPLAN Notices, 2015

Co-Authors: Nuno P. Lopes, David Menendez, Santosh Nagarakatte, John Regehr

Abstract:

Compilers should not miscompile. Our work addresses problems in developing peephole optimizations that perform local rewriting to improve the efficiency of LLVM code. These optimizations are individually difficult to get right, particularly in the presence of Undefined Behavior; taken together they represent a persistent source of bugs. This paper presents Alive, a domain-specific language for writing optimizations and for automatically either proving them correct or else generating counterexamples. Furthermore, Alive can be automatically translated into C++ code that is suitable for inclusion in an LLVM optimization pass. Alive is based on an attempt to balance usability and formal methods; for example, it captures---but largely hides---the detailed semantics of three different kinds of Undefined Behavior in LLVM. We have translated more than 300 LLVM optimizations into Alive and, in the process, found that eight of them were wrong.

15 days free trial to Access Article

Grigore Rosu - One of the best experts on this subject based on the ideXlab platform.

Dealing With C's Original Sin

IEEE Software, 2019

Co-Authors: Chris Hathhorn, Grigore Rosu

Abstract:

In the very early days of C, the compiler written by Dennis Ritchie and supplied with the UNIX operating system entirely defined the language. As the number of users and C implementations grew, however, so too did the need for a language standard-a contract between users and implementers about what should and should not count as C. This effort began in 1983 with the formation of a committee tasked with producing "an unambiguous and machine-independent definition of the language C" and led to the ANSI C Standard in 1989.1 In retrospect, it was not until this date, 17 years after the first compiler, when C's most notorious language feature slithered into the world: Undefined Behavior .

15 days free trial to Access Article
RV - Runtime Verification at Work: A Tutorial

Runtime Verification, 2016

Co-Authors: Philip Daian, Chris Hathhorn, Dwight Guth, Manasvi Saxena, Edgar Pek, Traian Florin Şerbănuţă, Grigore Rosu

Abstract:

We present a suite of runtime verification tools developed by Runtime Verification Inc.: RV-Match, RV-Predict, and RV-Monitor. RV-Match is a tool for checking C programs for Undefined Behavior and other common programmer mistakes. It is extracted from the most complete formal semantics of the C11 language and beats many similar tools in its ability to catch a broad range of undesirable Behaviors. RV-Predict is a dynamic data race detector for Java and C/C++ programs. It is perhaps the only tool that is both sound and maximal: it only reports real races and it can find all races that can be found by any other sound data race detector analyzing the same execution trace. RV-Monitor is a runtime monitoring tool that checks and enforces safety and security properties during program execution. Our tools focus on reporting no false positives and are free for non-commercial use.

15 days free trial to Access Article
CAV (1) - RV-match: Practical semantics-based program analysis

Computer Aided Verification, 2016

Co-Authors: Dwight Guth, Chris Hathhorn, Manasvi Saxena, Grigore Rosu

Abstract:

We present RV-Match, a tool for checking C programs for Undefined Behavior and other common programmer mistakes. Our tool is extracted from the most complete formal semantics of the C11 language. Previous versions of this tool were used primarily for testing the correctness of the semantics, but we have improved it into a tool for doing practical analysis of real C programs. It beats many similar tools in its ability to catch a broad range of undesirable Behaviors. We demonstrate this with comparisons based on a third-party benchmark.

15 days free trial to Access Article
PLDI - Defining the Undefinedness of C

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015

Co-Authors: Chris Hathhorn, Chucky Ellison, Grigore Rosu

Abstract:

We present a ``negative'' semantics of the C11 language---a semantics that does not just give meaning to correct programs, but also rejects Undefined programs. We investigate Undefined Behavior in C and discuss the techniques and special considerations needed for formally specifying it. We have used these techniques to modify and extend a semantics of C into one that captures Undefined Behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end nearly doubling the size of the original semantics. From our semantics, we have automatically extracted an Undefinedness checker, which we evaluate against other popular analysis tools, using our own test suite in addition to a third-party test suite. Our checker is capable of detecting examples of all 77 categories of core language Undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of Undefined Behavior in C, and thus of the C language itself.

15 days free trial to Access Article

Nuno P. Lopes - One of the best experts on this subject based on the ideXlab platform.

Practical verification of peephole optimizations with Alive

Communications of The ACM, 2018

Co-Authors: Nuno P. Lopes, David Menendez, Santosh Nagarakatte, John Regehr

Abstract:

Compilers should not miscompile. Peephole optimizations, which perform local rewriting of the input program to improve the efficiency of generated code, are a persistent source of compiler bugs. We created Alive, a domain-specific language for writing optimizations and for automatically either proving them correct or else generating counterexamples. Furthermore, Alive can be automatically translated into C++ code that is suitable for inclusion in an LLVM optimization pass. Alive is based on an attempt to balance usability and formal methods; for example, it captures---but largely hides---the detailed semantics of the various kinds of Undefined Behavior. Alive has found numerous bugs in the LLVM compiler and is being used by LLVM developers.

15 days free trial to Access Article
Taming Undefined Behavior in LLVM

ACM SIGPLAN Notices, 2017

Co-Authors: Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-kil Hur, Sanjoy Das, David Majnemer, John Regehr, Nuno P. Lopes

Abstract:

A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis. In this paper we study an aspect of IR design that has received little attention: the role of Undefined Behavior. The IR for every optimizing compiler we have looked at, including GCC, LLVM, Intel's, and Microsoft's, supports one or more forms of Undefined Behavior (UB), not only to reflect the semantics of UB-heavy programming languages such as C and C++, but also to model inherently unsafe low-level operations such as memory stores and to avoid over-constraining IR semantics to the point that desirable transformations become illegal. The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs. We present solutions to the problems we have identified in LLVM's IR and show that most optimizations currently in LLVM remain sound, and that some desirable new transformations become permissible. Our solutions do not degrade compile time or performance of generated code.

15 days free trial to Access Article
PLDI - Taming Undefined Behavior in LLVM

Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2017

Co-Authors: Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-kil Hur, Sanjoy Das, David Majnemer, John Regehr, Nuno P. Lopes

Abstract:

A central concern for an optimizing compiler is the design of its intermediate representation (IR) for code. The IR should make it easy to perform transformations, and should also afford efficient and precise static analysis. In this paper we study an aspect of IR design that has received little attention: the role of Undefined Behavior. The IR for every optimizing compiler we have looked at, including GCC, LLVM, Intel's, and Microsoft's, supports one or more forms of Undefined Behavior (UB), not only to reflect the semantics of UB-heavy programming languages such as C and C++, but also to model inherently unsafe low-level operations such as memory stores and to avoid over-constraining IR semantics to the point that desirable transformations become illegal. The current semantics of LLVM's IR fails to justify some cases of loop unswitching, global value numbering, and other important "textbook" optimizations, causing long-standing bugs. We present solutions to the problems we have identified in LLVM's IR and show that most optimizations currently in LLVM remain sound, and that some desirable new transformations become permissible. Our solutions do not degrade compile time or performance of generated code.

15 days free trial to Access Article
Provably correct peephole optimizations with alive

ACM SIGPLAN Notices, 2015

Co-Authors: Nuno P. Lopes, David Menendez, Santosh Nagarakatte, John Regehr

Abstract:

Compilers should not miscompile. Our work addresses problems in developing peephole optimizations that perform local rewriting to improve the efficiency of LLVM code. These optimizations are individually difficult to get right, particularly in the presence of Undefined Behavior; taken together they represent a persistent source of bugs. This paper presents Alive, a domain-specific language for writing optimizations and for automatically either proving them correct or else generating counterexamples. Furthermore, Alive can be automatically translated into C++ code that is suitable for inclusion in an LLVM optimization pass. Alive is based on an attempt to balance usability and formal methods; for example, it captures---but largely hides---the detailed semantics of three different kinds of Undefined Behavior in LLVM. We have translated more than 300 LLVM optimizations into Alive and, in the process, found that eight of them were wrong.

15 days free trial to Access Article
PLDI - Provably correct peephole optimizations with alive

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation - PLDI 2015, 2015

Co-Authors: Nuno P. Lopes, David Menendez, Santosh Nagarakatte, John Regehr

Abstract:

Compilers should not miscompile. Our work addresses problems in developing peephole optimizations that perform local rewriting to improve the efficiency of LLVM code. These optimizations are individually difficult to get right, particularly in the presence of Undefined Behavior; taken together they represent a persistent source of bugs. This paper presents Alive, a domain-specific language for writing optimizations and for automatically either proving them correct or else generating counterexamples. Furthermore, Alive can be automatically translated into C++ code that is suitable for inclusion in an LLVM optimization pass. Alive is based on an attempt to balance usability and formal methods; for example, it captures---but largely hides---the detailed semantics of three different kinds of Undefined Behavior in LLVM. We have translated more than 300 LLVM optimizations into Alive and, in the process, found that eight of them were wrong.

15 days free trial to Access Article

Chris Hathhorn - One of the best experts on this subject based on the ideXlab platform.

Dealing With C's Original Sin

IEEE Software, 2019

Co-Authors: Chris Hathhorn, Grigore Rosu

Abstract:

In the very early days of C, the compiler written by Dennis Ritchie and supplied with the UNIX operating system entirely defined the language. As the number of users and C implementations grew, however, so too did the need for a language standard-a contract between users and implementers about what should and should not count as C. This effort began in 1983 with the formation of a committee tasked with producing "an unambiguous and machine-independent definition of the language C" and led to the ANSI C Standard in 1989.1 In retrospect, it was not until this date, 17 years after the first compiler, when C's most notorious language feature slithered into the world: Undefined Behavior .

15 days free trial to Access Article
RV - Runtime Verification at Work: A Tutorial

Runtime Verification, 2016

Co-Authors: Philip Daian, Chris Hathhorn, Dwight Guth, Manasvi Saxena, Edgar Pek, Traian Florin Şerbănuţă, Grigore Rosu

Abstract:

We present a suite of runtime verification tools developed by Runtime Verification Inc.: RV-Match, RV-Predict, and RV-Monitor. RV-Match is a tool for checking C programs for Undefined Behavior and other common programmer mistakes. It is extracted from the most complete formal semantics of the C11 language and beats many similar tools in its ability to catch a broad range of undesirable Behaviors. RV-Predict is a dynamic data race detector for Java and C/C++ programs. It is perhaps the only tool that is both sound and maximal: it only reports real races and it can find all races that can be found by any other sound data race detector analyzing the same execution trace. RV-Monitor is a runtime monitoring tool that checks and enforces safety and security properties during program execution. Our tools focus on reporting no false positives and are free for non-commercial use.

15 days free trial to Access Article
CAV (1) - RV-match: Practical semantics-based program analysis

Computer Aided Verification, 2016

Co-Authors: Dwight Guth, Chris Hathhorn, Manasvi Saxena, Grigore Rosu

Abstract:

We present RV-Match, a tool for checking C programs for Undefined Behavior and other common programmer mistakes. Our tool is extracted from the most complete formal semantics of the C11 language. Previous versions of this tool were used primarily for testing the correctness of the semantics, but we have improved it into a tool for doing practical analysis of real C programs. It beats many similar tools in its ability to catch a broad range of undesirable Behaviors. We demonstrate this with comparisons based on a third-party benchmark.

15 days free trial to Access Article
PLDI - Defining the Undefinedness of C

Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2015

Co-Authors: Chris Hathhorn, Chucky Ellison, Grigore Rosu

Abstract:

We present a ``negative'' semantics of the C11 language---a semantics that does not just give meaning to correct programs, but also rejects Undefined programs. We investigate Undefined Behavior in C and discuss the techniques and special considerations needed for formally specifying it. We have used these techniques to modify and extend a semantics of C into one that captures Undefined Behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end nearly doubling the size of the original semantics. From our semantics, we have automatically extracted an Undefinedness checker, which we evaluate against other popular analysis tools, using our own test suite in addition to a third-party test suite. Our checker is capable of detecting examples of all 77 categories of core language Undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of Undefined Behavior in C, and thus of the C language itself.

15 days free trial to Access Article
Defining the Undefinedness of C11 : practical semantics-based program analysis

1

Co-Authors: Chris Hathhorn

Abstract:

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] This thesis extends the work of Ellison and Ros,u [13, 12] but focuses on the "negative" semantics of the C11 language--the semantics required to not just give meaning to correct programs, but also to reject Undefined programs. We investigate Undefined Behavior in C and discuss the techniques and special considerations needed to formally specify it. Using these techniques, we have modified and extended a semantics of C into one that captures Undefined Behavior. The amount of semantic infrastructure and effort required to achieve this was unexpectedly high, in the end more than tripling the size of the original semantics. From our semantics, we automatically extract kcc, a tool for checking realworld C programs for Undefined Behavior and other common programmer mistakes. Previous versions of this tool were used primarily for testing the correctness of the semantics, but we have improved it into a tool for doing practical analysis of real C programs. It beats many similar tools in its ability to catch a broad range of undesirable Behaviors. We demonstrate this with comparisons based on our own test suite in addition to third-party benchmarks. Our checker is capable of detecting examples of all 77 categories of core language Undefinedness appearing in the C11 standard, more than any other tool we considered. Based on this evaluation, we argue that our work is the most comprehensive and complete semantic treatment of Undefined Behavior in C, and thus of the C language itself.

15 days free trial to Access Article

Anton A. Kutsenko - One of the best experts on this subject based on the ideXlab platform.

Programming Infinite Machines

Erkenntnis, 2019

Co-Authors: Anton A. Kutsenko

Abstract:

For infinite machines that are free from the classical Thomson’s lamp paradox, we show that they are not free from its inverted-in-time version. We provide a program for infinite machines and an infinite mechanism that demonstrate this paradox. While their finite analogs work predictably, the program and the infinite mechanism demonstrate an Undefined Behavior. As in the case of infinite Davies machines (Davies in Br J Philos Sci 52(4):671–682, 2001), our examples are free from infinite masses, infinite velocities, infinite forces, etc. Only infinite divisibility of space and time is assumed. Thus, the infinite devices considered are possible in a Newtonian Universe and they do not conflict with Newtonian mechanics. Note that the classical Thomson’s lamp paradox leads to infinite velocities which may not be producible in acceptable models of Newtonian mechanics. Finally, it is shown that the “paradox of predictability” is similar to the inverted Thomson’s lamp paradox.

15 days free trial to Access Article
Programming infinite machines

arXiv: Logic in Computer Science, 2018

Co-Authors: Anton A. Kutsenko

Abstract:

For infinite machines which are free from the classical Thompson's lamp paradox we show that they are not free from its inverted version. We provide a program for infinite machines and an infinite mechanism which simulate this paradox. While their finite analogs work predictably, the program and the infinite mechanism demonstrate an Undefined Behavior. As in the case of infinite Davies's machines, our examples are free from infinite masses, infinite velocities, infinite forces, etc. Only infinite divisibility of space and timeis assumed. Thus, the considered infinite devices are possible in a continuous Newtonian Universe and they do not conflict with continuous Newtonian mechanics. Some possible applications to the analysis of the Navier-Stokes equations are discussed.

15 days free trial to Access Article

Discover everything there is to know about the scientific topic Undefined Behavior with ideXlab!

John Regehr - One of the best experts on this subject based on the ideXlab platform.

Random testing for C and C++ compilers with YARPGen

Practical verification of peephole optimizations with Alive

Taming Undefined Behavior in LLVM

PLDI - Taming Undefined Behavior in LLVM

Provably correct peephole optimizations with alive

Grigore Rosu - One of the best experts on this subject based on the ideXlab platform.

Dealing With C's Original Sin

RV - Runtime Verification at Work: A Tutorial

CAV (1) - RV-match: Practical semantics-based program analysis

PLDI - Defining the Undefinedness of C

Nuno P. Lopes - One of the best experts on this subject based on the ideXlab platform.

Practical verification of peephole optimizations with Alive

Taming Undefined Behavior in LLVM

PLDI - Taming Undefined Behavior in LLVM

Provably correct peephole optimizations with alive

PLDI - Provably correct peephole optimizations with alive

Chris Hathhorn - One of the best experts on this subject based on the ideXlab platform.

Dealing With C's Original Sin

RV - Runtime Verification at Work: A Tutorial

CAV (1) - RV-match: Practical semantics-based program analysis

PLDI - Defining the Undefinedness of C

Defining the Undefinedness of C11 : practical semantics-based program analysis

Anton A. Kutsenko - One of the best experts on this subject based on the ideXlab platform.

Programming Infinite Machines

Programming infinite machines