Document number: | P2784R0 | |
---|---|---|
Date: | 2023-02-09 | |
Audience: | SG21 | |
Reply-to: | Andrzej Krzemieński <akrzemi1 at gmail dot com> |
This paper explores the possibility of not stopping the program after a contract violation has been detected at run-time. This is a feature to be added to the contract support framework.
Contract annotations are tools for expressing what constitutes a program that is running against
its specification ("specification" understood as information provided in contract annotations).
If a runtime check based on such annotation returns false
, we can be sure that
this is happening right now: the program runs against the specified intentions. If a program is
let to continue at this point, one of the likely consequences is that it will either crash
or start behaving in an unpredictable way (e.g. by hitting an undefined behavior at the language level):
int * p = 0; [[assert: p != 0]]; return *p > 3; // bad things!
For this reason papers like [P2388R4] suggest to immediately halt the program at this point which, while being a harsh reaction, gives a guaranteed repeatable result with an upper limit on the possible consequences.
While this seems a good default, there are situations where stopping the program this way is not the optimum solution. These include:
main()
as such submodule.
We may know that a bug in function main()
may not affect the correctness of the second
call to main()
if we somehow managed to restart it from within the program.The first three cases have one thing in common: while we do not want the program to halt, it is allowed, and even desired, to resume the program execution from a different place than where the contract violation was observed. In this paper we only focus on these use cases. This sounds like a job for the exception handling mechanism. We consider it as one of the options, however this paper also proposes another.
The problem of transferring control to a different location upon "failure" is addressed in C++ by the exception handling mechanism. This mechanism blends nicely with other parts of the language by:
if
-statements.This is why [P0542R5] allowed a "mode" where upon a detected contract violation a programmer-installed violaiton handler is invoked, and this handler as one of possible opitons can throw an exception. This way we get a two-fold guarantee:
This is also what [P2698R0] proposes under a new translation mode Eval_and_throw.
Using exceptions to recover from contract violations, however, is problematic for a number of reasons.
The first issue is conceptual. Exceptions were meant to handle the situation where a correct program responds in an exceptional way to an exceptional situation. This exceptional recovery still executes some code (primarily destructors) and the executed code also assumes that the program is correct and it has its own preconditions.
Consider a class with an invariant expressed in the source code via preconditions.
class State { std::vector<Column*> _columns; unsigned _theColumn; public: bool invariant() const noexcept { return _theColumn < _columns.size() && _columns[_theColumn] != nullptr; } void alter() [[pre: invariant()]] [[post: invariant()]]; ~State() [[pre: invariant()]] { delete _columns[_theColumn]; } };
If upon calling state.alter()
its precondition is violated and this
gets turned into an exception, during the stack unwinding we will need to call
the destructor of state
. The destructor also has a precondition
which would also be violated. This would trigger a second exception to be thrown,
which in C++ normally results in calling std::terminate
.
We could put it in another way: destructors also have preconditions, and they may call other funcitons with contract annotations. If any of these checks fail, we get a throwing destructor, which likely aborts the program.
The second issue is more pragmatic: the interaction with noexcept
functions, which can also have contract annotations. If contract checks are
allowed to throw, we have a new question to answer, that
[P2388R4] doesn't have to answer: are contract conditions evaluated inside or outside the function
— because now it becomes a visible property:
void fun() noexcept [[pre: false]]; void test() { fun(); // ok if precondition evaluated outside the function // std::terminate() if precondition evaluated inside the function }
Note that even if we said that the precondition must be evaluated outside the function,
this does not solve the problem of reporting contract violaiton via throw from
nested noexcept
funcitons:
void fun() noexcept [[pre: false]]; void gun() noexcept { fun(); } void test() { gun(); // std::terminate() void(*pf)() noexcept = &fun; pf(); // std::terminate() }
And in fact the issue is bigger than just noexcept
functions. What should operator noexcept
return?
void fun() noexcept [[pre: false]]; constexpr bool mystery = noexcept(fun());
Remember that operator noexcept
tests full expressions along with invisible things like conversions and destructors.
Should its value be dependent on the translation mode?
The scope of the poblem is even wider. It is not limited to functions declared noexcept
but also to functions that provide a no-fail guarantee, even if this is not statically checkable.
This is how exception-safety (or failuer-safety) guarantees work: you can only provide a strong (commit or rollback)
guarantee when you know that some operations never throw. When upon contract violation
they nonetheless start to throw, no funciton can provide the declared level of exception safety.
One could argue that if a precondition is violated, by definition, no guarante is provided.
But on the other hand, one of the goals of contracts is to offer some guarantees even if the contact
is violated.
The third issue stems from the fact that a huge fraction of C++ programs is compiled with exceptions disabled. Yet, these programs have the same problem to solve: how not to halt the program upon contract violation.
As a potential solution to the above problem we propose a mechanism that is harsher than
stack unwinding, but softer than std::abort()
. In fact we are proposing a stricter
and simpler version of the setjmp
/longjmp
mechanism. It is composed
of two proposed Standard Library functions with special powers:
template <invokable F> void abortable_component(F&& f);
This invokes the passed function f
in almost a regular way, except that it is treated
as "being executed in a separate component" as described below. Function std::abortable_component
is exception-neutral: whatever the evaluation of f()
throws is thrown out of function
std::abortable_component
. The goal of this function is to instruct the compiler what
the programmer considers a component boundary.
[[noreturn]] void abort_component() noexcept;
Calling this function initiates the process of leaving the function call stack, without calling
destructors of automatic objects and function parameters, until a component boundary, indicated
by std::abortable_component
is reached. Then the program resumes just after the call to
std::abortable_component
. If no component boundary is found in the call stack then std::abort
is called. Example:
struct Guard { ~Guard() { std::printf("A"); } }; int fun() { Guard g; std::abort_component(); // (2) abort sequence starts std::printf("B"); // (3) this is skipped, "B" is never printed } // (4) destructor is skipped, "A" is never printed int main() { std::abortable_component(&fun); // (1) launching `fun` as a subcomponent std::printf("C"); // (5) getting out of subcmponent, "C" is printed }
We could say that this mechanism is similar to stack unwinding except that:
Alternatively, we could say that this is like setjmp
/longjmp
,
except that the proposed approach is more structured: subcomponents have to nest.
There is also no way to convey any information other than the fact that we are aborting.
And there is no undefinded behavior related to skipping destructors.
As one can easily observe, aborting a component can easily cause resource leaks,
as destructors of automatic objects and function parameters are not executed.
It may be more than just leaks. Not calling the destructor of scoped_lock
can cause concurrency issues in other parts of the program.
However, we are talking about a tool for minimizing damage in a desperate situation:
The goal is no longer to get everything right, but to minimize damage. Continuing without cleanup can cause further bugs, but so can continuing after a detected bug. This will be a dangerous feature, not recommended to be used for other situations.
Putting std::abortable_component
in the program would mean that the
programmer considers it reasonably safe to continue the execution of the program
from that point even if arbitrary parts of the called function were skipped.
This could be an option when invoking a "plugin" that is an optional part of the progam,
experimental and not required to fulfil the main program task.
The other part of the interface — function std::abort_component()
would not even have to be exposed to programmers. For the purpose of the
contract support framework it would be enough to say that in Eval_and_abort
translation mode, when the predicate evaluates to false
the effect is as if
std::abort_component()
was called. We wouldn't need a third translation mode,
the effect of not aborting would be achieved by putting std::abortable_component
in your program. Thus, when you want your program not to abort upon contract violation,
you have to indicate a place (or places) where it is safe to resume the program from.
This feature can be intorduced after the MVP in a backward-compatible fassion.
For the MVP we can simply say that upon contract violation std::abort()
is called. Then, after the MVP change it to calling std::abort_component()
,
which is indistinguishable from std::abort()
as long as you have no
call to std::abortable_component()
in the program.
Finally, the proposed interface is very modest: there is no information conveyed
about the point of and the reason for calling std::component_abort()
.
There is not information whether we returned normally or via the abort from
std::abortable_component()
. The mechanism could be extended to satisfy these expectations.
However, the goal of this paper is to show the main idea behind the feature.
Similarly, if this mechanism is used to handle contract violation, it could be combined
with logging the information about the point of failure before calling
std::abort_component()
.
The use cases serviced by this feature.
Suppose function solve()
evaluates one of the user-prvided pugins.
it may have a bug, but if it fails — even if it leaked some resources —
the rest of the program, or other user plugins
may still work fine.
int solve(); // user plugin bool call_user_plugin = true; // my flag for protecting against calling plugin twice, if it is proven buggy std::optional<int> fun() { std::optional<int> solution; if (call_user_plugin) { std::abortable_component([&]{ solution = solve(); }); if (!solution) call_user_plugin = false; } return solution; }
Suppose that you run a plugin in a separate thread, and in case of failure, you only want to kill that thread.
int solve(); // user plugin int main() { std::optional<int> solution; std::jthread t {[&]{ try { std::abortable_component([&]{ solution = solve(); }); } catch(...) { // swallow } }}; // main thread ... }
Suppose you want to check, in a program executing unit tests, that function Sqrt
has a precondition
detecting negative inputs. This assumes the program is compiled in Eval_and_abort translation
mode.
auto test_precondition_of_sqrt() { bool function_finished = false; std::abortable_component([&]{ (void)Sqrt(-1.0); function_finished = true; }); EXPECT(!function_finished); }
Suppose that after a detected failure, you want to restart the program, but from within the program.
int main() { bool finished = false; std::abortable_component([&]{ finished = main_program_loop(); }); while (!finished) { std::abortable_component([&]{ necessary_critical_cleanup(); finished = main_program_loop(); }); } }
In this paper we demonstrated that there is at least one way to add support for not-aborting the program upon detecting contract violaiton (while at the same time not letting the code dependent on the declared contract execute) after the MVP, while retaining backwards compatibiity. "After the MVP" may mean "still in the same release cycle". This solution does not require the introduction of a new translation mode. Other ways of addressing the same problem, also suitable for a post-MVP addition, are also possibe; for instance, the ability to install a custom violation handler where throwing an exception is one of the options.
Joshua Berne has reviewed this paper and contributed to its quality.