◀︎

D3939R0: tail recursion enforcement

This paper propose standardization of an attribute (or other mechanism) to enforce tail recursion on a return expression, to make this type of code portable and usable for unoptimized builds and also in constant evaluator.

Introduction

Attribute [[clang::musttail]] in clang and [[gnu::musttail]] in GCC are already existing practice to enforce compiler emitting tail recursion.

While C++ doesn't define notion of a system stack or recursion limits, generally a program can't have arbitrary deep recursion without running to portability problems.

In practice compilers provide a hard limit for the recursion depth only in the constant evaluator (in clang the option is set with -fconstexpr-depth, with default value 512).

Motivation


struct interpreter { 
  // ...
  using instruction_type = void(*)(interpreter &);
  using symbol_type = ...;
  using stack_type = std::stack<symbol_type>;
  
  stack_type stack{};
  
  symbol_type read_symbol();
  instruction_type read_instruction();
	
  constexpr void execute() {
    auto instr = read_instruction();
    [[musttail]] return instr(*this); // BEFORE: crash in runtime with unoptimized build,
                                      // unable to interpret longer input with a regex library in constant evaluator
                                      // AFTER: safe, portable, debuggable, and quick
  }
};

namespace bytecode {
  constexpr void nop(interpreter & state) {
    [[musttail]] return state.read_instruction()(state); // BEFORE: maybe a crash
                                                         // AFTER: fine
  }

  constexpr void push(interpreter & state) {
    state.stack.push(state.read_symbol());
    [[musttail]] return state.read_instruction()(state); // BEFORE: maybe a crash
                                                         // AFTER: fine
  }
}

One can argue that there are different approaches how to write equivalent code. Computed go-to is not portable, change the code into a loop reading the instruction and then call the instruction, return back to loop, introduce another another control flows and calling / returning from a function. And we can't depend on optimized to optimize it without checking.

Security & safety

MISRA C and C++ forbids recursive direct and indirect calls of a function itself, main concern is unbounded memory usage and potential application crash. In some cases a recursion is the most elegant solution and changing the code to be non-recursive can lead to worse readable code.

This proposal provides quarantee of bounded memory usage in such cases. When this is needed usually a manual inspection of generated code is needed that the tail recursion is present, but this provides a false sense of safety as once you forget to check after changing compiler or platform, your code suddenly is insecure in a sneaky way.

Crash in optimized build due avoidable stack overflow

As mentioned in previous subsection, currently compilers are very good at optimizing tail-recursion, but this optimization is obviously not present in non-optimized builds, leading to different behavior.

In both existing implementations when the attribute [[gnu::musttail]] or [[clang::musttail]] is used the resulting code contains no recursion.

Portability

All modern hardware platforms provide tail-recursion support, this is not a case of some embed platforms with otherwise very limited CPUs. These platforms usually are not support fully conforming C++ already, like exceptions. certainly developers on these platform won't be upset that they can't use tail-recursion optimized calls.

As a developer I prefer to be informed about my inability to use the tail recursion over false positive silent failure to provide memory constraints.

What modern platform don't support tail-recursion call optimization?

WASM didn't support tail-recursion until recently, and now is a baseline support since January 2025. Another modern platform not supporting tail recursion is JVM.

There is a related GCC bug 94794 from Iain when he was implementing coroutine symmetric transfer which is somehow similar thing.

Constant evaluator using same code path

Let's pretend we don't mind not having the guarantees in runtime as we always optimize our programs and never use debug build. There is still no way around recursion limit in constant evaluator, as current implementations use recursive AST walking and this approach is using a lot of memory and somehow deep recursion is simple impossible.

We already established we don't want users to write code twice just to have it compatible with constant evaluator as writing two code paths means larger chance of bugs in the code. With more usage of reflection it will be even more common situation.

In the example I sketch how an internal interpreter of a regular expression engine could look like, and it can reach even higher speed than current existing CTRE, but I can't change CTRE's internal to do it, as I'm hitting the recursion limits and I want keep the code portable.

Syntax options

Clang and GCC used attribute syntax [[gnu::musttail]], [[clang::musttail]], and older __attribute__((musttail)). This attribute is put before return keyword and not on the call itself.

First possible option is to provide a standard attribute [[musttail]] (or any form EWG wishes). Potential problem is the ignorability of attributes. So other approaches are possible:

Keyword

return __must_tail fnc() or __must_tail return fnc(). Or even new type of return: tailexpr_return fnc().

An interesting idea from a fellow committee member is return goto fnc() as technically tail call is a jump.

Annotation

C++26's gave us annotation syntax [[=value]] to add unignorable annotation. So maybe this is EWG's prefered way to ask LEWG to create a library constant and get:

[[=std::must_tail]] return fnc()

Design & behavior

Compiler supporting this paper's enforcement mechanism (in any form), must guarantee tail-recursion call for only call expression directly in its return statement, which means:

  • the call replaces current stack frame
  • current stack frames and all objects in it must be destroyed or be trivially destroyable
  • the call must not reference any object inside current stack frame
  • the call must not happen in a try handler if the callee is not marked noexcept

If compiler can make sure the tail-recursion can be provided the compilation must end with an error.

Compilers can provide implementation specific requirements, in practice similar or same call signature can be required.

Lifetime


struct S { ~S(){} };
void f() {
  S s;
  [[musttail]] return f(); // ERROR: object `s` is not trivially destructible and is still alive
}

Attribute or annotation on a call without return

Additional option is to allow the annotation / attribute on a last call inside void returning function without writing return keyword:


void f() {
  [[musttail]] g(); // ??
}

Currently clang doesn't support it and errors out, GCC on other side will happily insert a call and silently continues ( bug 123059 reported here).

Implementation

As said before, there is existing implementation in production GCC and Clang and it's used.

Constant implementation

I have already implemented a prototype of this functionality in Clang's constant evaluation (TODO: find which branch contains it :).

Basic functionality is to use existing [[clang::musttail]] strict checking facility and then just detect the attribute's presence and when the only call inside ReturnStmt is reached, prepare everything, and store prepared call in a special slot (one for whole constant evaluator) which can be implemented at worse with a std::function or something similar, or as just preallocated slot.

Then when the function call is stored, instead of evaluating, just return to current's function caller, and when all cleaning is done, and then in HandleFunctionCall where we are preparing to return to parent's AST node, we just loop as long there is something in the prepared call slot, and evaluate the call. This basically transform recursion into a loop in the interpreter.

Constexpr steps are still counted so the constant evaluator won't run forever, but this limit is much larger, as this is no longer bounded by system's stack size.

Wording

Will be added after we settle on syntax.

Feature test macro

15.11 Predefined macro names [cpp.predefined]

__cpp_enforced_tail_recursion 2026??
draft