December 2012

Best of 2012: Clang @ GoingNative -- Chandler Carruth

As we're all unwinding at the end of the year, your blog staff would like to recommend watching again one of the year's most hilariously entertaining and deeply informative talks about the world's hottest new C++ compiler -- a holiday video that's genuinely fun:

Clang: Defending C++ from Murphy's Million Monkeys

by Chandler Carruth
at GoingNative 2012

[... Clang] provides fantastic diagnostics, static and dynamic program analysis, advanced rewriting and refactoring functionality, and language extensibility. Together with improvements to the language in C++11 these help programmers cope with today's code and write better code tomorrow. Clang also makes it easier than ever before to evolve and evaluate new language features and extensions to make C++ itself better.

Through this talk I'll give some background on the Clang compiler, what it does today to make writing C++ better, and how we're using it to help shape the C++ language going forward.

Even your non-technical family and friends will probably enjoy the first five minutes.

Thanks again to Chandler, GoingNative, and Channel 9 for an engaging and illuminating presentation.

 

Clang 3.2 released

Chris Lattner has announced the release of version 3.2 of the Clang C++ compiler, as part of LLVM 3.2. Read the release notes here.

Highlights include:

  • Improved error and warning diagnostics.
  • Doxygen-like documentation comment support.
  • Improved Python bindings.
  • Standard C11 _Alignof support
  • The same strong Standard C++11 support as in the spring release, Clang 3.1. See the C++98 and C++11 Support in Clang status page for details.

Quick Q: Simultaneously iterating over and modifying an unordered_set? -- StackOverflow

From StackOverflow [c++11]:

Consider the following code:

unordered_set<T> S = ...;
for (const auto& x : S)
   if (...)
       S.insert(...);

This is broken correct? If we insert something into S then the iterators may be invalidated (due to a rehash), which will break the range-for because under the hood it is using S.begin ... S.end.

Is there some pattern to deal with this?

Continue reading...

Three Optimization Tips for C++ -- Andrei Alexandrescu

For your Friday viewing pleasure, from the always-engaging master, Andrei Alexandrescu:

Three Optimization Tips for C++ (video) (PDF slides)

Andrei Alexandrescu

Facebook NYC, December 4, 2012

 

From an early slide in the talk: "Intuition:

  • Ignores aspects of a complex reality
  • Makes narrow/wrong/obsolete assumptions

[...]

  • The only good intuition: "I should time this."

​Enjoy.

(Bonus: The tips are also applicable in other languages. They just happen to be easier to use and control in C++.)

An implementation of generic lambdas (request for feedback) -- Faisal Vali

This week, Faisal Vali shared an initial "alpha" implementation of generic lambdas in Clang. Faisal is the lead author of the proposal (N3418), with Herb Sutter and Dave Abrahams.

To read and participate in the active discussion, see the message thread on std-proposals.

Here is a copy of Faisal's announcement:

Motivated by the positive feedback we received regarding Generic Lambdas (during the October 2012 ISO C++ Standards Meeting in Portland), I have implemented a reasonably complete (unless I am missing something obvious) implementation (http://faisalv.github.com/clang-glambda/) of most of the proposal (named lambda syntax for functions has not been attempted) using a fork of Clang.

 

We would like to encourage developers who are interested in this feature, to either compile the code or download the binaries (sorry, only Windows for now) and play with the implementation and provide us with feedback so that we can incorporate it within our next revision of the document.

The following link: http://faisalv.github.com/clang-glambda/, has some instructions (towards the bottom) on how to compile the code or use the binaries for Windows.

 

Any and all constructive feedback will be greatly appreciated.

The current version (12/2012) implements subproposals 2.1, 2.2, 2.3 and 2.5.

 

2.1 Allow the type-specifier within a parameter declaration of a lambda to be auto (i.e. auto is mandatory)

auto Sum = [](auto a, decltype(a) b) { return a + b; };
int i = Sum(3, 4);
double d = Sum(3.14, 2.77);

 

2.2 Allow the use of familiar template syntax in lambda expressions

auto NumElements = []<int N>(auto (&a)[N]) { return N; };
int arri[]{1, 2, 3};
double arrd[]{3.14, 2.77, 6.626};
auto total = NumElements(arri) + NumElements(arrd);

 

2.3 Permit a lambda body to be an expression

int local = 10;
auto L = [&](auto a) a + ++local;

 

2.5 Autogenerate a conversion to function pointer in captureless generic lambdas

auto L = [](auto a, decltype(a) b) { return a + b; };
int (*fp)(int, int) = L;

 

Thank you and looking forward to the feedback!

Is C++11 uniform initialization a replacement for the old style syntax? -- Programmers.StackExchange

Here is a question from Programmer's StackExchange (tags [c++] and [c++11]) which most C++ developers, whatever their level, will ask as soon as they are introduced to the new Uniform Initialization syntax:

Is it recommended now to use uniform initialization in all cases? What should the general approach be for this new feature as far as coding style goes and general usage? What are some reasons to not use it?

The accepted answer clearly provides very good reasons to use this new syntax as much as possible (if your compiler supports it already): minimizing redundant typenames and avoiding the Most Vexing Parse. It also points some reasons to not use this syntax, in particular in case you're trying to call a standard container constructor.

There is one other reason not to:

std::vector<int> v{100};

What does this do? It could create a vector<int> with one hundred default-constructed items. Or it could create a vector<int> with one item whose value is 100. ... In actuality, it does the latter.

Read the full QA.

Read Stroustrup's FAQ about Uniform Initialization syntax.

Bjarne Stroustrup interview: From the Foundation and C++11, to portability and the C++ resurgence

Last Monday, Bjarne Stroustrup gave a live interview to David Intersimone of Embarcadero to kick off their CodeRage 7 conference. Bjarne discusss the new Standard C++ Foundation, the ISO C++11 standard, new language features, how C++11 builds on C++’s strengths, application portability, and C++’s ubiquitous presence in the markets.

The video is now on YouTube. Enjoy.

 

Constexpr Unions -- Andrzej KrzemieĊ„ski

On using constexpr and unions, with insights into the design of the currently-proposed std::optional<T>:

Constexpr Unions

by Andrzej Krzemieński

I assume you are already familiar with constexpr functions. (If not, see a short introduction here.) [Ed.: We linked to that article recently, so you've seen it if you've been following isocpp.org.]

In this post I wanted to share my experience with using unions in constant expressions. Unions are not very popular due to type-safety hole they open, but they offer some capabilities that I found priceless when working with Fernando Cacciola on std::optional proposal.

Continue reading...

Quick Q: Why does std::map not have a const accessor? -- StackOverflow

To help balance out "introductory," "intermediate," and "advanced" content, we're trying an experiment to highlight interesting bite-sized tidbits that are in the first two categories.

Here's today's tidbit from StackOverflow's [c++11] tag:

The declaration for the [] operator on a std::map is this:

T& operator[] ( const key_type& x );

Is there a reason it isn't this?

T& operator[] ( const key_type& x );

const T& operator[] const ( const key_type& x );

Because that would be incredibly useful any time you need to access a member map in a const method.

As the two top answers show, the answer is different in C++98 and C++11, and C++11 is where it's "at" (pardon).

Read answers on StackOverflow...

A C++11 Name-Lookup Problem: a Long-Term Solution and a Temporary Work-Around

A C++11 Name-Lookup Problem: a Long-Term Solution and a Temporary Work-Around

by Eric Niebler

[This article has been changed since it was first published. New information from the standardization committee, which is working to fix the issue, is included.]

[Disclaimer: This is an advanced article.]

Both beginner programmers and experienced library developers got new tools when C++11 came out. As a library developer, I was particularly excited by variadic templates, an advanced feature that makes it possible to write a function that takes an arbitrary number of arguments. As it turns out, the nature of the language feature, together with the existing name lookup rules of C++, can steer you down a dark alley of frustration. This article sheds some light on the issue and shows a light at the end of the tunnel in the form of a language simplification coming in C++14. Until then, there is a workaround. But first, a word about variadic templates.

Imagine you're trying to write a sum() function that takes an arbitrary number of arguments and adds them all together. In pseudo-code, you want something like this:

// pseudo-code
auto sum( t ) { return t; }
auto sum( t, ...u ) { return t + sum( u... ); }

You have two overloads: one that takes a single argument and returns it; and the other that takes multiple arguments and adds the first to the resulting of summing the rest. Simple enough, right? With variadic templates, this isn't too hard, although the syntax takes some getting used to. Let's assume we're summing ints for now:

// real code
int sum( int i )
{
  return i;
}

template< typename ...Ints >
int sum( int i, Ints ...is )
{
  return i + sum( is... );
}

OK, that wasn't so bad. It's a little strange that we had to parameterize all the arguments after the first. We would have preferred a signature like "int sum(int i, int... is)". Variadic templates don't let us do that, but it's not a huge deal.

The other thing we notice is that variadic templates can only ever expand into a comma-separated list of things. Above, we expand is... into the (comma-separated) argument list of a recursive invocation of sum(). If we could expand the argument pack into a list of things separated with '+', then we could have done all this in one shot. Instead, we're forced by the nature of variadic templates to implement sum() recursively. This may not seem like a big deal right now, but is.

Let's check to see that our sum() function works like we expect:

int i = sum( 1 );       // OK, i == 1
int j = sum( 1, 2 );    // OK, j == 3
int k = sum( 1, 2, 3 ); // OK, k == 6

Now we're feeling bold, and we'd like to generalize. Maybe we want to sum longs, or even concatenate std::strings! We need to make two changes: replace int with a template parameter, and automatically deduce the result type of the addition. We want to use type deduction so that if we, say, add an int and a long, the return type is long as it should be. C++11 makes this possible as well, although again the syntax takes some getting used to:

template< typename T >
T sum( T t )
{
  return t;
}

template< typename T, typename ...Us >
auto sum( T t, Us ...us ) -> decltype( t + sum( us... ) )
{
  return t + sum( us... );
}

Above, we use decltype to deduce the return type, and we use the new trailing return type declaration syntax. The code duplication in the trailing return type declaration is a code smell, but we can live with it for now. (FWIW, the standardization committee smells it too, and wants to clean this up. Read to the end for a glimpse of the smell-free future of C++.) Let's give it a test drive:

auto i = sum( 1 );                                     // OK, i == 1
auto j = sum( 1, 2 );                                  // OK, i == 3
auto greeting = sum( std::string("hello "), "world" ); // Cool! greeting == "hello world"
auto k = sum( 1, 2, 3 );                               // ERROR! Doesn't compile.

The last line gives us the following error on Clang:

>  main.cpp:25:14: error: no matching function for call to 'sum'
>      auto k = sum( 1, 2, 3 ); // ERROR! Doesn't compile.
>               ^~~
>  main.cpp:16:6: note: candidate template ignored: substitution
>  failure [with T = int, Us = <int, int>]: call to function
>  'sum' that is neither visible in the template definition nor
>  found by argument-dependent lookup
>  auto sum( T t, Us ...us ) -> decltype( t + sum( us... ) )
>       ^                                     ~~~
>  main.cpp:11:3: note: candidate function template not viable:
>  requires single argument 't', but 3 arguments were provided
>  T sum( T t )
>    ^
>  1 error generated.

What's going on here? Calls to sum() work for one argument and two arguments, but fail for three. In the error message, we see that Clang really tried to call the variadic overload of sum() but failed because "'sum' ... is neither visible in the template definition nor found by argument-dependent lookup." Huh?

I'll skip to the chase and tell you what's going on (though you might take a moment to figure it out for yourself before reading further). Look again at the definition of the variadic sum() overload:

template< typename T, typename ...Us >
auto sum( T t, Us ...us ) -> decltype( t + sum( us... ) )
{
  return t + sum( us... );
}

sum() appears twice, once in the function declaration and once in the function body. It's the one in the function declaration that's causing the problem. Consider that the above is (roughly) equivalent to the following:

template< typename T, typename ...Us >
decltype( T() + sum( Us()... ) ) sum( T t, Us ...us )
{
  return t + sum( us... );
}

The main difference is that we moved the return type from the back to the front, but this makes it a little more clear that in the return type the second sum() overload hasn't been seen yet! Even in the trailing return type formulation, the compiler pretends like it hasn't seen the second overload of sum() yet. And if it hasn't seen it, it can't call it. Picky, picky.

Name Lookup for Dummies

Name lookup happens in two phases. Let's just consider what happens in the case of the non-qualified function call in the return type of sum(). The compiler builds a set of overloads by adding a dash of functions from lookup phase 1 and then another dash from lookup phase 2. In phase 1, which happens when sum() is first parsed, functions with the right name that are in scope are tossed in the overload bucket. In our case, that's the single-argument overload of sum(), since that's the only one that is currently in scope. This phase happens before the argument types are known. Phase 2 is the argument-dependent lookup (ADL) phase and happens when the argument types are known; we check the associated namespaces of the arguments and look for any sum() overloads in those namespaces. None are found, because the arguments are of type int which has no associated namespaces. Bummer, our bucket only has only one overload of sum() in it, and it's not the one we need.

If you want to see something interesting about the way ADL works, try this. Define "struct String : std::string {};" at global scope. Now do this:

auto k = sum( "", String(), "" ); // OK!

Why does it compile when one of the arguments is a String? Because now ADL is forced to look in String's associated namespaces. Since String is defined in the global scope, ADL must check the global scope during the phase 2, and presto! there's another overload of sum() hiding in plain sight.

What's curious about our case is that we're trying to use a function in the declaration of that same function. There's just no way to get a function in scope until after its declaration, so trying to get phase 1 lookup to find it is a lost cause.

Recall what we said earlier about variadic templates: recursive algorithms are pretty much the only way to get non-trivial things done with them. There's no way around it. If the return type depends on the type returned by the recursive invocation -- and it often does -- we will run into this name lookup problem. Every time. The committee is already on its way with a language simplification that will make this problem go away (more about that below). For now, we have to find a workaround. Help us, phase 2 lookup. You're our only hope.

Relax, It's Just a Phase

How do we get name lookup to cool its jets and wait for phase 2? Let's back up a second and ask ourselves why lookup is done in two phases in the first place. When the compiler first sees the definition of a function template, some things are known and some aren't. For instance, the name of the function is known, but not the types of the arguments. Anything that depends on the template parameters is called, er, dependent. Name lookup in dependent types and dependent expressions happens during phase 2. So, in order to make the sum() function call resolve correctly, we need to make it depend on a template parameter.

Once again skipping ahead a bit, here is the hackish workaround that I've found. Explanation to follow.

namespace detail
{
  // The implementation of sum() is in a function object,
  // which is really just a holder for overloads.
  struct sum_impl
  {
    template< typename T >
    T operator()( T t ) const
    {
      return t;
    }

    // Sneaky trick below to make the function call lookup
    // happen in phase 2 instead of phase 1:
    template< typename T, typename ...Us, typename Impl = sum_impl >
    auto operator()( T t, Us ...us ) const -> decltype( t + Impl()( us... ) )
    {
      return t + Impl()( us... );
    }
  };
}

constexpr detail::sum_impl sum{};

In the above code, it looks like we simply moved the implementation of the sum() function into a sum_impl function object. But look closely, because there's more going on here than at first glance. The magic happens here:

template< typename T, typename ...Us, typename Impl = sum_impl >
// ----------------------- MAGIC HERE ^^^^^^^^^^^^^^^^^^^^^^^^
auto operator()( T t, Us ...us ) const -> decltype( t + Impl()( us... ) )

The "typename Impl = sum_impl" is a default function template parameter, new in C++11. If you don't specify it explicitly, Impl defaults to sum_impl. We won't specify it explicitly, but when the compiler first sees this code (in phase 1), it doesn't know that. Instead, when the compiler sees this...

decltype( t + Impl()( us... ) )

... the compiler must be patient and wait until phase 2, when it knows what Impl is, before it can resolve the function call. And that's precisely the point.

Sum-mary

With the above trick, our sum() function now compiles for three or more arguments. In short, when the compiler isn't finding the thing it should be finding, and you're yelling at the computer, "BUT IT'S RIGHT THERE!!!", pause and consider that maybe you too have been bitten by the phases of name lookup. Try the default function template parameter trick to make your expressions dependent.

A Look to C++14

The good news is that the problem described above is about to disappear. The C++ committee is in the process of approving return type deduction for regular functions the same way as it already works for lambdas, with the happy result that you won't even have to bother to write sum's return type at all (including not having to write decltype) because it will be deduced from the return statement in the body, at which point everything is happily in scope and Just Works.

In particular, the variadic sum() will be simply:

template< typename T, typename ...Us >
auto sum( T t, Us ...us )         // look ma, no "->"
{
  return t + sum( us... );
}

See Jason Merrill's paper N3386 for details. This proposal is on track to be adopted for C++14, and has already been implemented in the current development branch of GCC; to try it out, get the current development snapshot of GCC (or wait for GCC 4.8 in a few months) and compile with the -std=c++1y switch.

Automatic return type deduction, already becoming available in GCC and soon other compilers, is another example of how C++ continues to become simpler and remove the need for knowing about old-C++ corner cases. As simpler alternatives are added, we can just stop using the older complex ways in our new code, even while enjoying C++'s great backward compatibility for our older existing code.