According to Hoyle...

C++0x Part 5: Rvalue References

macCompanion

February 2010

by Jonathan Hoyle

http://www.jonhoyle.com

Four months ago, we began our series on the upcoming changes on the C++ language. For those who have missed the previous installments, feel free to visit these links:

            • C++0x Part 1: What is It (and Does It Even Matter)?
            • C++0x Part 2: A Step Forward
            • C++0x Part 3: Making Coding Easier
            • C++0x Part 4: Smart Pointers

We continue by examining another advanced feature of C++0x, namely Rvalue References.

In C, function parameters are always passed by value; that is, a copy of the parameter always is passed, never the actual parameter. To modify a variable in a C, the function must pass the parameter's pointer, and then dereference the pointer internally to modify the data:

void foo(int valueParameter, int *pointerParameter)
{
++valueParameter;
// parameter passed by value, so mods are local to this copy

++pointerParameter;
// pointer passed by value, so mods are local to this copy

    ++*pointerParameter;
// dereferencing pointer, so mods are permament
}

One of the powerful new features that C++ introduced over C was that of a reference, using the & operator. Functions in C++ could then have parameters passed by reference, thus allowing its data to be modified directly without the need of a pointer:

void foo(int valueParameter, int &referenceParameter)
{
++valueParameter; // passed by val, mods local to this copy
++referenceParameter; // passed by ref, mods are permament
}

References must be to lvalues, that is variables which can be modified. Rvalues, read-only or temporary memory, cannot be used:

int myIntA = 10;
int myIntB = 20;

foo(myIntA, myIntB); // myIntA stays at 10, myIntB becomes 21
foo(1, myIntA);      // 1 passed in by value, myIntA becomes 11
foo(myIntA, 1);      // Error: 1 is an rvalue and can't be passed
foo(0, myIntB + 1);  // Error: myIntB + 1 is an rvalue

Occasionally, it is useful to pass a parameter by reference even when there is no desire to modify its contents. This is particularly true when a large class or struct is being passed to the function, and you wish to avoid creating a copy of the large object:

void foo(BigClass valueParam, const BigClass &constRefParam)
{
++valueParam; // passed by value, mods are tempoarary
++constRefParam; // compiler error, cannot modify a const
}

In C++0x, a new type of reference is defined, that of an rvalue reference (the familiar type of reference from C++98 is now referred to as an lvalue reference). Rvalue references can bind to temporary data but act on it directly without the need of a copy. The && operator indicates that a reference is an rvalue reference:

void foo(int valueParam, int &lvalRefParam, int &&rvalRefParam)
{
    ++valueParam;    // passed by value: mods local to this copy
    ++lvalRefParam;  // lvalue ref: changes permanent
    ++rvalRefParam;  // rvalue ref: changes local without a copy
}

foo(0, myIntA, myIntB + 1);
// The temporary value myIntB + 1 is not copied but moved as is

One of the chief benefits of rvalue references is the ability to take advantage Move Semantics, that is, moving data from variable to variable without copying. A class can define a Move Constructor instead of, or in addition to, a Copy Constructor as so:

// Class definition
class X
{
    public:
      X();            // Default Constructor
      X(const X &x); // Copy Constructor (lvalue ref)
      X(X &&x);       // Move Constructor (rvalue ref)
};

// Utility function returning X
X bar();

X   x1;            // Default construction of x1
X   x2(x1);        // x2 created as a copy of x1
X   x3(bar());     // bar() returns a temporary X, memory
// moved directly into x3

The primary motivation behind Move Semantics is improving performance. As an example, let us suppose you have two vectors of strings which you would like to swap data between. Using standard Copy Semantics, an implementation might look like this:

void SwapData(vector<string> &v1, vector<string> &v2)
{
vector<string> temp = v1; // A new copy of v1
v1 = v2; // A new copy of v2
v2 = temp; // A new copy of temp
};

Using Move Semantics, you can bypass all of that copying:

void SwapData(vector<string> &v1, vector<string> &v2)
{
    vector<string> temp = (vector<string> &&) v1;
// temp now points to same data as v1

    v1 = (vector<string> &&) v2; // v1 now points to v2's data
    v2 = (vector<string> &&) temp;  // v2 now points to temp's data
};
// No copies are made, only pointers are exchanged!

The obvious question is: how likely is this feature going to be used? Well there are many parts of the C++ language which are of limited appeal, but this one in particular seems rather obscure. Move Semantics is a big language change to answer a very narrow situation. Of course it is useful when this very specific situation arises, and I suppose it can be easily ignored when you don't want it. But it almost seems like using cannonballs to shoot down butterflies ... is this just too much?

And it's not just that it solves a particular problem ... it's also that the cognitive friction is high. When you walk away from the concept and come back, you constantly need to remind yourself of the rules and syntax. And misusing it can be devastating. Could a big feature, which is arguably unneeded, cause potential developers to ignore C++0x? Only time will tell. But admittedly, if this feature were dropped, I wouldn't lose sleep over it.

Coming Up Next Month: C++0x Part 6: Final Thoughts.

To see a list of all the According to Hoyle columns, visit: http://www.jonhoyle.com/maccompanion

http://www.maccompanion.com/macc/archives/February2010/Columns/AccordingtoHoyle.htm