According to Hoyle...
C++0x Part 5: Rvalue References
macCompanion
January 2010
by Jonathan Hoyle
jonhoyle@mac.com
http://www.jonhoyle.com
Four months ago, we began our series on the upcoming
changes on the C++ language. For those who have missed the previous installments,
feel free to visit these links:
•
C++0x Part 1: What is It (and Does It Even Matter)?
•
C++0x Part 2: A Step Forward
•
C++0x Part 3: Making Coding Easier
•
C++0x Part 4: Smart Pointers
We continue by examining another advanced feature of
C++0x, namely Rvalue References.
In C, function parameters are always passed by value;
that is, a copy of the parameter always is passed, never the actual parameter itself. Therefore,
to modify a variable in a C, the function must pass the parameter's pointer, and then
dereference the pointer internally to modify the data:
void foo(int valueParameter, int *pointerParameter)
{
++valueParameter;
// parameter passed by value, so mods are local to this copy
++pointerParameter;
// pointer passed by value, so mods are local to this copy
++*pointerParameter;
// dereferencing pointer, so mods are permanent
}
One of the powerful new features that C++ introduced
over C was that of a reference, using the
ampersand & operator. Functions
in C++ could then have parameters passed by reference, and thus allowing its data
to be modified directly without the need of a pointer:
void foo(int valueParameter, int &referenceParameter)
{
++valueParameter; // passed by val, mods local to this copy
++referenceParameter; // passed by ref, mods are permanent
}
References must be lvalues, that is, variables which
can be modified. Rvalues, read-only or temporary memory, cannot be used as a
reference:
int myIntA = 10;
int myIntB = 20;
foo(myIntA, myIntB); // myIntA stays at 10, myIntB becomes 21
foo(1, myIntA); // 1 passed in by value, myIntA becomes 11
foo(myIntA, 1); // Compiler Error: 1 is an rvalue
foo(0, myIntB + 1); // Compiler Error: myIntB + 1 is an rvalue
Occasionally, it is useful to pass a parameter by reference
even when there is no desire to modify its contents. This is particularly true when a
large class or struct is
being passed to the function, and you wish to avoid creating a copy of the large object:
void foo(BigClass valueParam, const BigClass &constRefParam)
{
++valueParam; // passed by value, mods are tempoarary
++constRefParam; // Compiler Error: cannot modify a const
}
In C++0x, a new type of reference is defined,
that of an rvalue reference (the familiar type of reference from C++98 is now
referred to as an lvalue reference). Rvalue references can bind to temporary
data but act on it directly without the need of a
copy. The && operator
indicates that a reference is an rvalue reference:
void foo(int valueParam, int &lValRefParam, int &&rValRefParam)
{
++valueParam; // passed by value: mods local to this copy
++lValRefParam; // lvalue ref: changes permanent
++rValRefParam; // rvalue ref: changes local without a copy
}
foo(0, myIntA, myIntB + 1);
// The temporary value myIntB + 1 is not copied but moved as is
One of the chief benefits of rvalue references is the
ability to take advantage Move Semantics, that is, moving data from variable to variable
without copying. A class can define a Move Constructor, instead of (or in addition
to) a Copy Constructor, as so:
// Class definition
class X
{
public:
X(); // Default Constructor
X(const X &x); // Copy Constructor (lvalue ref)
X(X &&x); // Move Constructor (rvalue ref)
};
X bar(); // Utility function returning X
X x1; // Default construction of x1
X x2(x1); // x2 created as a copy of x1
X x3(bar()); // bar() returns a temporary X,
// memory moved directly into x3
The primary motivation behind Move Semantics
is improving performance. As an example, let us suppose you have
two vectors of strings which
you would like to swap data between. Using standard Copy Semantics, an implementation
might look something like this:
void SwapData(vector<string> &v1, vector<string> &v2)
{
vector<string> temp = v1; // A new copy of v1, placed into temp
v1 = v2; // A new copy of v2, placed into v1
v2 = temp; // A new copy of temp, placed into v2
}
Using Move Semantics, you can bypass all of that copying:
void SwapData(vector<string> &v1, vector<string> &v2)
{
vector<string> temp = (vector<string>) v1;
// temp now points to same data as v1
v1 = (vector<string> &&) v2; // v1 now points to v2's data
v2 = (vector<string> &&) temp; // v2 now points to temp's data
}
// No copies are made, only pointers are exchanged!
The obvious question is: how likely is this feature
going to be used? Well, there are many parts of the C++ language which are of
limited appeal, but this one in particular seems rather obscure. Move Semantics
is a big language change to answer a very narrow situation. Of course, it is
useful when this very specific situation arises, and I suppose it can be easily ignored
when you don't need it. But it almost seems like using cannonballs to shoot
down butterflies ... is this just too much?
And it's not just that it solves a particular problem ... it's
also that the cognitive friction is high. When you walk away from the concept
and come back, you constantly need to remind yourself of the rules and syntax. And
misusing it can be devastating. Could a big feature, which is arguably unneeded,
cause potential developers to ignore C++0x? Only time will tell. But admittedly,
if this feature were dropped, I wouldn't lose sleep over it.
Coming Up Next Month: C++0x Part 6: Final Thoughts.
http://www.maccompanion.com/macc/archives/February2010/Columns/AccordingtoHoyle.htm