Thursday, December 13, 2012

C++11 Standard Explained: 3. Static Assertion

C++11 comes with a very nice functionality: static assertion.
static_assertion enables programmers to check for things during compilation.
Syntax is :
static_assert--(--constant-expression--,--string-literal-----)
More explanation for the details may be found here:
http://publib.boulder.ibm.com/infocenter/zos/v1r12/index.jsp?topic=%2Fcom.ibm.zos.r12.cbclx01%2Fsadec.htm

I will talk about some useful applications of it.
If you have some code for byte hacking, you may want to check for the size of integer. The following structure will enable you to do that.
I will check if long double is 64 bit in the system

template 
struct CheckSize {
        static_assert(sizeof(int64_t) == sizeof(T), "not big enough");
};
Later, you may check for the size of long long by doing the following:
    CheckSize<long double> check1; The other useful thing I found was that you could check for different system limits during compile time, to make sure that behavior is expected. You may check file pat limit like this:

static_assert(PATH_MAX <= 4096, "type is not size 8");

C++11 Standard Explained: 2. Constant Expression

In this article I will try a different approach, only explaining stuff that is not clearly explained elsewhere.

Before constant expression become the standard, you couldn't do much compile time calculation easily. Constant expression allowed programmer to leverage the power of C++ during compile time, which, for the most of the time, is only useful when you know the real underlining power of it.
constexpr is a new keyword that was proposed, and has been implemented in gcc and clang. It is

What does it not do?

For primitive types, there is no real difference between const and constexpr. 
A simple example:

C++03 code C++11 code
int main() {
    const int i = 100;
    if(i == 100) {
        return 1;
    }
    return 0;
}
int main() {
    constexpr int i = 100;
    if(i == 100) {
        return 1;
    }
    return 0;
}
Assembly code:
    movl    %esp, %ebp
    subl    $16, %esp
    movl    $100, -4(%ebp)
    movl    $1, %eax
    leave

Assembly code:   
    movl    %esp, %ebp
    subl    $16, %esp
    movl    $100, -4(%ebp)
    movl    $1, %eax
    leave


Yeah, you guessed it right: there is no difference at all. Note the compiler I used was g++, and I did not use any optimization option.

When you try it with a POD type such as the following:
struct compNum {
    int m_real;
    int m_img;
    constexpr compNum(int real, int img)
        : m_real(real), m_img(img) {}
};

And use it in constant expression code, such as the following

int main() {
    constexpr compNum curNum(1, 2);
    if(curNum.m_real == 100) {
        return 1;
    }
    return 0;
}

It makes some difference when you do not turn on the optimization option. When you turn on optimization, however, there is absolutely no difference.


So what does it do?

Bottom line: it enables you to do more at compile time. One example of that is:
int main() {
    constexpr char* str = "haha";
    static_assert(str == "haha", "string not the same");
    return 0;
}
Notice the line static_assert, which is one of the new feature in C++11 that I will get on with very soon, is that it allows for compile time assertion, which is similar to boost's static assert. Within static assert, you cannot use const char *. You have to use constexpr char * in this case, because constexpr char * are guaranteed to be available during compile time execution of the line following it.
Constexpr functions are also very nice to use. The following blogpost explains constexpr functions in detail, you may refer to this:
http://cpptruths.blogspot.com/2011/07/want-speed-use-constexpr-meta.html
The key idea for constant expression functions is that you may only have ONE return statement, and you may not declare any variables, therefore you have to be very smart in terms of the recursive function you are going to write to make it useful.
if you want some real code to show the power of constant expression, you can check out a constexpr hash map I wrote last summer.
https://github.com/benjibc/constexpr_hash_map
Anyways, please leave any comment for corrections. Coming up: static_assert.

Sunday, November 18, 2012

C++11 Standard Explained: 1. Unrestricted Union

C++11 contain a lot of new and exciting features, and unrestricted union is one of those.

Here is the link to the proposal for the standard:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf

The main difference between C++11 unrestricted union and the C and C++ union is that the new union is may contain non-POD types. Quoting from the proposal, "Our proposed solution is to remove all of the restrictions on the types of members of unions, with the exception of reference types."

There are profound significance for this change. I will use the example from the proposal to illustrate this point. If we have a very simple class called point:

struct point {
  point() {}
  point(int x, int y) : x_(x), y_(y) {}
  int x_, y_;
};


Note that this class only contains non-POD types, but the struct itself is NOT a POD types because it has a non-trivial constructor. This means that the following union is illegal with C++03:

union {
  point p_;
  int i_;
  const char* s_;
};

As you can see, this creates a lot of inconvenience, which effectively reduced the power of union, therefore the standard for unrestricted union is proposed. This means the above class is legal with the new C++ standard.

Before we proceed any further, I must note what is still not allowed with the new C++ standard, and that is reference types.

For example, if I have the following union declaration:

union cannotCompile
{
  int _i;
  char _c;
  char & _rc;
};

union.cpp:5:12: error: ‘cannotCompile::_rc’ may not have reference type ‘char&’ because it is a member of a union

Other than reference types, all the user-defined classes and structs are allowed to be a member of an union. This causes a new problem though: you need to allocate and delete the memory associated with the union member separately. If we have an union with the following definition:

union str_int
{
  std::string _str;
  int16_t _int;
}
Since the _str member has an non-trivial constructor, we need to be able to initialize the member specifically. Since you cannot directly call the constructor of a specific class, placement new is necessary here. This means that the union needs to have constructors, since it is non-trivial. Luckily, this is also part of the proposal for the standard of unrestricted union. To illustrate the point, I will add a constructor to this union:

union str_int
{
    std::string _str;
    std::vector<char> _raw;
    int16_t _int;
    str_int(std::string str)
    {
        _raw.~vector<char>();
        new (&str) std::string(str);
    }
    ~str_int()
    {
        _str.~string();
    }
};

You can also declare assignment constructor for the union. From the proposal, "The default constructor (12.1), copy constructor and copy assignment operator (12.8), and destructor (12.4) are special member functions."

You do need to declare the constructor and destructor specifically, because the constructor and destructor will be deleted: "if a non-trivial special member function is defined for any member of a union, or a member of an anonymous union inside a class, that special member function will be implicitly deleted (8.4 ¶10) for the union or class. This prevents the compiler from trying to write code that it cannot know how to write, and forces the programmer to write that code if it’s needed."

In the constructor, I called the destructor of the vector to deallocate the memory associated with it, and used placement new to create a std::string within the union. Here, the use of placement new is critical. Placement new takes a pointer as an argument, and allocate the memory to specified by the address. For more information with placement new, check out this post: http://stackoverflow.com/questions/222557/what-uses-are-there-for-placement-new.

Of course, there is one major bug associated with the constructor: how do you know if the union was interpreted as an vector, and there were memory allocated for the vector for us to destroy? The simply question is that we don't. The simple solution for this question would be: encapsulate this union with a wrapper class which keeps track of the current type of the union.

For a more complete and comprehensive example for unrestricted union, I implemented a JSON class last summer for the web framework I was working on using C++11 union. You may check it out here: https://github.com/benjibc/json-universal-container

Please point out any errors in the article by leaving a comment below. Have fun coding in C++11!
Coming up: C++11 constant expression.

Purpose of the blog

This is the first post of this blog ever! YAH!

What I try to achieve in this blog is to explain part of the C++ standard in detail, especially the new C++ 11 standard. The focus of the blog posts will be on the gotcha of the standard or the boundary conditions where things become unclear. I will try to clarify everything associated with a specific feature.

I am still a student, so many things I say may be inaccurate or even incorrect. Please notify me by leaving a comment on the post.