Wednesday, May 22, 2013

C++: generating assembly code with gcc/g++ for fun and profit

While discussing how to use strncpy in a really safe way with a colleague (see this post), we got stuck in the performance part: is one of the alternatives faster than the other?

More precisely, is this version (the potentially unsafe one) faster because it uses a constant that can be resolved at compile time for sure?

   strncpy( s.name, "test", MAX_NAME_SIZE );

Or is this version equally as fast because sizeof is also resolved at compile time?


   strncpy( s.name, "test", sizeof s.name );

This could be settled by going to the C++ standard documentation, but you have to pay for it.

A free copy of the C99 standard is available here, but it has two problems:


  1. It's not the C++ standard. It should be about the same in this area, but there is no guarantee.
  2. There is still room for interpretation in the text (see below), although later examples in the document show more evidence that it is evaluated at compile time, you really need to dig into it to make a point.

The text from the document:


6.5.3.4 The sizeof and _Alignof operators
... 
2 The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand. The result is an integer. If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an
integer constant.
After going through all that, we decided to settle the argument in a simpler way:


  • Get the assembly code for both versions
  • If they are the same, then sizeof is indeed evaluated at compile time (for this case, at least) and there is no performance penalty for the safer code.
The challenge then is to get the assembly code from a C/C++ program.

This is the test program:

 #include <string.h>  
   
 const int MAX_NAME_SIZE = 100;  
 struct something {   
   char name[MAX_NAME_SIZE+1];  
 };  
   
 int main( int argc, const char* argv[] ) {   
   something s;  
   
   strncpy( s.name, "test", MAX_NAME_SIZE+1 );   
   s.name[MAX_NAME_SIZE] = '\0';  
   
   strncpy( s.name, "test", sizeof s.name );  
   s.name[sizeof s.name - 1] = '\0';  
 }

Note: unfortunately the recipe below will not work with the clang family (standard in Mac OS and others). Its assembler does not support interleaved listing. The best it can do is the regular listing (more details in this SO question): g++ -g -O2 -S test.cpp >test.s.

When compiled with these options, we get the assembly language combined with the C++ source code into the test.s file:

   g++ -g -O2 -Wa,-aslh test.cpp >test.s

Before going into the results, this is what each option does:


  • -g: Generate symbols. Without this option the assembler will not be able to match the source code lines. Much harder to read the output without this option.
  • -O2: Optimization level. Strictly speaking, not needed for this exercise, but it is always a good idea to build at the same optimization level used in the released code. It can affect the resulting assembly code significantly.
  • -Wa: Pass options to the assembler. This is the hint that the next options, after the comma, should be passed on to the assembler.
  • -aslh: Assembler options. All -a options are for listing. In this case we asked to include the assembly code (-al), include symbols (-as) and, the most important one, include source code (-ah). These options are explained in more details in the assembler documentation. Just run man as to see them.
And now, the results. What follows is a cleaned up version of the assembly listing.

First, the strncpy( ... MAX_NAME_SIZE ) version, showing that the constant is used, as expected:

  34 000e C7442408              movl    $101, 8(%esp)
  35 0016 C7442404              movl    $.LC0, 4(%esp)
  36 001e 891C24                movl    %ebx, (%esp)
  41 0021 65A11400              movl    %gs:20, %eax
  42 0027 8944247C              movl    %eax, 124(%esp)
  43 002b 31C0                  xorl    %eax, %eax
  49 002d E8FCFFFF              call    strncpy

Now the strncpy( ... sizeof s.name ) version, showing that it also uses a constant, settling the argument (same object code, thus same performance):

  55 0032 C7442408              movl    $101, 8(%esp)
  56 003a C7442404              movl    $.LC0, 4(%esp)
  57 0042 891C24                movl    %ebx, (%esp)
  58 0045 E8FCFFFF              call    strncpy





No comments:

Post a Comment