C++ Coding Standard: The If Block

Almost two months ago I went to a CocoaHeads meeting during Macworld. They had Mike Lee talk. His presentation was about “Pimping Your App”. There were a bunch of interesting points, but one thing really stuck in my head. Mike was talking about how is a messy person in his life. His car is messy. His room is messy. His desk is messy. Everything is messy, except his code. HIs code is crystal clean, squeaky even. As a programmer you need to make sure your code is consistent and clean. Ever since I have been thinking about standardizing the way I write code. This is the first post in hopefully a stream of posts about quality code.

Yesterday, there was a discussion in the office about code quality. There were many points discussed, topics like line length, white space, and my personal favorite “if” statements. I have a track record of being incredibly inconsistent with my “if” block. The basic if block is the following:

if(foo) {
  //...
}

The “if” statement by itself is really not that big of a deal. It’s when you start adding “else if” and “else” clauses that it becomes complicated. The problem for me is twofold. Firstly, I’m inconsistent. The type of blocks I use on larger sections of code is different from the blocks I use on smaller sections of code. Secondly, my desire for consistency is at odds with my crazy, cooky desire to have code look aesthetically pleasing.

The Condensed If Block

I sometimes use the following style of the if statement:

if( foo ) {
   //...
} else if( bar ) {
   //...
} else {
   //...
}

This block is very condensed. You throw the braces for each clause on it’s own line. I feel that it this type of statement makes it clear and easy to really make the if statements a smaller part of the code. On trivial if blocks, I really like this approach. Where it suffers is in more complex if statements. If the ifs and else ifs fit really just blend into the code, sometimes making it easy to miss them. Okay, so if I’m looking at the code in detail, not a big ideal, but if I’m just giving it a quick glance over I might miss something. Also, if the “if” statement line is long enough, it could easily blend into the line below it.

The Almost Condensed If Block

This one is a take on the Condensed If block. It is actually just really poorly formatted “if” statements, but I often find myself using this one:

if( foo ) 
{
   //...
} else if( bar ) {
   //...
} else {
   //...
}

All we are doing here is moving the opening brace from the end of the if statement to the next line. The rest of the code follows the condensed. So, I really like this approach because the first “if” block is clearly separated from the rest of the code. It’s clear that we are are entering an “if” statement line. The remainder of the statements don’t take up too much space. The closing brace, the else if/else, and the following opening brace are all on the same line.

The downside of this approach is that it looks inconsistent. Why does the initial “if” statement get one extra return, and all the subsequent lines statements get jammed into one line? It’s not functionally different, and it might in general be more appealing to me, but consistency is also important. I have started to shift away from using this style.

The Intermediate If Block

This one has more white space:

if( foo ) {
   //...
}
else if ( bar ) {
   //...
}
else {
   //...
}

This approach has gives you a little bit more separation of the control statements from the code. For some reason, though, I just feel it looks weird to write the close brace on its own line but then incorporate the opening brace on the same line as the control statement. Still feels inconsistent.

White Space Heaven If Block

The following block is the last if style I’m going to talk about:

if( foo ) 
{
   //...
}
else if( bar )
{
   //...
}
else 
{
   //...
}

This block takes up a very large amount of space, everything gets its own line. It is probably the clearest of all the examples above, but the trade off is that your code is now three lines longer for each else statement. This extra space means that you can theoretically fit less code in the same amount of screen space.

Additional Concerns

There are a couple of additional concerns when working with if blocks. For example, if you are chaining “if” blocks, how much space should you provide.

// What I don't like
if(foo)
{
   //...
}
if(bar)
{
   //...
}

// What I prefer
if(foo)
{
   //...
}

if(bar)
{
   //...
}

I have run into some people who prefer the first option above with no space between each block. The reason I dislike this so much is that depending on the method of “if” syntax you use, you second if block could look identical to your else blocks. As your code gets large and complex, it is important to make it as easy to discern different control paths that might be executed, and in this case, white space is really your friend.

Another concern is the ternary operator. This operator is basically a simple “if/else” block which fits on one line.

double number;
if(foo)
    number = 3.1415;
else
    number = 2.71828183;

// As opposed to
double number = foo ? 3.1415 : 2.71828183;

Ternary operators are great for doing assignments as in the one line above, but be careful with trying to do too much in one line. My general rule is the line is getting really long, or I’m doing multiple clauses in my condition I tend to shift away from ternary operator.

Conclusion

The problem with all of this is that my mind changes depending on where I am in, what I’m working on and what I plan to be doing with my code. I’m learning towards using the condensed form of the “if” block in the future for the simple reason that I can fit more lines of code in less space (not to mention that some pointed out it was the K&R way, too). Bottom line, whatever method you choose, you should really stick to it as best you can for each project. If you open up a file in a project that was edited by someone else, you should probably follow their precedence.

Stupid std::vector Class

The Standard Template Library in C++ is nice to provide us with a bunch of different container classes so we don’t have to re-invent the wheel every time we write new code. One of the classes is called “Vector”, if you aren’t familiar with it, you might not get too much out of this post. But basically, it is a dynamically growing array. Meaning it has contiguous memory and can be indexed like a regular array. It’s a great class and I use it all over the place, but for the second time in one week I find it lacking.

I’m working on some code where I have two vectors of the same type. I want to concatenate one on the end of the other.

  std::vector firstVector;
  std::vector secondVector;

  // I want to do the following, but its not legal
  firstVector.append(secondVector);

This doesn’t work. There is no append method for std::vector. It turns out that the correct code for this is:

  std::vector firstVector;
  std::vector secondVector;

  firstVector.insert(firstVector.end(), secondVector.begin(), secondVector.end());

This frustrates me. Append seems like a logical function to include. I’m relatively new C++ developer, having been working in the language intensely for only about a year. My guess is I will reverse my opinion over time, but it just seems like an append() makes sense.

There is also the potential benefit, depending on the implementation of std::vector. Insert requires you to pass in three iterators. Because an append function would have access to the internals of each vector you wouldn’t necessarily need to do the look up of each iterator, which might save a few lines of code. These lines of code might be dwarfed by the amount of code you would need to check the parameter being passed to an append function, I’m not sure.

Regardless of whether or not it is more efficient, I still feel like append() makes sense, conceptually and I would like to see it in future versions of the STL.

(again, I reserve the right to change my opinion in the not too distant future)

Leap Year Spells Trouble for Zune Users

On December 31st, all 30GB Zune users woke up to their music players not working. In a rarity for Microsoft problems, the source for for the problem was found. There is a good explanation of the problem here.

There are two lessons to be learned from this: 1) be careful of your looping conditions. 2) Try and write your code in small snippets that are testable, and write tests! A simple iteration through the total amount of days including a leap year would have caught this bug.

Just FYI, I’m not saying I would have been good enough to catch this, but it is worth writing down so I try to remember for myself.

The Address of Monkey

Have you seen the following C code sample:

	char x = 1;
	char c = x["monkey"];

Do you know what it the value of c is? Don’t read on unless you want to know the answer and why. The value of c is ‘o’. Why? Well, I wrote some code to start playing around with this. The answer seemed simple, but here were some suggestions about why the answer is ‘o’:

  • 1 is the index into the string “monkey”
  • There is some magic with math of memory on the stack for the compiler used
  • Something else is happening

Okay, it seems relatively trivial now, but when I looked at it it wasn’t. Other people were putting up ideas so I tested them out. Here is my silly test code:

#include 

void initial_test()
{
	char x = 1;
	char c = x["monkey"];
	
	printf("What is c:%cn", c);
}

void different_index()
{
	char x = 2;
	char c = x["monkey"];
	
	printf("What is c:%cn", c);
}

void space_allocation()
{
	char x = 1;
	char v = 'd';
	char c = x["monkey"];
	
	printf("What is c:%cn", c);
}

int main(int argc, char** argv)
{
	initial_test();
	different_index();
	space_allocation();
	
	return 0;
}

/* Output:
What is c:o
What is c:n
What is c:o
*/

What is actually going on here is really just the associative property of addition. I was telling a friend that I would understand “monkey”[x], but not the other way around. This is the quote from my friend (who wishes to remain nameless):

I mean, technically it’s base_address + sizeof(datatype)*index. since sizeof(char) == 1, it’s just base_address+index. 1+addressof(“monkey”) or addressof(“monkey”) + 1.. they both work

In the end it was just a fun little exercise.

Why 2 can sometimes equal 1

Ran into a fun situation today where I was writing some code, and I came across an interesting situation in C++. Now, before I get to the end of this post, I’ll give you the punch-line, Developer stupidity.

So I was working on a exercise where I needed to write some sort of state machine.

enum STATES{
  STATE_1 = 0,
  STATE_2,
  STATE_3,
  ...
};

/* ... some other code ... */

switch(state) {
case STATE_1:
    // Do something
    state = STATE_2;
    break;
case STATE_2:
    // Do some other stuff
    state = STATE_3;
    break;
case STATE_3:
    // Do the last state of stuff
    // This code never gets called.
    break;
}

In this code STATE_3 is never reached. The code for the enum was working fine, but the state wasn’t being reached. I went over this for a while, until I found out the problem.

bool state = STATE_1;

The assignment for the state variable had been left over from a previous implementation of the code, and when you set the Boolean value for state = 2;, you get 1, which is true.

Now I don’t expect anyone to be as silly as me and make this mistake, but just in case, learn from my lesson.

Nuances in C: Struct Assignment in C

When I started this post, it was going to be a revolutionary post, talking about something that was really bothering me. As I have spent more time thinking about this, the answer seems so simple and obvious, still there was a time where I did not get this concept, so I here is a brief post on the topic,

The question is, in straight C, will this work:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct point_t {
  int x,y;
} Point;

int main(int argc, char **argv)
{
  Point a,b;
  int c[4] = {1, 2, 3, 4};

  b = c;

  printf("b(%p) is now b[x] = %d, b[y] = %dn", &b, b.x, b.y);

  return 0;
}

In C++ this works no problem. Structs are teated like classes where all members have public scope. Assigning one struct to another simple implements the copy constructor that is created by the C++ language, but what about straight C?

Well, like I said the answer is the obvious one, IT WORKS! The reason this originally confused me was because you CAN’T do this:

   int a[4] = { 1, 2, 3, 4};
   int b[4];

   b = a;

Why not? The compiler knows how big each of these integer arrays are. If you run sizeof(a); and it will give you the same thing as sizeof(b);. Copying an array is as simple as copying over the bytes (bit for bit), using memset() or something similar.

The reason structs work and arrays don’t is simple. Basically we are doing a type check on the object. If the object is a struct of defined type, we know how to do an assignment because we have the size. If the item is an array of integers, we would have to do a size comparison and the compiler doesn’t do that.

Okay, so I hope that explains the obvious, good luck.

Nuances in C: Array of Function pointers

I was in a job interview the other day and someone asked me the following question, which I got wrong. Its not hard to remember, but I figure if I pass it on, and anyone out there who reads this blog for technical content might get a little refresher.The question was something like: write the declaration for an array of function pointers that take an integer and return a double in C.

This problem isn’t overly complicated, but C can be a tricky language. I used the language for a year of solid development, and never had to use anything this complicated.

My attempt at this problem without any resource ended up looking like:
double function_array(int foo)[];

Okay, before you laugh, remember I haven’t had to ever actually defined a complicated structure like this. What this translates to in C is: a function named “function_array” is takes an int “foo” and returns an array of doubles. This is not possible in C as you can’t return arrays. You can return pointers, which can be indexed as an array, but you cannot return an array.

The solution is actually:double (*function_array[])(int foo);which solves the problem as stated.

The resource that I used to find the answer is a book called Expert C Programming: Deep C Secrets. This book is AMAZING and you should definitely invest in it if you do any serious work with C. It does have a bit of a Sun systems bias, as the author was a compiler writer for Sun Microsystems, but the book has useful information for anyone using the language. There is a whole chapter on unscrambling declarations in C.

So there it is, another friendly reminder about declarations in C.

Update:

After rereading the post I realized that I was not content with my understanding of how to use arrays of functions. I came up with a mock project to work on. Lets say you have an input string, and would like to process it to remove some special characters (in my case I used “:”, ” “, and “t”). I wrote a function for each special character and iterated through all the functions on each string. This might not be the most expedient way of solving this problem, but it allowed to me to test the use of arrays of functions.

You can see the source: here. It has some brief inline documentation, but I’m not doing anything else too crazy. One additional note, I rediscovered that if you initialize a string like char *foo = "some string"; it stores it read only memory. Note that is different than char foo[] = "some string" which is stored in read/write memory.