Optimization is a very broad topic when referring to compiled languages like C or C++. There are many good guides on that topic. However, I see that people nevertheless forget about a few basic principles; thus I’d like to write a few short words myself, explaining how to avoid common pitfalls when optimizing or at least trying to.
First of all, I’d like to notice that I will be considering optimization as targeted towards making the program execution time shorter through use of faster code (algorithms, methods). I will not cover the area in detail but just give a few tips or remainders which should be taken into account when optimizing.
Know your assembly!
Measurement is a very important tool when it comes to optimize. Measure before, measure after and see whether there was improvement. Developers like measures because they give numbers, and numbers simply say
slower, or sometimes
not sure but whatever. But remember that numbers can be mis-interpreted easily, and then they can lead to wrong (or even ridiculous) conclusions.
To be honest, measurement is a very broad topic itself, and it is really hard to perform good and trustworthy measures. And I believe that in order to do so, you first need to know deeply what you are measuring. And in fact you are measuring code; but more than the code you have just written, you are measuring the assembly generated by the compiler.
Thus, I believe that before starting to measure anything, you should obtain the resulting assembly and analyze it. Or at least compare with the previous result. Otherwise, you can end up trying to measure a difference between two variants of code resulting in the same assembly! And trying to improve your measurement methods to get a consistent, meaningful difference.
Of course, you shouldn’t trust the assembly by itself either. Sometimes seemingly insignificant code changes can cause noticeable differences in execution time. Sometimes four instructions (with longer execution time) actually execute faster than two instructions because the former were done in parallel while the latter couldn’t; and this is when measurement can come in handy.
Optimize the compiler, do not perform its job
There are people who believe that the only possible way of getting good code is by writing the assembly themselves. Shortly saying — don’t do it! Most importantly, you are making your program less portable; if the assembly is optional, the replacement part becomes not well-tested. And anyway, you are writing the assembly for your CPU. If you are writing for Intel, then AMD can work better with another code; if you are writing for AMD, then VIA may like yet another solution. And I’m just considering x86 here.
If you are writing in high-level programming language like C or C++, write in it. If the compiler doesn’t generate desired assembly, try to change the code to give it a hint. Sometimes you just need to reorder the variables a little, or introduce a helper (mid-result) variable.
Often it is actually beneficial to replace your complex,
optimized construct with a simpler one. The latter will benefice both the users (which will find it easier to understand it, and fiddle with it) and the compiler (which will find it easier to understand it, and replace it with a much better optimized code).
And if you still can’t get your desired assembly, and you are sure that it will be actually better and beneficial, report a bug. Get your compiler fixed so that it could do its job better, and benefit all the projects using it, not only yours.