Nobody Cares About a Few Million Nanoseconds

Unless the software is being run on a computer that is going into space, a nuclear reactor, or someone’s chest cavity, these micro-optimizations don’t matter 97% of the time. Even when programming for smart phones (which have limited system resources) you need to focus on improvements that result in orders of magnitude improvements, not micro-optimizations. These usually involve caching data or using entirely different algorithms, not tiny clever tricks that make the code inscrutable.

Code that is straightforward to read is easy to understand. Code that is easy to understand is less likely to have bugs. Code that is easy to understand is easy to extend with new features. And it’s bug-free programs with cool features that people want. Nobody cares about a few million nanoseconds.

More info about this topic here:

Also, to head off criticisms of this article, there are times when micro-optimizations are needed. That’s why Knuth says they’re unneeded only 97% of time.

Page 2 of 2 | Previous page

13 comments on this post.
  1. Stan:

    One thing I often have to point out to new programmers is that there does not have to be any direct 1-to-1 mapping from variables in your code to registers or even memory locations in the compiled machine code. Reusing a variable for a different purpose doesn’t save you any space because the compiler does its register allocation independent of your variable naming.

  2. Anon:

    Great article! You only fall flat on “Unless the software is being run on a computer that is going into space, a nuclear reactor, or someone’s chest cavity”.
    Those are known to be the last places where anyone would trade readability, which brings reliability, with speed.
    Usually speed optimizations are mandatoriy in OS kernels, system libraries, CPU-heavy interactive applications like VGs, embedded systems (the real ones, not smartphones)

  3. Adam:

    Actually, heavy manual optimization is frequently still necessary for spaceflight software due to the heavily constrained nature of the processors on board.

  4. tshirtman:

    a = a + b;
    b = a – b;
    a = a – b;

    that hacks also have a non obvious problem for beginners, a and b are 32 bits integers, right? now imagine a + b is either > 2^31 or < – 2^31… yes, you have an overflow, and the swap will produce wrong values a and b…

    and yeah, i think using python is fast enought for most things, and when obvious code is not, cprofile will help you, also, runsnakerun is a great gui to makes sense of the results…

    Link on my game is my python game project, which shows you can do complex games in pure python/pygame… it's a clone of smash bros nintendo games… some work left, but really enought to show to clone those 'python is slow' arguments… it's all about an efficient algorithm, (seeing real complexity, not saving "nothings" everywhere).

  5. tshirtman:

    to close*

    gah >_<

  6. Dino:

    I’ve seen the “temp swap” done as:
    a ^= b;
    b ^= a;
    a ^= b;
    See :

    I think it’s fair to say that you should never say never. In the situation where you find yourself doing integer (or float) swaps enough to warrant it, write an inline function to do it. Then you can elect to hide whatever implementation you desire as the core meaning behind the swap is clear from the function name.

    And I disagree about using doubles instead of floats! The accuracy of a float is enough to give an error of ~4m on scales approaching those needed to get a satellite into geostationary orbit. That’s better than you’ll get from a standard GPS device. Or to put it another way, on a metre scale, you can get an accuracy of about 1 nanometre, which is roughly in line with the diameter of atoms. I’d say that was enough for most people. Using doubles doesn’t eliminate rounding errors completely, it just makes them less common. If you want to use floats for representing currency, you probably need your head examining, horses for courses!

    Other than that, it’s a fine article :)

  7. ö:

    if i need to swap, i just use

    a, b = b, a

  8. UncleLaz:

    a, b = b, a

    works literally with any type of variable

  9. alan:

    I completely agree, except for your statement about floats verse doubles. On a lot of hardware doubles are much slower than floats, and it’s extremely difficult to convert code from using doubles to one that uses floats (for the reasons you mentioned). The other problem with doubles is that the profiler doesn’t really show issues with doubles since it’s not like the profiler is going to tell you that your multiples are slow — Instead you’ll just run into more cache misses and all of your math will be slightly slower.

    Other problems include talking with your GPU (since it is all floats or less) since now your GPU and CPU will have a different representation of the number (slower, and also can be the source of bugs).

    And, honestly, if you’re writing a game that requires the precision of a double, there’s probably something else wrong.

  10. ckunte:

    For a moment, I thought your example was meant for a language other than python, I mean the declarations are not on the same line, so what’s with your use of semicolon?

  11. Omid B.:

    Excellent article! I wish all the programmers could think this way. I preach all day long to keep developments simple but it is not an easy task to do. Some developers tend to complexify their developments as it shows that they are “smart” (am i wrong?). But they often forget that “smart” also means developing code that is simple to maintain, human readable and that comply with the customer budget!

  12. Michael Hunter:

    Instead of “don’t do micro-opt” instead “do focus on algorithms and abstractions”. The fact is, depending on application, micro-opt can be appropriate in parts of your code that are heavily used. You see plenty of this appropriately used in the python world. Way more then in just nuclear reactors, etc. The problem with speeding up many programs that havn’t been designed well isn’t that you couldn’t micro-opt your way there but that the places you would need to micro-opt are spread all over the place. If they are behind reasonable abstractions then you can apply micro-opt (after measuring) in a judicious way. And of course if you blow the algorithms up front and/or the use of those algorithms is appropriately abstracted then micro-opt is often not a good enough answer.

  13. James Thiele:

    You write:
    •Doing math in the code for the computer, such as multiplying a value by 525600 (the number of minutes in a year) instead of multiplying a value by 60 * 24 * 365 (which is more obvious).

    And makes it obvious that this is wrong for leap years. :)

Leave a comment