{"id":420,"date":"2015-03-27T12:25:13","date_gmt":"2015-03-27T12:25:13","guid":{"rendered":"http:\/\/blogs.gentoo.org\/lu_zero\/?p=420"},"modified":"2015-04-17T15:52:23","modified_gmt":"2015-04-17T15:52:23","slug":"again-on-assert","status":"publish","type":"post","link":"https:\/\/blogs.gentoo.org\/lu_zero\/2015\/03\/27\/again-on-assert\/","title":{"rendered":"Again on assert()"},"content":{"rendered":"<p>Since apparently there are still people not reading the fine <a href=\"http:\/\/man7.org\/linux\/man-pages\/man3\/assert.3.html\">man<\/a> page.<\/p>\n<blockquote><p>\nIf the macro NDEBUG was defined at the moment  was last included, the macro assert() generates no code, and hence does nothing at all.<br \/>\nOtherwise, the macro assert() prints an  error  message to  standard  error and terminates the program by calling abort(3) if expression is false (i.e., compares equal to zero).<br \/>\nThe <strong>purpose of this macro is to help the programmer find bugs in his program<\/strong>.  The message &#8220;assertion failed in file foo.c, function do_bar(), line 1287&#8221; is of <strong>no help at all<\/strong> to a user.\n<\/p><\/blockquote>\n<p>I guess it is time to return on <strong>security<\/strong> and expand a bit which are good practices and which are misguided ideas that should be <strong>eradicated<\/strong> to reduce the amount of Deny Of Service waiting to happen.<\/p>\n<h2>Security issues<\/h2>\n<p>The term <em>&#8220;Security issue&#8221;<\/em> covers a lot of different kind of situations. Usually unhanded paths in the code lead to <strong>memory corruption<\/strong>, <strong>memory leaks<\/strong>, <strong>crashes<\/strong> and other less evident problems such as <strong>information leaks<\/strong>.<\/p>\n<p>I&#8217;m focusing on <strong>crashes<\/strong> today, assume the others are usually more annoying or dangerous, it might be true or <strong>not<\/strong> depending on the scenarios:<\/p>\n<p>If you are watching a movie and you have a glitch in the bitstream that makes the application <strong>leak some memory<\/strong> you would not care at all as long you can enjoy your movie. If the same glitch makes VLC to close suddenly a second before you get to see who is the mastermind behind a really twisted plot&#8230; I guess you&#8217;ll scream at whoever thought was a good idea to <strong>crash<\/strong> there.<\/p>\n<p>If a glitch might get an attacker to run <strong>arbitrary code<\/strong> while you are watching your movie probably you&#8217;d like better to have your player to just <strong>crash<\/strong> instead.<\/p>\n<p>It is a <strong>false dichotomy<\/strong> since what you want is to have the glitch handled <strong>properly<\/strong>, and keep watching the rest of the movie w\/out having VLC crashing w\/out any meaningful information for you to know.<\/p>\n<div class=\"alert alert-info\">\nErrors <b>must<\/b> be handled, trading a <b>crash<\/b> for something else you consider worse is just being <b>naive<\/b>.\n<\/div>\n<h2>What is assert exactly?<\/h2>\n<p><a href=\"http:\/\/man7.org\/linux\/man-pages\/man3\/assert.3.html\">assert<\/a> is a debugging facility mandated by POSIX and C89 and C99, it is a macro that more or less looks like this<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"cp\">#define assert()                                       \\<\/span>\n<span class=\"cp\">    if (condition) {                                   \\<\/span>\n<span class=\"cp\">        do_nothing();                                  \\<\/span>\n<span class=\"cp\">    } else {                                           \\<\/span>\n<span class=\"cp\">       fprintf(stderr, &quot;%s %s&quot;, __LINE__, __func__);   \\<\/span>\n<span class=\"cp\">       abort();                                        \\<\/span>\n<span class=\"cp\">    }<\/span>\n<\/pre>\n<\/div>\n<p>If the condition does not happen <a href=\"http:\/\/man7.org\/linux\/man-pages\/man3\/abort.3.html\">crash<\/a>, here the real-life version from <a href=\"http:\/\/git.musl-libc.org\/cgit\/musl\/tree\/include\/assert.h\">musl<\/a><\/p>\n<div class=\"codehilite\">\n<pre><span class=\"cp\">#define assert(x) ((void)((x) || (__assert_fail(#x, __FILE__, __LINE__, __func__),0)))<\/span>\n<\/pre>\n<\/div>\n<h2>How to use it<\/h2>\n<blockquote><p>\nAssert should be use to verify assumptions. While <strong>developing<\/strong> they help you to verify if your<br \/>\nassumptions meet reality. If not they tell you that should investigate because something is<br \/>\nclearly wrong. They are not intended to be used in release builds.<br \/>\n<cite>&#8211; some wise Federico while talking about another language asserts<\/cite>\n<\/p><\/blockquote>\n<p>Usually when you write some code you might do something like this to make sure you aren&#8217;t doing anything wrong, you start with<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"kt\">int<\/span> <span class=\"nf\">my_function_doing_difficult_computations<\/span><span class=\"p\">(<\/span><span class=\"n\">Structure<\/span> <span class=\"o\">*<\/span><span class=\"n\">s<\/span><span class=\"p\">)<\/span>\n<span class=\"p\">{<\/span>\n   <span class=\"n\">a<\/span> <span class=\"o\">=<\/span> <span class=\"n\">some_computation<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">);<\/span>\n   <span class=\"p\">....<\/span>\n   <span class=\"n\">b<\/span> <span class=\"o\">=<\/span> <span class=\"n\">other_operations<\/span><span class=\"p\">(<\/span><span class=\"n\">a<\/span><span class=\"p\">,<\/span> <span class=\"n\">s<\/span><span class=\"p\">);<\/span>\n   <span class=\"p\">....<\/span>\n   <span class=\"n\">c<\/span> <span class=\"o\">=<\/span> <span class=\"n\">some_input<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">,<\/span> <span class=\"n\">b<\/span><span class=\"p\">);<\/span>\n   <span class=\"p\">...<\/span>\n   <span class=\"n\">idx<\/span> <span class=\"o\">=<\/span> <span class=\"n\">some_operation<\/span><span class=\"p\">(<\/span><span class=\"n\">a<\/span><span class=\"p\">,<\/span> <span class=\"n\">b<\/span><span class=\"p\">,<\/span> <span class=\"n\">c<\/span><span class=\"p\">);<\/span>\n\n   <span class=\"k\">return<\/span> <span class=\"n\">some_lut<\/span><span class=\"p\">[<\/span><span class=\"n\">idx<\/span><span class=\"p\">];<\/span>\n<span class=\"p\">}<\/span>\n<\/pre>\n<\/div>\n<p>Where <code>idx<\/code> in a signed integer, and so <code>a<\/code>, <code>b<\/code>, <code>c<\/code> are with some ranges that might or not depend on some external input.<\/p>\n<p>You do not want to have <code>idx<\/code> to be outside the range of the lookup table array <code>some_lut<\/code> and you are not so sure. How to check that you aren&#8217;t getting outside the array?<\/p>\n<p>When you write the code usually you iteratively improve a prototype, you can add tests to make sure every function is returning values within the expected range and you can use <code>assert()<\/code> as a poor-man C version of proper unit-testing.<\/p>\n<p>If some function depends on values outside your control (e.g. an input file), you usually do validation over them and cleanly <strong>error out<\/strong> there. Leaving external inputs unaccounted or, even worse, put an <code>assert()<\/code> there is really bad.<\/p>\n<h3>Unit testing and assert()<\/h3>\n<p>We want to make sure our function works fine, let&#8217;s make a really tiny test.<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"kt\">void<\/span> <span class=\"nf\">test_some_computation<\/span><span class=\"p\">(<\/span><span class=\"kt\">void<\/span><span class=\"p\">)<\/span>\n<span class=\"p\">{<\/span>\n    <span class=\"n\">Structure<\/span> <span class=\"o\">*<\/span><span class=\"n\">s<\/span> <span class=\"o\">=<\/span> <span class=\"nb\">NULL<\/span><span class=\"p\">;<\/span>\n    <span class=\"kt\">int<\/span> <span class=\"n\">i<\/span><span class=\"p\">;<\/span>\n    <span class=\"k\">while<\/span> <span class=\"p\">(<\/span><span class=\"n\">input_generator<\/span><span class=\"p\">(<\/span><span class=\"o\">&amp;<\/span><span class=\"n\">s<\/span><span class=\"p\">,<\/span> <span class=\"n\">i<\/span><span class=\"p\">))<\/span> <span class=\"p\">{<\/span>\n       <span class=\"kt\">int<\/span> <span class=\"n\">a<\/span> <span class=\"o\">=<\/span> <span class=\"n\">some_computation<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">);<\/span>\n       <span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">a<\/span> <span class=\"o\">&gt;<\/span> <span class=\"mi\">0<\/span> <span class=\"o\">&amp;&amp;<\/span> <span class=\"n\">a<\/span> <span class=\"o\">&lt;<\/span><span class=\"mi\">10<\/span><span class=\"p\">);<\/span>\n    <span class=\"p\">}<\/span>\n<span class=\"p\">}<\/span>\n<\/pre>\n<\/div>\n<p>It is compact and you can then run your test under <strong>gdb<\/strong> and inspect a bit around. Quite good if you are refactoring the innards of <code>some_computation()<\/code> and you want to be sure you did not consider some corner case.<\/p>\n<p>Here <code>assert()<\/code> is quite nice since we can pack in a single line the testcase and have a simple report if something went wrong. We could do better since assert does not tell use the <strong>value<\/strong> or how we ended up there though.<\/p>\n<p>You might not be that thorough and you can just decide to put the same assert in your function and check there, assuming you cover all the input space properly using regression tests.<\/p>\n<h3>To crash or not to crash<\/h3>\n<p>The people that consider OK <strong>crashing<\/strong> on runtime (remember the sad user that cannot watch his wonderful movie till the end?) suggest to leave the assert enabled at runtime.<\/p>\n<p>If you consider the example above, would be better to <strong>crash<\/strong> than to read a random value from the memory? Again this is a <strong>false<\/strong> dichotomy!<\/p>\n<p>You can expect failures, e.g. broken bitstreams and you want to just <strong>check<\/strong> and return a proper <strong>failure<\/strong> message.<\/p>\n<p>In our case <code>some_input()<\/code> return value should be checked for failures and the return value forwarder further up till the library user that then will decide what to do.<\/p>\n<p>Now remains the access to the lookup table. If you didn&#8217;t check sufficiently the other functions you might get a bogus index and if you get a bogus index you will read from random memory (crashing or not depending if the random memory is on an address mapped to the program outside). Do you want to have an <code>assert()<\/code> there? Or you&#8217;d rather ad another normal check with a normal failure path?<\/p>\n<p>An correct answer is to test your code enough so you do not need to add yet another check and, in fact, if the problem arises is <strong>wrong<\/strong> to add a check there, or, even worse an <code>assert()<\/code>, you should just go up in the execution path and fix the problem where it is: a non validated input, a wrong <em>&#8220;optimization&#8221;<\/em> or something sillier.<\/p>\n<p>There is open debate on if having <code>assert()<\/code> enabled is a good or bad practice when talking about defensive design. In <strong>C<\/strong>, in my opinion, it is a complete misuse. You if you want to litter your release code with tons of <strong>branches<\/strong> you can also spend time to implement something better and make sure to clean up correctly. Calling <code>abort()<\/code> leaves your input and output possibly in severely inconsistent state.<\/p>\n<h2>How to use it the wrong way<\/h2>\n<blockquote><p>\nI want to trade a crash anytime the alternative is memory corruption<br \/>\n<cite>&#8211; some misguided guy<\/cite>\n<\/p><\/blockquote>\n<p>Assume you have something like that<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"kt\">int<\/span> <span class=\"n\">size<\/span> <span class=\"o\">=<\/span> <span class=\"n\">some_computation<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">);<\/span>\n<span class=\"kt\">uint8_t<\/span> <span class=\"o\">*<\/span><span class=\"n\">p<\/span><span class=\"p\">;<\/span>\n<span class=\"kt\">uint8_t<\/span> <span class=\"o\">*<\/span><span class=\"n\">buf<\/span> <span class=\"o\">=<\/span> <span class=\"n\">p<\/span> <span class=\"o\">=<\/span> <span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"p\">);<\/span>\n\n\n<span class=\"k\">while<\/span> <span class=\"p\">(<\/span><span class=\"n\">some_related_computations<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">))<\/span> <span class=\"p\">{<\/span>\n   <span class=\"n\">do_stuff_<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">,<\/span> <span class=\"n\">p<\/span><span class=\"p\">);<\/span>\n   <span class=\"n\">p<\/span> <span class=\"o\">+=<\/span> <span class=\"mi\">4<\/span><span class=\"p\">;<\/span>\n<span class=\"p\">}<\/span>\n\n<span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">p<\/span> <span class=\"o\">-<\/span> <span class=\"n\">buf<\/span> <span class=\"o\">==<\/span> <span class=\"n\">size<\/span><span class=\"p\">);<\/span>\n<\/pre>\n<\/div>\n<p>If <code>some_computation()<\/code> and <code>some_related_computation(s)<\/code> do not agree, you might write over the allocated buffer! The naive person above starts talking about how the memory is corrupted by <code>do_stuff()<\/code> and <strong>horrible things<\/strong> (e.g. foreign code execution) could happen without the <code>assert()<\/code> and how even calling <code>return<\/code> at that point is terrible and would lead to horrible horrible things.<\/p>\n<p>Ok, <strong>NO<\/strong>. Stop <strong>NOW<\/strong>. Go up and look at how assert is implemented. If you check at that point that something went wrong, you have the <strong>corruption<\/strong> already. No matter what you do, somebody could exploit it depending on how naive you had been or unlucky.<\/p>\n<p>Remember: <code>assert()<\/code> does do <strong>I\/O<\/strong>, <strong>allocates memory<\/strong>, <strong>raises a signal<\/strong> and <strong>calls functions<\/strong>. All that you would rather not do when your memory is corrupted is done by <code>assert()<\/code>.<\/p>\n<p>You can be less <strong>naive<\/strong>.<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"kt\">int<\/span> <span class=\"n\">size<\/span> <span class=\"o\">=<\/span> <span class=\"n\">some_computation<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">);<\/span>\n<span class=\"kt\">uint8_t<\/span> <span class=\"o\">*<\/span><span class=\"n\">p<\/span><span class=\"p\">;<\/span>\n<span class=\"kt\">uint8_t<\/span> <span class=\"o\">*<\/span><span class=\"n\">buf<\/span> <span class=\"o\">=<\/span> <span class=\"n\">p<\/span> <span class=\"o\">=<\/span> <span class=\"n\">malloc<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span><span class=\"p\">);<\/span>\n\n<span class=\"k\">while<\/span> <span class=\"p\">(<\/span><span class=\"n\">some_related_computations<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">)<\/span> <span class=\"o\">&amp;&amp;<\/span> <span class=\"n\">size<\/span> <span class=\"o\">&gt;<\/span> <span class=\"mi\">4<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n   <span class=\"n\">do_stuff_<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">,<\/span> <span class=\"n\">p<\/span><span class=\"p\">);<\/span>\n   <span class=\"n\">p<\/span>    <span class=\"o\">+=<\/span> <span class=\"mi\">4<\/span><span class=\"p\">;<\/span>\n   <span class=\"n\">size<\/span> <span class=\"o\">-=<\/span> <span class=\"mi\">4<\/span><span class=\"p\">;<\/span>\n<span class=\"p\">}<\/span>\n<span class=\"n\">assert<\/span><span class=\"p\">(<\/span><span class=\"n\">size<\/span> <span class=\"o\">!=<\/span> <span class=\"mi\">0<\/span><span class=\"p\">);<\/span>\n<\/pre>\n<\/div>\n<p>But then, instead of the assert you can just add<\/p>\n<div class=\"codehilite\">\n<pre><span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">size<\/span> <span class=\"o\">!=<\/span> <span class=\"mi\">0<\/span><span class=\"p\">)<\/span> <span class=\"p\">{<\/span>\n    <span class=\"n\">msg<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;Something went really wrong!&quot;<\/span><span class=\"p\">);<\/span>\n    <span class=\"n\">log<\/span><span class=\"p\">(<\/span><span class=\"s\">&quot;The state is %p&quot;<\/span><span class=\"p\">,<\/span> <span class=\"n\">s<\/span><span class=\"o\">-&gt;<\/span><span class=\"n\">some_state<\/span><span class=\"p\">);<\/span>\n    <span class=\"n\">cleanup<\/span><span class=\"p\">(<\/span><span class=\"n\">s<\/span><span class=\"p\">);<\/span>\n    <span class=\"k\">goto<\/span> <span class=\"n\">fail<\/span><span class=\"p\">;<\/span>\n<span class=\"p\">}<\/span>\n<\/pre>\n<\/div>\n<p>This way when the <em>&#8220;impossible&#8221;<\/em> happens the user gets a proper notification and you can recover cleanly and no memory corruption ever happened.<\/p>\n<h2>Better than assert<\/h2>\n<p>Albeit being easy to use and portable <code>assert()<\/code> does not provide that much information, there are plenty of tools that can be leveraged to get better reporting.<\/p>\n<ul>\n<li>Gdb has a <a href=\"https:\/\/sourceware.org\/gdb\/onlinedocs\/gdb\/Signals.html\">some<\/a> <a href=\"https:\/\/sourceware.org\/gdb\/onlinedocs\/gdb\/Set-Watchpoints.html\">capabilities<\/a> that you might enjoy.<\/li>\n<li><a href=\"http:\/\/valgrind.org\">valgrind<\/a> can be used in non <a href=\"https:\/\/blog.mozilla.org\/nnethercote\/2011\/01\/11\/using-valgrind-to-get-stack-traces\/\">obvious<\/a> ways.<\/li>\n<li><a href=\"https:\/\/wiki.libav.org\/Security\/Tools#AddressSanitizer_.28gcc.2C_clang.29\">asan<\/a> can get pretty handy with <code>__asan_describe_address<\/code>.<\/li>\n<\/ul>\n<h2>In Closing<\/h2>\n<p><code>assert()<\/code> is a really nice debugging tool and it helps a lot to make sure some state remains invariant while refactoring.<\/p>\n<p>Leaving asserts in release code, on the other hand, is <strong>quite<\/strong> wrong, it does not give you any additional safety. Please do not buy the fairly tale that <code>assert()<\/code> saves you from the scary memory corruption issues, it does <em>NOT<\/em>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Since apparently there are still people not reading the fine man page. If the macro NDEBUG was defined at the moment was last included, the macro assert() generates no code, and hence does nothing at all. Otherwise, the macro assert() prints an error message to standard error and terminates the program by calling abort(3) if &hellip; <a href=\"https:\/\/blogs.gentoo.org\/lu_zero\/2015\/03\/27\/again-on-assert\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Again on assert()<\/span><\/a><\/p>\n","protected":false},"author":10,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[3,14],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1aGWH-6M","_links":{"self":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/420"}],"collection":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/users\/10"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/comments?post=420"}],"version-history":[{"count":9,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/420\/revisions"}],"predecessor-version":[{"id":446,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/posts\/420\/revisions\/446"}],"wp:attachment":[{"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/media?parent=420"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/categories?post=420"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.gentoo.org\/lu_zero\/wp-json\/wp\/v2\/tags?post=420"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}