codeblog code is freedom — patching my itch

December 16, 2010

gcc-4.5 and -D_FORTIFY_SOURCE=2 with “header” structures

Filed under: Blogging,Debian,Security,Ubuntu,Ubuntu-Server — kees @ 6:11 pm

Recently gcc (4.5) improved its ability to see the size of various structures. As a result, the FORTIFY protections have suddenly gotten a bit stricter. In the past, you used to be able to do things like this:

struct thingy {
    int magic;
    char data[4];
}

void work(char *input) {
    char buffer[1000];
    int length;
    struct thingy *header;

    header = (struct thingy *)buffer;

    length = strlen(input);
    if (length > sizeof(buffer) - sizeof(*header) - 1) abort();

    strcpy(header->data, input);
    header->magic = 42;

    do_something_fun(header);
}

The problem here is that gcc thinks that header->data is only 4 bytes long. But gcc doesn’t know we intentionally overruled this (and even did length checking), so due to -D_FORTIFY_SOURCE=2, the strcpy() checks kick in when input is more than 4 bytes.

The fix, in this case, is to use memcpy() instead, since we actually know how long our destination is, we can replace the strcpy(...) line with:

    memcpy(header->data, input, length + 1); /* take 0-term too */

This kind of header and then data stuff is common for protocol handlers. So far, things like Wine, TFTP, and others have been experiencing problems with the change. Please keep an eye out for it when doing testing.

© 2010, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

7 Comments

  1. Thanks for explaining why strncpy doesn’t work in this case either. Is this post because of FTBFS fix I have submitted to htmldoc?

    Comment by Dmitrijs Ledkovs — December 17, 2010 @ 12:42 am

  2. If you see something like that, please do not just replace the strcpy with a memcpy, but fix some of the real problems here:

    * casting things illegally to a struct:
    char buffer[1000];
    struct thingy *header;
    header = (struct thingy *)buffer;

    That’s a big NO, NEVER, NO. You can case a char * pointer to a struct only if you know it was a struct originally (or a struct with this struct as first member, or a union with this struct as first member). Casting a char buffer to a struct causes undefined behaviour (and will likely break your stuff on many architectures).

    * accessing stuff after the end of the struct
    You cannot access stuff after the end of the struct. This is not allowed in C. That you see a breakage with strcpy here shows that you should have never done it. Using a memcpy here means that it again will work some time until gcc gets better again an realized that your code is not valid. You either need c99 stuff: (char data[]) or you need to case to a compatible struct (i.e. some with the same initial things, or being two parts of an union) and access there.

    And yes, protocol handlers are often bad code. Please don’t code them in C if you don’t know the basics about C.

    Comment by Bernhard R. Link — December 17, 2010 @ 4:11 am

  3. @Dmitrijs: htmldoc was another one, but yeah, the continuing problems we’ve been seeing after moving to gcc-4.5 is what caused me to write this up.

    @Bernhard Oh, I agree. The casting the core of the problem. Unfortunately, this still shows up in software in the Debian and Ubuntu archives. :(

    Comment by kees — December 17, 2010 @ 10:35 am

  4. Bernhard wrote:

    : You cannot access stuff after the end of the struct.

    I disagree. The original code is perfectly valid C and this technique is quite common, especially in networking code. It is such a common idiom, that gcc added support for vector of size 0 (http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html) and C99 added flexible arrays. Prior to that, the most portable way to implement the idiom was to create a vector of size >= 1.

    In fact, -DFORTIFY_SOURCE=2 should not be used by default to build an entire distribution in my opinion since it is known to break valid C programs. -DFORTIFY_SOURCE=1 should be safe as far as I know. Of course making your program also work with -DFORTIFY_SOURCE=2 is desirable though.

    Comment by dominiko — December 18, 2010 @ 6:47 am

  5. dominoko wrote:
    : I disagree. The original code is perfectly valid C and this technique is quite common, especially in networking code. It is such a common idiom, that gcc added support for vector of size 0 and C99 added flexible arrays.

    I know that this is common code. But it was never valid code. One of the reasons why gcc introduced this and why c99 introduced a way to do this is that there simply is no valid way to do so without that.

    : Prior to that, the most portable way to implement the idiom was to create a vector of size >= 1.
    While it looks portable, it is not as it uses undefined behaviour. The portable way is to do some pointer arithmetic (with possible some struct used to get some padding information), using this technique is the lazy way.
    Sometimes using undefined behaviour might be usefull. Sometimes it will later get authorized (like the polymorphism by casting structs to those with a same initial member that c99 made valid), but with this history went another way: There is now another and valid way to get it done lazily and the old lazy way is still undefined and if you use it, compilers are still allowed to bite you and do so (some claim in this case even without fortify-source).

    Comment by Bernhard R. Link — December 18, 2010 @ 11:35 am

  6. Switching strcpy() for memcpy() is an annoying solution. Later static checkers will check that memcpy() doesn’t overflow the buffer and we’ll have to re-write the code again.

    Why is it specified as a 4 char array? If the size isn’t known, it would be more normally to declare it as a 0 or 1 character array. Fortify source doesn’t check really small arrays, and neither does Smatch.

    I’ve seen this before where gcc warned about uninitialized variables and everyone declared their pointers as NULL or their array offsets as -1. Then Smatch comes along and it complains about NULL dereferences and bogus array offsets and unreachable returns. It masks real bugs.

    Comment by Dan Carpenter — January 6, 2011 @ 3:00 am

  7. I agree with Bernhard’s argument that you shouldn’t access memory beyond the end of struct.

    But I’m guilty of casting buffers to structs on multiple accounts – using gcc’s __attribute__((packed)) whenever I need those buffers to be portable across architectures. If this is invalid code and undefined behavior, what is then the correct way or serializing and deserializing a struct? Reading buffer byte by byte and using arithmetic for multibyte integers? That would mean replacing 2 lines of code with 200. I doubt that’s a good idea.

    Comment by Tomaž — September 5, 2011 @ 2:41 am

Powered by WordPress