Discovery: Integer overflow in functions from scanf() family in MinGW, Cygwin, Embarcadero C and other environments at loading a number to char variable

3:47

Sat, 11 March 2017

Introduction

Biting deeply into the details of error which I describe under my cycle of articles called „C Language - Time consuming errors for which people programmers are jumping from bridges” I noticed very interesting thing, which casues that at the moment of writting this article all programs compiled under Windows system on MinGW GCC, Cygwin GCC, Embarcadero/Borland C compilers AND which loads number data to variable of type char with usage of scanf() family functions are vulnerable to integer overflow bug (!) :) However, this applies only to those programs that use scanf specified specifier.



After a moment of panic, you can go back to the article:) in which I will try to slightly brighten up what the problem is.

What is this error?

The problem lies not so much in the compiler (well, maybe a little bit-about that in a moment) as in Windowsowskiej MSVCRT library that contains an implementation of glibc, and so features such as scanf. The library is not compatible with the standard C99 (ISO 9899:1999) (and at most of the C89) and does not implement all format specifiers the scanf function (exactly the point specifiers contained in section 11 on page 358 [displayed in as p. 370] the C99 specification).

Click here to preview this page specifications





The whole essence of this error is that the functions of the family scanf getting a format specifiers which they do not support, doesn't omit a given element of the format, but forcefully trying to load it. In cases where these unsupported specifiers reduce the size of the type to be loaded (eg. int is getting short int) in each case we have an integer overflow.



MSVCRT not implement, for example, the specifier "h", by which we can indeed write "% HHU" (which means taking 1 byte values ​​of type unsigned char), but the function scanf in these environments, and so fetch us the 4-byte integer (and therefore leave all specifiers "h" format and will treat it as a "% u"), which will end up overwriting an additional 3 bytes of memory placed in storage on our one byte unsigned char. And in the case of the example shown below (which format scanf () keeps in a separate variable - this is important), even we do not inform any warning that does not know the specifier "h" (even if you turn on the parameters -Wall -Wextra). To the compiler MinGW us informed, should serve as a literal format and compile with -Wall parameter (compiler Embarcadero / Borland does not inform us in any case, even when compiling with -w). / Interestingly, from my conversations with Embarcadero that the compiler does not use the library MSVCRT / MSVCRT does not implement, for example, the "h" specifier, by which we can indeed write "%hhu" (which means taking 1 byte value of type unsigned char), but the function scanf in these environments fetch us the 4-byte integer (and so leave all "h" specifiers in format and will treat it as "%u"), which will end up overwriting additional 3 bytes of memory contained over the memory of our single-byte unsigned char variable. And, in the case below the given example (which the format function scanf() keeps in a separate variable-this is important) even does not inform us any warning that does not know the specifier "h" (even if I turn on the parameters-Wall-Wextra).

MinGW compiler can inform you, but you must serve format string as a literal and compile with -Wall (the compiler Embarcadero/Borland does not inform us in any case, even when compiling with the -w parameter). /What's interesting, from my conversations with Embarcadero, it appears that their compiler does not use MSVCRT/





When the compiler will inform us about the ignorance of the specifier "h" Compiler name When it will inform us? MinGW only when parameter format in scanf function will be literal and we turn on "all warnings" the -Wall parameter Cygwin only when parameter format in scanf function will be literal and we turn on "all warnings" the -Wall parameter Borland/Embarcadero C/C++ Never. Even with compiling with the -w parameter.

gcc c:\main.c -std=c99 -o c:\main.exe

scanf should not insist on carrying out the tag in this case.

An example exploit

#include <stdio.h> #include <stdbool.h> typedef volatile unsigned char uint8_t; int main() { printf("Enter the number from range 0-255: "); bool allowAccess = false; uint8_t userNumber; char format[] = "%hhu"; scanf(format, &userNumber); if (allowAccess) { printf("Access granted: very secret thing

"); } printf("Entered number is: %d

", userNumber); return 0; }

As we can see, allowAccess variable is set to false , and nowhere on does not change explicitly its value to true . Secret data should never be displayed (and so be it as long as we are politely enter the value from a specified range). What happens, however, when we "overflow" our variable userNumber in the way that variable allowAccess will be overwrited of any non-zero number (the variable is evaluated by the processor as true when it has any value that is not a equal zero). So let's try to hack into the program to obtain secret data. To do this, run the program and -asked- let's enter any number that will not fit in a single byte - eg. 256. The result of the action programme will be the following:

Enter the number from range 0-255: 256 Access granted: very secret thing Entered number is: 0

By appropriate machinations on the input data, we obtained access to secret data to which we should never have access to.

Other functions of the scanf() family: sscanf()

The sscanf of scanf differs only that retrieves data from a text buffer, rather than from standard input (keyboard). Both functions refer to mentioned above function vsscanf, so also sscanf is vulnerable - let's check:

#include <stdio.h> #include <stdbool.h> typedef volatile unsigned char uint8_t; int main() { bool allowAccess = false; uint8_t userNumber; char format[] = "%hhu"; char buffer[] = "257

"; sscanf(buffer, format, &userNumber); if (allowAccess) { printf("Access granted: very secret thing

"); } printf("Entered number is: %d

", userNumber); return 0; }

Another example of exploit: Null character overwriting

UPDATED 11.02.2017

Another interesting example of the application of our error is the ability to override byte 0, which marks the end of a character string. Assume that we have an application requesting a PIN (in the range 0-255 - calm down, it is known that no one as short PINs as it doesn't apply, but it is an example, right?), and then requesting the password (which we don't know). Overwriting a single-byte variable that holds the PIN we can so poison the final character string in memory in the line over the declaration of our variable, that make this poisoned program will print the master password to which it compares the password inputed by the user. Let's look: #include <stdio.h> #include <stdbool.h> #include <string.h> #define PASSWORD_MAX_LENGTH 64 typedef volatile unsigned char uint8_t; void debugPrintMemoryNearVariable(uint8_t* pointer) { for (long long int i=((long long int)pointer)-10; i<((long long int)pointer)+10; i++) { printf("%x (%c)\t", *((uint8_t*)i),*((uint8_t*)i) ); } printf("

"); } int main() { char strProvidePin[] = "Please provide PIN number [0-255]: "; char strProvidePass[] = "Please provide password: "; char passPhraseToCompare[] = "very secret password to compare which can't leak

"; char lang[1] = "E"; uint8_t userPIN; char userPass[PASSWORD_MAX_LENGTH+1]; char formatPIN[] = "%hhu"; /* printf("Memory before:

"); debugPrintMemoryNearVariable(&userPIN); */ printf("%s", strProvidePin); scanf(formatPIN, &userPIN); /* printf("Memory after:

"); debugPrintMemoryNearVariable(&userPIN); */ printf("%s", strProvidePass); getchar(); fgets (userPass, PASSWORD_MAX_LENGTH, stdin); printf("



Selected language: %s

", lang); // Password comparing if (strncmp(userPass, passPhraseToCompare, PASSWORD_MAX_LENGTH)==0) { printf("

You're logged in.

"); // authorized operations }else{ printf("

Wrong password.

"); } return 0; } When you run the program, let's enter to it PIN number: 1094795520. We find out that in the place where it has display the selected language we have the whole secret password instead (without the first two letters) with which the program had to compare the one entered by the user. This is because it overwrited a terminal null character included in the lang variable (this is a byte with a value of 0), the function which displays text didn't noted the end (because I could not), and it displayed out the characters until the next null character (of the next string in the memory) - in this case, the secret master password.

There is one interesting fact related to this code. While in the MinGW actually overwrite all four bytes, in the Embarcadero/Borland C (latest version, 7.20, the time when I'm writing the article) overwrite only two bytes of a text string lang (just lucky for the presented bug is also the last, null termination char of lang string, so the example above works).

Vulnerabilities in various environments

Vulnerable environment Environment name Vulnerability Windows environments 32-bit environments MinGW GCC 4.4.1 (32-bit) Yes MinGW GCC 5.3.0-3 (32-bit)

[the newest on day 19.02.2017] Yes Cygwin GCC 4.4.1 (32-bit)

[the newest on day 19.02.2017] Yes Embarcadero C++ 7.20 for Win32 / bcc32c version 3.3.1 Yes VisualStudio 2015 (14.0) 32-bit No VisualStudio 2005 (8.0) 32-bit No 64-bit environments MinGW 6.3.0 for i686

[the newest on day 19.02.2017] Yes MinGW 6.3.0 for x86_64 (posix-seh)

[the newest on day 19.02.2017] Yes Cygwin64 GCC 5.3.0 (64-bit) Yes VisualStudio 2015 (14.0) 64-bit No Unix environments RedHat GCC 4.8.5 (on Linux) No

Why VisualStudio is not vulnerable?

At first moment I thought they have found this problem and fixed it (when you try to use scanf functions we're geting warning to use scanf_s which, however, does not fit in the C99 standard). However, the case looks differently. VisualStudio already use the newer libraries, run-time (derived from MSVCRT) that implement a much larger part of the C99 standard-including the specifier "h". That's why it's not working.

Postfix "_s" suggested by VisualStudio function goes for something different: Unlike the less secure version - i.e. sscanf, functions with postfix "_s" support an additional parameter, buffer size, but only when you use the type specifiers, c, c, s, S (and not in the case of our "%hhu"). The buffer size is given as a parameter to the following parameter-references to a variable (more about VisualStudio-specific features can be found here) - let's look:

wchar_t buffer[10]; // buffer size is 10, width specification is 9 swscanf_s(input_string, L"%9s", buffer, (unsigned)_countof(buffer)); Versions of VisualStudio before the 4.0 and from 7.0 to 13.0 use differently named DLLs for each version (MSVCR20. Dll MSVCR70. MSVCR71 Dll. DLL, MSVCR80. DLL [VS 2005], MSVCR90. DLL [VS 2008], MSVCR100 [VS 2010], MSVCP110. DLL, etc.). With 14.0 version of VisualStudio (2015), the library was moved to the new DLL file named UCRTBASE.DLL (but programs are required to link to the desired version of the library named "VCRUNTIME140. DLL" - with numbers changing in tact to future versions - a veritable hell on Earth).

This software instllers should take care of it that appropriate version of the MSVCRT is present in the system (packages named "Visual C++ Redistributable Package" installed together with some programs are exactly for that purpose). With Windows is installed by default one version of this library.



Gynvael Coldwind accurately noted:

"hh" does not occur anywhere in the specification [Microsoft Visual Studio C/C++ -add Lukas], so you cannot expect that it will work. (...) Interesting is something else - that works on newer versions of the C Library Microsoft correctly. And this is interesting because according to documentation "hh" is not supported: https://msdn.microsoft.com/en-us/library/tcxf1dw6(v=vs.140).aspx



"he hh, j, z, and t length prefixes are not supported."



In contrast, what wouldn't be here to speak, %hhu is working properly ^_-

What to do, how to live?

The solution of the problem for MinGW

The best solution would of course be linking newer libraries. However, today, after a few hours solving the problem I not discovered how to force MinGW to use another library MSVCRT than that to which it links by default (maybe someone of you succeed). 07.03.2017 UPDATE : it turns out that GCC/G ++ in MinGW environment (unfortunately it does not work on Cygwin, or Embarcadero C) has a special build parameter (-D__USE_MINGW_ANSI_STDIO), which makes the default, old library MSVCRT is replaced by another-later.

So in order to compile program without vulnerability, you can use the following command:

gcc main.c -o main.exe -D__USE_MINGW_ANSI_STDIO

gcc main.c -o main.exe -Wall -Wextra -Wformat=2 -D__USE_MINGW_ANSI_STDIO

Additional warnings for safety

UPDATE 05.03.2017

(Thanks to Andrew Pinski from GNU GCC Project): You can force the compiler to warn about the impossibility of checking the parameter format (which is a pointer to a character buffer) in the code quoted above. There is a parameter such as the build -Wformat-nonliteral that you can use, but it is best to use the parameter, which includes a whole group of safety warnings, namely -Wformat=2 . So we can compile it using this syntax:

gcc main.c -o main.exe -Wall -Wextra -Wformat=2

gcc main.c -o main.exe -Wall -Wextra -Wformat=2 -D__USE_MINGW_ANSI_STDIO

A possible workaround

There are still obvious "workaround": even if we want to load the value from range 0-255, that would hold in our 1-byte unsigned char variable, then we should use for this purpose a variable of type integer (and not of type char). In another case, as you can see, our application will be vulnerable to error integer overflow.

Possible errors in applications

I did a grep on my disk in the directory with the programs with %hhu search and it shows some 400 files, most of which was linked with the new URmsvc*.dll, or did not have the scanf (just like printf). However, the fact is some of the URmsvcrt.dll be found, so you probably you could have a look (I encourage You to do it)

Conclusion

It is certainly some risk/something for what it's worth to pay attention when viewing the application code. The definitive definitely whether something is "error" or "security error" in such cases (errors in the API libraries) always bring to a specific case of a particular application, in which it was badly used.

An expert is a person who has committed all possible mistakes in your garden plot.

I have a big request to people who read the article to click below on the stars in order to rate the material. Thanks!

Autor opinii: Czytelnicy , data przesłania: 0

Skomentuj

Aby zamieścić komentarz, proszę włączyć JavaScript - niestety roboty spamujące dają mi niezmiernie popalić.

Nick



E-Mail



Komentarz





Komentarze czytelników

Nie ma jeszcze żadnych komentarzy.