By "actual code" I meant the assembly that the application logic compiles down to, not the entire executable. But as far as the entire package goes, compiling it using clang with some flags I can get down to 19.5k without any effort. If I wanted to waste time on this, ripping out the CRT entirely and getting it to 16k would probably take less than an hour.