Sunday, July 12, 2015

Damn You \xC2 ...

I was working on putting some code down in hex in a place it didn't belong. You can see in the image below a bit of the shell code and the code used to place it there. Not tremendously exciting stuff. But in particular you can see I have my initial buffer overflow filler on line 150, followed by a reliable address on line 151, a bit of a sled on line 152 and the shell code payload lastly on 153. Notice that nowhere is there a \xC2 or \xC3.



There are plenty of \x41 to be sure and several \x90s but no \xC2 or \xC3. Perhaps you may think that some got there within the shelled, but that isn't the case, and as you can clearly see there aren't any between the bytes that the NOP sled is composed of...


So where did they all come from? At first I thought maybe it was some DEP thing that I didn't know about (because there really is a LOT that I don't know about). But after experimenting a bit I realized, initially, it started with any bytes higher than \x7F - the limit of the ASCII range.

And that made it an encoding problem. Specifically line 163 encodes regular ASCII as UTF-8 which is fine but at line 165 when we hit hex outside the ASCII range we start to have problems in python 3. Python 2 took care of this seemingly automatically because the default encoding is "latin-1" which works for the range \x00-\xFF but UTF-8 doesn't and python 3 defaults to that. In any event I called out on line 165 for it to be UTF-8 which was not up to the task.

The story ends happily though. Changing line 165 to 'latin-1' results in the following...