Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.1k
Description
The size check in _Py_DecodeUTF8Ex can be improved to always check against a constant value without further arithmetic involved. This is already done at other places within the file, e.g.here.
I was curious if this could actually be triggered with a proof of concept by overflowing the check and eventually performing an out of boundary heap access. And in fact, with a very artificial setup, it is possible on a 32 bit system which tries to convert a 2 GB long string:
#include"Python.h"#include<sys/mman.h>#include<err.h>#include<stdlib.h>intmain(intargc,char*argv[]){char*str;size_twlen;wchar_t*program;// force UTF-8 modePy_UTF8Mode=1;if ((program=Py_DecodeLocale(argv[0],NULL))==NULL)errx(1,"PyDecodeLocale");Py_SetProgramName(program);Py_Initialize();// try to convert a 2 GB long stringif ((str=mmap(NULL, (size_t)INT_MAX+1,PROT_READ |PROT_WRITE,MAP_PRIVATE |MAP_ANONYMOUS,-1,0))== (void*)-1)err(1,"malloc");memset(str,'a',INT_MAX);str[INT_MAX]='\0';Py_DecodeLocale(str,&wlen);PyMem_RawFree(program);return0;}
I doubt that this is really reachable with actual code. But at least it is a good showcase that actual arithmetic is left over in the if-check. Let's remove it and save us this possible headache.
PS: Not sure if this is the correct way to create python issues with GitHub now. Let me know if something's missing or wrong!