Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
Description
Bug report
When encoding a null-terminated string inshift_jisx0213
, the null-terminator sometimes gets truncated. To add a null-terminator when encoding, I usually use(string + "\0").encode(encoding)
which works with most encodings. However, this doesn't seem to be the case here.
Instead, I'm usingstring.encode(encoding) + "\0".encode(encoding)
as a workaround to create the correct result. However, this won't produce the correct result forutf-16
, because the BOM would be included twice.
Consider the following sample script to check this for yourself.
strings:list[str]= ["hello world","バルーンフルーツ","バルーンフィッシュ","ライフアップキノコ"]encoding="shift_jisx0213"forstringinstrings:encoded_direct_null= (string+"\0").encode(encoding)encoded_append_null=string.encode(encoding)+"\0".encode(encoding)print(repr(string))print(" - encoded_append_null (EXPECTED!):",encoded_append_null.hex())print(" - encoded_direct_null: ",encoded_direct_null.hex())print()
This generates the following results. As you can see, the two results are not the same and in the second and fourth examples, the null-terminator has been removed for some reason. I've tried this withutf-8
andshift_jis
as well, but these yield the correct results.
'hello world' - encoded_append_null (EXPECTED!): 68656c6c6f20776f726c6400 - encoded_direct_null: 68656c6c6f20776f726c6400'バルーンフルーツ' - encoded_append_null (EXPECTED!): 836f838b815b83938374838b815b836300 - encoded_direct_null: 836f838b815b83938374838b815b8363'バルーンフィッシュ' - encoded_append_null (EXPECTED!): 836f838b815b83938374834283628356838500 - encoded_direct_null: 836f838b815b83938374834283628356838500'ライフアップキノコ' - encoded_append_null (EXPECTED!): 838983438374834183628376834c836d835200 - encoded_direct_null: 838983438374834183628376834c836d8352
Your environment
- Python: Python 3.10.7
- OS: Windows 10 Home
Metadata
Metadata
Assignees
Projects
Status