I have searched many times online and I have not been able to find a way to convert my binary string variable,X
X = "1000100100010110001101000001101010110011001010100"into a UTF-8 string value.
I have found that some people are using methods such as
b'message'.decode('utf-8')however, this method has not worked for me, as 'b' is said to be nonexistent, and I am not sure how to replace the 'message' with a variable. Not only, but I have not been able to comprehend how this method works. Is there a better alternative?
So how could I convert a binary string into a text string?
EDIT: I also do not mind ASCII decoding
CLARIFICATION: Here is specifically what I would like to happen.
def binaryToText(z): # Some code to convert binary to text return (something here);X="0110100001101001"print binaryToText(X)This would then yield the string...
hi- Since ASCII is effectively a subset of UTF-8 you'll find that your string
Xis already a UTF8 string. What is your expected output?mhawke– mhawke2016-11-11 22:43:37 +00:00CommentedNov 11, 2016 at 22:43 - +mhawke I am looking for a returned value of a UTF-8 string. The binary is initially a string, and I want to be able to convert that binary, into a UTF-8 string. Please ask me if you need more clarification!Dan– Dan2016-11-11 22:46:34 +00:00CommentedNov 11, 2016 at 22:46
- Are you using Python 2 or 3? Why did you tag BOTH? In Python 3, strings are utf by default.juanpa.arrivillaga– juanpa.arrivillaga2016-11-11 22:48:35 +00:00CommentedNov 11, 2016 at 22:48
- +juanpa.arrivillaga I have the flexibility to use both, dependant upon which option is best for me to use. I can accept solutions for both versions.Dan– Dan2016-11-11 22:50:00 +00:00CommentedNov 11, 2016 at 22:50
- Well, if you use Python 3, all strings are unicode, so that seems to be the most straightforward solution...juanpa.arrivillaga– juanpa.arrivillaga2016-11-11 22:57:54 +00:00CommentedNov 11, 2016 at 22:57
6 Answers6
It looks like you are trying to decode ASCII characters from a binary string representation (bit string) of each character.
You can take each block of eight characters (a byte), convert that to an integer, and then convert that to a character withchr():
>>> X = "0110100001101001">>> print(chr(int(X[:8], 2)))h>>> print(chr(int(X[8:], 2)))iAssuming that the values encoded in the string are ASCII this will give you the characters. You can generalise it like this:
def decode_binary_string(s): return ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))>>> decode_binary_string(X)hiIf you want to keep it in the original encoding you don't need to decode any further. Usually you would convert the incoming string into a Pythonunicode string and that can be done like this (Python 2):
def decode_binary_string(s, encoding='UTF-8'): byte_string = ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8)) return byte_string.decode(encoding)3 Comments
''.join([bin(ord(c))[2:].rjust(8,'0') for c in 'hi'])'str' object has no attribute 'decode'. I bring this up because this solution appears perfect for what I need but the encoding (or rather decoding) part doesn't seem to work.To convert bits given as a "01"-string (binary digits) into the corresponding text in Python 3:
>>> bits = "0110100001101001">>> n = int(bits, 2)>>> n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()'hi'For Python 2/3 solution, seeConvert binary to ASCII and vice versa.
Comments
In Python 2, an ascii-encoded (byte) string is also a utf8-encoded (byte) string.In Python 3, a (unicode) string must beencoded to utf8-encoded bytes. The decoding example was going the wrong way.
>>> X = "1000100100010110001101000001101010110011001010100">>> X.encode()b'1000100100010110001101000001101010110011001010100'Strings containing only the digits '0' and '1' are a special case and the same rules apply.
1 Comment
Provide the optional base argument toint to convert:
>> x = "1000100100010110001101000001101010110011001010100">> int(x, 2)301456912901716Comments
# Simple not elegant, used for a CTF challenge, did the trick# Input of Binary, Seperated in Bytesbinary = "01000011 01010100 01000110 01111011 01000010 01101001 01110100 01011111 01000110 01101100 01101001 01110000 01110000 01101001 01101110 01111101"# Add each item to a list at spacesbinlist = binary.split(" ")# List to Hold Characterschrlist = []# Loop to convertfor i in binlist: chrlist.append(chr(int(i,2)))# Print The list a joined stringprint("".join(chrlist))1 Comment
A working code for python 3
Binstr = '00011001 00001000'Binstr.split(' ')s = []for i in Binstr: s.append(chr(i))print(''.join(s))1 Comment
Explore related questions
See similar questions with these tags.




