19

I have searched many times online and I have not been able to find a way to convert my binary string variable,X

X = "1000100100010110001101000001101010110011001010100"

into a UTF-8 string value.

I have found that some people are using methods such as

b'message'.decode('utf-8')

however, this method has not worked for me, as 'b' is said to be nonexistent, and I am not sure how to replace the 'message' with a variable. Not only, but I have not been able to comprehend how this method works. Is there a better alternative?

So how could I convert a binary string into a text string?

EDIT: I also do not mind ASCII decoding

CLARIFICATION: Here is specifically what I would like to happen.

def binaryToText(z):    # Some code to convert binary to text    return (something here);X="0110100001101001"print binaryToText(X)

This would then yield the string...

hi
vvvvv's user avatar
vvvvv
32.9k19 gold badges70 silver badges103 bronze badges
askedNov 11, 2016 at 22:41
Dan's user avatar
7
  • Since ASCII is effectively a subset of UTF-8 you'll find that your stringX is already a UTF8 string. What is your expected output?CommentedNov 11, 2016 at 22:43
  • +mhawke I am looking for a returned value of a UTF-8 string. The binary is initially a string, and I want to be able to convert that binary, into a UTF-8 string. Please ask me if you need more clarification!CommentedNov 11, 2016 at 22:46
  • Are you using Python 2 or 3? Why did you tag BOTH? In Python 3, strings are utf by default.CommentedNov 11, 2016 at 22:48
  • +juanpa.arrivillaga I have the flexibility to use both, dependant upon which option is best for me to use. I can accept solutions for both versions.CommentedNov 11, 2016 at 22:50
  • Well, if you use Python 3, all strings are unicode, so that seems to be the most straightforward solution...CommentedNov 11, 2016 at 22:57

6 Answers6

17

It looks like you are trying to decode ASCII characters from a binary string representation (bit string) of each character.

You can take each block of eight characters (a byte), convert that to an integer, and then convert that to a character withchr():

>>> X = "0110100001101001">>> print(chr(int(X[:8], 2)))h>>> print(chr(int(X[8:], 2)))i

Assuming that the values encoded in the string are ASCII this will give you the characters. You can generalise it like this:

def decode_binary_string(s):    return ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))>>> decode_binary_string(X)hi

If you want to keep it in the original encoding you don't need to decode any further. Usually you would convert the incoming string into a Pythonunicode string and that can be done like this (Python 2):

def decode_binary_string(s, encoding='UTF-8'):    byte_string = ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))    return byte_string.decode(encoding)
answeredNov 12, 2016 at 2:33
mhawke's user avatar
Sign up to request clarification or add additional context in comments.

3 Comments

Could you also add the reverse code? For converting string to binary. That would be great :)
@Dan:''.join([bin(ord(c))[2:].rjust(8,'0') for c in 'hi'])
I'm way, way late to this solution but I'm curious. When I run the last of the code snippets above I get'str' object has no attribute 'decode'. I bring this up because this solution appears perfect for what I need but the encoding (or rather decoding) part doesn't seem to work.
6

To convert bits given as a "01"-string (binary digits) into the corresponding text in Python 3:

>>> bits = "0110100001101001">>> n = int(bits, 2)>>> n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()'hi'

For Python 2/3 solution, seeConvert binary to ASCII and vice versa.

answeredNov 12, 2016 at 18:20
jfs's user avatar

Comments

1

In Python 2, an ascii-encoded (byte) string is also a utf8-encoded (byte) string.In Python 3, a (unicode) string must beencoded to utf8-encoded bytes. The decoding example was going the wrong way.

>>> X = "1000100100010110001101000001101010110011001010100">>> X.encode()b'1000100100010110001101000001101010110011001010100'

Strings containing only the digits '0' and '1' are a special case and the same rules apply.

answeredNov 11, 2016 at 22:57
Terry Jan Reedy's user avatar

1 Comment

So how could I decode X? X.decode() does not seem to work.
0

Provide the optional base argument toint to convert:

>> x = "1000100100010110001101000001101010110011001010100">> int(x, 2)301456912901716
answeredNov 11, 2016 at 22:46
souldeux's user avatar

Comments

0
# Simple not elegant, used for a CTF challenge, did the trick# Input of Binary, Seperated in Bytesbinary = "01000011 01010100 01000110 01111011 01000010 01101001 01110100 01011111 01000110 01101100 01101001 01110000 01110000 01101001 01101110 01111101"# Add each item to a list at spacesbinlist = binary.split(" ")# List to Hold Characterschrlist = []# Loop to convertfor i in binlist:    chrlist.append(chr(int(i,2)))# Print The list a joined stringprint("".join(chrlist))
answeredSep 15, 2023 at 0:42
ctmedic09's user avatar

1 Comment

Your answer could be improved with additional supporting information. Pleaseedit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answersin the help center.
-1

A working code for python 3

Binstr = '00011001 00001000'Binstr.split(' ')s = []for i in Binstr:    s.append(chr(i))print(''.join(s))
LeopardShark's user avatar
LeopardShark
4,4964 gold badges21 silver badges37 bronze badges
answeredMar 19, 2022 at 16:34
user18513997's user avatar

1 Comment

Code syntax is invalid

Your Answer

Sign up orlog in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

By clicking “Post Your Answer”, you agree to ourterms of service and acknowledge you have read ourprivacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.