I want to create binary values for words based on their content of vowels and consonants, where vowels receive a value of '0' and consonants get a value of '1'.
For example, 'haha' would be represented as 1010, hahaha as 101010.
common_words = ['haha', 'hahaha', 'aardvark', etc...]dictify = {}binary_value = []#doesn't workfor word in common_words: for x in word: if x=='a' or x=='e' or x=='i' or x=='o' or x=='u': binary_value.append(0) dictify[word]=binary_value else: binary_value.append(1) dictify[word]=binary_value-With this I am getting too many binary digits in the resulting dictionary:
>>>dictify{'aardvark': [0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,...}desired output:
>>>dictify{'haha': 1010,'hahaha': 101010, 'aardvark': 00111011}I am thinking of a solution that doesn't involve a loop within a loop...
- Where does
eachornumber_valuecome from?user2357112– user23571122014-02-17 02:33:54 +00:00CommentedFeb 17, 2014 at 2:33 - 1There is no solution that doesn't use two loops.placeybordeaux– placeybordeaux2014-02-17 02:36:29 +00:00CommentedFeb 17, 2014 at 2:36
dictify = {w:"".join('0' if c in 'aeiouAEIOU' else '1' for c in w) for w in common_words}mshsayem– mshsayem2014-02-17 02:40:35 +00:00CommentedFeb 17, 2014 at 2:40- Your desired output isn't really possible--
00111011won't work as an integer because there's no way to preserve the initial zeroes. You could use a string or a list.DSM– DSM2014-02-17 02:41:56 +00:00CommentedFeb 17, 2014 at 2:41 - Please post your actual code. The code you posted can't work
eachandbinary_valueare never set.Chris Johnson– Chris Johnson2014-02-17 02:46:57 +00:00CommentedFeb 17, 2014 at 2:46
3 Answers3
The code you've posted doesn't work because all words share the samebinary_value list. (It also doesn't work becausenumber_value andeach are never defined, but we'll pretend those variables saidbinary_value andword instead.) Define a new list for each word:
for word in common_words: binary_value = [] for x in word: if x=='a' or x=='e' or x=='i' or x=='o' or x=='u': binary_value.append(0) dictify[word]=binary_value else: binary_value.append(1) dictify[word]=binary_valueIf you want the output to look like00111011 rather than a list, you'll need to make a string. (You could make an int, but then it would look like59 instead of00111011. Python doesn't distinguish "this int is base 2" or "this int has 2 leading zeros".)
for word in common_words: binary_value = [] for x in word: if x.lower() in 'aeiou': binary_value.append('0') else: binary_value.append('1') dictify[word] = ''.join(binary_value)Comments
user2357112 explains your code. Here is just another way:
>>> common_words = ['haha', 'hahaha', 'aardvark']>>> def binfy(w): return "".join('0' if c in 'aeiouAEIOU' else '1' for c in w)>>> dictify = {w:binfy(w) for w in common_words}>>> dictify{'aardvark': '00111011', 'haha': '1010', 'hahaha': '101010'}Comments
This seems like a job for translation tables. Assuming your input strings are all ASCII (and it seems likely or the definition of exactly what is a vowel gets fuzzy), you can define a translation table this way*:
# For simplicity's sake, I'm only using lowercase lettersfrom string import lowercase, maketranstt = maketrans(lowercase, '01110111011111011111011111')With the above table, the problem becomes trivial:
>>> 'haha'.translate(tt)'1010'>>> 'hahaha'.translate(tt)'101010'>>> 'aardvark'.translate(tt)'00111011'Given this solution, you can build dictify very simply with a comprehension:
dictify = {word:word.translate(tt) for word in common_words} #python2.7dictify = dict((word, word.translate(tt)) for word in common_words) # python 2.6 and earlier*This can also be done with Python 3, but you have to use bytes instead of strings:
from string import ascii_lowercasett = b''.maketrans(bytes(ascii_lowercase, 'ascii'), b'01110111011111011111011111')b'haha'.translate(tt)...Comments
Explore related questions
See similar questions with these tags.
