Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork19.4k
Closed
Milestone
Description
It seems that there is something bad happening when we use certain strings with special characters AND the empty string with categoricals:
# -*- coding: latin-1 -*-importpandasimportosexamples= [pandas.Series(['EÉ, 17','','a','b','c'],dtype='category'),pandas.Series(['EÉ, 17','a','b','c'],dtype='category'),pandas.Series(['','a','b','c'],dtype='category'),pandas.Series(['EE, 17','','a','b','c'],dtype='category'),pandas.Series(['øü','a','b','c'],dtype='category'),pandas.Series(['Aøü','','a','b','c'],dtype='category'),pandas.Series(['EÉ, 17','øü','a','b','c'],dtype='category') ]deftest_hdf(s):f='testhdf.h5'ifos.path.exists(f):os.remove(f)s.to_hdf(f,'data',format='table')returnpandas.read_hdf(f,'data')fori,sinenumerate(examples):flag=Truee=''try:test_hdf(s)exceptExceptionasex:e=exflag=Falseprint('%d: %s\t%s\t%s'% (i,'pass'ifflagelse'fail',s.tolist(),e))
Results in:
0: fail ['EÉ, 17', '', 'a', 'b', 'c'] Categorical categories must be unique 1: pass ['EÉ, 17', 'a', 'b', 'c'] 2: pass ['', 'a', 'b', 'c'] 3: pass ['EE, 17', '', 'a', 'b', 'c'] 4: pass ['øü', 'a', 'b', 'c'] 5: fail ['Aøü', '', 'a', 'b', 'c'] Categorical categories must be unique 6: pass ['EÉ, 17', 'øü', 'a', 'b', 'c']Not sure if I am using this incorrectly or if this is actually a corner case.