Wednesday, April 7, 2010

What happen if you put both str and unicode key on a dict in Python?


In [1]: {u"a": 1, "a": 2}
Out[1]: {u'a': 2}


Yas, it is because:


In [2]: u"a" == "a"
Out[2]: True


Python2.* think u"a" and "a" are EQUAL. of course:


In [3]: u"a" is "a"
Out[3]: False


they are not IDENTITY. If Guido designed to use IDENTITY to check whether they are SAME key, it causes:


In [4]: (1, 2) is (1, 2)
Out[4]: False


that uncomfortable behavior. We don't want to distinguish EQUAL tuples. It is a difficult question on designing new languages. Those behavior was changed from Python3.0. On it, bytes(byte sequence) and text(unicode char sequence) are not compatible. No automatic conversion between them.


>>> b"a" == "a"
False
>>> {b"a": 1, "a": 2}
{b'a': 1, 'a': 2}