Wednesday, April 7, 2010

What happen if you put both str and unicode key on a dict in Python?

In [1]: {u"a": 1, "a": 2}
Out[1]: {u'a': 2}

Yas, it is because:

In [2]: u"a" == "a"
Out[2]: True

Python2.* think u"a" and "a" are EQUAL. of course:

In [3]: u"a" is "a"
Out[3]: False

they are not IDENTITY. If Guido designed to use IDENTITY to check whether they are SAME key, it causes:

In [4]: (1, 2) is (1, 2)
Out[4]: False

that uncomfortable behavior. We don't want to distinguish EQUAL tuples. It is a difficult question on designing new languages. Those behavior was changed from Python3.0. On it, bytes(byte sequence) and text(unicode char sequence) are not compatible. No automatic conversion between them.

>>> b"a" == "a"
>>> {b"a": 1, "a": 2}
{b'a': 1, 'a': 2}