fune/third_party/python/attrs/docs/hashing.rst
Tom Prince 5658bb444e Bug 1458700: [python] Vendor attrs; r=dustin,ted
Differential Revision: https://phabricator.services.mozilla.com/D1124

--HG--
extra : source : 75ef3b64c1698ec53d1d289201ec9a3d51fea7b1
extra : histedit_source : 53d76ae733322098ca6b8f01fe4c2911c348ac3a
2018-05-02 19:39:38 -06:00

56 lines
3.3 KiB
ReStructuredText

Hashing
=======
.. warning::
The overarching theme is to never set the ``@attr.s(hash=X)`` parameter yourself.
Leave it at ``None`` which means that ``attrs`` will do the right thing for you, depending on the other parameters:
- If you want to make objects hashable by value: use ``@attr.s(frozen=True)``.
- If you want hashing and comparison by object identity: use ``@attr.s(cmp=False)``
Setting ``hash`` yourself can have unexpected consequences so we recommend to tinker with it only if you know exactly what you're doing.
Under certain circumstances, it's necessary for objects to be *hashable*.
For example if you want to put them into a :class:`set` or if you want to use them as keys in a :class:`dict`.
The *hash* of an object is an integer that represents the contents of an object.
It can be obtained by calling :func:`hash` on an object and is implemented by writing a ``__hash__`` method for your class.
``attrs`` will happily write a ``__hash__`` method you [#fn1]_, however it will *not* do so by default.
Because according to the definition_ from the official Python docs, the returned hash has to fullfil certain constraints:
#. Two objects that are equal, **must** have the same hash.
This means that if ``x == y``, it *must* follow that ``hash(x) == hash(y)``.
By default, Python classes are compared *and* hashed by their :func:`id`.
That means that every instance of a class has a different hash, no matter what attributes it carries.
It follows that the moment you (or ``attrs``) change the way equality is handled by implementing ``__eq__`` which is based on attribute values, this constraint is broken.
For that reason Python 3 will make a class that has customized equality unhashable.
Python 2 on the other hand will happily let you shoot your foot off.
Unfortunately ``attrs`` currently mimics Python 2's behavior for backward compatibility reasons if you set ``hash=False``.
The *correct way* to achieve hashing by id is to set ``@attr.s(cmp=False)``.
Setting ``@attr.s(hash=False)`` (that implies ``cmp=True``) is almost certainly a *bug*.
#. If two object are not equal, their hash **should** be different.
While this isn't a requirement from a standpoint of correctness, sets and dicts become less effective if there are a lot of identical hashes.
The worst case is when all objects have the same hash which turns a set into a list.
#. The hash of an object **must not** change.
If you create a class with ``@attr.s(frozen=True)`` this is fullfilled by definition, therefore ``attrs`` will write a ``__hash__`` function for you automatically.
You can also force it to write one with ``hash=True`` but then it's *your* responsibility to make sure that the object is not mutated.
This point is the reason why mutable structures like lists, dictionaries, or sets aren't hashable while immutable ones like tuples or frozensets are:
point 1 and 2 require that the hash changes with the contents but point 3 forbids it.
For a more thorough explanation of this topic, please refer to this blog post: `Python Hashes and Equality`_.
.. [#fn1] The hash is computed by hashing a tuple that consists of an unique id for the class plus all attribute values.
.. _definition: https://docs.python.org/3/glossary.html#term-hashable
.. _`Python Hashes and Equality`: https://hynek.me/articles/hashes-and-equality/