基础知识

Python对象包含三个基本要素:ID,数据类型,值。其中:

  • ID,object identity,对象的唯一身份识别,是其在内存中的地址,只读,通过 id() 函数获取

  • 数据类型,对象存储的数据类型,只读,通过 type() 函数获取

  • 值,对象存储的真实数据,一般支持读写

Python是面向对象语言,它包含许多为适应面向对象而存在的特殊方法:
__new__ : 对象实例化时调用的第一个函数,它调用 __init__ 来完成实例化
__init__ : 对象构造函数,实例化对象时调用
__del__ : 对象析构函数,定义对象被垃圾回收时的行为
__cmp__ : 对象比较函数,实现几乎所有比较操作,==,<=, >=

当然, Python支持对不同的比较操作定义自己的操作函数:
__eq__ : 定义等号行为,==
__gt__ : 定义大于号行为,>
__lt__ : 定义小于号行为,<
…等等

is 关键字

is 又称同一性运算符,A is B 仅当A和B是同一个对象时返回True。is 比较操作将对象的唯一身份识别ID作为判断依据,对两个相互比较的对象,仅当 id(A) 等于 id(B)is 返回True。

== 操作符

== 是等号比较操作符,A == B 当A和B的值相同时返回True。== 操作符将对象进行数值上的比较,比较结果由 __eq__ 比较操作函数决定。如果我们将 __eq__ 函数设置成总是返回True,那么即使 == 比较的两个对象数值不等对结果也没什么影响,== 将总是返回True。

  1. >>> class Foo(object):
  2. def __eq__(self, other):
  3. return True
  4. >>> f = Foo()
  5. >>> f == None
  6. True
  7. >>> f is None
  8. False

is== 的关系

其实只要理解了 is== 的本质区别,它们的关系就很好理解:is 判断是否是同一个对象,is 返回True则暗示 == 一定返回True,is 返回False则 == 更据数值关系可真可假;== 判断对象是否数值相等,== 的真假对 is 没有任何意义。

使用 is 的时候,有些情况下会出现意想不到的结果,比如 () is () 我们会期望它返回False,毕竟看起来是创建了两个不同的tuple,但实际返回的确是True。导致这个问题的原因是Python的缓存机制,出于效率考虑Python会缓存基本数据类型对象和一些不支持 in-place 修改的对象(像数值对象、字符串对象和一些tuple)进行复用,对于类对象和其他可修改对象Python不会缓存复用,它们总是运行时重新创建。

最后引用Thomas的评论:

When you call id({}), Python creates a dict and passes it to the id function. The id function takes its id (its memory location), and throws away the dict. The dict is destroyed. When you do it twice in quick succession (without any other dicts being created in the mean time), the dict Python creates the second time happens to use the same block of memory as the first time. (CPython’s memory allocator makes that a lot more likely than it sounds.) Since (in CPython) id uses the memory location as the object id, the id of the two objects is the same. This obviously doesn’t happen if you assign the dict to a variable and then get its id(), because the dicts are alive at the same time, so their id has to be different.

Mutability does not directly come into play, but code objects caching tuples and strings do. In the same code object (function or class body or module body) the same literals (integers, strings and certain tuples) will be re-used. Mutable objects can never be re-used, they’re always created at runtime.

In short, an object’s id is only unique for the lifetime of the object. After the object is destroyed, or before it is created, something else can have the same id.

参考资料:

  1. Python is同一性运算符和==相等运算符区别

  2. Python 魔术方法指南

  3. object identity and equivalence

  4. Is there a difference between == and is in Python?

  5. Is there any difference between “foo is None” and “foo == None”?

  6. Why does id({}) == id({}) and id([]) == id([]) in CPython?