深入描述符

mgkt4375 8年前
   <p>描述符是一种在多个属性上重复利用同一个存取逻辑的方式,他能”劫持”那些本对于self.__dict__的操作。描述符通常是一种包含__get__、__set__、__delete__三种方法中至少一种的类,给人的感觉是「把一个类的操作托付与另外一个类」。静态方法、类方法、property都是构建描述符的类。</p>    <p>我们先看一个简单的描述符的例子(基于我之前的分享的 Python高级编程 改编,这个PPT建议大家去看看):</p>    <pre>  <code class="language-python">classMyDescriptor(object):       _value = ''       def__get__(self, instance, klass):           return self._value         def__set__(self, instance, value):           self._value = value.swapcase()      classSwap(object):       swap = MyDescriptor()  </code></pre>    <p>注意MyDescriptor要用新式类。调用一下:</p>    <pre>  <code class="language-python">In [1]: from descriptor_example import Swap  In [2]: instance = Swap()  In [3]: instance.swap  # 没有报AttributeError错误,因为对swap的属性访问被描述符类重载了  Out[3]: ''  In [4]: instance.swap = 'make it swap'  # 使用__set__重新设置_value  In [5]: instance.swap  Out[5]: 'MAKE IT SWAP'  In [6]: instance.__dict__  # 没有用到__dict__:被劫持了  Out[6]: {}  </code></pre>    <p>这就是描述符的威力。我们熟知的staticmethod、classmethod如果你不理解,那么看一下用Python实现的效果可能会更清楚了:</p>    <pre>  <code class="language-python">>>> classmyStaticMethod(object):  ...     def__init__(self, method):  ...         self.staticmethod = method  ...     def__get__(self, object, type=None):  ...         return self.staticmethod  ...  >>> classmyClassMethod(object):  ...     def__init__(self, method):  ...         self.classmethod = method  ...     def__get__(self, object, klass=None):  ...         if klass is None:  ...             klass = type(object)  ...         defnewfunc(*args):  ...             return self.classmethod(klass, *args)  ...         return newfunc  </code></pre>    <p>在实际的生产项目中,描述符有什么用处呢?首先看MongoEngine中的Field的用法:</p>    <pre>  <code class="language-python">from mongoengine import *                                                                              classMetadata(EmbeddedDocument):                         tags = ListField(StringField())      revisions = ListField(IntField())                                                        classWikiPage(Document):                                 title = StringField(required=True)                    text = StringField()                                  metadata = EmbeddedDocumentField(Metadata)  </code></pre>    <p>有非常多的Field类型,其实它们的基类就是一个 描述符 ,我简化下,大家看看实现的原理:</p>    <pre>  <code class="language-python">classBaseField(object):      name = None      def__init__(self, **kwargs):          self.__dict__.update(kwargs)          ...                def__get__(self, instance, owner):          return instance._data.get(self.name)                def__set__(self, instance, value):          ...          instance._data[self.name] = value  </code></pre>    <p>很多项目的源代码看起来很复杂,在抽丝剥茧之后,其实原理非常简单,复杂的是业务逻辑。</p>    <p>接着我们再看Flask的依赖Werkzeug中的cached_property:</p>    <pre>  <code class="language-python">class_Missing(object):      def__repr__(self):          return 'no value'        def__reduce__(self):          return '_missing'      _missing = _Missing()       classcached_property(property):      def__init__(self, func, name=None, doc=None):          self.__name__ = name or func.__name__          self.__module__ = func.__module__          self.__doc__ = doc or func.__doc__          self.func = func        def__set__(self, obj, value):          obj.__dict__[self.__name__] = value        def__get__(self, obj, type=None):          if obj is None:              return self          value = obj.__dict__.get(self.__name__, _missing)          if value is _missing:              value = self.func(obj)              obj.__dict__[self.__name__] = value          return value  </code></pre>    <p>其实看类的名字就知道这是缓存属性的,看不懂没关系,用一下:</p>    <pre>  <code class="language-python">classFoo(object):   @cached_property      deffoo(self):          print 'Call me!'          return 42  </code></pre>    <p>调用下:</p>    <pre>  <code class="language-python">In [1]: from cached_property import Foo     ...: foo = Foo()     ...:    In [2]: foo.bar  Call me!  Out[2]: 42    In [3]: foo.bar  Out[3]: 42  </code></pre>    <p>可以看到在从第二次调用bar方法开始,其实用的是缓存的结果,并没有真的去执行。</p>    <p>说了这么多描述符的用法。我们写一个做字段验证的描述符:</p>    <pre>  <code class="language-python">classQuantity(object):      def__init__(self, name):          self.name = name        def__set__(self, instance, value):          if value > 0:              instance.__dict__[self.name] = value          else:              raise ValueError('value must be > 0')      classRectangle(object):      height = Quantity('height')      width = Quantity('width')        def__init__(self, height, width):          self.height = height          self.width = width     @property      defarea(self):          return self.height * self.width  </code></pre>    <p>我们试一试:</p>    <pre>  <code class="language-python">In [1]: from rectangle import Rectangle  In [2]: r = Rectangle(10, 20)  In [3]: r.area  Out[3]: 200    In [4]: r = Rectangle(-1, 20)  ---------------------------------------------------------------------------  ValueError                                Traceback (most recent call last)  <ipython-input-5-5a7fc56e8a> in <module>()  ----> 1 r = Rectangle(-1, 20)    /Users/dongweiming/mp/2017-03-23/rectangle.py in __init__(self, height, width)       15       16     def __init__(self, height, width):  ---> 17         self.height = height       18         self.width = width       19    /Users/dongweiming/mp/2017-03-23/rectangle.py in __set__(self, instance, value)        7             instance.__dict__[self.name] = value        8         else:  ----> 9             raise ValueError('value must be > 0')       10       11    ValueError: value must be > 0  </code></pre>    <p>看到了吧,我们在描述符的类里面对传值进行了验证。ORM就是这么玩的!</p>    <p>但是上面的这个实现有个缺点,就是不太自动化,你看 height = Quantity('height') ,这得让属性和Quantity的name都叫做height,那么可不可以不用指定name呢?当然可以,不过实现的要复杂很多:</p>    <pre>  <code class="language-python">classQuantity(object):      __counter = 0      def__init__(self):          cls = self.__class__          prefix = cls.__name__          index = cls.__counter          self.name = '_{}#{}'.format(prefix, index)          cls.__counter += 1                def__get__(self, instance, owner):          if instance is None:              return self          return getattr(instance, self.name)      ...          classRectangle(object):      height = Quantity()      width = Quantity()       ...  </code></pre>    <p>Quantity的name相当于类名+计时器,这个计时器每调用一次就叠加1,用此区分。有一点值得提一提,在__get__中的:</p>    <pre>  <code class="language-python">if instance is None:      return self  </code></pre>    <p>在很多地方可见,比如之前提到的MongoEngine中的 BaseField 。这是由于直接调用Rectangle.height这样的属性时候会报AttributeError, 因为描述符是实例上的属性。</p>    <p>PS:这个灵感来自《Fluent Python》,书中还有一个我认为设计非常好的例子。就是当要验证的内容种类很多的时候,如何更好地扩展的问题。现在假设我们除了验证传入的值要大于0,还得验证不能为空和必须是数字(当然三种验证在一个方法中验证也是可以接受的,我这里就是个演示),我们先写一个abc的基类:</p>    <pre>  <code class="language-python">classValidated(abc.ABC):      __counter = 0        def__init__(self):          cls = self.__class__          prefix = cls.__name__          index = cls.__counter          self.name = '_{}#{}'.format(prefix, index)          cls.__counter += 1        def__get__(self, instance, owner):          if instance is None:              return self          else:              return getattr(instance, self.name)      def__set__(self, instance, value):          value = self.validate(instance, value)          setattr(instance, self.name, value)      @abc.abstractmethod      defvalidate(self, instance, value):          """return validated value or raise ValueError"""  </code></pre>    <p>现在新加一个检查类型,新增一个继承了Validated的、包含检查的validate方法的类就可以了:</p>    <pre>  <code class="language-python">classQuantity(Validated):      defvalidate(self, instance, value):          if value <= 0:              raise ValueError('value must be > 0')          return value      classNonBlank(Validated):      defvalidate(self, instance, value):          value = value.strip()          if len(value) == 0:              raise ValueError('value cannot be empty or blank')          return value  </code></pre>    <p>前面展示的描述符都是一个类,那么可不可以用函数来实现呢?也是可以的:</p>    <pre>  <code class="language-python">defquantity():      try:          quantity.counter += 1      except AttributeError:          quantity.counter = 0        storage_name = '_{}:{}'.format('quantity', quantity.counter)        defqty_getter(instance):          return getattr(instance, storage_name)        defqty_setter(instance, value):          if value > 0:              setattr(instance, storage_name, value)          else:              raise ValueError('value must be > 0')      return property(qty_getter, qty_setter)  </code></pre>    <p> </p>    <p>来自:http://www.dongwm.com/archives/深入属性描述符/</p>    <p> </p>