重构关键路径一个python库:Laboratory
Laboratory!
Laboratory是重构关键路径一个python库(是Github的Scientist一个移植).
Why?
See Github's blog post - http://githubengineering.com/scientist/
But how?
Imagine you've implemented a complex caching strategy for some objects in your database and a stale cache is simply not acceptable. How could you test this and ensure parity with your previous implementation, under load, with production data? Run it in production!
Laboratory will:
- Run both the new and the old code
- Compare their results
- Record timing information about all code
- Swallow and record exceptions in the new code
- Publish all of this information
Of course, you're still unsure your candidate code works correctly, so laboratory will always return the result from the candidate block.
import laboratory experiment = laboratory.Experiment() with experiment.control() as c: c.record(get_objects_from_database()) with experiment.candidate() as c: c.record(get_objects_from_cache()) objects = experiment.run()
Publishing results
This data is useless unless we can do something with it. Laboratory makes no assumptions about how to do this - it's entirely for you to implement to suit your needs. For example, timing data can be sent to graphite, and mismatches can be placed in a capped collection in redis for debugging later.
The publish method is passed a Result instance, with control and candidate data is available in Result.control and Result.observations respectively.
class MyExperiment(laboratory.Experiment): def publish(self, result): statsd.timing('MyExperiment.control', result.control.duration) for o in result.observations: statsd.timing('MyExperiment.%s' % o.name, o.duration)
Controlling comparison
Not all data is created equal. By default laboratory compares using == , but sometimes you may need to tweak this to suit your needs. It's easy enough - just subclass Experiment and implement the compare(control, observation) method.
class MyExperiment(Experiment): def compare(self, control, observation): return control.value['id'] == observation.value['id']
Adding context
A lot of the time there's going to be extra context around an experiment that's useful to use in publishing or comparisons. You can set this data in a few ways.
# The first is experiment-wide context, which will be set on every observation laboratory makes. experiment = laboratory.Experiment(name='Object Cache Experiment', context={'user': user}) # Observation-specific context can be updated before or as the experiment is running. with experiment.control(name='Object DB Strategy', context={'using': 'db'}) as e: e.update_context({'uuid': uuid}) e.get_context() # == # { # 'user': <User>, # 'uuid': 'c08d46f1-92a6-46e5-9185-82d90dcb5af1', # 'using': 'db', # } with experiment.candidate(name='Object Cache Strategy', context={'using': 'cache'}) as e: e.update_context({'uuid': uuid}) e.get_context() # == # { # 'user': <User>, # 'using': 'cache', # }
Installation
Installing from pypi is recommended
$ pip install laboratory
官方网站:http://www.open-open.com/lib/view/home/1455582479058