Python

Python is call-by-assignment

2016-07-102019-10-22VintaPython

If you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.

If you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.

例如 def foo(bar): 這個 function 的參數傳遞，其實是在 foo 的 local namespace 裡幫傳進去的物件綁定了一個叫做 bar 的名字（所謂的 assignment）；如果你在 foo 裡 re-assign 了一個新的物件給 bar，實際上是把 bar 這個名字綁定到那個新的物件。

mutable objects: list, dict, set 的行為同 call-by-reference
immutable objects: boolean, int, float, long, str, unicode, tuple 的行為同 call-by-value

不過如果你把一個 mutable 物件放進 immutable 物件裡，例如把 list 放進 tuple 裡，則修改了 list 之後，那個 tuple 裡的 list 也是會被修改。

a_list = [1, 2, 3]
a_tuple = (a_list, 'a', 'b', 'c')
a_list.append(4)

Assignment is the binding of a name to an object: name = 'Molly'.
Assignment between names doesn't create a new object, both names are simply bound to the same object: nickname = name.

name = 'Mollie'
name = 'Vinta'

所謂的 assign 這個動作，其實是幫 'Mollie' 這個字串取一個名字叫做 name，所以如果你又加上一句 name = 'Vinta'，實際上是建立了一個新的物件（字串 'Vinta'），再把這個新字串綁定到 name 這個名字。

# if bar refers to a mutable object
def foo(bar):
    bar.append('new value')
    print(bar)
    # output: ['old value', 'new value']

answer_list = ['old value', ]
foo(answer_list)
print(answer_list)
# output: ['old value', 'new value']

# if bar refers to an immutable object
def foo(bar):
    bar = 'new value'
    print(bar)
    # output: 'new value'

answer_list = 'old value'
foo(answer_list)
print(answer_list)
# output: 'old value'

# if bar refers to a mutable object and re-assign it in foo
def foo(bar):
    bar = ['new value', ]
    print(bar)
    # output: ['new value', ]

answer_list = ['old value', ]
foo(answer_list)
print(answer_list)
# output: ['old value', ]

ref:
https://stackoverflow.com/questions/986006/how-do-i-pass-a-variable-by-reference
https://jeffknupp.com/blog/2012/11/13/is-python-callbyvalue-or-callbyreference-neither/
https://docs.python.org/3/faq/programming.html#how-do-i-write-a-function-with-output-parameters-call-by-reference

scope

只有 module、class、function 才有建立新的 scope，if、while、for 並不會建立新的 scope。

some_list = [
    {
        'id': 1,
        'is_change': False,
    },
    {
        'id': 2,
        'is_change': False,
    },
    {
        'id': 3,
        'is_change': False,
    },
    {
        'id': 4,
        'is_change': False,
    },
]

for item in some_list:
    item['is_change'] = True

some_list 的每個 item 都會被更新，因為是 mutable objects 的行為是 call-by-reference。所以你不需要這樣：

new_list = []
for item in some_list:
    item['is_change'] = True
    new_list.append(item)

Upload files to Amazon S3 when Travis CI builds pass

2016-06-282019-10-22VintaDevOps, Python, Web Development

Assume that you want to upload a xxx.whl file generated by pip wheel to Amazon S3 so that you will be able to run pip install https://url/to/s3/bucket/xxx.whl.

CAUTION! By default, only master branch's builds could trigger deployments in Travis CI.

Configuration

before_install:
  - pip install -U pip
  - pip install wheel

script:
  - python setup.py test

before_deploy:
  - pip wheel --wheel-dir=wheelhouse .

deploy:
  provider: s3
  access_key_id: "YOUR_KEY"
  secret_access_key: "YOUR_SECRET"
  bucket: YOUR_BUCKET
  acl: public_read
  local_dir: wheelhouse
  upload_dir: wheels
  skip_cleanup: true

# install from an URL directly
$ pip install https://url/to/s3/bucket/wheels/xxx.whl

ref:
https://docs.travis-ci.com/user/deployment/s3

argparse: Create a Command-line App with Python

2015-12-092019-10-22VintaPython

argparse is a Python standard library makes it easy to write a CLI application. You should use this module instead of optparse which is deprecated since Python 2.7.

ref:
https://docs.python.org/3/library/argparse.html

Basic example

#!/usr/bin/env python

from __future__ import print_function

import argparse
import sys

import jokekappa

class JokeKappaCLI(object):

    def __init__(self):
        parser = argparse.ArgumentParser(
            prog='jokekappa',
            description='humor is a serious thing, you should take it seriously',
        )
        self.parser = parser
        self.parser.add_argument('-v', '--version', action='version', version=jokekappa.__version__)

        self.subparsers = parser.add_subparsers(title='sub commands')
        self.subparsers \
            .add_parser('one', help='print one joke randomly') \
            .set_defaults(func=self.tell_joke)
        self.subparsers \
            .add_parser('all', help='print all jokes') \
            .set_defaults(func=self.tell_jokes)
        self.subparsers \
            .add_parser('update', help='update jokes from sources') \
            .set_defaults(func=self.update_jokes)

    def parse_args(self):
        if len(sys.argv) == 1:
            namespace = self.parser.parse_args(['one', ])
        else:
            namespace = self.parser.parse_args()
        namespace.func()

    def tell_joke(self):
        joke = jokekappa.get_joke()
        print(joke['content'])

    def tell_jokes(self):
        for joke in jokekappa.get_jokes():
            print(joke['content'])

    def update_jokes(self):
        jokekappa.update_jokes()
        print('Done')

def main():
    JokeKappaCLI().parse_args()

if __name__ == '__main__':
    main()

ref:
https://github.com/CodeTengu/JokeKappa
https://stackoverflow.com/questions/5176691/argparse-how-to-specify-a-default-subcommand

Advanced example

In following code, you're able to create a Python module called pangu and a command-line tool also called pangu, both share the same codebase.

in pangu.py

from __future__ import print_function

import argparse
import sys

__version__ = '3.3.0'
__all__ = ['spacing_text', 'PanguCLI']

def spacing_text(text):
    """
    You could find real code in https://github.com/vinta/pangu.py
    """
    return text.upper().strip()

class PanguCLI(object):

    def __init__(self):
        parser = argparse.ArgumentParser(
            prog='pangu',
            description='paranoid text spacing',
        )
        self.parser = parser
        self.parser.add_argument('-v', '--version', action='version', version=__version__)
        self.parser.add_argument('text', action='store', type=str)

    def parse(self):
        if not sys.stdin.isatty():
            print(spacing_text(sys.stdin.read()))
        elif len(sys.argv) > 1:
            namespace = self.parser.parse_args()
            print(spacing_text(namespace.text))
        else:
            self.parser.print_help()
        sys.exit(0)

if __name__ == '__main__':
    PanguCLI().parse()

in bin/pangu

#!/usr/bin/env python

from pangu import PanguCLI

if __name__ == '__main__':
    PanguCLI().parse()

in setup.py

from setuptools import setup

setup(
    name='pangu',
    py_modules=['pangu', ],
    scripts=['bin/pangu', ],
    ...
)

As a result, there're multiple usages:

$ pangu "abc"
$ python -m pangu "abc"
$ echo "abc" | pangu
$ echo "abc" | python -m pangu
ABC

ref:
https://github.com/vinta/pangu.py

Accept a list as option

parser.add_argument('-u', '--usernames', type=lambda x: x.split(','), dest='usernames', required=True)
# your_command -u vinta
# your_command -u vinta,saiday

Accept conditional argument

ref:
https://stackoverflow.com/questions/15459997/passing-integer-lists-to-python

Tools for Profiling your Python Projects

2015-08-312019-10-22VintaPython, Web Development

The first aim of profiling is to test a representative system to identify what's slow, using too much RAM, causing too much disk I/O or network I/O. You should keep in mind that profiling typically adds an overhead to your code.

In this post, I will introduce tools you could use to profile your Python or Django projects, including: timer, pycallgraph, cProfile, line-profiler, memory-profiler.

ref:
https://stackoverflow.com/questions/582336/how-can-you-profile-a-script
https://www.airpair.com/python/posts/optimizing-python-code

timer

The simplest way to profile a piece of code.

ref:
https://docs.python.org/3/library/timeit.html

pycallgraph

pycallgraph is a Python module that creates call graph visualizations for Python applications.

ref:
https://pycallgraph.readthedocs.org/en/latest/

$ sudo apt-get install graphviz
$ pip install pycallgraph

# in your_app/middlewares.py
from pycallgraph import Config
from pycallgraph import PyCallGraph
from pycallgraph.globbing_filter import GlobbingFilter
from pycallgraph.output import GraphvizOutput
import time

class PyCallGraphMiddleware(object):

    def process_view(self, request, callback, callback_args, callback_kwargs):
        if 'graph' in request.GET:
            config = Config()
            config.trace_filter = GlobbingFilter(include=['rest_framework.*', 'api.*', 'music.*'])
            graphviz = GraphvizOutput(output_file='pycallgraph-{}.png'.format(time.time()))
            pycallgraph = PyCallGraph(output=graphviz, config=config)
            pycallgraph.start()

            self.pycallgraph = pycallgraph

    def process_response(self, request, response):
        if 'graph' in request.GET:
            self.pycallgraph.done()

        return response

# in settings.py
MIDDLEWARE_CLASSES = (
    'your_app.middlewares.PyCallGraphMiddleware',
    ...
)

$ python manage.py runserver 0.0.0.0:8000
$ open http://127.0.0.1:8000/your_endpoint/?graph=true

cProfile

cProfile is a tool in Python's standard library to understand which functions in your code take the longest to run. It will give you a high-level view of the performance problem so you can direct your attention to the critical functions.

ref:
http://igor.kupczynski.info/2015/01/16/profiling-python-scripts.html
https://ymichael.com/2014/03/08/profiling-python-with-cprofile.html

$ python -m cProfile manage.py test member
$ python -m cProfile -o my-profile-data.out manage.py test --failtest
$ python -m cProfile -o my-profile-data.out manage.py runserver 0.0.0.0:8000

$ pip install cprofilev
$ cprofilev -f my-profile-data.out -a 0.0.0.0 -p 4000
$ open http://127.0.0.1:4000

cProfile with django-cprofile-middleware

$ pip install django-cprofile-middleware

# in settings.py
MIDDLEWARE_CLASSES = (
    ...
    'django_cprofile_middleware.middleware.ProfilerMiddleware',
)

Open any url with a ?prof suffix to do the profiling, for instance, http://localhost:8000/foo/?prof

ref:
https://github.com/omarish/django-cprofile-middleware

cProfile with django-extension and kcachegrind

kcachegrind is a profiling data visualization tool, used to determine the most time consuming execution parts of a program.

ref:
http://django-extensions.readthedocs.org/en/latest/runprofileserver.html

$ pip install django-extensions

# in settings.py
INSTALLED_APPS += (
    'django_extensions',
)

$ mkdir -p my-profile-data

$ python manage.py runprofileserver \
--noreload \
--nomedia \
--nostatic \
--kcachegrind \
--prof-path=my-profile-data \
0.0.0.0:8000

$ brew install qcachegrind --with-graphviz
$ qcachegrind my-profile-data/root.003563ms.1441992439.prof
# or
$ sudo apt-get install kcachegrind
$ kcachegrind my-profile-data/root.003563ms.1441992439.prof

cProfile with django-debug-toolbar

You're only able to use django-debug-toolbar if your view returns HTML, it needs a place to inject the debug panels into your DOM on the webpage.

ref:
https://github.com/django-debug-toolbar/django-debug-toolbar

$ pip install django-debug-toolbar

# in settiangs.py
INSTALLED_APPS += (
    'debug_toolbar',
)

DEBUG_TOOLBAR_PANELS = [
    ...
    'debug_toolbar.panels.profiling.ProfilingPanel',
    ...
]

line-profiler

line-profiler is a module for doing line-by-line profiling of functions. One of my favorite tools.

ref:
https://github.com/rkern/line_profiler

$ pip install line-profiler

# in your_app/views.py
def do_line_profiler(view=None, extra_view=None):
    import line_profiler

    def wrapper(view):
        def wrapped(*args, **kwargs):
            prof = line_profiler.LineProfiler()
            prof.add_function(view)
            if extra_view:
                [prof.add_function(v) for v in extra_view]
            with prof:
                resp = view(*args, **kwargs)
            prof.print_stats()
            return resp

        return wrapped

    if view:
        return wrapper(view)

    return wrapper

@do_line_profiler
def your_view(request):
    pass

ref:
https://djangosnippets.org/snippets/10483/

There is a pure Python alternative: pprofile.
https://github.com/vpelletier/pprofile

line-profiler with django-devserver

ref:
https://github.com/dcramer/django-devserver

$ pip install git+git://github.com/dcramer/django-devserver#egg=django-devserver

in settings.py

INSTALLED_APPS += (
    'devserver',
)

DEVSERVER_MODULES = (
    ...
    'devserver.modules.profile.LineProfilerModule',
    ...
)

DEVSERVER_AUTO_PROFILE = False

in your_app/views.py

from devserver.modules.profile import devserver_profile

@devserver_profile()
def your_view(request):
    pass

line-profiler with django-debug-toolbar-line-profiler

ref:
http://django-debug-toolbar.readthedocs.org/en/latest/
https://github.com/dmclain/django-debug-toolbar-line-profiler

$ pip install django-debug-toolbar django-debug-toolbar-line-profiler

# in settings.py
INSTALLED_APPS += (
    'debug_toolbar',
    'debug_toolbar_line_profiler',
)

DEBUG_TOOLBAR_PANELS = [
    ...
    'debug_toolbar_line_profiler.panel.ProfilingPanel',
    ...
]

memory-profiler

This is a Python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for Python programs.

ref:
https://pypi.python.org/pypi/memory_profiler

$ pip install memory-profiler psutil

# in your_app/views.py
from memory_profiler import profile

@profile(precision=4)
def your_view(request):
    pass

There are other options:
http://stackoverflow.com/questions/110259/which-python-memory-profiler-is-recommended

dogslow

ref:
https://bitbucket.org/evzijst/dogslow

django-slow-tests

ref:
https://github.com/realpython/django-slow-tests

django-debug-toolbar: The Debugging Toolkit for Django

2015-08-302019-10-22VintaPython, Web Development

django-debug-toolbar is a tool sets to display various debug information about the current request and response in Django.

ref:
https://github.com/django-debug-toolbar/django-debug-toolbar

Install

$ pip install \
  django-debug-toolbar \
  django-debug-toolbar-line-profiler \
  django-debug-toolbar-template-profiler \
  django-debug-toolbar-template-timings \
  django-debug-panel \
  memcache-toolbar \
  pympler \
  git+https://github.com/scuml/debug-toolbar-mail

ref:
https://github.com/dmclain/django-debug-toolbar-line-profiler
https://github.com/node13h/django-debug-toolbar-template-profiler
https://github.com/orf/django-debug-toolbar-template-timings
https://github.com/recamshak/django-debug-panel
https://github.com/ross/memcache-debug-panel
https://pythonhosted.org/Pympler/django.html
https://github.com/scuml/debug-toolbar-mail

Python 3
https://github.com/lerela/django-debug-toolbar-line-profile

Configuration

in urls.py

from django.conf import settings
from django.conf.urls import include, url

if settings.DEBUG:
    import debug_toolbar
    urlpatterns = [
        url(r'^__debug__/', include(debug_toolbar.urls)),
    ] + urlpatterns

in settings.py

INSTALLED_APPS += (
    'debug_toolbar',
    # 'debug_toolbar_line_profiler',
    # 'memcache_toolbar',
    # 'pympler',
    # 'template_profiler_panel',
    # 'template_timings_panel',
)
DEBUG_TOOLBAR_PANELS = [
    # 'debug_toolbar.panels.versions.VersionsPanel',
    # 'debug_toolbar.panels.timer.TimerPanel',
    # 'debug_toolbar.panels.settings.SettingsPanel',
    # 'debug_toolbar.panels.headers.HeadersPanel',
    # 'debug_toolbar.panels.request.RequestPanel',
    'debug_toolbar.panels.sql.SQLPanel',
    # 'debug_toolbar.panels.staticfiles.StaticFilesPanel',
    # 'debug_toolbar.panels.templates.TemplatesPanel',
    # 'template_timings_panel.panels.TemplateTimings.TemplateTimings',
    # 'template_profiler_panel.panels.template.TemplateProfilerPanel'
    # 'debug_toolbar.panels.cache.CachePanel',
    # 'memcache_toolbar.panels.memcache.MemcachePanel',
    # 'debug_toolbar.panels.profiling.ProfilingPanel',
    # 'debug_toolbar_line_profiler.panel.ProfilingPanel',
    # 'pympler.panels.MemoryPanel',
    # 'debug_toolbar.panels.signals.SignalsPanel',
    # 'debug_toolbar.panels.logging.LoggingPanel',
    # 'debug_toolbar.panels.redirects.RedirectsPanel',
]

if 'debug_toolbar' in INSTALLED_APPS:
    MIDDLEWARE_CLASSES = list(MIDDLEWARE_CLASSES)
    MIDDLEWARE_CLASSES += [
        'debug_toolbar.middleware.DebugToolbarMiddleware',
    ]

def show_toolbar(request):
    return True

DEBUG_TOOLBAR_CONFIG = {
    'SHOW_TOOLBAR_CALLBACK': show_toolbar,
}

INTERNAL_IPS = (
    '127.0.0.1',
)

ref:
http://django-debug-toolbar.readthedocs.org/en/latest/configuration.html
http://django-debug-toolbar.readthedocs.org/en/latest/panels.html

要確保沒有在 MIDDLEWARE_CLASSES 裡啟用以下的 middlewares：

'django.middleware.gzip.GZipMiddleware'
'django.middleware.http.ConditionalGetMiddleware'

ref:
http://django-debug-toolbar.readthedocs.io/en/stable/installation.html#automatic-setup