Archive for May 2015

Read and write files in Go

Reading file line by line

如果要一行一行地讀
建議用 bufio.Scanner
但是 Scanner 有個缺點
就是一行太長(超過 64K)的時候會出現 bufio.Scanner: token too long 的錯誤
這時候還是得用 bufio.Reader

fin, err := os.Open(path)
if err != nil {
    fmt.Println(err)
}
defer fin.Close()

scanner := bufio.NewScanner(fin)
for scanner.Scan() {
    line := scanner.Text()
    fmt.Fprintln(os.Stdin, line)
}

if err := scanner.Err(); err != nil {
    fmt.Fprintln(os.Stderr, err)
}

If you know the maximum length of the tokens you will be reading, copy the bufio.Scanner code into your project and change the const MaxScanTokenSize value.

ref:
http://stackoverflow.com/questions/6141604/go-readline-string
http://stackoverflow.com/questions/1821811/how-to-read-write-from-to-file
http://stackoverflow.com/questions/8757389/reading-file-line-by-line-in-go
http://stackoverflow.com/questions/5884154/golang-read-text-file-into-string-array-and-write
https://github.com/polaris1119/The-Golang-Standard-Library-by-Example/blob/master/chapter01/01.4.md

bufio
https://golang.org/pkg/bufio/
https://golang.org/pkg/bufio/#Scanner

Reading and writing file line by line

fmt.Fprintln(writer, line)bw.WriteString(line) 還要快

func FileSpacing(filename string, w io.Writer) (err error) {
    fr, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer fr.Close()

    br := bufio.NewReader(fr)
    bw := bufio.NewWriter(w)

    for {
        line, err := br.ReadString('\n')
        if err == nil {
            fmt.Fprint(bw, TextSpacing(line))
        } else {
            if err == io.EOF {
                fmt.Fprint(bw, TextSpacing(line))
                break
            }
            return err
        }
    }
    defer bw.Flush()

    return nil
}

Copy a file

fin, _ := os.Open("source.txt")
fout, _ := os.Create("destination.txt")

io.Copy(fout, fin)

defer fout.Close()
defer fin.Close()

ref:
http://stackoverflow.com/questions/23272663/transfer-a-big-file-in-golang
http://golang.org/pkg/io/#Copy

Compute MD5 of a file

func md5Of(filename string) string {
    var result []byte

    file, err := os.Open(filename)
    checkError(err)
    defer file.Close()

    hash := md5.New()
    _, err = io.Copy(hash, file)
    checkError(err)

    checksum := hex.EncodeToString(hash.Sum(result))

    return checksum
}

ref:
http://stackoverflow.com/questions/29505089/how-can-i-compare-two-files-in-golang

Decorator in Python

Python 裡的所有東西都是 object
function 也是
所以你可以對 function 做任何跟 object 一樣的事
例如把一個 function 當成參數丟給另一個 function
當然也可以 decorate class

ref:
http://blog.dscpl.com.au/2014/01/how-you-implemented-your-python.html
http://www.jianshu.com/p/98ca249b1326

不帶參數的 decorator

第一層 def 接收 func
第二層 def 接收 func 的 *args, **kwargs

通常意義下的 decorator 是把 func(就是 something_1、something_2)丟給 decorator function
做一些額外的動作
然後回傳該 func 原本的 return
並不操作 func 本身

如果要操作 func 本身
例如幫 func 增加一個 attribute
請參考下下面的例子

def func_wrapper(func):
    def arg_wrapper(*args, **kwargs):
        print func.__name__ + ' was called'
        print args
        print kwargs

        return func(*args, **kwargs)

    return arg_wrapper

@func_wrapper
def something_1(name='[name]'):
    print 'something_1: ' + name

@func_wrapper
def something_2(name='[name]'):
    print 'something_2: ' + name

something_1('Tom')
something_2('Cathy')
something_1()
something_2()

ref:
http://www.dotblogs.com.tw/rickyteng/archive/2013/11/06/126852.aspx

帶參數的 decorator

需要傳參數給 decorator 時
要多 wrap 一次
其實就是在原本的 decorator function 的外面再多加一個 def 來接收參數

那個 model_class 就是傳給 decorator 的參數

# 跟不帶參數的 decorator 相比多了一層,用來接收 decorator 的參數
def can_access_item_required(model_class):
    def func_wrapper(func):
        def arg_wrapper(*args, **kwargs):
            request = args[0]
            item = model_class.objects.get(pk=kwargs['pk'])

            if not model_class.objects.can_access_item_by(item, request.user):
                raise PermissionDenied()

            return func(*args, **kwargs)

        return arg_wrapper

    return func_wrapper

@can_access_item_required(Label)
def label_update(request, pk):
    pass

# equals to

label_update = can_access_item_required(Artist)(label_update)

ref:
https://www.python.org/dev/peps/pep-0318/#current-syntax

decorator 修改 func 本身

不需要封裝什麼
單純地把 func 丟給另一個 function
然後再 return 那個 func 即可

in views.py

def intro_middleware_decorator(func):
    func.enable_intro_middleware = False

    return func

@intro_middleware_decorator
def intro_1(request):
    return render(request, 'dps/intro/1.html')

in middleware.py

class IntroMiddleware(object):
    """
    如果 user 沒有完成 intro 步驟,則轉址到 /intro/1/
    """

    def __init__(self):
        self.enable = True

    def process_view(self, request, view_func, view_args, view_kwargs):
        """
        要在 intro 系列的 view 加上 disable_intro_middleware_decorator
        避免無限迴圈
        """

        self.enable = getattr(view_func, 'enable_intro_middleware', True)

    def process_response(self, request, response):
        if self.enable:
            resolver = request.resolver_match
            namespace = getattr(resolver, 'namespace', None)
            # 避免 admin, login 之類的 view 也都被 redirect
            if namespace == 'dps':
                if request.user.is_authenticated():
                    profile = request.user.profile
                    if (profile.identity == ProfileIdentity.PHANTOM) and (not profile.is_ready):
                        return redirect('dps:intro-one')
                else:
                    return redirect('auth:login')

        return response

decorator in a Class

class SVAPIListView(object):

    @staticmethod
    def filter_last_modified_decorator(func):
        """
        用來讓 client 同步資料
        ?last_modified=2015-02-04T15:16:22&is_deleted=true
        可以拿到某個時間點之後,新增、修改、刪除的項目
        必須明確地指定 is_deleted=true 才會包含被刪除的項目
        """

        def _add_filters_for_syncing(self, *args, **kwargs):
            queryset = func(self, *args, **kwargs)

            last_modified = self.get_last_modified()
            if last_modified:
                queryset = queryset.filter(last_modified__gte=last_modified)

                is_deleted = self.get_is_deleted()
                target_user = self.get_target_user()
                if (self.request.user == target_user) and (is_deleted):
                    # 只有用戶本人才能拿到被刪除的 items
                    pass
                else:
                    queryset = queryset.filter(enable=True)
            else:
                queryset = queryset.filter(enable=True)

            return queryset

        return _add_filters_for_syncing

class UserAlbumList(SVAPIListView):
    serializer_class = api_serializers.AlbumListSerializer

    @UserIDAsUsernameMixin.filter_last_modified_decorator
    def get_queryset(self, profile_user):
        queryset = MusicAlbum.objects.filter(user=profile_user)
        queryset = queryset.select_related('user', 'user__profile')

        ordering = self.request.GET.get('ordering')
        if ordering == 'like_count':
            queryset = queryset.order_by('-like_count')
        else:
            queryset = queryset.order_by('-id')

        return queryset

    def get(self, request, user_id, *args, **kwargs):
        profile_user = self.get_target_user()
        queryset = self.get_queryset(profile_user=profile_user)
        data = self.get_serializer_data(queryset)

        return Response(data)

ref:
http://stackoverflow.com/questions/1263451/python-decorators-in-classes

多個 decorator 時的執行順序

如果有多個 decorator 裝飾同一個 function / class
執行的順序是由下往上的
會先執行 @decorator_1

@decorator_3
@decorator_2
@decorator_1
def function():
    pass

Class-based decorator

class RecorderDecorator(object):

    __slots__ = ['recorder', 'msg_template', 'subject', 'action']

    recorder = Recorder()

    def __init__(self, msg_template, subject=None, action=None):
        self.msg_template = msg_template
        self.subject = subject
        self.action = action

class model_recorder(RecorderDecorator):
    """
    用於 model 的 method
    接受一個 string template,會自動使用 method parameters 作為 template context
    會使用 method name 作為 action,model name 作為 subject
    會在裝飾的 method 執行完之後執行這個 decorator

    用法可以參考 Song.play()

    @recorder_decorators.model_recorder('user: {user.id}, ip: {from_ip}, full: {is_full}, embed: {is_embed}')
    def play(self, user, from_ip, is_full=False, is_embed=False, from_playlist=None):
        pass

    則 log message 會是 [song:play] user: 123, ip: 59.120.12.57, full: True, embed: False
    """

    def __call__(self, func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            func_result = func(*args, **kwargs)
            func_kwargs = inspect.getcallargs(func, *args, **kwargs)

            model_instance = func_kwargs.get('self')

            if getattr(model_instance, 'id'):
                new_msg_template = ', '.join(('item: {self.id}', self.msg_template))
            else:
                new_msg_template = self.msg_template

            if not self.subject:
                model_name = type_utils.item_type(model_instance)
                new_subject = model_name
            else:
                new_subject = self.subject

            if not self.action:
                new_action = func.__name__
            else:
                new_action = self.action

            msg = new_msg_template.format(**func_kwargs)
            self.recorder.write(new_subject, new_action, msg)

            return func_result

        return wrapper

class Song(models.Model):

    @model_recorder('user: {user.id}, ip: {from_ip}, full: {is_full}, embed: {is_embed}')
    def play(self, user, from_ip, is_full=False, is_embed=False, from_playlist=None):
        pass

ref:
http://stackoverflow.com/questions/9416947/python-class-based-decorator-with-parameters-that-can-decorate-a-method-or-a-fun

Database Routers in Django

把部分 models / tables 獨立到一台資料庫

Database Router

沒辦法在 Model class 裡指定這個 model 只能用某個 database
而是要用 database router
就是判斷 model._meta.app_label == 'xxx' 的時候指定使用某一個 database
database 是指定義在 settings.DATABASES 的那些

不過 django 不支援跨 database 的 model relation
你不能用 foreign key 或 m2m 指向另一個 database 裡的 model
但是其實你直接用 user_id, song_id 之類的 int 欄位來記錄就好了

in settings.py

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'main',
        'USER': 'whatever',
        'PASSWORD': 'whatever',
        'HOST': '',
        'PORT': '',
    },
    'warehouse': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'warehouse',
        'USER': 'whatever',
        'PASSWORD': 'whatever',
        'HOST': '',
        'PORT': '',
    },
}

DATABASE_ROUTERS = [
    'app.db_routers.SVDatabaseRouter',
]

in app/db_routers.py

class SVWarehouseDBMixin(object):
    """
    CAUTION! 有這個 minxin 的 class 會被放到 warehouse 這台資料庫!

    原本是想繼承 models.Model 用 abstract = True
    但是載入 db router 的時候會衝突(待研究)
    只好改成 mixin
    """

    pass


class SVDatabaseRouter(object):

    def db_for_read(self, model, **hints):
        if issubclass(model, SVWarehouseDBMixin):
            return 'warehouse'

        return 'default'

    def db_for_write(self, model, **hints):
        if issubclass(model, SVWarehouseDBMixin):
            return 'warehouse'

        return 'default'

    def allow_relation(self, obj1, obj2, **hints):
        if isinstance(obj1, SVWarehouseDBMixin) and isinstance(obj2, SVWarehouseDBMixin):
            return True
        elif (not isinstance(obj1, SVWarehouseDBMixin)) and (not isinstance(obj2, SVWarehouseDBMixin)):
            return True

        return False

    def allow_syncdb(self, db, model):
        if db == 'default':
            if issubclass(model, SVWarehouseDBMixin):
                return False
            else:
                return True
        elif db == 'warehouse':
            if issubclass(model, SVWarehouseDBMixin):
                return True
            else:
                return False

        return False

ref:
https://docs.djangoproject.com/en/dev/topics/db/multi-db/

Models

繼承 SVWarehouseDBMixin 這個 minxin 的 class 會被放到 warehouse 這台資料庫!
所以 migrate 的時候要注意,記得在 migration 檔案裡加上:

from south.db import dbs
warehouse_db = dbs['warehouse']

這樣 south 才會在 warehouse 這台資料庫上建立 table
不然就是你自己手動去 CREATE TABLE

from app.db_routers import SVWarehouseDBMixin

class PlayRecord(SVWarehouseDBMixin, models.Model):
    song_id = models.IntegerField()
    user_id = models.IntegerField(null=True, blank=True)
    is_full = models.BooleanField(default=False)
    ip_address = models.IPAddressField()
    location = models.CharField(max_length=2)
    created_at = models.DateTimeField()

Migration

如果你要把舊有的 app 的 models 搬到另一台資料庫
但是 models 不動(還是放在本來的 app 底下)
你可能會需要 reset 整個 migration 紀錄
從頭開始建立一個新的 migration
因為 schema 會錯亂
所以還是建議新開一個 app 來放那些要搬到另一台資料庫的 models
這樣 database router 和 migration 都會比較單純

in app/migrations/0001_initial.py

$ pip install south==1.0.2

# syncdb 默認只會作用到 default 資料庫,你要明確指定要用哪個 database 才行
$ ./manage.py syncdb --noinput
$ ./manage.py syncdb --noinput --database=warehouse

# migrate 卻可以作用到其他資料庫
# 因為 migrate 哪個資料庫是 migration file 裡的 `db` 參數在決定的
$ ./manage.py migrate

ref:
http://stackoverflow.com/questions/7029228/is-using-multiple-databases-and-south-together-possible

Unit Tests

Django only flushes the default database at the start of each test run. If your setup contains multiple databases, and you have a test that requires every database to be clean, you can use the multi_db attribute on the test suite to request a full flush.

from django.test import TestCase

class YourBTestCase(TestCase):
    multi_db = True

    def setUp(self):
        do_shit()

ref:
https://docs.djangoproject.com/en/dev/topics/testing/tools/
https://docs.djangoproject.com/en/dev/topics/testing/advanced/#topics-testing-advanced-multidb