Django ORM

defer / only / values / values_list

# retrieve everything but the `body` field
posts = Post.objects.all().defer('body')

# retrieve only the `title` field
posts = Post.objects.all().only('title')

# retrieve a list of {'id': id} dictionaries
posts = Post.objects.all().values('id')

# retrieve a list of (id,) tuples
posts = Post.objects.all().values_list('id')

# retrieve a list of ids
posts = Post.objects.all().values_list('id', flat=True)

使用 defer()only()
你得到的還是一個 model object

使用 values()values_list()
你得到的會是 list 或 dictionaries

如果你只會存取特定的幾個欄位而不會用到 methods
建議你用 values()values_list()
這樣可以省下把資料初始化成 Django model instance 的時間

distinct()

因為 MySQL 不支援 distinct('some_field')
會報錯 NotImplementedError: DISTINCT ON fields is not supported by this database backend
但是你其實可以結合 values()distinct() 來達成差不多的結果

AlbumDistribute.objects.filter(status=AlbumPublishingStatus.PUBLISHED).distinct('album').count()
# NotImplementedError: DISTINCT ON fields is not supported by this database backend

AlbumDistribute.objects.filter(status=AlbumPublishingStatus.PUBLISHED).values('album').count()
# 107

AlbumDistribute.objects.filter(status=AlbumPublishingStatus.PUBLISHED).values('album').distinct().count()
# 30

cast a queryset to a list

queryset = User.objects.filter(is_staff=True)
for user in queryset:
    setattr(user, 'extra_field', 'whatever')

object_list = list(queryset)

當你 assign 了值之後,最好把 queryset 轉成 list
這樣可以避免在其他地方不小心執行了 object_list.all() 之類的操作
因為 callable attributes cause DB lookups every time
每次 object_list.all() 又會產生一個新的 queryset

ref:
https://docs.djangoproject.com/en/dev/topics/db/optimization/

order_by()

.order_by() 不加參數表示不要任何排序

SQL 的 GROUP BY

如果你只是希望 query 的結果能夠隱式地 group 在一起
其實用 .order_by() 就可以了
例如 .order_by('album', '-created_at')
http://stackoverflow.com/questions/629551/how-to-query-as-group-by-in-django

如果你需要用到 aggregation
就要用 .values()

from django.db.models import Sum

"""
每一首歌在每一週的播放次數之和
group by song
[
    {'sum_count': 123553, 'song': 261009L},
    {'sum_count': 81889, 'song': 295336L},
    {'sum_count': 78596, 'song': 206349L},
    ...
]
"""
PlayRecordWeekly.objects \
.filter(song__created_at__range=(start_date, end_date)) \
.values('song') \
.annotate(sum_count=Sum('count')) \
.order_by('-sum_count')[:10]


"""
每一首歌的總播放次數
group by song
[
    {'count_song': 26054, 'song': 194512L},
    {'count_song': 18281, 'song': 201436L},
    {'count_song': 17802, 'song': 295336L},
    ...
]
"""
PlayRecord.objects \
.filter(created_at__year=2014, created_at__month=11) \
.values('song') \
.annotate(count_song=Count('song')) \
.order_by('-count_song')[:10]


"""
每一個用戶的播放次數之和
group by user
"""
PlayRecordArchive.objects \
.filter(last_modified__year=2014) \
.values('user_id') \
.annotate(sum_count=Sum('count')) \
.filter(sum_count__gte=10000) \
.order_by('-sum_count')


"""
每一種曲風的播放次數之和
group by song__genre
"""
PlayRecord.objects \
.filter(user=user) \
.exclude(song__genre=0) \
.values('song__genre') \
.annotate(count_song_genre=Count('song__genre')) \
.order_by('-count_song_genre')

ref:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#values
http://fcamel-life.blogspot.tw/2010/04/django-group-by.html
http://blog.darkchoco.com/2011/12/20/group-by-in-django/
http://stackoverflow.com/questions/19101665/django-how-to-do-select-count-group-by-and-order-by

Q

所有的 .filter() 查詢都是 AND 操作
如果想要 OR 就得用 Q()

from django.db.models import Q
Q(question__startswith='What')

ref:
https://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects

F

你可以在 .filter() 裡用 F() 來表示同一個 model 的其他欄位
就是可以在資料庫層就直接運算
而不會把資料拿到 Python 層處理

不過跟 transaction.atomic() 一起用的時候有點問題
因為有可能資料還沒有真的被存進資料庫裡

from django.db.models import F

Entry.objects.filter(n_comments__gt=F('n_pingbacks'))

song.play_count = F('play_count') + 1
song.save(update_fields=['play_count', ])

ref:
https://docs.djangoproject.com/en/dev/topics/db/queries/#using-f-expressions-in-filters

aggregate, annotate

https://docs.djangoproject.com/en/dev/topics/db/aggregation/
aggregate 是「聚合」
a sum total of many heterogenous things taken together

annotate 是「註釋」
add explanatory notes to or supply with critical comments
https://docs.djangoproject.com/en/dev/ref/models/querysets/#annotate

annotate 作用在 queryset 裡的每一個 object
所以返回的還是一個一般的 queryset
只是每個 model instance 會有額外的 attribute 儲存 annotate 的結果

aggregate 作用在整個 queryset
通常會返回一個 dict
只包含你 aggregate 的結果
例如 model 裡某個欄位(例如 price)的總和

要訣就是 Avg, Sum, Count, Max, Min
能夠在 SQL 做的計算就在 SQL 做
不要拿到 Python 來做
https://docs.djangoproject.com/en/dev/ref/models/querysets/#aggregation-functions

ref:
https://docs.djangoproject.com/en/dev/topics/db/aggregation/
http://stackoverflow.com/questions/327807/django-equivalent-for-count-and-group-by/1317837
http://www.shellbye.com/blog/%E6%8A%80%E6%9C%AF%E4%B8%96%E7%95%8C/django-aggregate-annotate-%E8%AF%A6%E8%A7%A3/