Django 的 SomeModel.objects.get_or_create()
實際上的步驟是:
- get
- if problem: save + some_trickery
- if still problem: get again
- if still problem: surrender and raise
get_or_create()
拋出 IntegrityError
但是接著 get()
一次卻會是 DoesNotExist
,這個錯誤常常會發生在 MySQL 的 AUTOCOMMIT=OFF
且 isolation level 是 REPEATABLE READ 的情況下。MySQL 默認採用 REPEATABLE READ(而 PostgreSQL 預設是 READ COMMITTED),而 REPEATABLE READ 的定義是在同一個 transaction 之內 執行相同的 SELECT 一定會拿到相同的結果。
with transaction.commit_on_success():
try:
user, created = User.objects.get_or_create(id=self.id, username=self.username)
except IntegrityError:
# 這個 get() 可能會發生 DoesNotExist
user = User.objects.get(id=self.id)
會發生那個 DoesNotExist
通常是因為:
- thread a 執行
get()
,但是DoesNotExist
- 所以 thread a 接著
create()
,但是在資料建立之前 - thread b 幾乎在同一個時間點也執行
get()
,也是DoesNotExist
- thread a 成功地
create()
了 - 因為在 3 拿不到資料,所以 thread b 也執行
create()
,但是因為 UNIQUE CONSTRAINT,導致失敗了 - 所以 thread b 又會
get()
一次,但是因為 REPEATABLE READ,thread b 不會拿到 thread a 建立的資料,所以DoesNotExist
- 所以那個
User.DoesNotExist
就是 thread b 拋出來的錯誤 - thread a 是成功的,你事後去資料庫裡查詢,會看到那個 User 的資料好端端地在那裡
ref:
http://stackoverflow.com/questions/2235318/how-do-i-deal-with-this-race-condition-in-django
https://blog.ionelmc.ro/2014/12/28/terrible-choices-mysql/
解決的方法,除了就乾脆不要在同一個 transaction 裡之外,基本上就只能把 isolation level 改成 READ COMMITTED 了。似乎沒有其他辦法了,因為這就是 isolation level 的定義。
As of MariaDB/MySQL 5.1, if you use READ COMMITTED or enable innodb_locks_unsafe_for_binlog, you must use row-based binary logging.
SHOW VARIABLES LIKE 'binlog_format';
SET SESSION binlog_format = 'ROW';
SET GLOBAL binlog_format = 'ROW';
ref:
http://dba.stackexchange.com/questions/2678/how-do-i-show-the-binlog-format-on-a-mysql-server
Django 和 Celery 都建議使用 READ COMMITTED
https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
https://code.djangoproject.com/ticket/13906
https://code.djangoproject.com/ticket/6641
http://docs.celeryproject.org/en/latest/faq.html#mysql-is-throwing-deadlock-errors-what-can-i-do
方法一
SELECT @@GLOBAL.tx_isolation, @@tx_isolation;
# MySQL
SET SESSION tx_isolation='READ-COMMITTED';
SET GLOBAL tx_isolation='READ-COMMITTED';
# MariaDB
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;
ref:
http://dev.mysql.com/doc/refman/5.5/en/set-transaction.html
方法二
/etc/mysql/my.cnf
[mysqld]
transaction-isolation = READ-COMMITTED
方法三
in settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
...
'OPTIONS': {
'init_command': 'SET SESSION tx_isolation="READ-COMMITTED"',
},
}
}
方法四
自己寫一個 get_or_create()
from django.db import transaction
@transaction.atomic
# or
@transaction.commit_on_success
def my_get_or_create(...):
try:
obj = MyObj.objects.create(...)
except IntegrityError:
transaction.commit()
obj = MyObj.objects.get(...)
return obj
ref:
http://stackoverflow.com/questions/2235318/how-do-i-deal-with-this-race-condition-in-django
MySQL 默認是 AUTOCOMMIT
http://dev.mysql.com/doc/refman/5.5/en/commit.html
Django 默認也是 AUTOCOMMIT,除非你使用了 TransactionMiddleware
(Django 1.5 以前)或是設置 ATOMIC_REQUESTS = False
(Django 1.6 以後)。TransactionMiddleware
的效果類似 @transaction.commit_on_success
或 @transaction.atomic
,都是把 AUTOCOMMIT 關掉,差別在於範圍不同。
ref:
http://django.readthedocs.io/en/1.5.x/topics/db/transactions.html#django-s-default-transaction-behavior
http://django.readthedocs.io/en/1.6.x/topics/db/transactions.html#managing-autocommit
http://django.readthedocs.io/en/1.6.x/ref/settings.html#std:setting-DATABASE-ATOMIC_REQUESTS