Django 的 SomeModel.objects.get_or_create()
實際上的步驟是:
get, if-problem: save+some_trickery, if-still-problem: get again, if-still-problem: surrender and raise.
get_or_create()
拋出 IntegrityError
但是接著 get()
一次卻會是 DoesNotExist
這個錯誤常常會發生在 MySQL 的 AUTOCOMMIT=OFF 且 isolation level 是 REPEATABLE READ 的情況下
MySQL 默認採用 REPEATABLE READ(而 PostgreSQL 預設是 READ COMMITTED)
而 REPEATABLE READ 的定義是在同一個 transaction 之內 執行相同的 SELECT 一定會拿到相同的結果
with transaction.commit_on_success(): try: user, created = User.objects.get_or_create(id=self.id, username=self.username) except IntegrityError: # 這個 get() 可能會發生 DoesNotExist user = User.objects.get(id=self.id)
會發生那個 DoesNotExist
通常是因為:
- thread a 執行
get()
,但是DoesNotExist
- 所以 thread a 接著
create()
,但是在資料建立之前 - thread b 幾乎在同一個時間點也執行
get()
,也是DoesNotExist
- thread a 成功地
create()
了 - 因為在 3 拿不到資料,所以 thread b 也執行
create()
,但是因為 UNIQUE CONSTRAINT,導致失敗了 - 所以 thread b 又會
get()
一次,但是因為 REPEATABLE READ,thread b 不會拿到 thread a 建立的資料,所以DoesNotExist
- 所以那個
User.DoesNotExist
就是 thread b 拋出來的錯誤 - thread a 是成功的,你事後去資料庫裡查詢,會看到那個 User 的資料好端端地在那裡
ref:
http://stackoverflow.com/questions/2235318/how-do-i-deal-with-this-race-condition-in-django
https://blog.ionelmc.ro/2014/12/28/terrible-choices-mysql/
解決的方法 除了就乾脆不要在同一個 transaction 裡之外
基本上就只能把 isolation level 改成 READ COMMITTED 了
似乎沒有其他辦法了
因為這就是 isolation level 的定義
As of MariaDB/MySQL 5.1, if you use READ COMMITTED or enable innodb_locks_unsafe_for_binlog, you must use row-based binary logging.
SHOW VARIABLES LIKE 'binlog_format'; SET SESSION binlog_format = 'ROW'; SET GLOBAL binlog_format = 'ROW';
ref:
http://dba.stackexchange.com/questions/2678/how-do-i-show-the-binlog-format-on-a-mysql-server
Django 和 Celery 都建議使用 READ COMMITTED
https://docs.djangoproject.com/en/dev/ref/models/querysets/#get-or-create
https://code.djangoproject.com/ticket/13906
https://code.djangoproject.com/ticket/6641
http://docs.celeryproject.org/en/latest/faq.html#mysql-is-throwing-deadlock-errors-what-can-i-do
方法一
SHOW VARIABLES LIKE 'tx_isolation'; SET SESSION tx_isolation='READ-COMMITTED'; SET GLOBAL tx_isolation='READ-COMMITTED'; # or SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;
ref:
http://dev.mysql.com/doc/refman/5.5/en/set-transaction.html
方法二
/etc/mysql/my.cnf
[mysqld] transaction-isolation = READ-COMMITTED
方法三
in settings.py
DATABASES = { 'default': { 'ENGINE': 'django.db.backends.mysql', ... 'OPTIONS': { 'init_command': 'SET SESSION tx_isolation="READ-COMMITTED"', }, } }
方法四
自己寫一個 get_or_create()
from django.db import transaction @transaction.atomic # or @transaction.commit_on_success def my_get_or_create(...): try: obj = MyObj.objects.create(...) except IntegrityError: transaction.commit() obj = MyObj.objects.get(...) return obj
ref:
http://stackoverflow.com/questions/2235318/how-do-i-deal-with-this-race-condition-in-django
--
MySQL 默認是 AUTOCOMMIT
http://dev.mysql.com/doc/refman/5.5/en/commit.html
Django 默認也是 AUTOCOMMIT
除非你使用了 TransactionMiddleware
(Django 1.5 以前)或是設置 ATOMIC_REQUESTS = False
(Django 1.6 以後)
TransactionMiddleware
的效果類似 @transaction.commit_on_success
或 @transaction.atomic
都是把 AUTOCOMMIT 關掉
差別在於範圍不同
http://django.readthedocs.io/en/1.5.x/topics/db/transactions.html#django-s-default-transaction-behavior
http://django.readthedocs.io/en/1.6.x/topics/db/transactions.html#managing-autocommit
http://django.readthedocs.io/en/1.6.x/ref/settings.html#std:setting-DATABASE-ATOMIC_REQUESTS