Build a news feed in Redis (fan-out-on-write aka Push model)

但是好像用 Cassandra 會比較適合?

--

可以用 sorted set 來儲存每個用戶的 news feed
用作品的 created_at timestamp 當 score
保持每個 set 的容量為 800 之類的

當用戶發佈了作品之後

就透過 message queue 把這個作品的 id 寫入該用戶的所有 followers 的 news feed 裡
或許也可以只寫入到所謂的 "active users"(例如七天內登入過的用戶)的 news feed 裡
因為明星級的用戶可能有很大量的 followers

news_feed:user_183 = [
user_8:song:5232, 8783421
user_2:song:432, 8509823
user_4:photo:23, 8323490
user_7:album:1232, 5323453
...
]

用 message queue 把作品 id 寫入各用戶的 news feed 時
或許也可以區分活躍用戶與非活躍用戶
活躍用戶的 task 的優先權比較高

當 A 用戶 follow 了 B 用戶之後

可以先取出 A 用戶的 news feed 的最後一個項目
看它的 created_at 是多少
然後撈出 B 用戶的比這個 created_at 還新的作品
接著寫入 A 用戶的 news feed
最後在 truncate 一下(用 ZREMRANGEBYRANK 指令)只保留 800 個項目

當用戶刪除了作品之後

可能有幾種做法:

一是在刪除之後就透過 message queue 遍歷所有 followers 的 news feed
然後把該作品刪掉(用 ZREM 指令)
但是仍然可能遇到還沒刪完就有用戶去讀自己的 news feed 的情況

二是在用戶讀自己的 news feed 的時候才檢查
假設是用 redis 的 list 或 sorted set
一開始就直接讀出 list 裡的 id 列表(當然會分頁)
然後根據 id 再去 mysql 裡撈出真正的 model(例如歌曲、專輯、照片等)
然後檢查有沒有哪些項目被標示為刪除了(假設是虛刪)
有的話就從 redis list 中刪掉
然後再撈下一個分頁的資料來補足
如果還是有應該被刪掉的資料就再重複整個過程

三是幫每個作品維護一個「哪些用戶的 news feed 會有這個作品」的 list
http://stackoverflow.com/questions/12357770/how-to-implement-user-feed-like-in-twitter-or-facebook-on-redis

當然這些方法一起用可能會更好

這裡的刪除可能不只是作品本身被刪除
也可能是 A 用戶 unfollow 了 B 用戶
所以也要把 B 用戶的作品從 A 用戶的 news feed 中刪掉
這個情況的話
似乎只能遍歷 A 用戶的 news feed 來刪掉 B 用戶的作品了?

ref:
http://highscalability.com/blog/2013/10/28/design-decisions-for-scaling-your-high-traffic-feeds.html

pt-online-schema-change

pt-online-schema-change can alter a table's structure without blocking reads or writes.

pt-online-schema-change works by creating an empty copy of the table to alter, modifying it as desired, and then copying rows from the original table into the new table. When the copy is complete, it moves away the original table and replaces it with the new one. Any modifications to data in the original tables during the copy will be reflected in the new table, because the tool creates triggers on the original table to update the corresponding rows in the new table.

Install

$ sudo apt-key adv --keyserver keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A

# add this to `/etc/apt/sources.list`
# remember to replace `precise` with the name of your distribution
deb http://repo.percona.com/apt precise main
deb-src http://repo.percona.com/apt precise main

$ sudo apt-get update
$ sudo aptitude -V install libio-socket-ssl-perl libnet-ssleay-perl percona-toolkit

ref:
http://www.percona.com/doc/percona-server/5.6/installation/apt_repo.html

Usage

$ time pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--dry-run \
--alter "ADD COLUMN extra_data TEXT NULL" \
D=streetvoice,t=member_userprofile

# you must specify `execute` option explicitly to alter tables
$ time pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--execute \
--alter "ADD COLUMN extra_data TEXT NULL" \
D=streetvoice,t=member_userprofile

$ pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--execute \
--alter " \
DROP COLUMN get_sms, \
DROP COLUMN job_title, \
DROP COLUMN exp, \
DROP COLUMN non_member_message, \
DROP COLUMN fans_amount, \
DROP COLUMN new_fans, \
DROP COLUMN last_time_in_giftbox \
" \
D=streetvoice,t=member_userprofile

$ pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--execute \
--alter " \
ADD COLUMN playlist_id integer NULL AFTER user_id, \
ADD CONSTRAINT playlist_id_refs_id_1d572331 FOREIGN KEY (playlist_id) REFERENCES music_playlist (id), \
ADD INDEX music_playrecord_97bd5154 (playlist_id) \
" \
D=streetvoice,t=music_playrecord

# 2 cores, 2GB RAM
real 105m27.484s
user 0m8.541s
sys 0m1.414s

$ pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--execute \
--alter-foreign-keys-method=rebuild_constraints \
--alter " \
ADD COLUMN play_count integer UNSIGNED NOT NULL DEFAULT 0 AFTER buy_link \
" \
D=streetvoice,t=music_musicalbum

$ pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--execute \
--alter " \
DROP COLUMN starsigns \
" \
D=streetvoice,t=member_usertemp

$ pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--dry-run \
--alter " \
DROP COLUMN is_cc, \
DROP COLUMN cc_license \
" \
D=streetvoice,t=music_song

ref:
http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html
http://www.percona.com/blog/2014/11/18/avoiding-mysql-alter-table-downtime/
http://www.cnblogs.com/zhoujinyi/p/3491059.html
http://huanghualiang.blog.51cto.com/6782683/1360873

alter-foreign-keys-method

如果你要修改的 table 有欄位是其他 table 的 foreign key
你必須明確地指定 alter-foreign-keys-method 參數
否則 pt-online-schema-change 會不讓你執行
因為會有 constraint 的問題()

官方文件說 rebuild_constraints 是比較建議的作法

$ time pt-online-schema-change \
--user=root --ask-pass --host=127.0.0.1 \
--print \
--execute \
--alter-foreign-keys-method=rebuild_constraints \
--alter "ADD COLUMN length int(10) UNSIGNED NOT NULL DEFAULT 0 AFTER file_size" \
D=streetvoice,t=music_song

# 2 cores, 2G RAM
real 2m57.298s
user 0m0.338s
sys 0m0.043s

ref:
http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html#cmdoption-pt-online-schema-change--alter-foreign-keys-method

MariaDB Galera Cluster

This tool works fine with MariaDB Galera Cluster, only makes sure wsrep_OSU_method value is TOI.

ref:
http://www.severalnines.com/blog/online-schema-upgrade-mysql-galera-cluster-using-toi-method
http://galeracluster.com/documentation-webpages/schemaupgrades.html

Percona XtraBackup and innobackupex

Percona XtraBackup and innobackupex

Install

$ sudo apt-key adv --keyserver keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A

# add this to `/etc/apt/sources.list`
# remember to replace `trusty` with the name of your distribution
deb http://repo.percona.com/apt trusty main
deb-src http://repo.percona.com/apt trusty main

$ sudo apt-get update
$ sudo apt-get install -V percona-xtrabackup

ref:
http://www.percona.com/doc/percona-xtrabackup/2.2/installation/apt_repo.html
http://www.severalnines.com/blog/mysqldump-or-percona-xtrabackup-backup-strategies-mysql-galera-cluster

Usage

innobackupex is a wrapper script for xtrabackup, it called xtrabackup binary to backup all the data of InnoDB tables (see Creating a Backup for details on this process) and copied all the table definitions in the database (.frm files).

ref:
http://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/innobackupex_script.html
http://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/innobackupex_option_reference.html

create a (hot) full backup

$ sudo mkdir -p /data/backups/mysql

$ time sudo innobackupex \
--defaults-file=/etc/mysql/my.cnf \
--host=127.0.0.1 --user=root --password=123 \
--parallel=4 \
--compress --compress-threads=4 \
--galera-info \
/data/backups/mysql

# 2 cores, 2G RAM
real 8m43.021s
user 1m47.838s
sys 0m16.291s

ref:
http://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/creating_a_backup_ibk.html
http://www.percona.com/doc/percona-xtrabackup/2.2/howtos/recipes_ibkx_compressed.html

prepare a full backup

$ time sudo innobackupex \
--apply-log \
--use-memory=1G \
/data/backups/mysql/2015-03-03_16-11-57/

ref:
http://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/preparing_a_backup_ibk.html

restore a full backup

$ sudo service mysql stop
$ sudo mv /var/lib/mysql /var/lib/mysql_backup
$ mkdir -p /var/lib/mysql

$ time sudo innobackupex \
--copy-back \
/data/backups/mysql/2015-03-03_16-11-57/

# 2 cores, 2G RAM
real 8m44.813s
user 0m0.541s
sys 0m46.829s

$ sudo chown -R mysql:mysql /var/lib/mysql

# CAUTION for using MariaDB Galera Cluster, it should be started in a particulr way
# sudo service mysql start --wsrep-new-cluster
$ sudo service mysql start

ref:
http://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/restoring_a_backup_ibk.html

create an incremental backup

$ time sudo innobackupex \
--defaults-file=/etc/mysql/my.cnf \
--host=127.0.0.1 --user=root --password=123 \
--incremental-basedir=/data/backups/mysql/2015-03-03_16-11-57/ \
--incremental \
/data/backups/mysql/

# 2 cores, 2G RAM
real 3m54.367s
user 0m39.399s
sys 0m10.585s

ref:
http://www.percona.com/doc/percona-xtrabackup/2.2/innobackupex/incremental_backups_innobackupex.html

prepare an incremental backup

$ sudo innobackupex \
--apply-log --redo-only \
/data/backups/mysql/2015-03-03_16-11-57/ # the directory of the first full backup

$ sudo innobackupex \
--apply-log --redo-only \
/data/backups/mysql/2015-03-03_16-11-57/ \
--incremental-dir=/data/backups/mysql/2015-03-03_17-56-28/ # the directory of the first incremental backup

$ sudo innobackupex \
--apply-log --redo-only \
/data/backups/mysql/2015-03-03_16-11-57/\
--incremental-dir=/data/backups/mysql/2015-03-03_18-03-42/ # the directory of the second incremental backup

到这里增量还原,还没有结束,还有最重要一步,就是要进行一次全量还原。
停止数据库,删除 /var/lib/mysql,再还原。
增量备份的原理就是,把增量目录下的数据,整合到全变量目录下,然后再进行全数据量的还原。

ref:
http://blog.51yip.com/mysql/1650.html
http://blog.csdn.net/dbanote/article/details/13295727

Issues

InnoDB: Error: log file ./ib_logfile0 is of different size 5242880 bytes

InnoDB: than specified in the .cnf file 50331648 bytes!

in /etc/mysql/my.cnf

innodb_log_file_size = 5242880

ref:
http://blog.51yip.com/mysql/1650.html