MySQL error codes

Show all MySQL error codes using the perror command.

$ for i in {1..190..1}; do perror "$i"; done
OS error code   1:  Operation not permitted
OS error code   2:  No such file or directory
OS error code   3:  No such process
OS error code   4:  Interrupted system call
OS error code   5:  Input/output error
OS error code   6:  No such device or address
OS error code   7:  Argument list too long
OS error code   8:  Exec format error
OS error code   9:  Bad file descriptor
OS error code  10:  No child processes
OS error code  11:  Resource temporarily unavailable
OS error code  12:  Cannot allocate memory
OS error code  13:  Permission denied
OS error code  14:  Bad address
OS error code  15:  Block device required
OS error code  16:  Device or resource busy
OS error code  17:  File exists
OS error code  18:  Invalid cross-device link
OS error code  19:  No such device
OS error code  20:  Not a directory
OS error code  21:  Is a directory
OS error code  22:  Invalid argument
OS error code  23:  Too many open files in system
OS error code  24:  Too many open files
OS error code  25:  Inappropriate ioctl for device
OS error code  26:  Text file busy
OS error code  27:  File too large
OS error code  28:  No space left on device
OS error code  30:  Read-only file system
OS error code  31:  Too many links
OS error code  32:  Broken pipe
OS error code  33:  Numerical argument out of domain
OS error code  34:  Numerical result out of range
OS error code  35:  Resource deadlock avoided
OS error code  36:  File name too long
OS error code  37:  No locks available
OS error code  38:  Function not implemented
OS error code  39:  Directory not empty
OS error code  40:  Too many levels of symbolic links
OS error code  42:  No message of desired type
OS error code  43:  Identifier removed
OS error code  44:  Channel number out of range
OS error code  45:  Level 2 not synchronized
OS error code  46:  Level 3 halted
OS error code  47:  Level 3 reset
OS error code  48:  Link number out of range
OS error code  49:  Protocol driver not attached
OS error code  50:  No CSI structure available
OS error code  51:  Level 2 halted
OS error code  52:  Invalid exchange
OS error code  53:  Invalid request descriptor
OS error code  54:  Exchange full
OS error code  55:  No anode
OS error code  56:  Invalid request code
OS error code  57:  Invalid slot
OS error code  59:  Bad font file format
OS error code  60:  Device not a stream
OS error code  61:  No data available
OS error code  62:  Timer expired
OS error code  63:  Out of streams resources
OS error code  64:  Machine is not on the network
OS error code  65:  Package not installed
OS error code  66:  Object is remote
OS error code  67:  Link has been severed
OS error code  68:  Advertise error
OS error code  69:  Srmount error
OS error code  70:  Communication error on send
OS error code  71:  Protocol error
OS error code  72:  Multihop attempted
OS error code  73:  RFS specific error
OS error code  74:  Bad message
OS error code  75:  Value too large for defined data type
OS error code  76:  Name not unique on network
OS error code  77:  File descriptor in bad state
OS error code  78:  Remote address changed
OS error code  79:  Can not access a needed shared library
OS error code  80:  Accessing a corrupted shared library
OS error code  81:  .lib section in a.out corrupted
OS error code  82:  Attempting to link in too many shared libraries
OS error code  83:  Cannot exec a shared library directly
OS error code  84:  Invalid or incomplete multibyte or wide character
OS error code  85:  Interrupted system call should be restarted
OS error code  86:  Streams pipe error
OS error code  87:  Too many users
OS error code  88:  Socket operation on non-socket
OS error code  89:  Destination address required
OS error code  90:  Message too long
OS error code  91:  Protocol wrong type for socket
OS error code  92:  Protocol not available
OS error code  93:  Protocol not supported
OS error code  94:  Socket type not supported
OS error code  95:  Operation not supported
OS error code  96:  Protocol family not supported
OS error code  97:  Address family not supported by protocol
OS error code  98:  Address already in use
OS error code  99:  Cannot assign requested address
OS error code 100:  Network is down
OS error code 101:  Network is unreachable
OS error code 102:  Network dropped connection on reset
OS error code 103:  Software caused connection abort
OS error code 104:  Connection reset by peer
OS error code 105:  No buffer space available
OS error code 106:  Transport endpoint is already connected
OS error code 107:  Transport endpoint is not connected
OS error code 108:  Cannot send after transport endpoint shutdown
OS error code 109:  Too many references: cannot splice
OS error code 110:  Connection timed out
OS error code 111:  Connection refused
OS error code 112:  Host is down
OS error code 113:  No route to host
OS error code 114:  Operation already in progress
OS error code 115:  Operation now in progress
OS error code 116:  Stale NFS file handle
OS error code 117:  Structure needs cleaning
OS error code 118:  Not a XENIX named type file
OS error code 119:  No XENIX semaphores available
OS error code 120:  Is a named type file
OS error code 121:  Remote I/O error
OS error code 122:  Disk quota exceeded
OS error code 123:  No medium found
OS error code 124:  Wrong medium type
OS error code 125:  Operation canceled
MySQL error code 126: Index file is crashed
MySQL error code 127: Record-file is crashed
MySQL error code 128: Out of memory
MySQL error code 130: Incorrect file format
MySQL error code 131: Command not supported by database
MySQL error code 132: Old database file
MySQL error code 133: No record read before update
MySQL error code 134: Record was already deleted (or record file crashed)
MySQL error code 135: No more room in record file
MySQL error code 136: No more room in index file
MySQL error code 137: No more records (read after end of file)
MySQL error code 138: Unsupported extension used for table
MySQL error code 139: Too big row
MySQL error code 140: Wrong create options
MySQL error code 141: Duplicate unique key or constraint on write or update
MySQL error code 142: Unknown character set used
MySQL error code 143: Conflicting table definitions in sub-tables of MERGE table
MySQL error code 144: Table is crashed and last repair failed
MySQL error code 145: Table was marked as crashed and should be repaired
MySQL error code 146: Lock timed out; Retry transaction
MySQL error code 147: Lock table is full;  Restart program with a larger locktable
MySQL error code 148: Updates are not allowed under a read only transactions
MySQL error code 149: Lock deadlock; Retry transaction
MySQL error code 150: Foreign key constraint is incorrectly formed
MySQL error code 151: Cannot add a child row
MySQL error code 152: Cannot delete a parent row
MySQL error code 153: No savepoint with that name
MySQL error code 154: Non unique key block size
MySQL error code 155: The table does not exist in engine
MySQL error code 156: The table already existed in storage engine
MySQL error code 157: Could not connect to storage engine
MySQL error code 158: Unexpected null pointer found when using spatial index
MySQL error code 159: The table changed in storage engine
MySQL error code 160: There's no partition in table for the given value
MySQL error code 161: Row-based binary logging of row failed
MySQL error code 162: Index needed in foreign key constraint
MySQL error code 163: Upholding foreign key constraints would lead to a duplicate key error in some other table
MySQL error code 164: Table needs to be upgraded before it can be used
MySQL error code 165: Table is read only
MySQL error code 166: Failed to get next auto increment value
MySQL error code 167: Failed to set row auto increment value
MySQL error code 168: Unknown (generic) error from engine
MySQL error code 169: Record was not update. Original values was same as new values
MySQL error code 170: It is not possible to log this statement
MySQL error code 171: The event was corrupt, leading to illegal data being read
MySQL error code 172: The table is of a new format not supported by this version
MySQL error code 173: The event could not be processed. No other handler error happened
MySQL error code 174: Got a fatal error during initialization of handler
MySQL error code 175: File too short; Expected more data in file
MySQL error code 176: Read page with wrong checksum
MySQL error code 177: Too many active concurrent transactions
MySQL error code 178: Record not matching the given partition set
MySQL error code 179: Index column length exceeds limit
MySQL error code 180: Index corrupted
MySQL error code 181: Undo record too big
MySQL error code 182: Invalid InnoDB FTS Doc ID
MySQL error code 183: Table is being used in foreign key check
MySQL error code 184: Tablespace already exists
MySQL error code 185: Too many columns
MySQL error code 186: Row in wrong partition
MySQL error code 187: Row is not visible by the current transaction
MySQL error code 188: Operation was interrupted by end user (probably kill command?)
MySQL error code 189: Disk full
MySQL error code 190: Incompatible key or row definition between the MariaDB .frm file and the information in the storage engine. You have to dump and restore the table to fix this

Calculate the similarity of two vectors

scipy.spatial.distance
https://docs.scipy.org/doc/scipy/reference/spatial.distance.html

sklearn.metrics
http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics

Distance

Euclidean distance 歐幾里德距離

from sklearn.metrics.pairwise import euclidean_distances

euclidean_distances([0, 0, 0, 0], [0, 0, 0, 0])
# array([[ 0.]])

euclidean_distances([1, 0, 1, 0], [1, 0, 1, 0])
# array([[ 0.]])

euclidean_distances([0, 1, 0, 1], [1, 0, 1, 0])
# array([[ 2.]])

ref:
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.euclidean_distances.html

Manhattan Distance 曼哈頓距離

from sklearn.metrics.pairwise import manhattan_distances

manhattan_distances([0, 0, 0, 0], [0, 0 , 0, 0])
# array([[ 0.]])

manhattan_distances([1, 1, 1, 0], [1, 0, 0, 0])
# array([[ 2.]])

manhattan_distances([0, 1, 0, 1], [1, 0, 1, 0])
# array([[ 4.]])

ref:
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.manhattan_distances.html

Similarity

Cosine similarity 餘弦相似度

from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise import cosine_distances
from sklearn.metrics.pairwise import pairwise_distances
from scipy.spatial.distance import pdist, squareform

cosine_similarity(matrix) == \
1 - cosine_distances(matrix) == \
1 - pairwise_distances(matrix, metric='cosine') == \
1 - squareform(pdist(matrix, 'cosine'))

cosine_similarity([0, 0, 0, 0], [0, 0, 0, 0])
# array([[ 0.]])

cosine_similarity([1, 0, 0, 0], [1, 0, 0, 0])
# array([[ 1.]])

cosine_similarity([1, 0, 1, 0], [0, 1, 0, 1])
# array([[ 0.]])

cosine_similarity([1, 0, 0, 1], [1, 0, 0, 0])
# array([[ 0.70710678]])

cosine_similarity([1, 0, 0, 1], [1, 0, 1, 0])
# array([[ 0.5]])

ref:
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_similarity.html
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.cosine_distances.html
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.pairwise_distances.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html

Jaccard similarity coefficient score

from sklearn.metrics import jaccard_similarity_score

jaccard_similarity_score([0, 0, 0, 0], [0, 0, 0, 0])
# 1.0

jaccard_similarity_score([0, 0, 0, 0], [1, 0, 0, 0])
# 0.75

jaccard_similarity_score([1, 0, 0, 0], [1, 0, 0, 0])
# 1.0

jaccard_similarity_score([1, 0, 1, 0], [0, 1, 0, 1])
# 0.0

ref:
http://scikit-learn.org/stable/modules/generated/sklearn.metrics.jaccard_similarity_score.html

http://datascience.stackexchange.com/questions/5121/applications-and-differences-for-jaccard-similarity-and-cosine-similarity

Log-Likelihood similarity

TODO

Pearson correlation coefficient 皮爾森相關係數

It has a value between +1 and −1 inclusive, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation. You should only calculate Pearson Correlations when the number of items in common between two users is > 1, preferably greater than 5/10. Only calculate the Pearson Correlation for two users where they have commonly rated items.

For hign-dimensional binary attributes, the performances of Pearson correlation coefficient and Cosine similarity
are better than Jaccard similarity coefficient score.

from scipy.stats import pearsonr

pearsonr([1, 0, 1, 1], [0, 0, 0, 0])
# (nan, 1.0)

pearsonr([1, 0, 1, 1], [1, 0, 0, 0])
# (0.33333333333333331, 0.66666666666666607)

pearsonr([1, 0, 1, 0], [0, 1, 0, 1])
# (-1.0, 0.0)

ref:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html
http://stackoverflow.com/questions/11429604/how-is-nan-handled-in-pearson-correlation-user-user-similarity-matrix-in-a-recom

Dissimilarity

Dice dissimilarity

from scipy.spatial.distance import dice
import numpy as np

v1 = np.array([0, 0, 0, 0])
v2 = np.array([0, 0, 0, 0])

try:
    sim = 1.0 - dice(v1.astype(bool), v2.astype(bool))
except ZeroDivisionError:
    sim = 0

ref:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.dice.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.kulsinski.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.sokalsneath.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.yule.html

Recommender System: Collaborative Filtering 協同過濾推薦演算法

dataset 會是 m 個用戶對 n 個物品的評分 utility matrix
因為通常只有部分用戶和部份物品會有評分資料
所以是一個 sparse matrix(稀疏矩陣)
目標是利用這些稀疏的資料去預測出用戶對他還沒評分過的物品的評分
除了評分之外,也可能是喜歡(和不喜歡)、購買、瀏覽之類的數據
又分成主動評分和被動評分

CF 的缺點:

  • 如果沒有用戶的歷史數據就沒辦法做任何推薦
  • 以及無論 user-based 或 item-based 都需要消耗大量的運算資源
  • 大部分用戶有評分紀錄的資料都只佔所有資料中的很小一部分,matrix 相當稀疏,很難找到相似的資料
  • 會有馬太效應,越熱門的物品越容易被推薦,所以通常都會降低熱門物品的權重

CF 主要分為 memory-based 和 model-based 兩大類
user-based 和 item-based collaborative filtering 屬於 memory-based
memory-based 基本上就是純粹的計算,沒有什麼 Machine Learning 的成分
model-based 才是 Machine Learning 的範疇

User-based Collaborative Filtering

        item_a  item_b  item_c
user_1  2       -       3
user_2  5       2       -
user_3  3       3       1
user_4  -       2       2
# the algorithm from "Mahout in Action"
for every other user w
  compute a similarity s between u and w
  retain the top users, ranked by similarity, as a neighborhood n

for every item i that some user in n has a preference for,
      but that u  has no preference for yet
  for every other user v in n that has a preference for i
    compute a similarity s between u and  v
    incorporate v's preference for i, weighted by s, into a running average

user-based 考慮的是 user 和 user 之間的相似程度

給定一個用戶 A
計算用戶 A 跟其他所有用戶的相似度
找出最相似的 m 個用戶
再找出這些用戶有評分但是用戶 A 沒有評分的物品(也可以額外限制至少要幾個用戶有評分過)
以「相似用戶的相似度」和「該用戶對該物品的評分」來加權算出用戶 A 對這些未評分物品的評分
最後推薦給 A 評分最高的 n 個物品

預測 user_4 對 item_a 的評分 =
(user_4_user_1_sim x user_1_item_a_rating + user_4_user_3_sim x user_3_item_a_rating) / (user_4_user_1_sim + user_4_user_3_sim)

user-based 的特點:

  • 適合 user 遠少於 item 的系統,相似度的計算量會較少
  • item 的時效性強、更多樣的系統,例如新聞、社交網站,適合用 user-based CF
  • 不容易給出推薦理由
  • 驚喜度較高

常用的相似度演算法:

  • Pearson Correlation Coefficient
  • Cosine Similarity
  • Adjusted Cosine Similarity(有些用戶傾向於對所有物品評高分或低分,這個計算方式可以消除這樣的影響)

ref:
https://www.safaribooksonline.com/library/view/mahout-in-action/9781935182689/kindle_split_013.html

Item-based Collaborative Filtering

        user_1  user_2  user_3  user_4
item_a  2       5       3       -
item_b  -       2       3       2
item_c  3       -       1       2
# the algorithm from "Mahout in Action"
for every item i that u has no preference for yet
  for every item j that u has a preference for
    compute a similarity s between i and j
    add u's preference for j, weighted by s, to a running average
return the top items, ranked by weighted average

item-based 考慮的是 item 和 item 之間的相似程度
item-based 用的還是跟 user-based CF 一模一樣的資料
而不是使用 item 本身的特徵(那個叫 content-based)

如果物品數比用戶數還少得多的話
可以事先計算好所有物品之間的相似度
給定一個用戶 A
找出用戶 A 的所有未評分物品
以「用戶 A 的已評分物品對該未評分物品的相似度」和「用戶 A 對已評分物品的評分」來加權算出用戶 A 對這些未評分物品的評分
最後推薦給用戶 A 評分最高的 n 個物品

預測 user_4 對 item_a 的評分 =
(item_b_item_a_sim x user_4_item_b_rating + item_c_item_a_sim x user_4_item_c_rating) / (item_b_item_a_sim + item_c_item_a_sim)

也可以無視用戶 A 的歷史評分資料(或是根本沒有用戶 A 的歷史資料)
直接推薦跟某個物品最相似的 n 個物品

item-based 的特點:

  • 適合 item 遠少於 user 的系統,相似度的計算量會較少
  • 購物、電影、音樂、書籍等系統,用戶的興趣相對固定,適合用 item-based CF
  • 只會推薦類似的東西,驚喜度和多樣性較低
  • 通常只有在用戶量比較小的時候才需要頻繁地重新計算物品之間的相似度,隨著用戶量越大,物品的相似度會趨於穩定

ref:
https://ashokharnal.wordpress.com/2014/12/18/worked-out-example-item-based-collaborative-filtering-for-recommenmder-engine/
http://blog.csdn.net/huagong_adu/article/details/7362908

Slope One Recommender

        item_a  item_b  item_c
user_1  5       3       2
user_2  3       4       -
user_3  -       2       5
# the algorithm from "Mahout in Action"
for every item i the user u expresses no preference for
  for every item j that user u expresses a preference for
    find the average preference difference between j and i
    add this diff to u's preference value for j
    add this to a running average
return the top items, ranked by these averages

因為 memory-based collaborative filtering 的其中一個問題是數據量很大時計算量也會很可觀
所有就有人提出 Slope One 這種簡單粗暴的演算法來
雖然 Slope One 還是得計算所有物品兩兩之間的平均差異

Slope One 假設任兩個物品之間的評分都是一個 y = mx + b 而且 m = 1(斜率為 1)的線性關係
item_a 平均比 item_b 多 (2 + (-1)) / 2 = 0.5
item_a 平均比 item_c 多 (5 - 2) / 1 = 3
如果用 user_3 對 item_b 的評分來預測他對 item_a 的評分會是 2 + 0.5 = 2.5
如果用 user_3 對 item_c 的評分來預測他對 item_a 的評分會是 5 + 3 = 8
通常會用有多少人同時評分來加權多個評分

預測 user_3 對 item_a 的評分 =
((同時對 item_a 和 item_b 評分的人數 x user_3 用 item_b 對 item_a 的預測評分) + (同時對 item_a 和 item_c 評分的人數 x user_3 用 item_c 對 item_a 的預測評分)) / (同時對 item_a 和 item_b 評分的人數 + 同時對 item_a 和 item_c 評分的人數)
((2 x 2.5) + (1 x 8)) / (2 + 1) = 4.33

ref:
https://en.wikipedia.org/wiki/Slope_One