Linux commands that every DevOps engineer should know

Linux commands that every DevOps engineer should know

Linux commands that DevOps engineers (or SysAdmin) should know.

ref:
https://peteris.rocks/blog/htop/
http://techblog.netflix.com/2015/11/linux-performance-analysis-in-60s.html
http://techblog.netflix.com/2015/08/netflix-at-velocity-2015-linux.html

總覽

$ top

$ sudo apt-get install htop
$ htop

# 每 1 秒輸出一次資訊
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 1580104 171620 4287208    0    0     0    11    2    2  9  0 90  0  0
 0  0      0 1579832 171620 4287340    0    0     0     0 2871 2414 13  2 85  0  0
 0  0      0 1578688 171620 4287344    0    0     0    40 2311 1700 18  1 82  0  0
 1  0      0 1578640 171620 4287348    0    0     0    48 1302 1020  5  0 95  0  0
...

查 CPU

$ uptime

Load average: 0.03 0.11 0.19
Load average: 一分鐘 五分鐘 十五分鐘內的平均負載
單核心,如果 Load average 是 1 表示負載 100%
多核心的話,因為 Load average 是所有 CPU 數加起來,所以數值可能會大於 1

$ sudo apt-get install sysstat

# 每個 CPU 的使用率
$ mpstat -P ALL 1
Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015  _x86_64_ (32 CPU)
07:38:49 PM  CPU   %usr  %nice   %sys %iowait   %irq  %soft  %steal  %guest  %gnice  %idle
07:38:50 PM  all  98.47   0.00   0.75    0.00   0.00   0.00    0.00    0.00    0.00   0.78
07:38:50 PM    0  96.04   0.00   2.97    0.00   0.00   0.00    0.00    0.00    0.00   0.99
07:38:50 PM    1  97.00   0.00   1.00    0.00   0.00   0.00    0.00    0.00    0.00   2.00
07:38:50 PM    2  98.00   0.00   1.00    0.00   0.00   0.00    0.00    0.00    0.00   1.00
...

# 每個 process 的 CPU 使用率
$ pidstat 1
Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015    _x86_64_    (32 CPU)
07:41:02 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
07:41:03 PM     0         9    0.00    0.94    0.00    0.94     1  rcuos/0
07:41:03 PM     0      4214    5.66    5.66    0.00   11.32    15  mesos-slave
07:41:03 PM     0      4354    0.94    0.94    0.00    1.89     8  java
07:41:03 PM     0      6521 1596.23    1.89    0.00 1598.11    27  java
...

查 Memory

$ free –m
             total       used       free     shared    buffers     cached
Mem:          7983       6443       1540          0        167       4192
-/+ buffers/cache:       2083       5900
Swap:            0          0          0

查 Disk

$ iostat -xz 1
Linux 3.13.0-49-generic (titanclusters-xxxxx)  07/14/2015  _x86_64_ (32 CPU)
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          73.96    0.00    3.73    0.03    0.06   22.21
Device:   rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvda        0.00     0.23    0.21    0.18     4.52     2.08    34.37     0.00    9.98   13.80    5.42   2.44   0.09
xvdb        0.01     0.00    1.02    8.94   127.97   598.53   145.79     0.00    0.43    1.78    0.28   0.25   0.25
xvdc        0.01     0.00    1.02    8.86   127.79   595.94   146.50     0.00    0.45    1.82    0.30   0.27   0.26

查 IO

$ sudo apt-get install dstat iotop

# 可以顯示哪些 process 在進行 io 操作
$ dstat --top-io --top-bio

# with –only option to see only processes or threads actually doing I/O
$ sudo iotop --only

ref:
https://www.cyberciti.biz/hardware/linux-iotop-simple-top-like-io-monitor/

查 Network

$ sar -n TCP,ETCP 1

查 Process

$ ps aux
$ pstree -a

# attach 到某個 process,查看 system call
# -t -- absolute timestamp
# -T -- print time spent in each syscall
# -s strsize -- limit length of print strings to STRSIZE chars (default 32)
# -f -- follow forks
# -u username -- run command as username handling setuid and/or setgid
$ strace -t -T -f -p 1234

# 可以看到啟動 nginx 的過程中存取了哪些檔案
$ strace -f -e trace=file service nginx start

# 顯示 PID 3001 的 process 是用什麼指令和參數啟動的
$ tr '\0' '\n' < /proc/3001/cmdline

ref:
http://man7.org/linux/man-pages/man1/strace.1.html
https://blogs.oracle.com/ksplice/strace-the-sysadmins-microscope

查 Logs

# 顯示最近的 15 筆 system messages
$ dmesg | tail -fn 15

查 Nginx

# 顯示各個 status code 的數量
$ cat access.log | cut -d '"' -f3 | cut -d ' ' -f2 | sort | uniq -c | sort -rn

# 顯示哪些 URL 的 404 數量最多
$ awk '($9 ~ /404/)' access.log | awk '{print $7}' | sort | uniq -c | sort -rn

# 顯示 2016/10/01 的 16:00 ~ 18:00 的 log
$ grep "01/Oct/2016:1[6-8]" access.log

# 顯示 2016/10/01 的 09:00 ~ 12:00 的 log
$ egrep "01/Oct/2016:(0[8-9]|1[0-2])" access.log

ref:
http://stackoverflow.com/questions/7575267/extract-data-from-log-file-in-specified-range-of-time
http://superuser.com/questions/848971/unix-command-to-grep-a-time-range

Linux commands cookbook

Linux commands cookbook

switch shell to another user

# the latter with "-" gets an environment as if another user just logged in
$ sudo su - ubuntu

change file's modify time

$ touch -m -d '1 Jan 2006 12:34' tmp
$ touch -m -d '1 Jan 2006 12:34' tmp/old_file.txt

ref:
https://www.unixtutorial.org/2008/11/how-to-update-atime-and-mtime-for-a-file-in-unix/

delete old files under a directory

$ find /data/storage/tmp/* -mtime +2 | xargs rm -Rf
$ find /data/storage/tmp/* -mtime +2 -exec rm {} \;

ref:
http://stackoverflow.com/questions/14731133/how-to-delete-all-files-older-than-3-days-when-argument-list-too-long

append string to file in command line

# append
$ echo "the last line" >> README.md

# replace
$ echo "replace all" > README.md

rename sub-folders

$ for f in */migrations/; do mv -v "$f" "${f%???????????}south_migrations"; done

ref:
http://unix.stackexchange.com/questions/220176/rename-specific-level-of-sub-folders

list history commands

$ export HISTTIMEFORMAT="%Y%m%d %T  "
$ history

find public IP

$ wget -qO- http://ipecho.net/plain ; echo

ref:
http://askubuntu.com/questions/95910/command-for-determining-my-public-ip

count file lines

$ wc -l filename.txt

$ wc -l *.py

find files by name or content

$ find / -name virtualenvwrapper.sh

# 在現在的資料夾裡的全部檔案中搜尋字串,會自動搜尋子目錄
$ find . | xargs grep 'string'

$ find . -iname '*something*'

$ find *.html | xargs grep 'share_server.html'

# 搜尋當前目錄及子目錄下的含有 print() 字串的檔案
$ grep -rnw "." -e "print()"

$ grep -rnw "." -e "print()" --include=\*.py

ref:
https://stackoverflow.com/questions/16956810/how-do-i-find-all-files-containing-specific-text-on-linux

list files by date

$ ls -lrt

extract info from a file

$ cat uwsgi.log | grep error

display contents of all files in the current directory

$ grep . *
$ grep . *.html

list used ports

# list open files for a process
$ lsof | grep uwsgi

$ lsof -i | grep LISTEN
$ lsof -i -n -P | grep LISTEN

# TCP
$ sudo netstat -ntlp | grep uwsgi

# UCP
$ sudo netstat -nulp

$ sudo netstat -nxlp

ping port

$ curl -I "10.148.70.84:9200"
$ curl -I "192.168.100.10:80"

$ sudo apt-get install nmap
$ nmap -p 4505 54.250.5.176
$ nmap -p 8000 10.10.100.70
$ nmap -p 5672 10.10.100.82

$ telnet 54.250.5.176 4505

ref:
http://stackoverflow.com/questions/12792856/what-ports-does-rabbitmq-use

show network traffic and bandwidth

$ tcpdump -i eth0

$ sudo apt-get install tcptrack
$ tcptrack -i eth0

ref:
http://askubuntu.com/questions/257263/how-to-display-network-traffic-in-terminal

list running processes

# show all processes
$ pstree -a

# also show pid
$ pstree -ap

# 列出前 10 個最佔記憶體的 processes
$ ps aux | sort -nk +4 | tail

# 列出 mysql 相關的 processes
$ ps aux | grep 'worker process'
$ ps aux | grep uwsgi

# 樹狀顯示
$ ps auxf

# 搜尋 process 並以樹狀結果顯示 parent process
$ ps f -opid,cmd -C python

kill processes

# 列出目前所有的正在記憶體當中的程序
$ ps aux

# 匹配字串
$ ps aux | grep "mongo"

# 幹掉它
$ kill PID

# kill all processes matching a name
$ sudo killall -9 httpd
$ sudo killall salt
$ sudo pkill -f runserver
Redirect stderr to stdout or a file

Redirect stderr to stdout or a file

0: stdin
1: stdout
2: stderr

# redirect stdout to a file
$ strace uptime > test.log
# equals to
$ strace uptime 1> test.log

# redirect stderr to a file
$ strace uptime 2> test.log

# redirect stderr to stdout
# if you write `strace uptime 2> 1`, it means that you redirect stderr to a file with the name 1
$ strace uptime 2>&1

# redirect stderr and stdout to a file
$ strace uptime &> test.log

# run a command in backgroud
$ long-running-command &

ref:
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html
https://www.cyberciti.biz/faq/redirecting-stderr-to-stdout/