DevOps

sysctl: Linux System Tweaking

2019-10-302020-10-17VintaDevOps, Web Development

sysctl is a command-lin tool to modify kernel parameters at runtime in Linux.

ref:
http://man7.org/linux/man-pages/man8/sysctl.8.html

Usage

List All Parameters

$ sudo sysctl -a
$ sudo sysctl -a | grep tcp

The parameters available are those listed under /proc/sys/.

$ cat /proc/sys/net/core/somaxconn
1024

Show the Entry of a Specified Parameter

$ sudo sysctl net.core.somaxconn
net.core.somaxconn = 1024

### Show the Value of a Specified Parameter

```console
$ sysctl -n net.core.somaxconn
1024

Change a Specified Parameter

# Elasticsearch
$ sysctl -w vm.max_map_count = 262143

# Redis
$ sysctl -w vm.overcommit_memory = 1

ref:
https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html
https://redis.io/topics/admin

Persistence

`sysctl -w` only modify parameters at runtime, and they would be set to default values after the system is restarted. You must write those settings in `/etc/sysctl.conf` to persistent them.

# Do less swapping
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2

# Prevents SYN DOS attacks. Applies to ipv6 as well, despite name.
net.ipv4.tcp_syncookies = 1

# Prevents ip spoofing.
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1

# Only groups within this id range can use ping.
net.ipv4.ping_group_range=999 59999

# Redirects can potentially be used to maliciously alter hosts routing tables.
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 1
net.ipv6.conf.all.accept_redirects = 0

# The source routing feature includes some known vulnerabilities.
net.ipv4.conf.all.accept_source_route = 0
net.ipv6.conf.all.accept_source_route = 0

# See RFC 1337
net.ipv4.tcp_rfc1337 = 1

# Enable IPv6 Privacy Extensions (see RFC4941 and RFC3041)
net.ipv6.conf.default.use_tempaddr = 2
net.ipv6.conf.all.use_tempaddr = 2

# Restarts computer after 120 seconds after kernel panic
kernel.panic = 120

# Users should not be able to create soft or hard links to files which they do not own. This mitigates several privilege escalation vulnerabilities.
fs.protected_hardlinks = 1
fs.protected_symlinks = 1

ref:
https://blog.runcloud.io/how-to-secure-your-linux-server/
https://www.percona.com/blog/2019/02/25/mysql-challenge-100k-connections/
https://www.nginx.com/blog/tuning-nginx/

Activate parameters from the configuration file.

$ sudo sysctl -p

Troubleshooting

OS error code 24: Too many open files

$ sudo vim /etc/sysctl.conf
fs.file-max = 601017

$ sudo sysctl -p

$ sudo vim /etc/security/limits.d/nofile.conf
* soft nofile 65535
* hard nofile 65535
root soft nofile 65535
root hard nofile 65535

$ ulimit -n 65535

OS error code 99: Cannot assign requested address

For MySQL. Because there's no available local network ports left. You might need to set `net.ipv4.tcp_tw_reuse = 1` instead of `net.ipv4.tcp_tw_recycle = 1`.

$ sudo vim /etc/sysctl.conf
net.ipv4.tcp_tw_reuse = 1

$ sudo sysctl -p

ref:
https://www.percona.com/blog/2014/12/08/what-happens-when-your-application-cannot-open-yet-another-connection-to-mysql/
https://stackoverflow.com/questions/6426253/tcp-tw-reuse-vs-tcp-tw-recycle-which-to-use-or-both

Parameters are missing from `sysctl -a` or `/proc/sys`

Sometimes you might find some parameters are not in `sysctl -a` or `/proc/sys`.

You can find them in `/sys`:

$ echo "never" > /sys/kernel/mm/transparent_hugepage/enabled
$ echo "never" > /sys/kernel/mm/transparent_hugepage/defrag

$ cat /sys/kernel/mm/transparent_hugepage/enabled

To persistent them:

$ vim /etc/rc.local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
   echo "never" > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
   echo "never" > /sys/kernel/mm/transparent_hugepage/defrag
fi

$ systemctl enable rc-local

If /etc/rc.local doesn't exist, create one and run chmod 644 /etc/rc.local.

ref:
https://redis.io/topics/admin
https://unix.stackexchange.com/questions/99154/disable-transparent-hugepages

Setup Scalable WordPress Sites on Kubernetes

2019-10-252019-10-30VintaDevOps, Web Development

This article is about how to deploy a scalable WordPress site on Google Kubernetes Engine.

Using the container version of the popular LEMP stack:

Linux (Docker containers)
NGINX
MySQL (Google Cloud SQL)
PHP (PHP-FPM)

Google Cloud Platform Pricing

Deploying a personal blog on Kubernetes sounds like overkill (I must admit, it does). Still, it is fun and an excellent practice to containerize a traditional application, WordPress, which is harder than you thought. More importantly, the financial cost of running a Kubernetes cluster on GKE could be pretty low if you use preemptible VMs which also means native Chaos Engineering!

ref:
https://cloud.google.com/pricing/list
https://cloud.google.com/sql/pricing
https://cloud.google.com/compute/all-pricing

Google Cloud SQL

Cloud SQL is the fully managed relational database service on Google Cloud, though it currently only supports MySQL 5.6 and 5.7.

You can simply create a MySQL instance with few clicks on Google Cloud Platform Console or CLI. It is recommended to enable Private IP that allows VPC networking and never exposed to the public Internet. Nevertheless, you have to turn on Public IP if you would like to connect to it from your local machine. Otherwise, you might see something like couldn't connect to "xxx": dial tcp 10.x.x.x:3307: connect: network is unreachable. Remember to set IP whitelists for Public IP.

Connect to a Cloud SQL instance from your local machine:

$ gcloud components install cloud_sql_proxy
$ cloud_sql_proxy -instances=YOUR_INSTANCE_CONNECTION_NAME=tcp:0.0.0.0:3306

$ mysql --host 127.0.0.1 --port 3306 -u root -p

ref:
https://cloud.google.com/sql/docs/mysql
https://cloud.google.com/sql/docs/mysql/sql-proxy

Google Kubernetes Engine

The master of your Google Kubernetes Engine cluster is managed by GKE itself, as a result, you only need to provision and pay for worker nodes. No cluster management fees.

You can create a Kubernetes cluster on Google Cloud Platform Console or CLI, and there are some useful settings you might like to turn on:

Enable VPC-native (alias IP)
Enable Intranode visibility
Enable Stackdriver Kubernetes Engine Monitoring
Enable GKE usage metering
- You need to create an empty dataset on BigQuery first
- Don't forget to set the table expiration
Enable HTTP load balancing

Node Pools

Over-provisioning is human nature, so don't spend too much time on choosing the right machine type for your Kubernetes cluster at the beginning since you are very likely to overprovision without real usage data at hand. Instead, after deploying your workloads, you can find out the actual resource usage from Stackdriver Monitoring or GKE usage metering, then adjust your node pools.

Some useful node pool configurations:

Enable preemptible nodes
Access scopes > Set access for each API:
- Enable Cloud SQL

After the cluster is created, you can now configure your kubectl:

$ gcloud container clusters get-credentials YOUR_CLUSTER_NAME --zone YOUR_SELECTED_ZONE --project YOUR_PROJECT_ID
$ kubectl get nodes

If you are not familiar with Kubernetes, check out The Incomplete Guide to Google Kubernetes Engine.

WordPress

Here comes the tricky part, containerizing a WordPress site is not as simple as pulling a Docker image and set replicas: 10 since WordPress is a totally stateful application. Especially:

MySQL Database
The wp-content folder

The dependency on MySQL is relatively easy to solve since it is an external service. Your MySQL database could be managed, self-hosted, single machine, master-slave, or multi-master. However, horizontally scaling a database would be another story, so we only focus on WordPress now.

The next one, our notorious wp-content folder which includes plugins, themes, and uploads.

ref:
https://engineering.bitnami.com/articles/scaling-wordpress-in-kubernetes.html
https://dev.to/mfahlandt/scaling-properly-a-stateful-app-like-wordpress-with-kubernetes-engine-and-cloud-sql-in-google-cloud-27jh
https://thecode.co/blog/moving-wordpress-to-multiserver/

User-uploaded Media

Users (site owners, editors, or any logged-in users) can upload images or even videos on a WordPress site if you allow them to do so. For those uploaded contents, it is best to copy them to Amazon S3 or Google Cloud Storage automatically after a user uploads a file. Also, don't forget to configure a CDN to point at your bucket. Luckily, there are already plugins for such tasks:

Both storage services support direct uploads: the uploading file goes to S3 or GCS directly without touching your servers, but you might need to write some code to achieve that.

Pre-installed Plugins and Themes

You would usually deploy multiple WordPress Pods in Kubernetes, and each pod has its own resources: CPU, memory, and storage. Anything writes to the local volume is ephemeral that only exists within the Pod's lifecycle. When you install a new plugin through WordPress admin dashboard, the plugin would be only installed on the local disk of one of Pods, the one serves your request at the time. Therefore, your subsequent requests inevitably go to any of the other Pods because of the nature of Service load balancing, and they do not have those plugin files, even the plugin is marked as activated in the database, which causes an inconsistent issue.

There are two solutions for plugins and themes:

A shared writable network filesystem mounted by each Pod
An immutable Docker image which pre-installs every needed plugin and theme

For the first solution, you can either setup an NFS server, a Ceph cluster, or any of network-attached filesystems. An NFS server might be the simplest way, although it could also easily be a single point of failure in your architecture. Fortunately, managed network filesystem services are available in major cloud providers, like Amazon EFS and Google Cloud Filestore. In fact, Kubernetes is able to provide ReadWriteMany access mode for PersistentVolume (the volume can be mounted as read-write by many nodes). Still, only a few types of Volume support it, which don't include gcePersistentDisk and awsElasticBlockStore.

However, I personally adopt the second solution, creating Docker images contain pre-installed plugins and themes through CI since it is more immutable and no network latency issue as in NFS. Besides, I don't frequently install new plugins. It is regretful that some plugins might still write data to the local disk directly, and most of the time we can not prevent it.

ref:
https://serverfault.com/questions/905795/dynamically-added-wordpress-plugins-on-kubernetes

Dockerfile

Here is a dead-simple script to download pre-defined plugins and themes, and you can use it in Dockerfile later:

#!/bin/bash
set -ex

mkdir -p plugins
for download_url in $(cat plugins.txt)
do
    curl -Ls $download_url -o plugin.zip
    unzip -oq plugin.zip -d plugins/
    rm -f plugin.zip
done

mkdir -p themes
for download_url in $(cat themes.txt)
do
    curl -Ls $download_url -o theme.zip
    unzip -oq theme.zip -d themes/
    rm -f theme.zip
done

plugins.txt and themes.txt look like this:

https://downloads.wordpress.org/plugin/prismatic.2.2.zip
https://downloads.wordpress.org/plugin/wp-githuber-md.1.11.8.zip
https://downloads.wordpress.org/plugin/wp-stateless.2.2.7.zip

Then you need to create a custom Dockerfile based on the official wordpress Docker image along with your customizations.

FROM wordpress:5.2.4-fpm as builder

WORKDIR /usr/src/wp-cli/
RUN curl -Os https://raw.githubusercontent.com/wp-cli/builds/gh-pages/phar/wp-cli.phar && \
    chmod +x wp-cli.phar && \
    mv wp-cli.phar wp

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
    unzip && \
    apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /usr/src/app/
COPY wordpress/ /usr/src/app/
RUN chmod +x install.sh && \
    sh install.sh && \
    rm -rf \
    install.sh \
    plugins.txt \
    themes.txt

###

FROM wordpress:5.2.4-fpm

RUN mv "$PHP_INI_DIR/php.ini-production" "$PHP_INI_DIR/php.ini"
COPY php/custom.ini /usr/local/etc/php/conf.d/
COPY php-fpm/zz-docker.conf /usr/local/etc/php-fpm.d/

COPY --from=builder /usr/src/wp-cli/wp /usr/local/bin/
COPY --from=builder /usr/src/app/ /usr/src/wordpress/wp-content/
RUN cd /usr/src/wordpress/wp-content/ && \
    rm -rf \
    plugins/akismet/ \
    plugins/hello.php \
    themes/twentysixteen/ \
    themes/twentyseventeen/

# HACK: `101` is the user id of `nginx` user in `nginx:x.x.x-alpine` Docker image
# https://stackoverflow.com/questions/36824222/how-to-change-the-nginx-process-user-of-the-official-docker-image-nginx
RUN usermod -u 101 www-data && \
    groupmod -g 101 www-data

ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["php-fpm"]

The multiple FROM statements are for multi-stage builds.

See more details on the GitHub repository:
https://github.com/vinta/vinta.ws/tree/master/docker/code-blog

Google Cloud Build

Next, a small cloudbuild.yaml file to build Docker images in Google Cloud Build triggered by GitHub commits automatically.

substitutions:
  _BLOG_IMAGE_NAME: my-blog
steps:
- id: my-blog-cache-image
  name: gcr.io/cloud-builders/docker
  entrypoint: "/bin/bash"
  args:
   - "-c"
   - |
     docker pull asia.gcr.io/$PROJECT_ID/$_BLOG_IMAGE_NAME:$BRANCH_NAME || exit 0
  waitFor: ["-"]
- id: my-blog-build-image
  name: gcr.io/cloud-builders/docker
  args: [
    "build",
    "--cache-from", "asia.gcr.io/$PROJECT_ID/$_BLOG_IMAGE_NAME:$BRANCH_NAME",
    "-t", "asia.gcr.io/$PROJECT_ID/$_BLOG_IMAGE_NAME:$BRANCH_NAME",
    "-t", "asia.gcr.io/$PROJECT_ID/$_BLOG_IMAGE_NAME:$SHORT_SHA",
    "docker/my-blog/",
  ]
  waitFor: ["my-blog-cache-image"]
images:
- asia.gcr.io/$PROJECT_ID/$_BLOG_IMAGE_NAME:$SHORT_SHA

Just put it into the root directory of your GitHub repository. Don't forget to store Docker images near your server's location, in my case, asia.gcr.io.

Moreover, it is recommended by the official documentation to use --cache-from for speeding up Docker builds.

ref:
https://cloud.google.com/container-registry/docs/pushing-and-pulling#tag_the_local_image_with_the_registry_name
https://cloud.google.com/cloud-build/docs/speeding-up-builds

Deployments

Finally, here comes Kubernetes manifests. The era of YAML developers.

WordPress, PHP-FPM, and NGINX

You can configure the WordPress site as Deployment with an NGINX sidecar container which proxies to PHP-FPM via UNIX socket.

ConfigMaps for both WordPress and NGINX:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-blog-wp-config
data:
  wp-config.php: |
    <?php
    define('DB_NAME', 'xxx');
    define('DB_USER', 'xxx');
    define('DB_PASSWORD', 'xxx');
    define('DB_HOST', 'xxx');
    define('DB_CHARSET', 'utf8mb4');
    define('DB_COLLATE', '');

    define('AUTH_KEY',         'xxx');
    define('SECURE_AUTH_KEY',  'xxx');
    define('LOGGED_IN_KEY',    'xxx');
    define('NONCE_KEY',        'xxx');
    define('AUTH_SALT',        'xxx');
    define('SECURE_AUTH_SALT', 'xxx');
    define('LOGGED_IN_SALT',   'xxx');
    define('NONCE_SALT',       'xxx');

    $table_prefix = 'wp_';

    define('WP_DEBUG', false);

    if (isset($_SERVER['HTTP_X_FORWARDED_PROTO']) && $_SERVER['HTTP_X_FORWARDED_PROTO'] === 'https') {
      $_SERVER['HTTPS'] = 'on';
    }

    // WORDPRESS_CONFIG_EXTRA
    define('AUTOSAVE_INTERVAL', 86400);
    define('WP_POST_REVISIONS', false);

    if (!defined('ABSPATH')) {
      define('ABSPATH', dirname( __FILE__ ) . '/');
    }

    require_once(ABSPATH . 'wp-settings.php');
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-blog-nginx-site
data:
  default.conf: |
    server {
      listen 80;
      root /var/www/html;
      index index.php;

      if ($http_user_agent ~* (GoogleHC)) { # https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#health_checks
        return 200;
      }

      location /blog/ { # WordPress is installed in a subfolder
        try_files $uri $uri/ /blog/index.php?q=$uri&$args;
      }

      location ~ [^/]\.php(/|$) {
        try_files $uri =404;
        fastcgi_split_path_info ^(.+?\.php)(/.*)$;
        include fastcgi_params;
        fastcgi_param HTTP_PROXY "";
        fastcgi_pass unix:/var/run/php-fpm.sock;
        fastcgi_index index.php;
        fastcgi_buffers 8 16k;
        fastcgi_buffer_size 32k;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_param PATH_INFO $fastcgi_path_info;
      }
    }

The wordpress image supports setting configurations through environment variables, though I prefer to store the whole wp-config.php in ConfigMap, which is more convenient. It is also worth noting that you need to use the same set of WordPress secret keys (AUTH_KEY, LOGGED_IN_KEY, etc.) for all of your WordPress replicas. Otherwise, you might encounter login failures due to mismatched login cookies.

Of course, you can use a base64 encoded (NOT ENCRYPTED!) Secret to store sensitive data.

ref:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/
https://kubernetes.io/docs/concepts/configuration/secret/

Service:

apiVersion: v1
kind: Service
metadata:
  name: my-blog
spec:
  selector:
    app: my-blog
  type: NodePort
  ports:
  - name: http
    port: 80
    targetPort: http

ref:
https://kubernetes.io/docs/concepts/services-networking/service/

Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-blog
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-blog
  template:
    metadata:
      labels:
        app: my-blog
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100 # prevent the scheduler from locating two pods on the same node
            podAffinityTerm:
              topologyKey: kubernetes.io/hostname
              labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                    - my-blog
      volumes:
      - name: php-fpm-unix-socket
        emptyDir:
          medium: Memory
      - name: wordpress-root
        emptyDir:
          medium: Memory
      - name: my-blog-wp-config
        configMap:
          name: my-blog-wp-config
      - name: my-blog-nginx-site
        configMap:
          name: my-blog-nginx-site
      containers:
      - name: wordpress
        image: asia.gcr.io/YOUR_PROJECT_ID/YOUR_IMAGE_NAME:YOUR_IMAGE_TAG
        workingDir: /var/www/html/blog # HACK: specify the WordPress installation path: subfolder
        volumeMounts:
        - name: php-fpm-unix-socket
          mountPath: /var/run
        - name: wordpress-root
          mountPath: /var/www/html/blog
        - name: my-blog-wp-config
          mountPath: /var/www/html/blog/wp-config.php
          subPath: wp-config.php
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 512Mi
      - name: nginx
        image: nginx:1.17.5-alpine
        volumeMounts:
        - name: php-fpm-unix-socket
          mountPath: /var/run
        - name: wordpress-root
          mountPath: /var/www/html/blog
          readOnly: true
        - name: my-blog-nginx-site
          mountPath: /etc/nginx/conf.d/
          readOnly: true
        ports:
        - name: http
          containerPort: 80
        resources:
          requests:
            cpu: 50m
            memory: 100Mi
          limits:
            cpu: 100m
            memory: 100Mi

Setting podAntiAffinity is important for running apps on Preemptible nodes.

Pro tip: you can set the emptyDir.medium: Memory to mount a tmpfs (RAM-backed filesystem) for Volumes.

ref:
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

CronJob

WP-Cron is the way WordPress handles scheduling time-based tasks. The problem is how WP-Cron works: on every page load, a list of scheduled tasks is checked to see what needs to be run. Therefore, you might consider replacing WP-Cron with a regular Kubernetes CronJob.

// in wp-config.php
define('DISABLE_WP_CRON', true);

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: my-blog-wp-cron
spec:
  schedule: "0 * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          volumes:
          - name: my-blog-wp-config
            configMap:
              name: my-blog-wp-config
          containers:
          - name: wp-cron
            image: asia.gcr.io/YOUR_PROJECT_ID/YOUR_IMAGE_NAME:YOUR_IMAGE_TAG
            command: ["/usr/local/bin/php"]
            args:
            - /usr/src/wordpress/wp-cron.php
            volumeMounts:
            - name: my-blog-wp-config
              mountPath: /usr/src/wordpress/wp-config.php
              subPath: wp-config.php
              readOnly: true
          restartPolicy: OnFailure

ref:
https://developer.wordpress.org/plugins/cron/

Ingress

Lastly, you would need external access to Services in your Kubernetes cluster:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: load-balancer
  annotations:
    kubernetes.io/ingress.class: "gce" # https://github.com/kubernetes/ingress-gce
spec:
  rules:
  - host: example.com
    http:
      paths:
      - path: /blog/*
        backend:
          serviceName: my-blog
          servicePort: http
      - backend:
          serviceName: frontend
          servicePort: http

There is a default NGINX Deployment to serve requests other than WordPress.

See more details on the GitHub repository:
https://github.com/vinta/vinta.ws/tree/master/kubernetes

ref:
https://kubernetes.io/docs/concepts/services-networking/ingress/
https://cloud.google.com/kubernetes-engine/docs/concepts/ingress

SSL Certificates

HTTPS is absolutely required nowadays. There are some solutions to automatically provision and manage TLS certificates for you:

Conclusions

If a picture is worth a thousand words, then a video is worth a million. This video accurately describes how we ultimately deploy a WordPress site on Kubernetes.

mitmproxy: proxy any network traffic through your local machine

2018-10-212019-10-30VintaDevOps, Python, Web Development

mitmproxy is your swiss-army knife for interactive HTTP/HTTPS proxy. In fact, it can be used to intercept, inspect, modify and replay web traffic such as HTTP/1, HTTP/2, WebSockets, or any other SSL/TLS-protected protocols.

Moreover, mitproxy has a powerful Python API offers full control over any intercepted request and response.

ref:
https://mitmproxy.org/
https://docs.mitmproxy.org/stable/

Concept

ref:
https://docs.mitmproxy.org/stable/concepts-howmitmproxyworks/

Installation

$ brew install mitmproxy

$ mitmproxy --version
Mitmproxy: 4.0.4
Python:    3.7.0
OpenSSL:   OpenSSL 1.0.2p  14 Aug 2018
Platform:  Darwin-18.0.0-x86_64-i386-64bit

ref:
https://docs.mitmproxy.org/stable/overview-installation/

Configuration

Make your computer become the man of man-in-the-middle attack.

macOS

$ ipconfig getifaddr en0
192.168.0.128

$ mitmproxy -p 8888
# or
$ mitmweb -p 8888
$ open http://127.0.0.1:8081/

Flow List keys:

?: Show help
q: Exit the current view
f: Set view filter
r: Replay this flow
i: Set intercept filter
hjkl or arrow: Move left/down/up/right
enter: Select

Flow Details keys:

tab: Select next
m: Set flow view mode
e: Edit this flow (request or response)
a: Accept this intercepted flow

ref:
https://docs.mitmproxy.org/stable/tools-mitmproxy/
https://github.com/mitmproxy/mitmproxy/blob/master/mitmproxy/tools/console/defaultkeys.py

iOS

Go to Settings > Wi-Fi > Your Wi-Fi > Configure Proxy
- Select Manual, enter the following values:
  - Server: 192.168.0.128
  - Port: 8888
  - Authentication: unchecked
Open http://mitm.it/ on Safari
- Install the corresponding certificate for your device
Go to Settings > General > About > Certificate Trust Settings
- Turn on the mitmproxy certificate
Open any app you want to watch

ref:
https://docs.mitmproxy.org/stable/concepts-certificates/

Usage

The most exciting feature is you could alter any request and response using a Python script, mitmdump -s!

ref:
https://docs.mitmproxy.org/stable/tools-mitmdump/
https://github.com/mitmproxy/mitmproxy/tree/master/examples

Deal With Certificate Pinning

You can use your own certificate by passing the --certs example.com=/path/to/example.com.pem option to mitmproxy. Mitmproxy then uses the provided certificate for interception of the specified domain.

The certificate file is expected to be in the PEM format which would roughly looks like this:

$ mitmproxy -p 8888 --certs example.com=example.com.pem

ref:
https://docs.mitmproxy.org/stable/concepts-certificates/#using-a-custom-server-certificate

Redirect Requests To Your Local Development Server

# redirect_to_localhost.py
from mitmproxy import ctx
from mitmproxy import http

REMOTE_HOST = 'api.example.com'
DEV_HOST = '192.168.0.128'
DEV_PORT = 8000

def request(flow: http.HTTPFlow) -> None:
    if flow.request.pretty_host in [REMOTE_HOST, DEV_HOST]:
        ctx.log.info('=== request')
        ctx.log.info(str(flow.request.headers))
        ctx.log.info(f'content: {str(flow.request.content)}')

        flow.request.scheme = 'http'
        flow.request.host = DEV_HOST
        flow.request.port = DEV_PORT

def response(flow: http.HTTPFlow) -> None:
    if flow.request.pretty_host == DEV_HOST:
        ctx.log.info('=== response')
        ctx.log.info(str(flow.response.headers))
        if flow.response.headers.get('Content-Type', '').startswith('image/'):
            return
        ctx.log.info(f'body: {str(flow.response.get_content())}')

ref:
https://discourse.mitmproxy.org/t/reverse-mode-change-request-host-according-to-the-sni-https/466

You could use negative regex with --ignore-hosts to only watch specific domains. Of course, you are still able to blacklist any domain you don't want: --ignore-hosts 'apple.com|icloud.com|itunes.com|facebook.com|googleapis.com|crashlytics.com'.

Currently, changing the Host server for HTTP/2 connections is not allowed, but you could just disable HTTP/2 proxy to solve the issue if you don't need HTTP/2 for local development.

$ mitmdump -p 8888 \
--certs example.com=example.com.pem \
-v --flow-detail 3 \
--ignore-hosts '^(?!.*example\.com)' \
--no-http2 \
-s redirect_to_localhost.py

ref:
https://stackoverflow.com/questions/29414158/regex-negative-lookahead-with-wildcard

MongoDB operations: Replica Set

2018-07-112019-10-21VintaDatabase, DevOps

A replica set is a group of servers (mongod actually) that maintain the same data set, with one primary which takes client requests, and multiple secondaries that keep copies of the primary's data. If the primary crashes, secondaries can elect a new primary from amongst themselves.

Replication from primary to secondaries is asynchronous.

ref:
https://docs.mongodb.com/v3.6/replication/
https://www.safaribooksonline.com/library/view/mongodb-the-definitive/9781491954454/ch08.html
https://www.percona.com/blog/2018/10/10/mongodb-replica-set-scenarios-and-internals/

Concepts

Primary: A node that accepts writes and is the leader for voting. There can be only one primary.
Secondary: A node that replicates from the primary or another secondary and can be used for reads. There can be a max of 127.
Arbiter: The node does not hold data and only participates in the voting. Also, it cannot be elected as the primary.
- In the event your node count is an even number, add one of these to break the tie. Never add one where it would make the count even.
Priority 0 node: The node cannot be selected as the primary. You might want to lower priority of some slow nodes.
- Priority allows you to prefer specific nodes are primary.
Vote 0 node: The node does not participate in the voting.
- In some cases, having more than eight nodes means additional nodes must not vote.
Hidden node: The hidden node must be a priority 0 node and is invisible to the driver which unable to take queries from clients.
Delayed node: The delayed node must be a hidden node, and its data lag behind the primary for some time.
Tags: Grants special ability to make queries directly to specific nodes. Useful for BI, geo-locality, and other advanced functions.

ref:
https://docs.mongodb.com/manual/core/replica-set-elections/
https://docs.mongodb.com/manual/core/replica-set-priority-0-member/
https://docs.mongodb.com/manual/core/replica-set-hidden-member/
https://docs.mongodb.com/manual/core/replica-set-delayed-member/

Common Architectures

ref:
https://docs.mongodb.com/v3.6/core/replica-set-architectures/
https://www.percona.com/blog/2018/03/22/the-anatomy-of-a-mongodb-replica-set/

Three-Node Replica Set: Primary, Secondary, Secondary

ref:
https://docs.mongodb.com/v3.6/tutorial/deploy-replica-set/
https://docs.mongodb.com/v3.6/tutorial/expand-replica-set/

If you are running MongoDB cluster on Kubernetes, PLEASE USE THE FULL DNS NAME (FQDN). DO NOT use something like pod-name.service-name.

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
> rs.initiate({
   _id : "rs0",
   members: [
      {_id: 0, host: "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017"},
      {_id: 1, host: "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017"},
      {_id: 2, host: "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017"}
   ]
})
{
    "ok" : 1,
    "operationTime" : Timestamp(1531223087, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531223087, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}
rs0:PRIMARY> db.isMaster()

ref:
https://docs.mongodb.com/v3.6/reference/method/rs.initiate/

$ mongo mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local
rs0:SECONDARY> rs.slaveOk()
rs0:SECONDARY> show dbs
rs0:SECONDARY> rs.conf()
{
    "_id" : "rs0",
    "version" : 1,
    "protocolVersion" : NumberLong(1),
    "members" : [
        {
            "_id" : 0,
            "host" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 1,
            "host" : "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 2,
            "host" : "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "catchUpTimeoutMillis" : -1,
        "catchUpTakeoverDelayMillis" : 30000,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        },
        "replicaSetId" : ObjectId("5b449c2f9269bb1a807a8cdf")
    }
}
rs0:SECONDARY> rs.status()
{
    "set" : "rs0",
    "date" : ISODate("2018-07-10T11:47:48.474Z"),
    "myState" : 1,
    "term" : NumberLong(1),
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        },
        "readConcernMajorityOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1531223260, 1),
            "t" : NumberLong(1)
        }
    },
    "members" : [
        {
            "_id" : 0,
            "name" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 381,
            "optime" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-07-10T11:47:40Z"),
            "electionTime" : Timestamp(1531223098, 1),
            "electionDate" : ISODate("2018-07-10T11:44:58Z"),
            "configVersion" : 1,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 181,
            "optime" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-07-10T11:47:40Z"),
            "optimeDurableDate" : ISODate("2018-07-10T11:47:40Z"),
            "lastHeartbeat" : ISODate("2018-07-10T11:47:46.599Z"),
            "lastHeartbeatRecv" : ISODate("2018-07-10T11:47:47.332Z"),
            "pingMs" : NumberLong(0),
            "syncingTo" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "configVersion" : 1
        },
        {
            "_id" : 2,
            "name" : "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 181,
            "optime" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDurable" : {
                "ts" : Timestamp(1531223260, 1),
                "t" : NumberLong(1)
            },
            "optimeDate" : ISODate("2018-07-10T11:47:40Z"),
            "optimeDurableDate" : ISODate("2018-07-10T11:47:40Z"),
            "lastHeartbeat" : ISODate("2018-07-10T11:47:46.599Z"),
            "lastHeartbeatRecv" : ISODate("2018-07-10T11:47:47.283Z"),
            "pingMs" : NumberLong(0),
            "syncingTo" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "configVersion" : 1
        }
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1531223260, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531223260, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

Three-Node Replica Set: Primary, Secondary, Arbiter

If your replica set has an even number of members, add an arbiter to obtain a majority of votes in an election for primary. Arbiters do not require dedicated hardware.

ref:
https://docs.mongodb.com/v3.6/tutorial/add-replica-set-arbiter/

Issues

Change Replica Set Name

Start mongod without --replSet
Run db.system.replset.remove({_id: 'oldReplicaSetName'}) in MongoDB Shell
Start mongod with --replSet "newReplicaSetName"

ref:
https://stackoverflow.com/questions/33400607/how-do-i-rename-a-mongodb-replica-set

InvalidReplicaSetConfig: Our replica set configuration is invalid or does not include us

$ kubectl logs -f mongodb-rs0-0
REPL_HB [replexec-10] Error in heartbeat (requestId: 20048) to mongodb-rs0-2.mongodb-rs0:27017, response status: InvalidReplicaSetConfig: Our replica set configuration is invalid or does not include us

$ mongo mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local
rs0:OTHER> rs.status()
{
    "state" : 10,
    "stateStr" : "REMOVED",
    "uptime" : 631,
    "optime" : {
        "ts" : Timestamp(1531224140, 1),
        "t" : NumberLong(1)
    },
    "optimeDate" : ISODate("2018-07-10T12:02:20Z"),
    "ok" : 0,
    "errmsg" : "Our replica set config is invalid or we are not a member of it",
    "code" : 93,
    "codeName" : "InvalidReplicaSetConfig",
    "operationTime" : Timestamp(1531224140, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531224790, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
rs0:PRIMARY> rs.conf() 
{
    "_id" : "rs0",
    "version" : 9,
    "protocolVersion" : NumberLong(1),
    "members" : [
        {
            "_id" : 0,
            "host" : "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 1,
            "host" : "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        },
        {
            "_id" : 2,
            "host" : "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "catchUpTimeoutMillis" : -1,
        "catchUpTakeoverDelayMillis" : 30000,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        },
        "replicaSetId" : ObjectId("5b449c2f9269bb1a807a8cdf")
    }
}

The faulty member's state is REMOVED (it was once in a replica set but was subsequently removed) and shows Our replica set config is invalid or we are not a member of it. In fact, the real issue is that the removed node is sill in the list of replica set members.

You could just manually remove the broken node from the replica set on the primary, restart the node, and re-add the node.

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
rs0:PRIMARY> rs.remove("mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017")

# restart the Pod
$ kubectl delete mongodb-rs0-2

$ mongo mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local
rs0:PRIMARY> rs.add("mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017")

ref:
https://stackoverflow.com/questions/47439781/mongodb-replica-set-member-state-is-other
https://docs.mongodb.com/v3.6/tutorial/remove-replica-set-member/
https://docs.mongodb.com/manual/reference/replica-states/

db.isMaster(): Does not have a valid replica set config

rs0:OTHER> db.isMaster()
{
    "hosts" : [
        "mongodb-rs0-0.mongodb-rs0.default.svc.cluster.local:27017",
        "mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017",
        "mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local27017"
    ],
    "setName" : "rs0",
    "ismaster" : false,
    "secondary" : false,
    "info" : "Does not have a valid replica set config",
    "isreplicaset" : true,
    "maxBsonObjectSize" : 16777216,
    "maxMessageSizeBytes" : 48000000,
    "maxWriteBatchSize" : 100000,
    "localTime" : ISODate("2018-07-10T14:34:48.640Z"),
    "logicalSessionTimeoutMinutes" : 30,
    "minWireVersion" : 0,
    "maxWireVersion" : 6,
    "readOnly" : false,
    "ok" : 1,
    "operationTime" : Timestamp(1531232610, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1531232610, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

You could just re-configure the replica set and only keep reachable members.

rs0:OTHER> oldConf = rs.conf()
rs0:OTHER> oldConf.members = [oldConf.members[0]]
rs0:OTHER> rs.reconfig(oldConf, {force: true})
rs0:PRIMARY> rs.add("mongodb-rs0-1.mongodb-rs0.default.svc.cluster.local:27017")
rs0:PRIMARY> rs.add("mongodb-rs0-2.mongodb-rs0.default.svc.cluster.local:27017")

ref:
https://docs.mongodb.com/v3.6/tutorial/reconfigure-replica-set-with-unavailable-members/

Change Replica Set Name

Stop mongod
Start mongod --bind_ip_all --port 27017 --dbpath /data/db without --replSet
Remove the old Replica Set name

use admin
db.getCollection('system.version').remove({_id: 'shardIdentity'})

use local
db.getCollection('system.replset').remove({_id: 'rs0'})

Start mongod --bind_ip_all --port 27017 --dbpath /data/db --shardsvr --replSet sh0

ref:
https://stackoverflow.com/questions/33400607/how-do-i-rename-a-mongodb-replica-set

Connect To A Replica Set Cluster

ref:
https://api.mongodb.com/python/current/examples/high_availability.html

Use Connection Pools

ref:
https://api.mongodb.com/python/current/faq.html#how-does-connection-pooling-work-in-pymongo

Apex and Terraform: The easiest way to manage AWS Lambda functions

2018-04-112019-10-22VintaDevOps, Web Development

AWS Lambda lets you run code without provisioning or managing servers, which is so-called Serverless or Function as a Service (FaaS).

Apex is a Go command-line tool to manage and deploy your serverless functions on AWS Lambda. Apex is also integrated with Terraform to provide cloud infrastructure management, for instance, configuring your AWS Lambda functions with Amazon API Gateway.

ref:
https://aws.amazon.com/lambda/
https://aws.amazon.com/api-gateway/
https://github.com/apex/apex

You could browse projects created in this post on GitHub:
https://github.com/vinta/pangu.space
https://github.com/CodeTengu/LambdaBaku

Install

$ curl https://raw.githubusercontent.com/apex/apex/master/install.sh | sh

ref:
https://apex.run/#installation

Initialize

It is recommended to configure your AWS credentials with awscli.

$ pip install awscli
$ aws configure

ref:
https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html

To use Apex to manage Lambda functions, you have to make sure your AWS credential has minimum IAM permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "iam:CreateRole",
        "iam:CreatePolicy",
        "iam:AttachRolePolicy",
        "iam:PassRole",
        "lambda:GetFunction",
        "lambda:ListFunctions",
        "lambda:CreateFunction",
        "lambda:DeleteFunction",
        "lambda:InvokeFunction",
        "lambda:GetFunctionConfiguration",
        "lambda:UpdateFunctionConfiguration",
        "lambda:UpdateFunctionCode",
        "lambda:CreateAlias",
        "lambda:UpdateAlias",
        "lambda:GetAlias",
        "lambda:ListAliases",
        "lambda:ListVersionsByFunction",
        "logs:FilterLogEvents",
        "cloudwatch:GetMetricStatistics"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

$ apex init

ref:
https://apex.run/#getting-started

After running apex init, Apex creates a Role and a Policy. You should be able to find them on AWS IAM Management Console. If you want to access other AWS resources, for instance, S3 buckets, DynamoDB tables, SNS, in your Lambda functions, you must create a new Policy which grants appropriate permissions and attachs itself to the Role that Apex created.

Here is a Policy example of operating certain DynamoDB tables:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt123456789",
            "Effect": "Allow",
            "Action": [
                "dynamodb:*"
            ],
            "Resource": [
                "arn:aws:dynamodb:ap-northeast-1:123456789:table/CodeTengu_Preference",
                "arn:aws:dynamodb:ap-northeast-1:123456789:table/CodeTengu_Preference/*",
                "arn:aws:dynamodb:ap-northeast-1:123456789:table/CodeTengu_WeeklyIssue",
                "arn:aws:dynamodb:ap-northeast-1:123456789:table/CodeTengu_WeeklyIssue/*",
                "arn:aws:dynamodb:ap-northeast-1:123456789:table/CodeTengu_WeeklyPost",
                "arn:aws:dynamodb:ap-northeast-1:123456789:table/CodeTengu_WeeklyPost/*"
            ]
        }
    ]
}

Write Lambda Functions

ref:
https://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html

Node.js

The simplest handler:

const aws = require('aws-sdk');

exports.handle = (event, context, callback) => {
  doYourShit();
  callback(null, 'DONE');
};

ref:
https://docs.aws.amazon.com/lambda/latest/dg/programming-model.html

Call another Lambda function in a Lambda function:

You must make sure your Lambda role has the permission of invoking other Lambda functions.

const util = require('util');

const aws = require('aws-sdk');

const params = {
  FunctionName: 'LambdaBaku_syncIssue',
  InvocationType: 'Event', // means asynchronous execution
  Payload: JSON.stringify({ issue_number: curatedIssue.number }),
};

lambda.invoke(params, (err, data) => {
  if (err) {
    console.log('FAIL', params);
    console.log(util.inspect(err));
  } else {
    console.log(data);
  }
});

ref:
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html
https://stackoverflow.com/questions/31714788/can-an-aws-lambda-function-call-another

Go

Write a Lambda function triggered by Amazon API Gateway:

package main

import (
    "encoding/json"
    "errors"
    "log"

    "github.com/aws/aws-lambda-go/events"
    "github.com/aws/aws-lambda-go/lambda"
    "github.com/vinta/pangu"
)

var (
    // ErrTextNotProvided is thrown when text is not provided in HTTP query string
    ErrTextNotProvided = errors.New("No text was provided in HTTP query string")
)

// Handler is the AWS Lambda function handler
func Handler(request events.APIGatewayProxyRequest) (events.APIGatewayProxyResponse, error) {
    log.Printf("request id: %s\n", request.RequestContext.RequestID)

    text, ok := request.QueryStringParameters["t"]
    if !ok {
        errMap := map[string]string{
            "message": ErrTextNotProvided.Error(),
        }
        errMapJSON, _ := json.MarshalIndent(errMap, "", " ")

        return events.APIGatewayProxyResponse{
            Body: string(errMapJSON),
            StatusCode: 400,
        }, nil
    }

    log.Printf("text: %s\n", text)

    textPlainHeaders := map[string]string{
        "content-type": "text/plain; charset=utf-8",
    }

    return events.APIGatewayProxyResponse{
        Body: pangu.SpacingText(text),
        Headers: textPlainHeaders,
        StatusCode: 200,
    }, nil
}

func main() {
    lambda.Start(Handler)
}

ref:
https://aws.amazon.com/blogs/compute/announcing-go-support-for-aws-lambda/
https://docs.aws.amazon.com/lambda/latest/dg/go-programming-model-handler-types.html
https://docs.aws.amazon.com/lambda/latest/dg/go-programming-model-errors.html

Your "Integration Request" configurations in API Gateway should be like:

Integration type: Lambda Function
Use Lambda Proxy integration: Yes
Lambda Region: ap-northeast-1
Lambda Function: panguspace_spacing_text
Invoke with caller credentials: No
Credentials cache: Do not add caller credentials to cache key
Use Default Timeout: Yes

It's also worth noting that the API response is mainly defined by APIGatewayProxyResponse in Lambda function code. Configurations in API Gateway, i.e., "Integration Response" and "Method Response" do not matter.

ref:
https://docs.aws.amazon.com/apigateway/latest/developerguide/getting-started-with-lambda-integration.html

Usage

Deploy all functions:

$ apex deploy

ref:
https://apex.run/#deploying-functions

Invoke a function:

# invoke a function directly
$ apex invoke spacing_text --logs
{
    "statusCode": 400,
    "headers": null,
    "body":"{\"message\": \"No text was provided in the HTTP query string\"}"
}

# invoke a function with an API Gateway event
$ cat fixtures/spacing_text_event.json
{
    "queryStringParameters": {"t": "與PM戰鬥的人，應當小心自己不要成為PM"}
}
$ apex invoke spacing_text --logs < fixtures/spacing_text_event.json
{
    "statusCode": 200,
    "headers": {"content-type": "text/plain; charset=utf-8"},
    "body": "與 PM 戰鬥的人，應當小心自己不要成為 PM"
}

ref:
https://apex.run/#invoking-functions

View logs which might delay several seconds:

$ apex logs -f

Pack a function:

$ apex build spacing_text > spacing_text.zip

Configure API Gateway

Create API Keys

To setup API keys, do the following:

Configure your API methods to require an API key
Deploy your API
Create an API key for the API in a region
Create an Usage Plan and assign an API key with a certain Stage

In step 1, your "Method Request" configurations in API Gateway should be like:

Authorization: NONE
Request Validator: NONE
API Key Required: true

Now you are able to call the API with a x-api-key header:

$ curl -H "x-api-key: YOUR-API-KEY" https://xxx.execute-api.ap-northeast-1.amazonaws.com/v1/your-endpoint/

ref:
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-create-usage-plans-with-rest-api.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/how-to-use-postman-to-call-api.html

Actually, you could release your APIs without API keys if you like.

Setup a Custom Domain

To setup a custom domain which managed by Cloudflare, see the following link:
https://stackoverflow.com/a/46061708/885524

It is worth noting that even the Stack Overflow answer said using Full (Strict) SSL mode but actually Full also works.

Moreover, it might take a long time to generate "Target Domain Name" (xxx.cloudfront.net).

Don't forget to add "Base Path Mappings" in API Gateway Custom Domain Names:

api.pangu.space
- Target Domain Name: xxx.cloudfront.net
- ACM Certificate: *.pangu.space
- Base Path Mappings:
  - Path: /v1
  - Destination: Pangu:v1

Manage Infrastructures with Terraform

Terraform is a tool to manage your cloud infrastructures as code.

$ brew install terraform

$ tree .
.
├── functions
│   ├── introduce
│   │   └── main.go
│   └── spacing_text
│       └── main.go
└── infrastructure
    ├── main.tf
    └── variables.tf

Define variables and data sources:

# infrastructure/variables.tf
data "aws_caller_identity" "current" {}

variable "aws_region" {}
variable "apex_environment" {}
variable "apex_function_role" {}

variable "apex_function_arns" {
  type = "map"
}

variable "apex_function_names" {
  type = "map"
}

variable "apex_function_introduce" {}
variable "apex_function_spacing_text" {}

ref:
https://www.terraform.io/docs/providers/aws/d/caller_identity.html

Define AWS resources:

# infrastructure/main.tf
resource "aws_api_gateway_rest_api" "pangu" {
  name = "Pangu"
}

resource "aws_api_gateway_method" "pangu_root" {
  rest_api_id   = "${aws_api_gateway_rest_api.pangu.id}"
  resource_id   = "${aws_api_gateway_rest_api.pangu.root_resource_id}"
  http_method   = "GET"
  authorization = "NONE"
}

resource "aws_api_gateway_integration" "pangu_root_get" {
  rest_api_id             = "${aws_api_gateway_rest_api.pangu.id}"
  resource_id             = "${aws_api_gateway_rest_api.pangu.root_resource_id}"
  http_method             = "${aws_api_gateway_method.pangu_root.http_method}"
  integration_http_method = "POST"
  type                    = "AWS_PROXY"
  uri                     = "arn:aws:apigateway:${var.aws_region}:lambda:path/2015-03-31/functions/${var.apex_function_introduce}/invocations"
}

resource "aws_api_gateway_method_response" "pangu_root_get_200" {
  rest_api_id = "${aws_api_gateway_rest_api.pangu.id}"
  resource_id = "${aws_api_gateway_rest_api.pangu.root_resource_id}"
  http_method = "${aws_api_gateway_method.pangu_root.http_method}"
  status_code = "200"

  response_models = {
    "application/json" = "Empty"
  }

  response_parameters = {
    "method.response.header.Access-Control-Allow-Origin" = true
  }
}

resource "aws_api_gateway_resource" "pangu_spacing_text" {
  rest_api_id = "${aws_api_gateway_rest_api.pangu.id}"
  parent_id   = "${aws_api_gateway_rest_api.pangu.root_resource_id}"
  path_part   = "spacing-text"
}

resource "aws_api_gateway_method" "pangu_spacing_text_get" {
  rest_api_id      = "${aws_api_gateway_rest_api.pangu.id}"
  resource_id      = "${aws_api_gateway_resource.pangu_spacing_text.id}"
  http_method      = "GET"
  authorization    = "NONE"
  api_key_required = true
}

resource "aws_api_gateway_integration" "pangu_spacing_text_get" {
  rest_api_id             = "${aws_api_gateway_rest_api.pangu.id}"
  resource_id             = "${aws_api_gateway_resource.pangu_spacing_text.id}"
  http_method             = "${aws_api_gateway_method.pangu_spacing_text_get.http_method}"
  integration_http_method = "POST"
  type                    = "AWS_PROXY"
  uri                     = "arn:aws:apigateway:${var.aws_region}:lambda:path/2015-03-31/functions/${var.apex_function_spacing_text}/invocations"
}

resource "aws_api_gateway_method_response" "pangu_spacing_text_get_200" {
  rest_api_id = "${aws_api_gateway_rest_api.pangu.id}"
  resource_id = "${aws_api_gateway_resource.pangu_spacing_text.id}"
  http_method = "${aws_api_gateway_method.pangu_spacing_text_get.http_method}"
  status_code = "200"

  response_models = {
    "application/json" = "Empty"
  }

  response_parameters = {
    "method.response.header.Access-Control-Allow-Origin" = true
  }
}

resource "aws_api_gateway_deployment" "pangu" {
  depends_on = [
    "aws_api_gateway_method.pangu_root",
    "aws_api_gateway_integration.pangu_root_get",
    "aws_api_gateway_method_response.pangu_root_get_200",
    "aws_api_gateway_resource.pangu_spacing_text",
    "aws_api_gateway_method.pangu_spacing_text_get",
    "aws_api_gateway_integration.pangu_spacing_text_get",
    "aws_api_gateway_method_response.pangu_spacing_text_get_200",
  ]

  rest_api_id = "${aws_api_gateway_rest_api.pangu.id}"
  stage_name  = "v1"
}

resource "aws_lambda_permission" "pangu_root_get" {
  statement_id  = "AllowInvokeFromAPIGateway"
  action        = "lambda:InvokeFunction"
  function_name = "${var.apex_function_introduce}"
  principal     = "apigateway.amazonaws.com"

  source_arn = "arn:aws:execute-api:${var.aws_region}:${data.aws_caller_identity.current.account_id}:${aws_api_gateway_rest_api.pangu.id}/*/${aws_api_gateway_integration.pangu_root_get.http_method}/"
}

resource "aws_lambda_permission" "pangu_spacing_text" {
  statement_id  = "AllowInvokeFromAPIGateway"
  action        = "lambda:InvokeFunction"
  function_name = "${var.apex_function_spacing_text}"
  principal     = "apigateway.amazonaws.com"

  source_arn = "arn:aws:execute-api:${var.aws_region}:${data.aws_caller_identity.current.account_id}:${aws_api_gateway_rest_api.pangu.id}/*/${aws_api_gateway_integration.pangu_spacing_text_get.http_method}${aws_api_gateway_resource.pangu_spacing_text.path}"
}

ref:
https://www.terraform.io/docs/providers/aws/guides/serverless-with-aws-lambda-and-api-gateway.html

# donwload provider plugins
$ apex infra init

# view the generated execution plan
$ apex infra plan

# deploy your infrastructures
$ apex infra apply
$ apex infra apply -auto-approve

ref:
https://apex.run/#managing-infrastructure