`sysctl -w` only modify parameters at runtime, and they would be set to default values after the system is restarted. You must write those settings in `/etc/sysctl.conf` to persistent them.
# Do less swapping
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
# Prevents SYN DOS attacks. Applies to ipv6 as well, despite name.
net.ipv4.tcp_syncookies = 1
# Prevents ip spoofing.
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
# Only groups within this id range can use ping.
net.ipv4.ping_group_range=999 59999
# Redirects can potentially be used to maliciously alter hosts routing tables.
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 1
net.ipv6.conf.all.accept_redirects = 0
# The source routing feature includes some known vulnerabilities.
net.ipv4.conf.all.accept_source_route = 0
net.ipv6.conf.all.accept_source_route = 0
# See RFC 1337
net.ipv4.tcp_rfc1337 = 1
# Enable IPv6 Privacy Extensions (see RFC4941 and RFC3041)
net.ipv6.conf.default.use_tempaddr = 2
net.ipv6.conf.all.use_tempaddr = 2
# Restarts computer after 120 seconds after kernel panic
kernel.panic = 120
# Users should not be able to create soft or hard links to files which they do not own. This mitigates several privilege escalation vulnerabilities.
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
$ sudo vim /etc/sysctl.conf
fs.file-max = 601017
$ sudo sysctl -p
$ sudo vim /etc/security/limits.d/nofile.conf
* soft nofile 65535
* hard nofile 65535
root soft nofile 65535
root hard nofile 65535
$ ulimit -n 65535
OS error code 99: Cannot assign requested address
For MySQL. Because there's no available local network ports left. You might need to set `net.ipv4.tcp_tw_reuse = 1` instead of `net.ipv4.tcp_tw_recycle = 1`.
$ vim /etc/rc.local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo "never" > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo "never" > /sys/kernel/mm/transparent_hugepage/defrag
fi
$ systemctl enable rc-local
If /etc/rc.local doesn't exist, create one and run chmod 644 /etc/rc.local.
This article is about how to deploy a scalable WordPress site on Google Kubernetes Engine.
Using the container version of the popular LEMP stack:
Linux (Docker containers)
NGINX
MySQL (Google Cloud SQL)
PHP (PHP-FPM)
Google Cloud Platform Pricing
Deploying a personal blog on Kubernetes sounds like overkill (I must admit, it does). Still, it is fun and an excellent practice to containerize a traditional application, WordPress, which is harder than you thought. More importantly, the financial cost of running a Kubernetes cluster on GKE could be pretty low if you use preemptible VMs which also means native Chaos Engineering!
Cloud SQL is the fully managed relational database service on Google Cloud, though it currently only supports MySQL 5.6 and 5.7.
You can simply create a MySQL instance with few clicks on Google Cloud Platform Console or CLI. It is recommended to enable Private IP that allows VPC networking and never exposed to the public Internet. Nevertheless, you have to turn on Public IP if you would like to connect to it from your local machine. Otherwise, you might see something like couldn't connect to "xxx": dial tcp 10.x.x.x:3307: connect: network is unreachable. Remember to set IP whitelists for Public IP.
Connect to a Cloud SQL instance from your local machine:
The master of your Google Kubernetes Engine cluster is managed by GKE itself, as a result, you only need to provision and pay for worker nodes. No cluster management fees.
You can create a Kubernetes cluster on Google Cloud Platform Console or CLI, and there are some useful settings you might like to turn on:
Over-provisioning is human nature, so don't spend too much time on choosing the right machine type for your Kubernetes cluster at the beginning since you are very likely to overprovision without real usage data at hand. Instead, after deploying your workloads, you can find out the actual resource usage from Stackdriver Monitoring or GKE usage metering, then adjust your node pools.
Some useful node pool configurations:
Enable preemptible nodes
Access scopes > Set access for each API:
Enable Cloud SQL
After the cluster is created, you can now configure your kubectl:
Here comes the tricky part, containerizing a WordPress site is not as simple as pulling a Docker image and set replicas: 10 since WordPress is a totally stateful application. Especially:
MySQL Database
The wp-content folder
The dependency on MySQL is relatively easy to solve since it is an external service. Your MySQL database could be managed, self-hosted, single machine, master-slave, or multi-master. However, horizontally scaling a database would be another story, so we only focus on WordPress now.
The next one, our notorious wp-content folder which includes plugins, themes, and uploads.
Users (site owners, editors, or any logged-in users) can upload images or even videos on a WordPress site if you allow them to do so. For those uploaded contents, it is best to copy them to Amazon S3 or Google Cloud Storage automatically after a user uploads a file. Also, don't forget to configure a CDN to point at your bucket. Luckily, there are already plugins for such tasks:
Both storage services support direct uploads: the uploading file goes to S3 or GCS directly without touching your servers, but you might need to write some code to achieve that.
Pre-installed Plugins and Themes
You would usually deploy multiple WordPress Pods in Kubernetes, and each pod has its own resources: CPU, memory, and storage. Anything writes to the local volume is ephemeral that only exists within the Pod's lifecycle. When you install a new plugin through WordPress admin dashboard, the plugin would be only installed on the local disk of one of Pods, the one serves your request at the time. Therefore, your subsequent requests inevitably go to any of the other Pods because of the nature of Service load balancing, and they do not have those plugin files, even the plugin is marked as activated in the database, which causes an inconsistent issue.
There are two solutions for plugins and themes:
A shared writable network filesystem mounted by each Pod
An immutable Docker image which pre-installs every needed plugin and theme
For the first solution, you can either setup an NFS server, a Ceph cluster, or any of network-attached filesystems. An NFS server might be the simplest way, although it could also easily be a single point of failure in your architecture. Fortunately, managed network filesystem services are available in major cloud providers, like Amazon EFS and Google Cloud Filestore. In fact, Kubernetes is able to provide ReadWriteManyaccess mode for PersistentVolume (the volume can be mounted as read-write by many nodes). Still, only a few types of Volume support it, which don't include gcePersistentDisk and awsElasticBlockStore.
However, I personally adopt the second solution, creating Docker images contain pre-installed plugins and themes through CI since it is more immutable and no network latency issue as in NFS. Besides, I don't frequently install new plugins. It is regretful that some plugins might still write data to the local disk directly, and most of the time we can not prevent it.
Just put it into the root directory of your GitHub repository. Don't forget to store Docker images near your server's location, in my case, asia.gcr.io.
Moreover, it is recommended by the official documentation to use --cache-from for speeding up Docker builds.
The wordpress image supports setting configurations through environment variables, though I prefer to store the whole wp-config.php in ConfigMap, which is more convenient. It is also worth noting that you need to use the same set of WordPress secret keys (AUTH_KEY, LOGGED_IN_KEY, etc.) for all of your WordPress replicas. Otherwise, you might encounter login failures due to mismatched login cookies.
Of course, you can use a base64 encoded (NOT ENCRYPTED!) Secret to store sensitive data.
WP-Cron is the way WordPress handles scheduling time-based tasks. The problem is how WP-Cron works: on every page load, a list of scheduled tasks is checked to see what needs to be run. Therefore, you might consider replacing WP-Cron with a regular Kubernetes CronJob.
// in wp-config.php
define('DISABLE_WP_CRON', true);
If a picture is worth a thousand words, then a video is worth a million. This video accurately describes how we ultimately deploy a WordPress site on Kubernetes.
You write the next generation JavaScript code (ES6 or ES2018!) and using Babel to convert them to ES5. Even more, with the new @babel/preset-env module, it is able to intellectually convert your next generation ECMAScript code to compatible syntax based on browser compatibility statistics. So you don't have to target specific browser versions anymore!
@babel/preset-env transpiles your files to commonjs by default, which requires the transpiled files to be included by require or import. To make this compatible with your Chrome extension, you need to transpile the files as umd module.
Change Stream is a Change Data Capture (CDC) feature provided by MongoDB since v3.6. In layman's terms, it's a high-level API that allows you to subscribe to real-time notifications whenever there is a change in your MongoDB collections, databases, or the entire cluster, in an event-driven fashion.
Change Stream uses information stored in the oplog (operations log) to produce the change event. The oplog.rs is a special capped collection that keeps a rolling record of all insert, update, and remove operations that come into your MongoDB so other members of the Replica Set can copy them. Since Change Stream is built on top of the oplog, it is only available for Replica Sets and Sharded clusters.
The problem with most databases' replication logs is that they have long been considered to be an internal implementation detail of the database, not a public API (Martin Kleppmann, 2017).
Change Stream comes to rescue!
Change Stream in a Sharded cluster
MongoDB has a global logical clock that enables the server to order all changes across a Sharded cluster.
To guarantee total ordering of changes, for each change notification the mongos checks with each shard to see if the shard has seen more recent changes. Sharded clusters with one or more shards that have little or no activity for the collection, or are "cold", can negatively affect the response time of the change stream as the mongos must still check with those cold shards to guarantee total ordering of changes.
There are some typical use cases of Change Stream:
Syncing fields between the source and denormalized collections to mitigate the data consistency issue.
Invalidating the cache.
Updating the search index.
Replicating data to a data warehouse.
Hooking up Change Stream to a generic streaming processing pipeline, e.g., Kafka or Spark Streaming.
How to open a Change Stream?
First of all, you must have a Replica Set or a Shared cluster for your MongoDB deployment and make sure you are using WiredTiger storage engine. If you don't, you might use MongoDB all wrong.
You could also enable 'fullDocument': 'updateLookup' which includes the entire document in each update event, but as the name says, it does a lookup which has an overhead and might exceed the 16MB limitation on BSON documents.
Also, the content of fullDocument may differ from the updateDescription if other majority-committed operations modified the document between the original update operation and the full document lookup. Be cautious when you use it.
Besides regular insert, update, and delete, there is also a replace event which triggered by a update operation.
How to aggregate Change Stream events?
One of the advantages of Change Stream is that you are able to leverage MongoDB's powerful [aggregation]() framework - allowing you to filter and modify the output of Change Stream.
However, there is a tricky part in update events, field names and their contents in the updateDescription.updatedFields might vary if the updated field is an array field. Assuming that we have a tags field which is a list of strings in the user collection. You could try running following code in the mongo shell:
Fortunately, to mitigate the tags and tags.2 problem, we could do some aggregation to $project and $match change events if we only want to listen to the change of the tags field:
Another critical feature of Change Stream is Resumability. Since any service will inevitably get restarted or crashed, it is essential that we can resume from the point of time that Change Stream was interrupted.
A resumeAfter token is carried by every Change Stream event: the _id field whose value looks like {'_data': '825C4607870000000129295A1004AF1EE5355B7344D6B25478700E75259D46645F696400645C42176528578222B13ADEAA0004'}. In other words, the {'_data': 'a hex string'} is your resumeAfter token.
In practice, you should store each resumeAfter token somewhere, for instance, Redis, so that you can resume from a blackout or a restart. It is also a good idea to wrap the store function with a debounced functionality.
Another unusual (and not so reliable) way to get a resumeAfter token is composing one from the oplog.rs collection:
const _ = require('lodash');
const { MongoClient, ReadPreference } = require('mongodb');
const MONGO_URL = 'mongodb://127.0.0.1:27017/';
(async () => {
const mongoClient = await MongoClient.connect(MONGO_URL, {
appname: 'test',
replicaSet: 'rs0',
readPreference: ReadPreference.PRIMARY,
useNewUrlParser: true,
});
// cannot use 'local' database through mongos
const localDb = await mongoClient.db('local');
// querying oplog.rs might take seconds
const doc = await localDb.collection('oplog.rs')
.findOne(
{'ns': 'test.user'}, // dbName.collectionName
{'sort': {'$natural': -1}},
);
// https://stackoverflow.com/questions/48665409/how-do-i-resume-a-mongodb-changestream-at-the-first-document-and-not-just-change
// https://github.com/mongodb/mongo/blob/master/src/mongo/db/storage/key_string.cpp
// https://github.com/mongodb/mongo/blob/master/src/mongo/bson/bsontypes.h
const resumeAfterData = [
'82', // unknown
doc.ts.toString(16), // timestamp
'29', // unknown
'29', // unknown
'5A', // CType::BinData
'10', // length (16)
'04', // BinDataType of newUUID
doc.ui.toString('hex'), // the collection uuid (see `db.getCollectionInfos({name: 'user'})`)
'46', // CType::Object
'64', // CType::OID (vary from the type of the collection primary key)
'5F', // _ (vary from the field name of the collection primary key)
'69', // i
'64', // d
'00', // null
'64', // CType::OID (vary from the type of document primary key)
_.get(doc, 'o2._id', _.get(doc, 'o._id')).toString('hex'), // ObjectID, update operations have `o2` field and others have `o` field
'00', // null
'04', // unknown
].join('').toUpperCase();
const options = {
'resumeAfter': {
'_data': resumeAfterData,
},
};
console.log(options);
const db = await mongoClient.db('test');
const changeStream = db.collection('user').watch([], options);
changeStream.on('change', (event) => {
console.log(event);
});
})();
startAtOperationTime
The startAtOperationTime is only available in MongoDB 4.0+. It simply represents a starting point of time for the Change Stream. Also, you must make sure that the specified starting point is in the time range of the oplog if it is in the past.
mitmproxy is your swiss-army knife for interactive HTTP/HTTPS proxy. In fact, it can be used to intercept, inspect, modify and replay web traffic such as HTTP/1, HTTP/2, WebSockets, or any other SSL/TLS-protected protocols.
Moreover, mitproxy has a powerful Python API offers full control over any intercepted request and response.
You can use your own certificate by passing the --certs example.com=/path/to/example.com.pem option to mitmproxy. Mitmproxy then uses the provided certificate for interception of the specified domain.
The certificate file is expected to be in the PEM format which would roughly looks like this:
You could use negative regex with --ignore-hosts to only watch specific domains. Of course, you are still able to blacklist any domain you don't want: --ignore-hosts 'apple.com|icloud.com|itunes.com|facebook.com|googleapis.com|crashlytics.com'.
Currently, changing the Host server for HTTP/2 connections is not allowed, but you could just disable HTTP/2 proxy to solve the issue if you don't need HTTP/2 for local development.