Amazon EKS: Setup Cluster Autoscaler

Amazon EKS: Setup Cluster Autoscaler

The Kubernetes's Cluster Autoscaler automatically adjusts the number of nodes in your cluster when pods fail or are rescheduled onto other nodes. Here we're going to deploy the AWS implementation that implements the decisions of Cluster Autoscaler by communicating with AWS products and services such as Amazon EC2.

ref:
https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html

Configure IAM Permissions

If your existing node groups were created with eksctl create nodegroup --asg-access, then this policy already exists and you can skip this step.

in cluster-autoscaler-policy-staging.json

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/k8s.io/cluster-autoscaler/perp-staging": "owned"
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "ec2:DescribeLaunchTemplateVersions"
            ],
            "Resource": "*"
        }
    ]
}
aws --profile perp iam create-policy \
  --policy-name AmazonEKSClusterAutoscalerPolicyStaging \
  --policy-document file://cluster-autoscaler-policy-staging.json

eksctl --profile=perp create iamserviceaccount \
  --cluster=perp-staging \
  --namespace=kube-system \
  --name=cluster-autoscaler \
  --attach-policy-arn=arn:aws:iam::xxx:policy/AmazonEKSClusterAutoscalerPolicy \
  --override-existing-serviceaccounts \
  --approve

ref:
https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html

Deploy Cluster Autoscaler

Download the deployment yaml of cluster-autoscaler:

curl -o cluster-autoscaler-autodiscover.yaml \
https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Before you apply the file, it's recommended to check whether the version of cluster-autoscaler matches the Kubernetes major and minor version of your cluster. Find the version number on GitHub releases.

kubectl apply -f cluster-autoscaler-autodiscover.yaml
# or
kubectl set image deployment cluster-autoscaler \
  -n kube-system \
  cluster-autoscaler=k8s.gcr.io/autoscaling/cluster-autoscaler:v1.21.3

Do some tweaks:

  • --balance-similar-node-groups ensures that there is enough available compute across all availability zones.
  • --skip-nodes-with-system-pods=false ensures that there are no problems with scaling to zero.
kubectl patch deployment cluster-autoscaler \
  -n kube-system \
  -p '{"spec":{"template":{"metadata":{"annotations":{"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"}}}}}'

kubectl -n kube-system edit deployment.apps/cluster-autoscaler
# change the command to the following:
# spec:
#   containers:
#   - command
#     - ./cluster-autoscaler
#     - --v=4
#     - --stderrthreshold=info
#     - --cloud-provider=aws
#     - --skip-nodes-with-local-storage=false
#     - --expander=least-waste
#     - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/perp-staging
#     - --balance-similar-node-groups
#     - --skip-nodes-with-system-pods=false

ref:
https://aws.github.io/aws-eks-best-practices/cluster-autoscaling/
https://github.com/kubernetes/autoscaler/tree/master/charts/cluster-autoscaler#additional-configuration

Amazon EKS: Setup aws-load-balancer-controller for Kubernetes Ingress

Amazon EKS: Setup aws-load-balancer-controller for Kubernetes Ingress

AWS Load Balancer Controller replaces the functionality of the AWS ALB Ingress Controller.

Install aws-load-balancer-controller

Create an IAM OIDC provider for your cluster

eksctl utils associate-iam-oidc-provider --profile=perp \
  --region ap-northeast-1 \
  --cluster perp-staging \
  --approve

ref:
https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

Create a Kubernetes ServiceAccount for the Controller

curl -o iam_policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.3.0/docs/install/iam_policy.json

aws iam create-policy --profile=perp \
  --policy-name AWSLoadBalancerControllerIAMPolicy \
  --policy-document file://iam_policy.json

eksctl create iamserviceaccount --profile=perp \
  --cluster=perp-staging \
  --namespace=kube-system \
  --name=aws-load-balancer-controller \
  --attach-policy-arn=arn:aws:iam::XXX:policy/AWSLoadBalancerControllerIAMPolicy \
  --override-existing-serviceaccounts \
  --approve

re:
https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.3/deploy/installation/

Deploy aws-load-balancer-controller

helm repo add eks https://aws.github.io/eks-charts

helm ls -A

helm upgrade -i \
  aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n kube-system \
  --set clusterName=perp-staging \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller \
  --set region=ap-northeast-1 \
  --set vpcId=vpc-XXX

ref:
https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html#helm-v3-or-later
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.3/deploy/installation/
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.3/deploy/configurations/
https://github.com/aws/eks-charts/tree/master/stable/aws-load-balancer-controller

Create an Ingress using ALB

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: graph-node-ingress
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-FS-1-2-Res-2020-10
    alb.ingress.kubernetes.io/certificate-arn: YOUR_ACM_ARN
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/target-group-attributes: stickiness.enabled=true,stickiness.lb_cookie.duration_seconds=60
    alb.ingress.kubernetes.io/load-balancer-attributes: idle_timeout.timeout_seconds=3600,deletion_protection.enabled=true
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}]'
    alb.ingress.kubernetes.io/actions.graph-node-jsonrpc: >
      {"type": "forward", "forwardConfig": {"targetGroups": [
        {"serviceName": "graph-node", "servicePort": 8000, "weight": 100}
      ]}}
    alb.ingress.kubernetes.io/actions.graph-node-websocket: >
      {"type": "forward", "forwardConfig": {"targetGroups": [
        {"serviceName": "graph-node", "servicePort": 8001, "weight": 100}
      ]}}
spec:
  rules:
  - host: "subgraph-api.example.com"
    http:
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: graph-node-jsonrpc
              port:
                name: use-annotation
  - host: "subgraph-ws.example.com"
    http:
      paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: graph-node-websocket
              port:
                name: use-annotation

ref:
https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.3/guide/ingress/annotations/

kubectl apply -f graph-node-ingress.yaml -R

Then create Route 53 records for the above domains, and point them to the newly created ALB.

Amazon EKS: Setup kubernetes-external-secrets with AWS Secret Manager

Amazon EKS: Setup kubernetes-external-secrets with AWS Secret Manager

kubernetes-external-secrets allows you to use external secret management systems, like AWS Secrets Manager, to securely add Secrets in Kubernetes, so Pods can access Secrets normally.

ref:
https://github.com/external-secrets/kubernetes-external-secrets

AWS Secret Manager

For instance, we create a secret named YOUR_SECRET on AWS Secret Manager in the same region as our EKS cluster, using DefaultEncryptionKey as the encryption key. The content of the secret entity look like:

{
  "KEY_1": "VALUE_1",
  "KEY_2": "VALUE_2",
}

We can retrieve the secret value:

aws secretsmanager get-secret-value --profile=perp \
--region ap-northeast-1 \
--secret-id YOUR_SECRET

kubernetes-external-secrets

For kubernetes-external-secrets to work properly, it must be granted access to AWS Secrets Manager. To achieve that, we need to create an IAM role for kubernetes-external-secrets' service account.

ref:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

Configure Secrets Backends

Create an IAM OIDC provider for the cluster:

eksctl utils associate-iam-oidc-provider --profile=perp \
--region ap-northeast-1 \
--cluster perp-staging \
--approve

aws iam list-open-id-connect-providers --profile=perp

ref:
https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

Create an IAM policy that allows the role to access all secrets we created on AWS Secret Manager:

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --profile=perp --query "Account" --output text)

cat <<EOF > policy.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "secretsmanager:GetResourcePolicy",
        "secretsmanager:GetSecretValue",
        "secretsmanager:DescribeSecret",
        "secretsmanager:ListSecretVersionIds"
      ],
      "Resource": [
        "arn:aws:secretsmanager:ap-northeast-1:${AWS_ACCOUNT_ID}:secret:*"
      ]
    }
  ]
}
EOF

aws iam create-policy --profile=perp \
--policy-name perp-staging-secrets-policy --policy-document file://policy.json

Attach the above IAM policy to an IAM role, and define AssumeRole for the service account external-secrets-kubernetes-external-secrets which will be created later:

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --profile=perp --query "Account" --output text)
OIDC_PROVIDER=$(aws eks describe-cluster --profile=perp --name perp-staging --region ap-northeast-1 --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///") 

cat <<EOF > trust.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:aud": "sts.amazonaws.com",
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:default:external-secrets-kubernetes-external-secrets"
        }
      }
    }
  ]
}
EOF

aws iam create-role --profile=perp \
--role-name perp-staging-secrets-role \
--assume-role-policy-document file://trust.json

aws iam attach-role-policy --profile=perp \
--role-name perp-staging-secrets-role \
--policy-arn YOUR_POLICY_ARN

ref:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
https://gist.github.com/lukaszbudnik/f1f42bd5a57430e3c25034200ba44c2e

Deploy kubernetes-external-secrets Controller

helm repo add external-secrets https://external-secrets.github.io/kubernetes-external-secrets/

helm install external-secrets \
external-secrets/kubernetes-external-secrets \
--skip-crds \
--set env.AWS_REGION=ap-northeast-1 \
--set securityContext.fsGroup=65534 \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"='YOUR_ROLE_ARN'

helm list

It would automatically create a service account named external-secrets-kubernetes-external-secrets in Kubernetes.

ref:
https://github.com/external-secrets/kubernetes-external-secrets/tree/master/charts/kubernetes-external-secrets

Deploy ExternalSecret

ExternalSecret app-secrets will generate a Secret object with the same name, and the content would look like:

apiVersion: kubernetes-client.io/v1
kind: ExternalSecret
metadata:
  name: example-secret
spec:
  backendType: secretsManager
  region: ap-northeast-1
  dataFrom:
    - YOUR_SECRET
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example-app
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-app
        image: busybox:latest
        envFrom:
        - secretRef:
            name: example-secret
kubectl get secret example-secret -o jsonpath="{.data.KEY_1}" | base64 --decode

ref:
https://gist.github.com/lukaszbudnik/f1f42bd5a57430e3c25034200ba44c2e