{"id":892,"date":"2025-11-25T01:55:16","date_gmt":"2025-11-24T17:55:16","guid":{"rendered":"https:\/\/vinta.ws\/code\/?p=892"},"modified":"2026-02-18T01:20:34","modified_gmt":"2026-02-17T17:20:34","slug":"gke-autopilot-cluster-pay-for-pods-not-nodes","status":"publish","type":"post","link":"https:\/\/vinta.ws\/code\/gke-autopilot-cluster-pay-for-pods-not-nodes.html","title":{"rendered":"GKE Autopilot Cluster: Pay for Pods, Not Nodes"},"content":{"rendered":"<p>If you're already on Google Cloud, it's highly recommended to use GKE Autopilot Cluster: you only pay for resources requested by your pods (<strong>system pods and unused resources are free<\/strong> in Autopilot Cluster). No need to pay for surplus node pools anymore! Plus the entire cluster applies Google's best practices by default.<\/p>\n<p>ref:<br \/>\n<a href=\"https:\/\/cloud.google.com\/kubernetes-engine\/pricing#compute\">https:\/\/cloud.google.com\/kubernetes-engine\/pricing#compute<\/a><\/p>\n<h2>Create an Autopilot Cluster<\/h2>\n<p>DO NOT enable Private Nodes, otherwise you\u00a0<strong>MUST<\/strong>\u00a0pay for a Cloud NAT Gateway (~$32\/month) for them to access the internet (to pull images, etc.).<\/p>\n<pre class=\"line-numbers\"><code class=\"language-bash\"># create\ngcloud container clusters create-auto my-auto-cluster \n--project YOUR_PROJECT_ID \n--region us-west1\n\n# connect\ngcloud container clusters get-credentials my-auto-cluster \n--project YOUR_PROJECT_ID \n--region us-west1<\/code><\/pre>\n<p>You can update some configurations later on Google Cloud Console.<\/p>\n<p>ref:<br \/>\n<a href=\"https:\/\/docs.cloud.google.com\/sdk\/gcloud\/reference\/container\/clusters\/create-auto\">https:\/\/docs.cloud.google.com\/sdk\/gcloud\/reference\/container\/clusters\/create-auto<\/a><\/p>\n<p>Autopilot mode works in both Autopilot and Standard clusters. You don't necessarily need to create a new Autopilot cluster; you can simply deploy your pods in Autopilot mode as long as your Standard cluster meets the requirements:<\/p>\n<pre class=\"line-numbers\"><code class=\"language-bash\">gcloud container clusters check-autopilot-compatibility my-cluster \n--project YOUR_PROJECT_ID \n--region us-west1<\/code><\/pre>\n<p>ref:<br \/>\n<a href=\"https:\/\/cloud.google.com\/kubernetes-engine\/docs\/concepts\/about-autopilot-mode-standard-clusters\">https:\/\/cloud.google.com\/kubernetes-engine\/docs\/concepts\/about-autopilot-mode-standard-clusters<\/a><br \/>\n<a href=\"https:\/\/cloud.google.com\/kubernetes-engine\/docs\/how-to\/autopilot-classes-standard-clusters\">https:\/\/cloud.google.com\/kubernetes-engine\/docs\/how-to\/autopilot-classes-standard-clusters<\/a><\/p>\n<h2>Deploy Workloads in Autopilot Mode<\/h2>\n<p>The only thing you need to do is add <strong>one magical config<\/strong>: <code>nodeSelector: cloud.google.com\/compute-class: &quot;autopilot&quot;<\/code>. That's it. You don't need to create or manage any node pools beforehand, just write some YAMLs and <code>kubectl apply<\/code>. All workloads with <code>cloud.google.com\/compute-class: &quot;autopilot&quot;<\/code> will run in Autopilot mode.<\/p>\n<p>More importantly, you are only billed for the CPU\/memory resources <strong>your pods<\/strong> request, not for nodes that may have unused capacity or system pods (those running under the <code>kube-system<\/code> namespace). Autopilot mode is both cost-efficient and developer-friendly.<\/p>\n<pre class=\"line-numbers\"><code class=\"language-yaml\">apiVersion: apps\/v1\nkind: Deployment\nmetadata:\n  name: nginx\nspec:\n  replicas: 3\n  selector:\n    matchLabels:\n      app: nginx\n  template:\n    metadata:\n      labels:\n        app: nginx\n    spec:\n      nodeSelector:\n        cloud.google.com\/compute-class: \"autopilot\"\n      containers:\n        - name: nginx\n          image: nginx:1.29.3\n          ports:\n            - name: http\n              containerPort: 80\n          resources:\n            requests:\n              cpu: 100m\n              memory: 128Mi\n            limits:\n              cpu: 500m\n              memory: 256Mi<\/code><\/pre>\n<p>If your workloads are fault-tolerant (stateless), you can use Spot instances to save a significant amount of money. Just change the <code>nodeSelector<\/code> to <code>autopilot-spot<\/code>:<\/p>\n<pre class=\"line-numbers\"><code class=\"language-yaml\">apiVersion: apps\/v1\nkind: Deployment\nmetadata:\n  name: nginx\nspec:\n  template:\n    spec:\n      nodeSelector:\n        cloud.google.com\/compute-class: \"autopilot-spot\"\n      terminationGracePeriodSeconds: 25 # spot instances have a 25s warning before preemption<\/code><\/pre>\n<p>ref:<br \/>\n<a href=\"https:\/\/docs.cloud.google.com\/kubernetes-engine\/docs\/how-to\/autopilot-classes-standard-clusters\">https:\/\/docs.cloud.google.com\/kubernetes-engine\/docs\/how-to\/autopilot-classes-standard-clusters<\/a><\/p>\n<p>You will see something like this in your Autopilot cluster:<\/p>\n<pre class=\"line-numbers\"><code class=\"language-bash\">kubectl get nodes\nNAME                             STATUS   ROLES    AGE     VERSION\ngk3-my-auto-cluster-nap-xxx      Ready    &lt;none&gt;   2d18h   v1.33.5-gke.1201000\ngk3-my-auto-cluster-nap-xxx      Ready    &lt;none&gt;   1d13h   v1.33.5-gke.1201000\ngk3-my-auto-cluster-pool-1-xxx   Ready    &lt;none&gt;   86m     v1.33.5-gke.1201000<\/code><\/pre>\n<p>The <code>nap<\/code> nodes are auto-provisioned by Autopilot for your workloads, while <code>pool-1<\/code> is a default node pool created during cluster creation. System pods may run on either, but in Autopilot cluster, you are <strong>never billed for the nodes themselves<\/strong> (neither <code>nap<\/code> nor <code>pool-1<\/code>), nor for the system pods. You only pay for the resources requested by your application pods.<\/p>\n<p>FYI, the minimum resources for Autopilot workloads are:<\/p>\n<ul>\n<li>CPU: <code>50m<\/code><\/li>\n<li>Memory: <code>52Mi<\/code><\/li>\n<\/ul>\n<p>Additionally, Autopilot applies the following default resource requests if not specified:<\/p>\n<ul>\n<li>Containers in DaemonSets\n<ul>\n<li>CPU: <code>50m<\/code><\/li>\n<li>Memory: <code>100Mi<\/code><\/li>\n<li>Ephemeral storage: <code>100Mi<\/code><\/li>\n<\/ul>\n<\/li>\n<li>All other containers\n<ul>\n<li>Ephemeral storage: <code>1Gi<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>ref:<br \/>\n<a href=\"https:\/\/docs.cloud.google.com\/kubernetes-engine\/docs\/concepts\/autopilot-resource-requests\">https:\/\/docs.cloud.google.com\/kubernetes-engine\/docs\/concepts\/autopilot-resource-requests<\/a><\/p>\n<h2>Exclude Prometheus Metrics<\/h2>\n<p>You may see <code>Prometheus Samples Ingested<\/code> in your billing. If you don't need (or don't care about) Prometheus metrics for observability, you could exclude them:<\/p>\n<ul>\n<li>Go to Google Cloud Console -&gt; Monitoring -&gt; Metrics Management -&gt; Excluded Metrics -&gt; Metrics Exclusion<\/li>\n<li>If you want to exclude all:\n<ul>\n<li><code>prometheus.googleapis.com\/.*<\/code><\/li>\n<\/ul>\n<\/li>\n<li>If you only want to exclude some:\n<ul>\n<li><code>prometheus.googleapis.com\/container_.*<\/code><\/li>\n<li><code>prometheus.googleapis.com\/kubelet_.*<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>It's worth noting that excluding Prometheus metrics won't affect your HorizontalPodAutoscaler (HPA) which is using Metrics Server instead.<\/p>\n<p>ref:<br \/>\n<a href=\"https:\/\/console.cloud.google.com\/monitoring\/metrics-management\/excluded-metrics\">https:\/\/console.cloud.google.com\/monitoring\/metrics-management\/excluded-metrics<\/a><br \/>\n<a href=\"https:\/\/docs.cloud.google.com\/stackdriver\/docs\/managed-prometheus\/cost-controls\">https:\/\/docs.cloud.google.com\/stackdriver\/docs\/managed-prometheus\/cost-controls<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you're already on Google Cloud, it's highly recommended to use GKE Autopilot Cluster: you only pay for resources requested by your pods. No need to pay for surplus node pools anymore!<\/p>\n","protected":false},"author":1,"featured_media":893,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38],"tags":[51,114,123],"class_list":["post-892","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-about-devops","tag-env","tag-google-cloud-platform","tag-kubernetes"],"_links":{"self":[{"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/posts\/892","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/comments?post=892"}],"version-history":[{"count":0,"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/posts\/892\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/media\/893"}],"wp:attachment":[{"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/media?parent=892"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/categories?post=892"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/vinta.ws\/code\/wp-json\/wp\/v2\/tags?post=892"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}