September 23, 2019
This is a article on how to NOT wake up with a massive GCP bill by forgetting you’ve left some expensive resources running (like a multi-node GKE cluster), we’ll be using Google Cloud Scheduler to scale our GKE cluster to 0 on some interval. This could be a really useful pattern if you’re spinning up a GKE cluster and doing some platform development and/or prototyping usage of tools and applications within the Kubernetes ecosystem.
In order to start lets enable the required APIs we’ll be using here.
Firstly enable Google Kubernetes Engine,
gcloud services enable container.googleapis.com
and then we enable Google Cloud Scheduler
cloud services enable cloudscheduler.googleapis.com
As an example, we’ll quickly create a GKE cluster for demonstration purposes, you might instead do this using Deployment Manager or the Google Terraform provider,
gcloud beta container clusters create example \
--zone="australia-southeast1-a" \
--machine-type="n1-standard-1" \
--num-nodes="3" \
--preemptible \
--no-user-output-enabled \
In order to authorize our Cloud Scheduler instance to successfully interact with the GKE API, we need to create a &Service Account with the correct IAM role permitting specifically the container.clusters.update
permission. We can achieve by running the following commands
We need to create a custom role with the specific permissions we need,
gcloud iam roles create gke.scheduler \
--project ${PROJECT_ID} \
--title "Role GKE Scheduler" \
--description "Managing the scaling of GKE nodes" \
--permissions container.clusters.update \
--stage GA
Where
${PROJECT_ID}
is the project you’re operating within
In order for us to provide the required authorization to our Cloud Scheduler task we need to attach a Service Account, this is created by running the following,
cloud beta iam service-accounts create gke-scheduler \
--description "managing scheduling of worker nodes on gke" \
--display-name "gke-scheduler"
Now we bind the Custom Role we created with the Service Account, run the following,
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
--member serviceAccount:gke-scheduler@${PROJECT_ID}.iam.gserviceaccount.com \
--role projects/${PROJECT_ID}/roles/gke.scheduler
Finally the job can be created using our created Service Account. Cloud Scheduler uses the standard CronJob syntax, I’ve used a 6 hour interval to be safe - ensuring the cluster gets scaled down regularly. The beauty of Kubernetes its so easy to stand everything back up its fairly trivial to just scale the cluster back up when required.
cloud beta scheduler jobs create http gke-cluster-auto-scale-down \
--schedule "0 */6 * * *" \
--uri=https://container.googleapis.com/v1beta1/projects/${PROJECT_ID}/zones/australia-southeast1-a/clusters/${CLUSTER_NAME}/nodePools/default-pool/setSize \
--message-body '{"nodeCount":0}' \
--time-zone=Australia/Melbourne \
--oauth-service-account-email gke-scheduler@${PROJECT_ID}.iam.gserviceaccount.com
Where
${PROJECT_ID}
is the project you’re operating within and${CLUSTER_NAME}
is the name of your GKE cluster.
Now you should be cooking with gas 🔥, experimenting with the confidence of never accidentally having your resources running for too long.
We’ve created a least-privileged mode of auto-scaling our GKE clusters down, ensuring we don’t exceed our budgets unnecessarily. This is a basic prototype of the capability of Cloud Scheduler, I would advise checking out Terraform for a more robust and templated structure for your GCP resources.
Additionally we’ve also enabled preemptible VMs for our GKE cluster, which further reduces costing for a cluster used for prototyping and development.
Written by Ben Ebsworth, thoughts are their own and any material is representative of an attempt to self-educate and explore ideas and technology You should follow him on Twitter