How to Set Up a Kubernetes Cluster with Terraform on AWS
How to Set Up a Kubernetes Cluster with Terraform on AWS
Estimated Time: 60–90 minutes (including provisioning wait times)
Overview
Kubernetes has become the de facto standard for container orchestration, and Terraform is the industry-leading Infrastructure as Code (IaC) tool. In this tutorial, you’ll provision a production-grade Amazon EKS (Elastic Kubernetes Service) cluster on AWS using Terraform — from scratch. By the end, you’ll have a fully functional cluster with worker nodes, a VPC, and `kubectl` access configured on your local machine.
We’ll cover:
Prerequisites
Make sure you have the following before you begin:
| Tool / Account | Purpose | Get It |
|—|—|—|
| **AWS Account** | Cloud provider for provisioning EC2, VPC, and EKS | [aws.amazon.com](https://aws.amazon.com) |
| **AWS CLI** (v2+) | Authenticate Terraform with AWS | `brew install awscli` / `apt install awscli` |
| **Terraform** (v1.5+) | Infrastructure as Code engine | [terraform.io/downloads](https://developer.hashicorp.com/terraform/downloads) |
| **kubectl** (v1.28+) | Kubernetes command-line tool | `brew install kubectl` / `apt install kubectl` |
| **aws-iam-authenticator** or **awscli v2 helper** | Authenticate `kubectl` with EKS | Bundled with `aws eks update-kubeconfig` (AWS CLI v2) |
| **SSH key pair (optional)** | Debug worker nodes via SSH | Created in AWS EC2 console |
IAM Permissions Required (attach to your user or role):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:*",
"eks:*",
"iam:*",
"autoscaling:*",
"cloudformation:*",
"kms:*"
],
"Resource": "*"
}
]
}
**Production note:** Restrict these permissions to specific resources and use conditions in a real environment. The above is for learning purposes.
Step 1 — Configure AWS Credentials
Authenticate your CLI session so Terraform can make API calls on your behalf.
Option A — Profile (recommended):
aws configure --profile nova-tech-lab
You’ll be prompted for:
AWS Access Key ID: AKIA...
AWS Secret Access Key: wJalrX...
Default region name: us-east-1
Default output format: json
Option B — Environment variables (CI/CD friendly):
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="wJalrX..."
export AWS_DEFAULT_REGION="us-east-1"
Verify your setup:
aws sts get-caller-identity
# -> { "Account": "123456789012", "UserId": "AIDA...", "Arn": "arn:aws:iam::123456789012:user/your-user" }
Step 2 — Create the Terraform Project Structure
Create a clean directory and initialize the project:
mkdir -p ~/projects/eks-terraform && cd ~/projects/eks-terraform
touch main.tf variables.tf outputs.tf terraform.tfvars
Your layout will look like this:
eks-terraform/
├── main.tf # Core infrastructure (VPC, EKS, node groups)
├── variables.tf # Input variables
├── terraform.tfvars # Variable values (keep out of version control)
└── outputs.tf # Useful output values (kubeconfig, cluster name)
Step 3 — Declare Providers and Backend
Open `main.tf` in your editor and add the provider configuration:
terraform {
required_version = ">= 1.5"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.23"
}
}
# Optional: state stored locally. Replace with S3 backend for teams.
backend "local" {
path = "terraform.tfstate"
}
}
provider "aws" {
region = var.aws_region
}
In `variables.tf`:
variable "aws_region" {
description = "AWS region to deploy resources"
type = string
default = "us-east-1"
}
variable "cluster_name" {
description = "Name of the EKS cluster"
type = string
default = "nova-tech-eks"
}
variable "cluster_version" {
description = "Kubernetes version for the cluster"
type = string
default = "1.30"
}
variable "instance_types" {
description = "EC2 instance types for node group"
type = list(string)
default = ["t3.medium"]
}
variable "desired_node_count" {
description = "Desired number of worker nodes"
type = number
default = 2
}
variable "min_node_count" {
type = number
default = 1
}
variable "max_node_count" {
type = number
default = 4
}
In `terraform.tfvars`:
aws_region = "us-east-1"
cluster_name = "nova-tech-eks"
cluster_version = "1.30"
instance_types = ["t3.medium"]
Step 4 — Create a Custom VPC for EKS
EKS requires a well-configured VPC. We’ll use the official AWS VPC Terraform module. Add this to `main.tf`:
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.8.1"
name = "${var.cluster_name}-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
enable_vpn_gateway = false
enable_dns_hostnames = true
enable_dns_support = true
single_nat_gateway = false
one_nat_gateway_per_az = false
public_subnet_tags = {
"kubernetes.io/role/elb" = "1"
}
private_subnet_tags = {
"kubernetes.io/role/internal-elb" = "1"
}
tags = {
Environment = "dev"
Project = var.cluster_name
}
}
**Why this matters:** EKS automatically provisions Load Balancers in public subnets and internal load balancers in private subnets. The tags above tell EKS which subnets to use.
Step 5 — Provision the EKS Cluster (Control Plane)
Still in `main.tf`, add the EKS module:
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "20.8.5"
cluster_name = var.cluster_name
cluster_version = var.cluster_version
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
cluster_endpoint_private_access = false
# Control plane security group: allow all traffic from VPC
cluster_security_group_additional_rules = {
ingress_self = {
description = "Allow ingress from cluster itself"
protocol = "-1"
from_port = 0
to_port = 0
type = "ingress"
cidr_blocks = ["10.0.0.0/16"]
}
}
# Enable EKS-managed add-ons
cluster_addons = {
coredns = {
most_recent = true
}
kube-proxy = {
most_recent = true
}
vpc-cni = {
most_recent = true
}
}
}
Step 6 — Add Managed Node Groups
Worker nodes run your actual container workloads. Attach them to the EKS module block:
Append inside the `module “eks” { }` block:
eks_managed_node_groups = {
main = {
desired_size = var.desired_node_count
min_size = var.min_node_count
max_size = var.max_node_count
instance_types = var.instance_types
# Use a custom AMI with a sensible disk size
block_device_mappings = {
xvda = {
device_name = "/dev/xvda"
ebs = {
volume_size = 40
volume_type = "gp3"
encrypted = true
delete_on_termination = true
}
}
}
tags = {
"kubernetes.io/cluster/${var.cluster_name}" = "owned"
}
}
}
The full EKS module now looks like this (skeleton):
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "20.8.5"
cluster_name = var.cluster_name
cluster_version = var.cluster_version
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
cluster_endpoint_public_access = true
cluster_addons = { ... }
eks_managed_node_groups = { main = { ... } }
}
Step 7 — Configure Data Sources and Outputs
In `outputs.tf`, add values you’ll need after provisioning:
output "cluster_endpoint" {
description = "Endpoint for your EKS Kubernetes API"
value = module.eks.cluster_endpoint
}
output "cluster_name" {
description = "EKS cluster name"
value = module.eks.cluster_name
}
output "cluster_certificate_authority_data" {
description = "Base64-encoded certificate data required to communicate with the cluster"
value = module.eks.cluster_certificate_authority_data
}
output "region" {
description = "AWS region"
value = var.aws_region
}
Also add a data source in `main.tf` to fetch the caller identity (used by the Kubernetes provider):
data "aws_caller_identity" "current" {}
Step 8 — Deploy the Infrastructure
Now the exciting part — apply the Terraform plan:
cd ~/projects/eks-terraform
terraform init
# -> Initializing modules...
# -> Terraform has been successfully initialized!
terraform plan
# Review the output — you should see ~80+ resources to be created
If everything looks good:
terraform apply -auto-approve
This will take 15–25 minutes (EKS control plane provisioning is the bottleneck). Grab a coffee. Terraform will show progress as resources are created:
module.vpc.aws_vpc.this: Creating...
module.vpc.aws_subnet.private[0]: Creation complete...
...
module.eks.module.eks_cluster.aws_eks_cluster.this: Still creating... [10m elapsed]
...
Apply complete! Resources: 84 added, 0 changed, 0 destroyed.
Step 9 — Configure kubectl
Once the apply is complete, configure `kubectl` to talk to your new cluster:
aws eks update-kubeconfig \
--region $(terraform output -raw region) \
--name $(terraform output -raw cluster_name)
Expected output:
Added new context arn:aws:eks:us-east-1:123456789012:cluster/nova-tech-eks to /home/user/.kube/config
Test connectivity:
kubectl cluster-info
# -> Kubernetes control plane is running at https://...
# -> CoreDNS is running at https://...
kubectl get nodes
# -> NAME STATUS ROLES AGE VERSION
# -> ip-10-0-1-xx.ec2.internal Ready <none> 5m v1.30.0-eks-...
# -> ip-10-0-2-xx.ec2.internal Ready <none> 5m v1.30.0-eks-...
Step 10 — Deploy a Smoke-Test Application
Verify that the cluster can schedule pods and expose services:
kubectl create deployment nginx-test \
--image=nginx:alpine \
--replicas=3
kubectl expose deployment nginx-test \
--type=LoadBalancer \
--port=80 \
--target-port=80
kubectl get pods -w
# -> nginx-test-xxxxx-xxxxx 1/1 Running
# -> nginx-test-xxxxx-xxxxx 1/1 Running
# -> nginx-test-xxxxx-xxxxx 1/1 Running
Once the LoadBalancer is provisioned (1–2 minutes):
kubectl get svc nginx-test
# -> NAME TYPE EXTERNAL-IP PORT(S)
# -> nginx-test LoadBalancer a1234-....elb.amazonaws.com 80:31234/TCP
curl http://a1234-....elb.amazonaws.com
# -> Welcome to nginx!
Clean up the test deployment when done:
kubectl delete deployment nginx-test
kubectl delete svc nginx-test
Step 11 — Clean Up (Avoid Surprise Bills)
To destroy everything and avoid ongoing AWS charges:
terraform destroy -auto-approve
This tears down the node groups, the EKS control plane, the VPC, and all associated resources. Confirm you see:
Destroy complete! Resources: 84 destroyed.
⚠️ **Important:** If you skip `terraform destroy`, an EKS cluster running continuously in `us-east-1` with `t3.medium` nodes costs roughly **$0.25–$0.40/hour** (~$200–$300/month).
Troubleshooting
Here are the most common issues you’ll encounter and how to fix them.
1. “Error creating EKS cluster: UnauthorizedException”
Cause: Your IAM user/role doesn’t have sufficient permissions.
Fix: Attach the `AmazonEKSClusterPolicy` and `AmazonEKSAdminPolicy` managed policies, or use the IAM policy block from the Prerequisites section above. Verify with:
aws iam list-attached-user-policies --user-name YOUR_USER
2. “Timeout waiting for EKS cluster to become ready”
Cause: EKS control plane provisioning is slow, or the VPC configuration is wrong (missing DNS hostnames, no NAT gateway for private subnets).
Fix:
3. “Node group creation failed: Unhealthy nodes”
Cause: The worker nodes can’t register with the EKS control plane.
Common checks:
# Check node status
kubectl describe node <node-name>
# Check node group events
aws eks describe-nodegroup \
--cluster-name nova-tech-eks \
--nodegroup-name main
4. “kubectl: connect: connection refused”
Cause: The EKS endpoint is not publicly accessible, or your `kubeconfig` is stale.
Fix:
# Re-generate kubeconfig
aws eks update-kubeconfig --region us-east-1 --name nova-tech-eks
# Verify public endpoint
aws eks describe-cluster --name nova-tech-eks \
--query "cluster.resourcesVpcConfig.endpointPublicAccess"
# -> true
5. “NoCredentialProviders: no valid providers in chain”
Cause: AWS credentials are not configured in your environment.
Fix:
aws configure
# OR
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
6. “Failed to create EBS volume: the availability zone does not exist”
Cause: You specified an AZ that isn’t enabled in your AWS account.
Fix: Check enabled AZs:
aws ec2 describe-availability-zones --region us-east-1
Then update your `main.tf` to use only enabled zones in the VPC module.
Next Steps
Now that your EKS cluster is up and running, here’s what you can do next:
| Task | Suggested Tool / Approach |
|—|—|
| Deploy a real application | `kubectl apply -f deployment.yaml` |
| Install Ingress Controller | `helm install ingress-nginx ingress-nginx/ingress-nginx` |
| Add monitoring | `helm install prometheus prometheus-community/kube-prometheus-stack` |
| Enable cluster autoscaling | Deploy the **Cluster Autoscaler** or **Karpenter** |
| Store Terraform state remotely | Use an S3 backend with DynamoDB locking |
| Set up CI/CD | Use GitHub Actions or GitLab CI to run `terraform apply` on merge |
Conclusion
You’ve just provisioned a fully functional Kubernetes cluster on AWS EKS using Terraform, with:
This pattern is battle-tested and used by teams shipping to production daily. The same Terraform code can be adapted for staging, QA, and production environments by parameterizing variables and swapping in an S3 backend.
Next time you need an EKS cluster, you can have one running in under 30 minutes — fully automated, version-controlled, and repeatable.
*Tutorial by the Nova Tech Cloud Team. We help companies build, secure, and scale cloud infrastructure. [Contact us](https://nova-tech.cloud/contact) for consulting, training, or managed DevOps services.*