In this post we are going to create a kubernetes cluster on AWS.
It will be a “private” cluster (master(s) and node(s) will be in a private subnet, not directly addressable from the internet) in an existing VPC.
We will mainly use two tools: kops and terraform so install them if you don’t have them yet.
We will not use EKS because its not supported in most region (yet).

For this tutorial you will need:

  • Terraform
  • kops
  • jq

(For people on MacOS brew install terraform kops jq)

Technical setup of our cluster

AWS

We are going to create a kubernetes cluster inside a VPC (we will create it using terraform) in the Paris region (eu-west-3).
This VPC will have 3 private and 3 public subnets (one per Availability zone).
For our private subnets we will have only 1 NAT gateway (for economy purpose).

Kubernetes

Our kubernetes cluster will run in a private topology (i.e. in privates subnets).
The kubernetes API (running on masters node) will only accessible through a Load Balancer (created by kops).
All the node won’t be internet accessible by default, but using a bastion host we will be able to ssh to them.

The following setup is “prod” ready, we will have 3 masters (one per Availability zone) and 2 nodes.
Kubernetes imposes the following fundamental requirements (shamefully stolen from here):

  • All containers can communicate with all other containers without NAT
  • All nodes can communicate with all containers (and vice-versa) without NAT
  • The IP address that a container sees itself as is the same IP address that others see it as

So in AWS we need to choose a network plugin. Here we will use the amazon-vpc-cni-k8s plugin. It is the recommended plugin and it’s maintained by AWS.

What our directory tree will look like at the end:

$ tree
.
├── kops
│   ├── cluster.yaml
│   ├── data
│   │   ├── aws_iam_role_bastions.k8s.zerotoprod.com_policy
│   │   ├── aws_iam_role_masters.k8s.zerotoprod.com_policy
│   │   ├── aws_iam_role_nodes.k8s.zerotoprod.com_policy
│   │   ├── aws_iam_role_policy_bastions.k8s.zerotoprod.com_policy
│   │   ├── aws_iam_role_policy_masters.k8s.zerotoprod.com_policy
│   │   ├── aws_iam_role_policy_nodes.k8s.zerotoprod.com_policy
│   │   ├── aws_key_pair_kubernetes.k8s.zerotoprod.com-bc9f40063a41c875285d3e42abb848f5_public_key
│   │   ├── aws_launch_configuration_master-eu-west-3a.masters.k8s.zerotoprod.com_user_data
│   │   ├── aws_launch_configuration_master-eu-west-3b.masters.k8s.zerotoprod.com_user_data
│   │   ├── aws_launch_configuration_master-eu-west-3c.masters.k8s.zerotoprod.com_user_data
│   │   └── aws_launch_configuration_nodes.k8s.zerotoprod.com_user_data
│   ├── kubernetes.tf
│   └── template.yaml
└── terraform
    ├── backend.tf
    ├── kops.tf
    ├── locals.tf
    ├── output.tf
    ├── provider.tf
    ├── route53.tf
    └── vpc.tf

3 directories, 23 files

As you can see our terraform and kops configuration are separated. This is because the kops configuration files are fully managed by kops and modifying them is not “persisted” between kops run.

What is out-of-scope

This is not a tutorial on terraform, even without knowing it you should still be able to understand most of it. You can learn the basics here.
We will also not dive deep into kubernetes and just limit ourself to creating the cluster.

(Optional) Setup for terraform

We need to create 2 resources before using terraform:

  • A S3 bucket (in our tutorial it will be named ztp-terraform, I recommend to set the versioning)
  • A DynamoDB table (in our tutorial it will be named ztp-terraform)

You can find more about backends here and the s3 backend.

terraform/provider.tf:

provider "aws" {
  region = "eu-west-3"
}

terraform/backend.tf:

terraform {
  backend "s3" {
    encrypt = true
    bucket = "ztp-terraform"
    key    = "common/beta_build_s3.tfstate"
    region = "eu-west-3"
    dynamodb_table = "ztp-terraform"
  }
}

Terraform: Shared resources

We need to setup some resources that will be used by kops for creating our k8s cluster but could also be used by other things.

We will use the very good terraform-aws-vpc module to avoid having to setup each resource individually.

But first, let’s define some locals variables that will be used throughout the whole tutorial.

terraform/locals.tf:

locals {
  cluster_name           = "k8s.zerotoprod.com"
  cidr                   = "10.0.0.0/16"
  azs                    = ["eu-west-3a", "eu-west-3b", "eu-west-3c"]
  private_subnets        = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets         = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
  environment            = "tutorial"
  kops_state_bucket_name = "ztp-kops"
  ingress_ips            = ["10.0.0.100/32", "10.0.0.101/32"]

  tags = {
    environment = "${local.environment}"
    terraform   = true
  }
}

VPC

Our VPC will be on the 10.0.0.0/16 with a separation of private and public subnets.

terraform/vpc.tf:

module "vpc" {
  source = "terraform-aws-modules/vpc/aws"

  name = "kops-tutorial"
  cidr = "${local.cidr}"

  azs             = "${local.azs}"
  private_subnets = "${local.private_subnets}"
  public_subnets  = "${local.public_subnets}"

  enable_nat_gateway = true
  single_nat_gateway = true
  one_nat_gateway_per_az = false

  private_subnet_tags = {
    "kubernetes.io/role/internal-elb" = 1
  }

  public_subnet_tags = {
    "kubernetes.io/role/elb" = 1
  }

  tags = {
    Environment = "${local.environment}"
    Application = "network"
    "kubernetes.io/cluster/${local.cluster_name}" = "shared"
  }
}

As you can see we are applying some specific tags to our AWS subnets so that kops can recognize them.

DNS zone (route53)

We will also need a route53 zone for our cluster k8s.zerotoprod.com.

terraform/route53.tf:

resource "aws_route53_zone" "cluster" {
  name = "${local.cluster_name}"
}

Now let’s really apply this configuration to our aws account:

$ terraform init
$ terraform apply -auto-approve

Now we have VPC and the necessary route53 zone for our k8s cluster.

Kops resources

Let’s also create a s3 bucket (with versioning enabled) where kops will save the configuration of our cluster.
And a security group to whitelist IPs access to the kubernetes API (see locals definition above).

terraform/kops_state.tf:

resource "aws_s3_bucket" "kops_state" {
  bucket = "${local.kops_state_bucket_name}"
  acl    = "private"

  versioning {
    enabled = true
  }

  tags = {
    Environment = "tutorial"
    Application = "kops"
  }
}

resource "aws_security_group" "k8s_api_http" {
  name   = "${local.environment}_k8s_api_http"
  vpc_id = "${module.vpc.vpc_id}"
  tags   = "${local.tags}"

  ingress {
    protocol    = "tcp"
    from_port   = 80
    to_port     = 80
    cidr_blocks = ["${local.ingress_ips}"]
  }

  ingress {
    protocol    = "tcp"
    from_port   = 443
    to_port     = 443
    cidr_blocks = ["${local.ingress_ips}"]
  }
}

Output

The output we define below will be used by kops to configure and create our cluster.

terraform/output.tf:

output "region" {
  value = "eu-west-3"
}

output "vpc_id" {
  value = "${module.vpc.vpc_id}"
}

output "vpc_cidr_block" {
  value = "${module.vpc.vpc_cidr_block}"
}

output "public_subnet_ids" {
  value = ["${module.vpc.public_subnets}"]
}

output "public_route_table_ids" {
  value = ["${module.vpc.public_route_table_ids}"]
}

output "private_subnet_ids" {
  value = ["${module.vpc.private_subnets}"]
}

output "private_route_table_ids" {
  value = ["${module.vpc.private_route_table_ids}"]
}

output "default_security_group_id" {
  value = "${module.vpc.default_security_group_id}"
}

output "nat_gateway_ids" {
  value = "${module.vpc.natgw_ids}"
}

output "availability_zones" {
  value = ["${local.azs}"]
}

output "kops_s3_bucket_name" {
  value = "${aws_s3_bucket.kops_state.bucket}"
}

output "cluster_name" {
  value = "${local.cluster_name}"
}

output "k8s_api_http_security_group_id" {
  value = "${aws_security_group.k8s_api_http.id}"
}

Kops: Cluster creation

kops is a tool to easily create a kubernetes cluster on AWS or GCE.

Our cluster creation will be a multi-step process because we are using the kops cluster templating tool to create our cluster.

Cluster template

kops/template.yaml

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  name: {{.cluster_name.value}}
spec:
  api:
    loadBalancer:
      type: Public
      additionalSecurityGroups: ["{{.k8s_api_http_security_group_id.value}}"]
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://{{.kops_s3_bucket_name.value}}/{{.cluster_name.value}}
  # Create one etcd member per AZ
  etcdClusters:
  - etcdMembers:
  {{range $i, $az := .availability_zones.value}}
    - instanceGroup: master-{{.}}
      name: {{. | replace $.region.value "" }}
  {{end}}
    name: main
  - etcdMembers:
  {{range $i, $az := .availability_zones.value}}
    - instanceGroup: master-{{.}}
      name: {{. | replace $.region.value "" }}
  {{end}}
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubernetesVersion: 1.11.6
  masterPublicName: api.{{.cluster_name.value}}
  networkCIDR: {{.vpc_cidr_block.value}}
  kubeControllerManager:
    clusterCIDR: {{.vpc_cidr_block.value}}
  kubeProxy:
    clusterCIDR: {{.vpc_cidr_block.value}}
  networkID: {{.vpc_id.value}}
  kubelet:
    anonymousAuth: false
  networking:
    amazonvpc: {}
  nonMasqueradeCIDR: {{.vpc_cidr_block.value}}
  sshAccess:
  - 0.0.0.0/0
  subnets:
  # Public (utility) subnets, one per AZ
  {{range $i, $id := .public_subnet_ids.value}}
  - id: {{.}}
    name: utility-{{index $.availability_zones.value $i}}
    type: Utility
    zone: {{index $.availability_zones.value $i}}
  {{end}}
  # Private subnets, one per AZ
  {{range $i, $id := .private_subnet_ids.value}}
  - id: {{.}}
    name: {{index $.availability_zones.value $i}}
    type: Private
    zone: {{index $.availability_zones.value $i}}
    egress: {{index $.nat_gateway_ids.value 0}}
  {{end}}
  topology:
    bastion:
      bastionPublicName: bastion.{{.cluster_name.value}}
    dns:
      type: Public
    masters: private
    nodes: private
---

# Create one master per AZ
{{range .availability_zones.value}}
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: {{$.cluster_name.value}}
  name: master-{{.}}
spec:
  image: kope.io/k8s-1.11-debian-stretch-amd64-hvm-ebs-2018-08-17
  machineType: t2.medium
  maxSize: 1
  minSize: 1
  role: Master
  nodeLabels:
    kops.k8s.io/instancegroup: master-{{.}}
  subnets:
  - {{.}}
---
  {{end}}

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: {{.cluster_name.value}}
  name: nodes
spec:
  image: kope.io/k8s-1.11-debian-stretch-amd64-hvm-ebs-2018-08-17
  machineType: t2.small
  maxSize: 2
  minSize: 2
  role: Node
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  subnets:
  {{range .availability_zones.value}}
  - {{.}}
  {{end}}
---

apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: {{.cluster_name.value}}
  name: bastions
spec:
  image: kope.io/k8s-1.11-debian-stretch-amd64-hvm-ebs-2018-08-17
  machineType: t2.micro
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: bastions
  role: Bastion
  subnets:
  {{range .availability_zones.value}}
  - utility-{{.}}
  {{end}}

The above template will be used by the kops templating tool to create a cluster, with:

  • 3 master, each in a different availability zone
  • 2 nodes
  • 1 bastion to have SSH access to any node of our cluster (master and nodes)

Using it

We are going use our previous terraform output as values for the template (run this in the kops/ directory).

TF_OUTPUT=$(cd ../terraform && terraform output -json)
CLUSTER_NAME="$(echo ${TF_OUTPUT} | jq -r .cluster_name.value)"
kops toolbox template --name ${CLUSTER_NAME} --values <( echo ${TF_OUTPUT}) --template template.yaml --format-yaml > cluster.yaml

Now the cluster.yaml contains the real cluster definition. We are going to put in the kops state s3 bucket.

STATE="s3://$(echo ${TF_OUTPUT} | jq -r .kops_s3_bucket_name.value)"
kops replace -f cluster.yaml --state ${STATE} --name ${CLUSTER_NAME} --force
kops create secret --name ${CLUSTER_NAME} --state ${STATE} --name ${CLUSTER_NAME} sshpublickey admin -i ~/.ssh/id_rsa.pub

The last command will create use your public key in ~/.ssh/id_rsa.pub to allow you to access the bastion host.

Now that kops state as been updated we can use it to create terraform files that will represent our cluster.

kops update cluster \
  --out=. \
  --target=terraform \
  --state ${STATE} \
  --name ${CLUSTER_NAME}

And let’s deploy it to AWS.

terraform init
terraform plan
terraform apply

Wrapping up

You should now have a cluster with multiple nodes and multiple masters, running on a VPC you control outside of kops.
This cluster uses the AWS VPC CNI plugin (amazon-vpc-cni-k8s), so pod networking uses the native network of AWS.

You should be able to see all your nodes by running:

$ kubectl get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-0-1-206.eu-west-3.compute.internal   Ready    master   12m   v1.11.6
ip-10-0-1-211.eu-west-3.compute.internal   Ready    node     11m   v1.11.6
ip-10-0-2-12.eu-west-3.compute.internal    Ready    master   12m   v1.11.6
ip-10-0-3-135.eu-west-3.compute.internal   Ready    node     10m   v1.11.6
ip-10-0-3-17.eu-west-3.compute.internal    Ready    master   12m   v1.11.6

You also have a bastion host to connect to your cluster VMs.

Acknowledgement

The kops part of this tutorial is heavily inspired by this blogpost, I discovered it while finishing writing this post and found the template solution a lot more cleaner than what I was using here. The main differences are that we are using one NAT gateway and creating a bastion host to SSH in our cluster nodes.