Getting started with kubernetes using Ansible and Terraform

Brad Downey
ITNEXT
Published in
12 min readMar 8, 2018

--

Click here to share this article on LinkedIn »

So, you want to start playing around with kubernetes because hey, why not. You could always run Minikube locally on your desktop, but this is only a single node cluster. I wanted to run a multi-node cluster and for me, it was about seeing it and feeling it; I learn better this way. Building and running kubernetes clusters is not my day job, so I needed a way to understand what all the hype is about.

Warning: This is NOT intended to be a production deployment. There are many shortcomings to this installation as compared to a full production deployment.

Tip: Most of this is based off of running from my Mac. Everything used below is platform independent, but some tweaks may need to be made to scripts that run locally.

Here is the basic flow: (I’m going to go through all of these steps in more details below.)

  1. Sign up for an AWS account, and create an access-key.
  2. Download Terraform and clone my git repo. Run this to provision 3 hosts.
  3. For a very basic install use this ansible script. I have a more complex one that installs some sample applications and uses Contiv for container segmentation. My ansible scripts are here and I am going to walk through these.

The TL;DR for this post is here.

git clone https://github.com/magic7s/terraform_aws_spot_instance.git
cd terraform_aws_spot_instance
terraform init
terraform apply
cd ..
git clone https://github.com/magic7s/ansible-kubeadm-contiv.git
cd ansible-kubeadm-contiv
# Edit inventory file with public ip addresses from terraform output
ansible-playbook -i inventory site.yml
Basic Diagram of AWS Environment to be provisioned

Infrastructure Provisioning using Terraform

There are probably a dozen ways to do the same thing. I chose Terraform to provision my virtual machines in AWS. Why? I’m lazy and didn’t want to keep clicking through the AWS console to spin up three clean VMs. I had to do it about 100 times because I always wanted to start fresh when deploying my cluster. Also, I wanted to destroy (delete) my VMs when I was not using them, so I didn’t have to pay.

I’m using spot instances because they are up to 90% off the normal price. I can get a m3.medium (3.75G RAM) for $0.0067/hr. There are some drawbacks; if amazon wants to raise the price higher than your bid price, they will terminate your instance. Also, you cannot power it off, you can only reboot it.

Installing Terraform is pretty simple. It comes as a compiled binary for your platform. Download it from https://www.terraform.io/downloads.html and follow the instructions. I copied the file to/usr/local/bin/terraform to allow me to run it from any directory on the command line.

Download my git repo with the terraform script to provision three hosts in a security group. https://github.com/magic7s/terraform_aws_spot_instance

Review the README.MD file for additional information on how to customize the script for you.

You will need to have your AWS access_key and secret_key available. I recommend you put it in ~/.aws/credentials and it should look like this:

[default]aws_access_key_id = AAABBBBCCCDDDEEEFFFaws_secret_access_key = ABC123456%$^&*WWWMMMCCC33658

If you installed the AWS CLI you may have already done this.

Once this is done, you can run terraform init .

macbook:terraform_aws_spot_instance brad$ terraform initInitializing provider plugins...
.... [lots of output omitted]
Terraform has been successfully initialized!.... [lots of output omitted]
macbook:terraform_aws_spot_instance brad$

Check that the aws region is correct in k8s.tf

variable "aws_region_name" { default = "us-west-2" }

You will need to pre-upload or create your SSH public key to the AWS region you are working in. Here are the instruction to do so. You will need the name of the key pair as listed on the AWS console.

variable "ssh_key_name" {default = "braddown-csicolaptop"}

Review the rest of the k8s.tf file for any changes you may want.

Run terraform plan

macbook:terraform_aws_spot_instance brad$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
data.external.myipaddr: Refreshing state...
data.aws_ami.ubuntu: Refreshing state...
------------------------------------------------------------------------
.... [lots of output omitted]
Plan: 7 to add, 0 to change, 0 to destroy.------------------------------------------------------------------------Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.
macbook:terraform_aws_spot_instance brad$

If you get any errors about data.external.myipaddr check that ./myipaddr.sh runs in your local directory. It should produce something like {“ip” : “68.231.198.92” } without error. If that doesn’t work try to comment it out and uncomment the other line.

data "external" "myipaddr" {# Pick one or the other. The second one requires an external script but uses DNS instead of https.#program = ["bash", "-c", "curl -s 'https://api.ipify.org?format=json'"]program = ["bash", "${path.module}/myipaddr.sh"]}

If all is successful run terraform apply and answer yes when prompted.

macbook:terraform_aws_spot_instance brad$ terraform apply
data.external.myipaddr: Refreshing state...
aws_security_group.k8s_sg: Refreshing state... (ID: sg-00dfd8ca76fe4e4f3)
data.aws_ami.ubuntu: Refreshing state...
aws_security_group_rule.allow_all_myip: Refreshing state... (ID: sgrule-2156153638)
aws_security_group_rule.allow_SG_any: Refreshing state... (ID: sgrule-104774467)
aws_security_group_rule.allow_all_egress: Refreshing state... (ID: sgrule-1753581355)
.... [lots of output omitted]Apply complete! Resources: 3 added, 0 changed, 0 destroyed.Outputs:master_ip = 54.218.25.205
worker_ips = [
54.202.109.45,
34.211.115.248
]
macbook:terraform_aws_spot_instance brad$

Congratulations you now have three spot instances, running the latest ubuntu 16.04 ami. The public IP address should output at the bottom. We will need these as we move to Ansible and installing kubernetes.

Deployed configuration via Ansible

Installing kubernetes with Ansible

Assuming you have already installed python, you can install ansible with pip. You should be installing at least ansible 2.4. Check to ensure you are running the latest version ansible --version

If you haven’t already cloned my git repo with all the ansible scripts, do that now from https://github.com/magic7s/ansible-kubeadm-contiv

You should now have the following files local.

.
├── deploy_contiv_network_config.yml
├── deploy_istio.yml
├── deploy_sample_apps.yml
├── group_vars
│ └── all
├── inventory
├── LICENSE
├── README.md
├── roles
│ ├── common
│ │ └── tasks
│ │ └── main.yml
│ ├── contiv
│ │ └── tasks
│ │ └── main.yml
│ ├── contiv_network_cfg
│ │ └── tasks
│ │ └── main.yml
│ ├── docker
│ │ └── tasks
│ │ └── main.yml
│ ├── istio
│ │ └── tasks
│ │ └── main.yml
│ ├── kubeadm
│ │ └── tasks
│ │ └── main.yml
│ ├── master
│ │ └── tasks
│ │ └── main.yml
│ ├── sample_apps
│ │ ├── tasks
│ │ │ ├── main.yml
│ │ │ ├── PHPGuestbook.yml
│ │ │ └── WordPressSQL.yml
│ │ ├── templates
│ │ │ ├── guestbook.yaml
│ │ │ └── wordpresssql.yaml
│ │ └── vars
│ │ └── main.yml
│ └── worker
│ └── tasks
│ └── main.yml
└── site.yml
22 directories, 22 files

site.yml is the main file that we will run to install everything. However, after the main installation (docker, kubernetes, contiv, etc.) you can run deploy_contiv_network_config.yml or deploy_sample_apps.yml or deploy_istio.yml as a separate scripts. By default everything will be installed in a single run.

You will need to update and review two files group_vars/all and inventory .

  1. kubeadm_token is a random generated token used to authenticate worker nodes into the cluster.
  2. If for some reason your network interface is not eth0 you will need to update the master_ip variable.
  3. If you do not want to install the sample apps or istio service mesh (which is just another sample app), change the values to false in the vars file.
  4. contiv_nets are all subnets used within the cluster, they do not need to be routable to the outside. Pods (groups of containers) will be accessed via port translation via the host’s IP address (inbound) or NAT (outbound).
## group_vars/all
kubeadm_token: db85f7.cff657b31b20eed5
master_ip: "{{ hostvars['master1']['ansible_eth0']['ipv4']['address'] }}"ansible_remote_user: ubuntukubeadm_reset_before_init: truedelete_kube_dns: falsedeploy_sample_apps: truedeploy_istio: truecontiv_nets:
# This network name is required for host to access pods.
contivh1:
net_type: -n infra
net_sub: -s 192.0.2.0/24
net_gw: -g 192.0.2.1
# This is used for pods that do not have a io.contiv.network label
default-net:
net_type: -n data
net_sub: -s 172.16.10.10-172.16.10.250/24
net_gw: -g 172.16.10.1
blue:
net_type: -n data
net_sub: -s 172.17.0.5-172.17.15.250/20
net_gw: -g 172.17.0.1
green:
net_type: -n data
net_sub: -s 172.18.0.5-172.18.15.250/20
net_gw: -g 172.18.0.1

If you have the output from the terraform apply command, the public IP addresses were outputted at the bottom or run terraform output

macbook:terraform_aws_spot_instance brad$ terraform output
master_ip = 54.218.113.71
worker_ips = [
54.190.7.158,
34.211.12.22
]
  1. Change the ip addresses to the public IP address of your hosts. Be careful not to copy-and-paste the comma between the worker IP addresses above.
# inventory
[master]
master1 ansible_ssh_host=54.218.113.71
[worker]
54.190.7.158
34.211.12.22
[master:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
[worker:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no'

After this, you should be able to run the ansible playbook.

macbook:ansible-kubeadm-contiv brad$ansible-playbook -i inventory site.yml.... [lots of output omitted]PLAY RECAP *********************************************************************************************************************************************************
34.217.135.160 : ok=16 changed=13 unreachable=0 failed=0
54.213.221.194 : ok=16 changed=13 unreachable=0 failed=0
master1 : ok=51 changed=41 unreachable=0 failed=0

There will be some tasks that fail (and are configured to be ignored) but they can be safely ignored. If the play stops and any of the hosts have a failed=1, that needs to be investigated.

For the most part you can run the playbook a second time and it will only update as needed or delete old configuration before re-applying. This is not 100% foolproof, and in some cases it is just easier to run terraform destroy and terraform apply to delete and spin up new hosts. [Remember to update the IP addresses in the inventory file].

Accessing and testing the kubernetes cluster

If all went well above, you should be able to access the newly built cluster.

SSH into the master node. If your laptop IP address has changed run terraform apply and it will check your IP address and update the aws security group.

Check if the cluster is in a Ready state:

ubuntu@ip-172-31-9-150:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-13-102 Ready <none> 9m v1.9.3
ip-172-31-8-222 Ready <none> 9m v1.9.3
ip-172-31-9-150 Ready master 17m v1.9.3

Check if all the pods are running:

ubuntu@ip-172-31-9-150:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default frontend-5b697d64fb-bvl6l 1/1 Running 0 9m
default frontend-5b697d64fb-jck77 1/1 Running 0 9m
default frontend-5b697d64fb-zfr56 1/1 Running 0 9m
default redis-master-54c674b96-kn76j 1/1 Running 0 9m
default redis-slave-58d5cbd65d-9msk6 1/1 Running 0 9m
default redis-slave-58d5cbd65d-dj2tn 1/1 Running 0 9m
default wordpress-77d578745-7w8lv 1/1 Running 0 8m
default wordpress-mysql-5fbdd6545b-qqb8t 1/1 Running 0 8m
kube-system contiv-etcd-wzckp 1/1 Running 0 14m
kube-system contiv-netmaster-cwv2k 1/1 Running 0 14m
kube-system contiv-netplugin-6wk6j 1/1 Running 0 10m
kube-system contiv-netplugin-9s7kj 1/1 Running 0 9m
kube-system contiv-netplugin-pjcc2 1/1 Running 0 14m
kube-system contiv-ovs-cq29l 2/2 Running 0 10m
kube-system contiv-ovs-hmgw8 2/2 Running 0 9m
kube-system contiv-ovs-knl6h 2/2 Running 0 14m
kube-system etcd-ip-172-31-9-150 1/1 Running 0 17m
kube-system kube-apiserver-ip-172-31-9-150 1/1 Running 0 17m
kube-system kube-controller-manager-ip-172-31-9-150 1/1 Running 0 17m
kube-system kube-proxy-fhr4z 1/1 Running 0 10m
kube-system kube-proxy-kdk6g 1/1 Running 0 17m
kube-system kube-proxy-kz6rj 1/1 Running 0 9m
kube-system kube-scheduler-ip-172-31-9-150 1/1 Running 0 17m

Check the service IP address and ports the sample apps are running:

ubuntu@ip-172-31-9-150:~$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
frontend NodePort 10.96.175.217 <none> 80:31887/TCP 12m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21m
redis-master ClusterIP 10.107.116.88 <none> 6379/TCP 12m
redis-slave ClusterIP 10.105.113.208 <none> 6379/TCP 12m
wordpress LoadBalancer 10.104.186.47 <pending> 80:30606/TCP 12m
wordpress-mysql ClusterIP 10.108.198.177 <none> 3306/TCP 12m

You can now access the sample apps via the Nodeport. All hosts in the cluster will expose a random high port that will be port mapped into the application.

Example:

  • http://54.213.139.136:31887 would access the Guestbook app
  • http://54.213.139.136:30606 would access the Wordpress app

Contiv Network Configuration

Contiv is the container networking plugin (CNI) and we have deployed three networks, plus an infra network.

ubuntu@ip-172-31-9-150:~$ netctl net ls
Tenant Network Nw Type Encap type Packet tag Subnet Gateway IPv6Subnet IPv6Gateway Cfgd Tag
------ ------- ------- ---------- ---------- ------- ------ ---------- ----------- ---------
default blue data vxlan 0 172.17.0.5-172.17.15.250/20 172.17.0.1
default contivh1 infra vxlan 0 192.0.2.0/24 192.0.2.1
default default-net data vxlan 0 172.16.10.10-172.16.10.250/24 172.16.10.1
default green data vxlan 0 172.18.0.5-172.18.15.250/20 172.18.0.1

All of our pods are deployed into the “blue” network. We can see that by the IP address of the pods.

ubuntu@ip-172-31-9-150:~$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
frontend-5b697d64fb-c9bzn 1/1 Running 0 2m 172.17.0.6 ip-172-31-8-222
frontend-5b697d64fb-ddlc5 1/1 Running 0 2m 172.17.0.5 ip-172-31-13-102
frontend-5b697d64fb-xmwfd 1/1 Running 0 2m 172.17.0.7 ip-172-31-9-150
redis-master-54c674b96-zz6nn 1/1 Running 0 2m 172.17.0.8 ip-172-31-13-102
redis-slave-58d5cbd65d-46gq2 1/1 Running 0 2m 172.17.0.9 ip-172-31-8-222
redis-slave-58d5cbd65d-qpd8h 1/1 Running 0 2m 172.17.0.10 ip-172-31-13-102
wordpress-75d9d5f86b-f7d8r 1/1 Running 0 1m 172.17.0.12 ip-172-31-13-102
wordpress-mysql-5fbdd6545b-xfcgq 1/1 Running 0 1m 172.17.0.11 ip-172-31-8-222

Let’s look at some of the group policy commands in Contiv. Here is a single endpoint group (Guestbook web tier). We can see it contains 3 endpoints that have been labeled with the tag io.contiv.net-group = epg-blue-guestbook-web

ubuntu@ip-172-31-9-150:~$ netctl group inspect epg-blue-guestbook-web
{
"Config": {
"key": "default:epg-blue-guestbook-web",
"groupName": "epg-blue-guestbook-web",
"networkName": "blue",
"policies": [
"policy-blue-guestbook-web"
],
"tenantName": "default",
"link-sets": {
"MatchRules": {
"default:policy-blue-guestbook-db:10": {
"type": "rule",
"key": "default:policy-blue-guestbook-db:10"
}
},
"Policies": {
"default:policy-blue-guestbook-web": {
"type": "policy",
"key": "default:policy-blue-guestbook-web"
}
}
},
"links": {
"AppProfile": {},
"NetProfile": {},
"Network": {
"type": "network",
"key": "default:blue"
},
"Tenant": {
"type": "tenant",
"key": "default"
}
}
},
"Oper": {
"endpoints": [
{
"containerName": "frontend-5b697d64fb-ddlc5",
"endpointGroupId": 6,
"endpointGroupKey": "epg-blue-guestbook-web:default",
"endpointID": "0268f0a30adbf76b30c6b0d5910f1f78d5719afd9761507f57a455813a827604",
"homingHost": "ip-172-31-13-102",
"ipAddress": [
"172.17.0.5",
""
],
"labels": "map[]",
"macAddress": "02:02:ac:11:00:05",
"network": "blue.default",
"serviceName": "epg-blue-guestbook-web"
},
{
"containerName": "frontend-5b697d64fb-c9bzn",
"endpointGroupId": 6,
"endpointGroupKey": "epg-blue-guestbook-web:default",
"endpointID": "24764b44e343c9352d96da863d2f9a788060487707c13a6a7093e5fb609a5261",
"homingHost": "ip-172-31-8-222",
"ipAddress": [
"172.17.0.6",
""
],
"labels": "map[]",
"macAddress": "02:02:ac:11:00:06",
"network": "blue.default",
"serviceName": "epg-blue-guestbook-web"
},
{
"containerName": "frontend-5b697d64fb-xmwfd",
"endpointGroupId": 6,
"endpointGroupKey": "epg-blue-guestbook-web:default",
"endpointID": "2ec34a609b8819553f714c3cf348d96910053a9deee51b6b577a95a691a2cf6c",
"homingHost": "ip-172-31-9-150",
"ipAddress": [
"172.17.0.7",
""
],
"labels": "map[]",
"macAddress": "02:02:ac:11:00:07",
"network": "blue.default",
"serviceName": "epg-blue-guestbook-web"
}
],
"externalPktTag": 1,
"groupTag": "epg-blue-guestbook-web.default",
"numEndpoints": 3,
"pktTag": 1
}
}
ubuntu@ip-172-31-9-150:~$

We can see all of the groups configured and their policies associated to each group. Then we show all the policies. Then each rule-set associated to each policy.

ubuntu@ip-172-31-9-150:~$ netctl group ls
Tenant Group Network IP Pool CfgdTag Policies Network profile
------ ----- ------- ------- ------- -------- ---------------
default epg-blue-guestbook-web blue policy-blue-guestbook-web
default epg-blue-guestbook-db blue policy-blue-guestbook-db
default epg-blue-wordpress-web blue policy-blue-wordpress-web
default epg-blue-wordpress-db blue policy-blue-wordpress-db
ubuntu@ip-172-31-9-150:~$ netctl policy ls
Tenant Policy
------ ------
default policy-blue-guestbook-web
default policy-blue-guestbook-db
default policy-blue-wordpress-web
default policy-blue-wordpress-db
ubuntu@ip-172-31-9-150:~$ netctl policy rule-ls policy-blue-guestbook-web
Incoming Rules:
Rule Priority From EndpointGroup From Network From IpAddress To IpAddress Protocol Port Action
---- -------- ------------------ ------------ --------- ------------ -------- ---- ------
10 1 tcp 80 allow
Outgoing Rules:
Rule Priority To EndpointGroup To Network To IpAddress Protocol Port Action
---- -------- ---------------- ---------- --------- -------- ---- ------
ubuntu@ip-172-31-9-150:~$ netctl policy rule-ls policy-blue-guestbook-db
Incoming Rules:
Rule Priority From EndpointGroup From Network From IpAddress To IpAddress Protocol Port Action
---- -------- ------------------ ------------ --------- ------------ -------- ---- ------
20 1 icmp 0 allow
10 1 epg-blue-guestbook-web 0 allow
Outgoing Rules:
Rule Priority To EndpointGroup To Network To IpAddress Protocol Port Action
---- -------- ---------------- ---------- --------- -------- ---- ------
ubuntu@ip-172-31-9-150:~$

These groups and policies are all configured via Ansible in the roles/sample_apps/vars/main.yml

Istio Service Mesh

If you have deploy_istio: true in the group_vars/all file, you should have istio and a sample application Bookinfo installed.

You should be able to access the Bookinfo app via the istio-ingress service.

ubuntu@ip-172-31-9-150:~$ kubectl get service -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-ingress LoadBalancer 10.110.99.208 <pending> 80:32398/TCP,443:31125/TCP 26m
istio-mixer ClusterIP 10.105.165.158 <none> 9091/TCP,15004/TCP,9093/TCP,9094/TCP,9102/TCP,9125/UDP,42422/TCP 26m
istio-pilot ClusterIP 10.100.147.149 <none> 15003/TCP,8080/TCP,9093/TCP,443/TCP 26m

Because the port assignment is random, your port numbers will be different. Mine would be: http://54.213.139.136:32398/productpage

Follow the sample guide at https://istio.io/docs/tasks/traffic-management/request-routing.html for more sample configuration tasks. All the istio files are located in /tmp/istio-0.6.0/

Enjoy!

Please provide feedback and/or suggestions. I used this process to learn (and troubleshoot) various new technologies. I am happy to accept pull requests on any of the code via github. If you notice any errors or things that are not explained in this guide, please let me know and I will update it.

--

--

A lover of technology. I am a Strategic Account Leader at GitLab; the opinions expressed in this blog are my own views and not those of GitLab.