VMware Tanzu Community Edition Kubernetes Platform in a Homelab

Back in October, VMware announced Tanzu Community Edition as way to provide "a full-featured, easy-to-manage Kubernetes platform that’s perfect for users and learners alike." TCE bundles a bunch of open-source components together in a modular, "batteries included but swappable" way:

Tanzu Community Edition components

I've been meaning to brush up on my Kubernetes skills so I thought deploying and using TCE in my self-contained homelab would be a fun and rewarding learning exercise - and it was!

Here's how I did it.

Planning

TCE supports several different deployment scenarios and targets. It can be configured as separate Management and Workload Clusters or as a single integrated Standalone Cluster, and deployed to cloud providers like AWS and Azure, on-premise vSphere, or even a local Docker environment1. I'll be using the standard Management + Workload Cluster setup in my on-prem vSphere, so I start by reviewing the Prepare to Deploy a Cluster to vSphere documentation to get an idea of what I'll need.

Looking ahead, part of the installation process creates a local KIND cluster for bootstrapping the Management and Workload clusters. I do most of my home computing (and homelab work) by using the Linux environment available on my Chromebook. Unfortunately I know from past experience that KIND will not work within this environment so I'll be using a Debian 10 VM to do the deployment.

Networking

The Kubernetes node VMs will need to be attached to a network with a DHCP server to assign their addresses, and that network will need to be able to talk to vSphere. My router handles DHCP for the range 192.168.1.101-250 so I'll plan on using that.

I'll also need to set aside a few static IPs for this project. These will need to be routable and within the same subnet as the DHCP range, but excluded from that DHCP range.

IP Address Purpose
192.168.1.60 Control plane for Management cluster
192.168.1.61 Control plane for Workload cluster
192.168.1.64 - 192.168.1.80 IP range for Workload load balancer

Prerequisites

Moving on to the Getting Started, I'll need to grab some software before I can actually Get Started.

Kubernetes control plane image

I need to download a VMware OVA which can be used for deploying my Kubernetes nodes from the VMWare Customer Connect portal here2. There are a few different options available. I'll get the Photon release with the highest Kubernetes version currently available, photon-3-kube-v1.21.2+vmware.1-tkg.2-12816990095845873721.ova.

Once the file is downloaded, I'll log into my vCenter and use the Deploy OVF Template action to deploy a new VM using the OVA. I won't bother booting the machine once deployed but will rename it to k8s-node to make it easier to identify later on and then convert it to a template.

New k8s-node template

Docker

I've already got Docker installed on this machine, but if I didn't I would follow the instructions here to get it installed and then follow these instructions to enable management of Docker without root.

I also verify that my install is using cgroup version 1 as version 2 is not currently supported:

1❯ docker info | grep -i cgroup
2 Cgroup Driver: cgroupfs
3 Cgroup Version: 1

kubectl binary

Next up, I'll install kubectl as described here - though the latest version is currently 1.23 and that won't work with the 1.21 control plane node image I downloaded from VMware (kubectl needs to be within one minor version of the control plane). Instead I need to find the latest 1.22 release.

I can look at the releases page on GithHub to see that the latest release for me is 1.22.5. With this newfound knowledge I can follow the Install kubectl binary with curl on Linux instructions to grab that specific version:

 1❯ curl -LO https://dl.k8s.io/release/v1.22.5/bin/linux/amd64/kubectl
 2
 3  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
 4                                 Dload  Upload   Total   Spent    Left  Speed
 5100   154  100   154    0     0   2298      0 --:--:-- --:--:-- --:--:--  2298
 6100 44.7M  100 44.7M    0     0  56.9M      0 --:--:-- --:--:-- --:--:-- 56.9M
 7
 8❯ sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
 9
10[sudo] password for john:
11
12❯ kubectl version --client
13Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:38:33Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}

kind binary

It's not strictly a requirement, but having the kind executable available will be handy for troubleshooting during the bootstrap process in case anything goes sideways. It can be installed in basically the same was as kubectl:

 1❯ curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.11.1/kind-linux-amd64
 2
 3  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
 4                                 Dload  Upload   Total   Spent    Left  Speed
 5100    98  100    98    0     0    513      0 --:--:-- --:--:-- --:--:--   513
 6100   655  100   655    0     0   2212      0 --:--:-- --:--:-- --:--:-- 10076
 7100 6660k  100 6660k    0     0  11.8M      0 --:--:-- --:--:-- --:--:-- 11.8M
 8
 9❯ sudo install -o root -g root -m 0755 kind /usr/local/bin/kind
10
11❯ kind version
12kind v0.11.1 go1.16.5 linux/amd64

Tanzu CLI

The final bit of required software is the Tanzu CLI, which can be downloaded from the project on GitHub.

1curl -H "Accept: application/vnd.github.v3.raw" \
2    -L https://api.github.com/repos/vmware-tanzu/community-edition/contents/hack/get-tce-release.sh | \
3    bash -s v0.9.1 linux

And then unpack it and run the installer:

1tar xf tce-linux-amd64-v0.9.1.tar.gz
2cd tce-linux-amd64-v0.9.1
3./install.sh

I can then verify the installation is working correctly:

1❯ tanzu version
2version: v0.2.1
3buildDate: 2021-09-29
4sha: ceaa474

Cluster creation

Okay, now it's time for the good stuff - creating some shiny new Tanzu clusters! The Tanzu CLI really does make this very easy to accomplish.

Management cluster

I need to create a Management cluster first and I'd like to do that with the UI, so that's as simple as:

1tanzu management-cluster create --ui

I should then be able to access the UI by pointing a web browser at http://127.0.0.1:8080... but I'm running this on a VM without a GUI, so I'll need to back up and tell it to bind on 0.0.0.0:8080 so the web installer will be accessible across the network. I can also include --browser none so that the installer doesn't bother with trying to launch a browser locally.

1❯ tanzu management-cluster create --ui --bind 0.0.0.0:8080 --browser none
2
3Validating the pre-requisites...
4Serving kickstart UI at http://[::]:8080

Now I can point my local browser to my VM and see the UI:

The Tanzu Installer UI

And then I can click the button at the bottom left to save my eyes3 before selecting the option to deploy on vSphere.

Configuring the IaaS Provider

I'll plug in the FQDN of my vCenter and provide a username and password to use to connect to it, then hit the Connect button. That will prompt me to accept the vCenter's certificate thumbprint, and then I'll be able to select the virtual datacenter that I want to use. Finally, I'll paste in the SSH public key4 I'll use for interacting with the cluster.

I click Next and move on to the Management Cluster Settings.

Configuring the Management Cluster

This is for a lab environment that's fairly memory-constrained, so I'll pick the single-node Development setup with a small instance type. I'll name the cluster tce-mgmt and stick with the default kube-vip control plane endpoint provider. I plug in the control plane endpoint IP that I'll use for connecting to the cluster and select the small instance type for the worker node type.

I don't have an NSX Advanced Load Balancer or any Metadata to configure so I'll skip past those steps and move on to configuring the Resources.

Configuring Resources

Here I pick to place the Tanzu-related resources in a VM folder named Tanzu, to store their data on my single host's single datastore, and to deploy to the one-host physical-cluster cluster.

Now for the Kubernetes Networking Settings:

Configuring Kubernetes Networking

This bit is actually pretty easy. For Network Name, I select the vSphere network where the 192.168.1.0/24 network I identified earlier lives, d-Home-Mgmt. I leave the service and pod CIDR ranges as default.

I disable the Identity Management option and then pick the k8s-node template I had imported to vSphere earlier.

Configuring the OS Image

I skip the Tanzu Mission Control piece (since I'm still waiting on access to TMC Starter) and click the Review Configuration button at the bottom of the screen to review my selections.

Reviewing the configuration

See the option at the bottom to copy the CLI command? I'll need to use that since clicking the friendly Deploy button doesn't seem to work while connected to the web server remotely.

1tanzu management-cluster create --file /home/john/.config/tanzu/tkg/clusterconfigs/dr94t3m2on.yaml -v 6

In fact, I'm going to copy that file into my working directory and give it a more descriptive name so that I can re-use it in the future.

1cp ~/.config/tanzu/tkg/clusterconfigs/dr94t3m2on.yaml ~/projects/tanzu-homelab/tce-mgmt.yaml

Now I can run the install command:

1tanzu management-cluster create --file ./tce-mgmt.yaml -v 6

After a moment or two of verifying prerequisites, I'm met with a polite offer to enable Tanzu Kubernetes Grid Service in vSphere:

 1vSphere 7.0 Environment Detected.
 2
 3You have connected to a vSphere 7.0 environment which does not have vSphere with Tanzu enabled. vSphere with Tanzu includes
 4an integrated Tanzu Kubernetes Grid Service which turns a vSphere cluster into a platform for running Kubernetes workloads in dedicated
 5resource pools. Configuring Tanzu Kubernetes Grid Service is done through vSphere HTML5 client.
 6
 7Tanzu Kubernetes Grid Service is the preferred way to consume Tanzu Kubernetes Grid in vSphere 7.0 environments. Alternatively you may
 8deploy a non-integrated Tanzu Kubernetes Grid instance on vSphere 7.0.
 9Note: To skip the prompts and directly deploy a non-integrated Tanzu Kubernetes Grid instance on vSphere 7.0, you can set the 'DEPLOY_TKG_ON_VSPHERE7' configuration variable to 'true'
10
11Do you want to configure vSphere with Tanzu? [y/N]: n
12Would you like to deploy a non-integrated Tanzu Kubernetes Grid management cluster on vSphere 7.0? [y/N]: y

That's not what I'm after in this case, though, so I'll answer with a n and a y to confirm that I want the non-integrated TKG deployment.

And now I go get coffee as it'll take 10-15 minutes for the deployment to complete.

Coffee break!

Okay, I'm back - and so is my shell prompt! The deployment completed successfully:

 1Waiting for additional components to be up and running...
 2Waiting for packages to be up and running...
 3Context set for management cluster tce-mgmt as 'tce-mgmt-admin@tce-mgmt'.
 4
 5Management cluster created!
 6
 7
 8You can now create your first workload cluster by running the following:
 9
10  tanzu cluster create [name] -f [file]
11
12
13Some addons might be getting installed! Check their status by running the following:
14
15  kubectl get apps -A

I can run that last command to go ahead and verify that the addon installation has completed:

1❯ kubectl get apps -A
2NAMESPACE    NAME                   DESCRIPTION           SINCE-DEPLOY   AGE
3tkg-system   antrea                 Reconcile succeeded   26s            6m49s
4tkg-system   metrics-server         Reconcile succeeded   36s            6m49s
5tkg-system   tanzu-addons-manager   Reconcile succeeded   22s            8m54s
6tkg-system   vsphere-cpi            Reconcile succeeded   19s            6m50s
7tkg-system   vsphere-csi            Reconcile succeeded   36s            6m50s

And I can use the Tanzu CLI to get some other details about the new management cluster:

 1❯ tanzu management-cluster get tce-mgmt
 2  NAME      NAMESPACE   STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES
 3  tce-mgmt  tkg-system  running  1/1           1/1      v1.21.2+vmware.1  management
 4
 5
 6Details:
 7
 8NAME                                                         READY  SEVERITY  REASON  SINCE  MESSAGE
 9/tce-mgmt                                                    True                     40m
10├─ClusterInfrastructure - VSphereCluster/tce-mgmt            True                     41m
11├─ControlPlane - KubeadmControlPlane/tce-mgmt-control-plane  True                     40m
12│ └─Machine/tce-mgmt-control-plane-xtdnx                     True                     40m
13└─Workers
14  └─MachineDeployment/tce-mgmt-md-0
15    └─Machine/tce-mgmt-md-0-745b858d44-4c9vv                 True                     40m
16
17
18Providers:
19
20  NAMESPACE                          NAME                    TYPE                    PROVIDERNAME  VERSION  WATCHNAMESPACE
21  capi-kubeadm-bootstrap-system      bootstrap-kubeadm       BootstrapProvider       kubeadm       v0.3.23
22  capi-kubeadm-control-plane-system  control-plane-kubeadm   ControlPlaneProvider    kubeadm       v0.3.23
23  capi-system                        cluster-api             CoreProvider            cluster-api   v0.3.23
24  capv-system                        infrastructure-vsphere  InfrastructureProvider  vsphere       v0.7.10

Excellent! Things are looking good so I can move on to create the cluster which will actually run my workloads.

Workload cluster

I won't use the UI for this but will instead take a copy of my tce-mgmt.yaml file and adapt it to suit the workload needs (as described here).

1cp tce-mgmt.yaml tce-work.yaml
2vi tce-work.yaml

I only need to change 2 of the parameters in this file:

  • CLUSTER_NAME: from tce-mgmt to tce-work
  • VSPHERE_CONTROL_PLANE_ENDPOINT: from 192.168.1.60 to 192.168.1.61

I could change a few others if I wanted to5:

  • (Optional) CLUSTER_PLAN to change between dev/prod plans independently
  • (Optional) CONTROL_PLANE_MACHINE_COUNT to deploy an increased number of control plane nodes (must but an odd integer)
  • (Optional) WORKER_MACHINE_COUNT to add worker nodes
  • (Optional) NAMESPACE to deploy the cluster in a specific Kubernetes namespace
  • (Optional) OS_NAME and OS_VERSION to use a different machine image for the workload cluster

After saving my changes to the tce-work.yaml file, I'm ready to deploy the cluster:

 1❯ tanzu cluster create --file tce-work.yaml
 2Validating configuration...
 3Warning: Pinniped configuration not found. Skipping pinniped configuration in workload cluster. Please refer to the documentation to check if you can configure pinniped on workload cluster manually
 4Creating workload cluster 'tce-work'...
 5Waiting for cluster to be initialized...
 6Waiting for cluster nodes to be available...
 7Waiting for addons installation...
 8Waiting for packages to be up and running...
 9
10Workload cluster 'tce-work' created

Right on! I'll use tanzu cluster get to check out the workload cluster:

 1❯ tanzu cluster get tce-work
 2  NAME      NAMESPACE  STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES
 3  tce-work  default    running  1/1           1/1      v1.21.2+vmware.1  <none>
 4 5
 6Details:
 7
 8NAME                                                         READY  SEVERITY  REASON  SINCE  MESSAGE
 9/tce-work                                                    True                     9m31s
10├─ClusterInfrastructure - VSphereCluster/tce-work            True                     10m
11├─ControlPlane - KubeadmControlPlane/tce-work-control-plane  True                     9m31s
12│ └─Machine/tce-work-control-plane-8km9m                     True                     9m31s
13└─Workers
14  └─MachineDeployment/tce-work-md-0
15    └─Machine/tce-work-md-0-687444b744-cck4x                 True                     8m31s

I can also go into vCenter and take a look at the VMs which constitute the two clusters:

Cluster VMs

I've highlighted the two Control Plane nodes. They got their IP addresses assigned by DHCP, but VMware says that I need to create reservations for them to make sure they don't change. So I'll do just that.

DHCP reservations on Google Wifi

Excellent, I've got a Tanzu management cluster and a Tanzu workload cluster. What now?

Working with Tanzu

If I run kubectl get nodes right now, I'll only get information about the management cluster:

1❯ kubectl get nodes
2NAME                             STATUS   ROLES                  AGE   VERSION
3tce-mgmt-control-plane-xtdnx     Ready    control-plane,master   18h   v1.21.2+vmware.1
4tce-mgmt-md-0-745b858d44-4c9vv   Ready    <none>                 17h   v1.21.2+vmware.1

Setting the right context

To be able to deploy stuff to the workload cluster, I need to tell kubectl how to talk to it. And to do that, I'll first need to use tanzu to capture the cluster's kubeconfig:

1❯ tanzu cluster kubeconfig get tce-work --admin
2Credentials of cluster 'tce-work' have been saved
3You can now access the cluster by running 'kubectl config use-context tce-work-admin@tce-work'

I can now run kubectl config get-contexts and see that I have access to contexts on both management and workload clusters:

1❯ kubectl config get-contexts
2CURRENT   NAME                      CLUSTER    AUTHINFO         NAMESPACE
3*         tce-mgmt-admin@tce-mgmt   tce-mgmt   tce-mgmt-admin
4          tce-work-admin@tce-work   tce-work   tce-work-admin

And I can switch to the tce-work cluster like so:

1❯ kubectl config use-context tce-work-admin@tce-work
2Switched to context "tce-work-admin@tce-work".
3❯ kubectl get nodes
4NAME                             STATUS   ROLES                  AGE   VERSION
5tce-work-control-plane-8km9m     Ready    control-plane,master   17h   v1.21.2+vmware.1
6tce-work-md-0-687444b744-cck4x   Ready    <none>                 17h   v1.21.2+vmware.1

There they are!

Deploying the yelb demo app

Before I move on to deploying actually useful workloads, I'll start with deploying a quick demo application as described in William Lam's post on Interesting Kubernetes application demos. yelb is a web app which consists of a UI front end, application server, database server, and Redis caching service so it's a great little demo to make sure Kubernetes is working correctly.

I can check out the sample deployment that William put together here, and then deploy it with:

 1❯ kubectl create ns yelb
 2namespace/yelb created
 3
 4❯ kubectl apply -f https://raw.githubusercontent.com/lamw/vmware-k8s-app-demo/master/yelb.yaml
 5service/redis-server created
 6service/yelb-db created
 7service/yelb-appserver created
 8service/yelb-ui created
 9deployment.apps/yelb-ui created
10deployment.apps/redis-server created
11deployment.apps/yelb-db created
12deployment.apps/yelb-appserver created
13
14❯ kubectl -n yelb get pods
15NAME                             READY   STATUS    RESTARTS   AGE
16redis-server-74556bbcb7-r9jqc    1/1     Running   0          10s
17yelb-appserver-d584bb889-2jspg   1/1     Running   0          10s
18yelb-db-694586cd78-wb8tt         1/1     Running   0          10s
19yelb-ui-8f54fd88c-k2dw9          1/1     Running   0          10s

Once the app is running, I can point my web browser at it to see it in action. But what IP do I use?

1❯ kubectl -n yelb get svc/yelb-ui
2NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
3yelb-ui          NodePort    100.71.228.116   <none>        80:30001/TCP   84s

This demo is using a NodePort type service to expose the front end, which means it will be accessible on port 30001 on the node it's running on. I can find that IP by:

1❯ kubectl -n yelb describe pod $(kubectl -n yelb get pods | grep yelb-ui | awk '{print $1}') | grep "Node:"
2Node:         tce-work-md-0-687444b744-cck4x/192.168.1.145

So I can point my browser at http://192.168.1.145:30001 and see the demo:

yelb demo page

After marveling at my own magnificence6 for a few minutes, I'm ready to move on to something more interesting - but first, I'll just delete the yelb namespace to clean up the work I just did:

1❯ kubectl delete ns yelb
2namespace "yelb" deleted

Now let's move on and try to deploy yelb behind a LoadBalancer service so it will get its own IP. William has a deployment spec for that too.

 1❯ kubectl create ns yelb
 2namespace/yelb created
 3
 4❯ kubectl apply -f https://raw.githubusercontent.com/lamw/vmware-k8s-app-demo/master/yelb-lb.yaml
 5service/redis-server created
 6service/yelb-db created
 7service/yelb-appserver created
 8service/yelb-ui created
 9deployment.apps/yelb-ui created
10deployment.apps/redis-server created
11deployment.apps/yelb-db created
12deployment.apps/yelb-appserver created
13
14❯ kubectl -n yelb get pods
15NAME                             READY   STATUS    RESTARTS   AGE
16redis-server-74556bbcb7-q6l62    1/1     Running   0          7s
17yelb-appserver-d584bb889-p5qgd   1/1     Running   0          7s
18yelb-db-694586cd78-hjtn4         1/1     Running   0          7s
19yelb-ui-8f54fd88c-pm9qw          1/1     Running   0          7s

And I can take a look at that service...

1❯ kubectl -n yelb get svc/yelb-ui
2NAME      TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
3yelb-ui   LoadBalancer   100.67.177.185   <pending>     80:32339/TCP   15s

Wait a minute. That external IP is still <pending>. What gives? Oh yeah I need to actually deploy and configure a load balancer before I can balance anything. That's up next.

Deploying kube-vip as a load balancer

Fortunately, William Lam wrote up some tips for handling that too. It's based on work by Scott Rosenberg. The quick-and-dirty steps needed to make this work are:

1git clone https://github.com/vrabbi/tkgm-customizations.git
2cd tkgm-customizations/carvel-packages/kube-vip-package
3kubectl apply -n tanzu-package-repo-global -f metadata.yml
4kubectl apply -n tanzu-package-repo-global -f package.yaml
5cat << EOF > values.yaml
6vip_range: 192.168.1.64-192.168.1.80
7EOF
8tanzu package install kubevip -p kubevip.terasky.com -v 0.3.9 -f values.yaml

Now I can check out the yelb-ui service again:

1❯ kubectl -n yelb get svc/yelb-ui
2NAME      TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)        AGE
3yelb-ui   LoadBalancer   100.67.177.185   192.168.1.65   80:32339/TCP   4h35m

And it's got an IP! I can point my browser to http://192.168.1.65 now and see:

Successful LoadBalancer test!

I'll keep the kube-vip load balancer since it'll come in handy, but I have no further use for yelb:

1❯ kubectl delete ns yelb
2namespace "yelb" deleted

Persistent Volume Claims, Storage Classes, and Storage Policies

At some point, I'm going to want to make sure that data from my Tanzu workloads stick around persistently - and for that, I'll need to define some storage stuff.

First up, I'll add a new tag called tkg-storage-local to the nuchost-local vSphere datastore that I want to use for storing Tanzu volumes:

Tag (and corresponding category) applied

Then I create a new vSphere Storage Policy called tkg-storage-policy which states that data covered by the policy should be placed on the datastore(s) tagged with tkg-storage-local:

My Tanzu storage policy

So that's the vSphere side of things sorted; now to map that back to the Kubernetes side. For that, I'll need to define a Storage Class tied to the vSphere Storage profile so I drop these details into a new file called vsphere-sc.yaml:

1kind: StorageClass
2apiVersion: storage.k8s.io/v1
3metadata:
4  name: vsphere
5provisioner: csi.vsphere.vmware.com
6parameters:
7  storagePolicyName: tkg-storage-policy

And then apply it with :

1❯ kubectl apply -f vsphere-sc.yaml
2storageclass.storage.k8s.io/vsphere created

I can test that I can create a Persistent Volume Claim against the new vsphere Storage Class by putting this in a new file called vsphere-pvc.yaml:

 1apiVersion: v1
 2kind: PersistentVolumeClaim
 3metadata:
 4  labels:
 5    name: vsphere-demo-1
 6  name: vsphere-demo-1
 7spec:
 8  accessModes:
 9    - ReadWriteOnce
10  storageClassName: vsphere
11  resources:
12    requests:
13      storage: 5Gi

And applying it:

1❯ kubectl apply -f demo-pvc.yaml
2persistentvolumeclaim/vsphere-demo-1 created

I can see the new claim, and confirm that its status is Bound:

1❯ kubectl get pvc
2NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
3vsphere-demo-1   Bound    pvc-36cc7c01-a1b3-4c1c-ba0d-dff3fd47f93b   5Gi        RWO            vsphere        4m25s

And for bonus points, I can see that the container volume was created on the vSphere side:

Container Volume in vSphere

So that's storage sorted. I'll clean up my test volume before moving on:

1❯ kubectl delete -f demo-pvc.yaml
2persistentvolumeclaim "vsphere-demo-1" deleted

A real workload - phpIPAM

Demos are all well and good, but how about a real-world deployment to tie it all together? I've been using a phpIPAM instance for assigning static IP addresses for my vRealize Automation deployments, but have only been using it to monitor IP usage within the network ranges to which vRA will provision machines. I recently decided that I'd like to expand phpIPAM's scope so it can keep an eye on all the network ranges within the environment. That's not a big ask in my little self-contained homelab, but having a single system scanning all the ranges of a large production network probably wouldn't scale too well.

Fortunately the phpIPAM project provides a remote scanning agent which can be used for keeping an eye on networks and reporting back to the main phpIPAM server. With this, I could deploy an agent to each region (or multiple agents to a region!) and divide up the network into chunks that each agent would be responsible for scanning. But that's a pretty lightweight task for a single server to manage, and who wants to deal with configuring multiple instances of the same thing? Not this guy.

So I set to work exploring some containerization options, and I found phpipam-docker. That would easily replicate my existing setup in a trio of containers (one for the web front-end, one for the database back-end, and one with cron jobs to run scans at regular intervals)... but doesn't provide a remote scan capability. I also found a dockerized phpipam-agent, but this one didn't quite meet my needs. It did provide me a base to work off of though so a few days of tinkering resulted in me publishing my first Docker image. I've still some work to do before this application stack is fully ready for production but it's at a point where I think it's worth doing a test deploy.

To start, I'll create a new namespace to keep things tidy:

1❯ kubectl create ns ipam
2namespace/ipam created

I'm going to wind up with four pods:

  • phpipam-db for the database back-end
  • phpipam-www for the web front-end
  • phpipam-cron for the local cron jobs, which will be largely but not completely7 replaced by:
  • phpipam-agent for my remote scan agent

I'll use each container's original docker-compose configuration and adapt that into something I can deploy on Kubernetes.

phpipam-db

The phpIPAM database will live inside a MariaDB container. Here's the relevant bit from docker-compose:

1services:
2  phpipam-db:
3    image: mariadb:latest
4    ports:
5      - "3306:3306"
6    environment:
7      - MYSQL_ROOT_PASSWORD=VMware1!VMWare1!
8    volumes:
9      - phpipam-db-data:/var/lib/mysql

So it will need a Service exposing the container's port 3306 so that other pods can connect to the database. For my immediate demo, using type: ClusterIP will be sufficient since all the connections will be coming from within the cluster. When I do this for real, it will need to be type: LoadBalancer so that the agent running on a different cluster can connect. And it will need a PersistentVolumeClaim so it can store the database data at /var/lib/mysql. It will also get passed an environment variable to set the initial root password on the database instance (which will be used later during the phpIPAM install to create the initial phpipam database).

It might look like this on the Kubernetes side:

 1# phpipam-db.yaml
 2apiVersion: v1
 3kind: Service
 4metadata:
 5  name: phpipam-db
 6  labels:
 7    app: phpipam-db
 8  namespace: ipam
 9spec:
10  type: ClusterIP
11  ports:
12  - name: mysql
13    port: 3306
14    protocol: TCP
15    targetPort: 3306
16  selector:
17    app: phpipam-db
18---
19apiVersion: v1
20kind: PersistentVolumeClaim
21metadata:
22  labels:
23    name: phpipam-db
24  name: phpipam-db-pvc
25  namespace: ipam
26spec:
27  accessModes:
28    - ReadWriteOnce
29  storageClassName: vsphere
30  resources:
31    requests:
32      storage: 5Gi
33---
34apiVersion: apps/v1
35kind: Deployment
36metadata:
37  name: phpipam-db
38  namespace: ipam
39spec:
40  selector:
41    matchLabels:
42      app: phpipam-db
43  replicas: 1
44  template:
45    metadata:
46      labels:
47        app: phpipam-db
48    spec:
49      containers:
50      - name: phpipam-db
51        image: mariadb:latest
52        env:
53        - name: MYSQL_ROOT_PASSWORD
54          value: "VMware1!VMware1!"
55        ports:
56        - name: mysql
57          containerPort: 3306
58        volumeMounts:
59        - name: phpipam-db-vol
60          mountPath: /var/lib/mysql
61      volumes:
62      - name: phpipam-db-vol
63        persistentVolumeClaim:
64          claimName: phpipam-db-pvc

Moving on:

phpipam-www

This is the docker-compose excerpt for the web component:

 1services:
 2  phpipam-web:
 3    image: phpipam/phpipam-www:1.5x
 4    ports:
 5      - "80:80"
 6    environment:
 7      - TZ=UTC
 8      - IPAM_DATABASE_HOST=phpipam-db
 9      - IPAM_DATABASE_PASS=VMware1!
10      - IPAM_DATABASE_WEBHOST=%
11    volumes:
12      - phpipam-logo:/phpipam/css/images/logo

Based on that, I can see that my phpipam-www pod will need a container running the phpipam/phpipam-www:1.5x image, a Service of type LoadBalancer to expose the web interface on port 80, a PersistentVolumeClaim mounted to /phpipam/css/images/logo, and some environment variables passed in to configure the thing. Note that the IPAM_DATABASE_PASS variable defines the password used for the phpipam user on the database (not the root user referenced earlier), and the IPAM_DATABASE_WEBHOST=% variable will define which hosts that phpipam database user will be able to connect from; setting it to % will make sure that my remote agent can connect to the database even if I don't know where the agent will be running.

Here's how I'd adapt that into a structure that Kubernetes will understand:

 1# phpipam-www.yaml
 2apiVersion: v1
 3kind: Service
 4metadata:
 5  name: phpipam-www
 6  labels:
 7    app: phpipam-www
 8  namespace: ipam
 9spec:
10  type: LoadBalancer
11  ports:
12  - name: http 
13    port: 80
14    protocol: TCP
15    targetPort: 80
16  selector:
17    app: phpipam-www
18---
19apiVersion: v1
20kind: PersistentVolumeClaim
21metadata:
22  labels:
23    name: phpipam-www
24  name: phpipam-www-pvc
25  namespace: ipam
26spec:
27  accessModes:
28    - ReadWriteOnce
29  storageClassName: vsphere
30  resources:
31    requests:
32      storage: 100Mi
33---
34apiVersion: apps/v1
35kind: Deployment
36metadata:
37  name: phpipam-www
38  namespace: ipam
39spec:
40  selector:
41    matchLabels:
42      app: phpipam-www
43  replicas: 1
44  template:
45    metadata:
46      labels:
47        app: phpipam-www
48    spec:
49      containers:
50      - name: phpipam-www
51        image: phpipam/phpipam-www:1.5x
52        env:
53        - name: TZ
54          value: "UTC"
55        - name: IPAM_DATABASE_HOST
56          value: "phpipam-db"
57        - name: IPAM_DATABASE_PASS
58          value: "VMware1!"
59        - name: IPAM_DATABASE_WEBHOST
60          value: "%"
61        ports:
62        - containerPort: 80
63        volumeMounts:
64        - name: phpipam-www-vol
65          mountPath: /phpipam/css/images/logo
66      volumes:
67      - name: phpipam-www-vol
68        persistentVolumeClaim:
69          claimName: phpipam-www-pvc

phpipam-cron

This container has a pretty simple configuration in docker-compose:

1services:
2  phpipam-cron:
3    image: phpipam/phpipam-cron:1.5x
4    environment:
5      - TZ=UTC
6      - IPAM_DATABASE_HOST=phpipam-db
7      - IPAM_DATABASE_PASS=VMware1!
8      - SCAN_INTERVAL=1h

No exposed ports, no need for persistence - just a base image and a few variables to tell it how to connect to the database and how often to run the scans:

 1# phpipam-cron.yaml
 2apiVersion: apps/v1
 3kind: Deployment
 4metadata:
 5  name: phpipam-cron
 6  namespace: ipam
 7spec:
 8  selector:
 9    matchLabels:
10      app: phpipam-cron
11  replicas: 1
12  template:
13    metadata:
14      labels:
15        app: phpipam-cron
16    spec:
17      containers:
18      - name: phpipam-cron
19        image: phpipam/phpipam-cron:1.5x
20        env:
21        - name: IPAM_DATABASE_HOST
22          value: "phpipam-db"
23        - name: IPAM_DATABASE_PASS
24          value: "VMWare1!"
25        - name: SCAN_INTERVAL
26          value: "1h"
27        - name: TZ
28          value: "UTC"

phpipam-agent

And finally, my remote scan agent. Here's the docker-compose:

 1services:
 2  phpipam-agent:
 3    container_name: phpipam-agent
 4    restart: unless-stopped
 5    image: ghcr.io/jbowdre/phpipam-agent:latest
 6    environment:
 7      - IPAM_DATABASE_HOST=phpipam-db
 8      - IPAM_DATABASE_NAME=phpipam
 9      - IPAM_DATABASE_USER=phpipam
10      - IPAM_DATABASE_PASS=VMware1!
11      - IPAM_DATABASE_PORT=3306
12      - IPAM_AGENT_KEY=
13      - IPAM_SCAN_INTERVAL=5m
14      - IPAM_RESET_AUTODISCOVER=true
15      - IPAM_REMOVE_DHCP=true
16      - TZ=UTC

It's got a few additional variables to make it extra-configurable, but still no need for persistence or network exposure. That IPAM_AGENT_KEY variable will need to get populated the appropriate key generated within the new phpIPAM deployment, but we can deal with that later.

For now, here's how I'd tell Kubernetes about it:

 1# phpipam-agent.yaml
 2apiVersion: apps/v1
 3kind: Deployment
 4metadata:
 5  name: phpipam-agent
 6  namespace: ipam
 7spec:
 8  selector:
 9    matchLabels:
10      app: phpipam-agent
11  replicas: 1
12  template:
13    metadata:
14      labels:
15        app: phpipam-agent
16    spec:
17      containers:
18      - name: phpipam-agent
19        image: ghcr.io/jbowdre/phpipam-agent:latest
20        env:
21        - name: IPAM_DATABASE_HOST
22          value: "phpipam-db"
23        - name: IPAM_DATABASE_NAME
24          value: "phpipam"
25        - name: IPAM_DATABASE_USER
26          value: "phpipam"
27        - name: IPAM_DATABASE_PASS
28          value: "VMware1!"
29        - name: IPAM_DATABASE_PORT
30          value: "3306"
31        - name: IPAM_AGENT_KEY
32          value: ""
33        - name: IPAM_SCAN_INTERVAL
34          value: "5m"
35        - name: IPAM_RESET_AUTODISCOVER
36          value: "true"
37        - name: IPAM_REMOVE_DHCP
38          value: "true"
39        - name: TZ
40          value: "UTC"

Deployment and configuration of phpIPAM

I can now go ahead and start deploying these containers, starting with the database one (upon which all the others rely):

1❯ kubectl apply -f phpipam-db.yaml
2service/phpipam-db created
3persistentvolumeclaim/phpipam-db-pvc created
4deployment.apps/phpipam-db created

And the web server:

1❯ kubectl apply -f phpipam-www.yaml
2service/phpipam-www created
3persistentvolumeclaim/phpipam-www-pvc created
4deployment.apps/phpipam-www created

And the cron runner:

1❯ kubectl apply -f phpipam-cron.yaml
2deployment.apps/phpipam-cron created

I'll hold off on the agent container for now since I'll need to adjust the configuration slightly after getting phpIPAM set up, but I will go ahead and check out my work so far:

 1❯ kubectl -n ipam get all
 2NAME                                READY   STATUS    RESTARTS   AGE
 3pod/phpipam-cron-6c994897c4-6rsnp   1/1     Running   0          4m30s
 4pod/phpipam-db-5f4c47d4b9-sb5bd     1/1     Running   0          16m
 5pod/phpipam-www-769c95c68d-94klg    1/1     Running   0          5m59s
 6
 7NAME                  TYPE           CLUSTER-IP       EXTERNAL-IP    PORT(S)        AGE
 8service/phpipam-db    ClusterIP      100.66.194.69    <none>         3306/TCP       16m
 9service/phpipam-www   LoadBalancer   100.65.232.238   192.168.1.64   80:31400/TCP   5m59s
10
11NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
12deployment.apps/phpipam-cron   1/1     1            1           4m30s
13deployment.apps/phpipam-db     1/1     1            1           16m
14deployment.apps/phpipam-www    1/1     1            1           5m59s
15
16NAME                                      DESIRED   CURRENT   READY   AGE
17replicaset.apps/phpipam-cron-6c994897c4   1         1         1       4m30s
18replicaset.apps/phpipam-db-5f4c47d4b9     1         1         1       16m
19replicaset.apps/phpipam-www-769c95c68d    1         1         1       5m59s

And I can point my browser to the EXTERNAL-IP associated with the phpipam-www service to see the initial setup page:

phpIPAM installation page

I'll click the New phpipam installation option to proceed to the next step:

Database initialization options

I'm all for easy so I'll opt for Automatic database installation, which will prompt me for the credentials of an account with rights to create a new database within the MariaDB instance. I'll enter root and the password I used for the MYSQL_ROOT_PASSWORD variable above:

Automatic database install

I click the Install database button and I'm then met with a happy success message saying that the phpipam database was successfully created.

And that eventually gets me to the post-install screen, where I set an admin password and proceed to log in:

We made it to the post-install!

To create a new scan agent, I go to Menu > Administration > Server management > Scan agents.

Scan agents screen

And click the button to create a new one:

Creating a new agent

I'll copy the agent code and plug it into my phpipam-agent.yaml file:

1        - name: IPAM_AGENT_KEY
2          value: "4DC5GLo-F_35cy7BEPnGn7HivtjP_o-v"

And then deploy that:

1❯ kubectl apply -f phpipam-agent.yaml
2deployment.apps/phpipam-agent created

The scan agent isn't going to do anything until it's assigned to a subnet though, so now I head to Administration > IP related management > Sections. phpIPAM comes with a few default sections and ranges and such defined so I'll delete those and create a new one that I'll call Lab.

Section management

Now I can create a new subnet within the Lab section by clicking the Subnets menu, selecting the Lab section, and clicking + Add subnet.

Empty subnets menu

I'll define the new subnet as 192.168.1.0/24. Once I enable the option to Check hosts status, I'll then be able to specify my new remote-agent as the scanner for this subnet.

Creating a new subnet
A new (but empty) subnet

It shows the scanner associated with the subnet, but no data yet. I'll need to wait a few minutes for the first scan to kick off (at the five-minute interval I defined in the configuration).

Newly discovered IPs!

Woah, it actually works!

Conclusion

I still need to do more work to the containerized phpIPAM stack ready for production, but I'm feeling pretty good for having deployed a functional demo of it at this point! And working on this was a nice excuse to get a bit more familiar with Tanzu Community Edition specifically, Kubernetes in general, and Docker (I learned a ton while assembling the phpipam-agent image!). I find I always learn more about a new-to-me technology when I have an actual project to do rather than just going through the motions of a lab exercise. Maybe my notes will be useful to you, too.


  1. Yo dawg, I heard you like containers... ↩︎

  2. Register here if you don't yet have an account. ↩︎

  3. Enabling dark mode is probably the most important part of this process. ↩︎

  4. If I didn't already have a key pair to use I would generate one with ssh-keygen -t rsa -b 4096 -C "email@example.com" and add it to my client with ssh-add ~/.ssh/id_rsa↩︎

  5. I'm not going to, but I totally could. ↩︎

  6. Mr. Anderson. ↩︎

  7. My phpipam-agent image won't (yet?) do the DNS lookups that phpipam-cron can. ↩︎

More Projects