Proxying Amazon PostgreSQL on OpenShift with PGPool
So you’ve been building a production application that uses a PostgreSQL database. During your development phase you have likely been using a local database on your computer or using a Docker instance or perhaps port-forwarding the database from your minishift or development OpenShift cluster. However, as you move into production deployment you aren’t deploying a PostgreSQL pod anymore but using a remote database, like Amazon Remote Database Service (RDS) for PostgreSQL. In a production environment best practices are that you have limited network access to the database to be only from within your virtual private cloud (VPC). Additionally, you may also incorporate an SSL certificate in order to further secure your connection to the database to avoid “rogue” connections.
Every now and again you can only debug customer issues with the specific data set. However with this closed off environment there are limited options:
- Increase the logging hoping to capture relevant data.
- Attempt to generate a valid data set via trial and error.
- Create a replica database or pg_dump the data for recreate.
The first two options take time and may not render much success. The third option can also take time, be a security risk, and be costly especially in a multi-tenant database environment. In a development debug scenario a developer would simply connect directly to the PostgreSQL pod using OpenShift’s port-forwarding capability and step through the application logic with either a debugger or examining the SQL statements with psql. Think of what a remarkable step forward it would be if you could interact with the remote database similarly to the development scenario but without giving up all the security restrictions you have put in place.
Proxying PostgreSQL
Pgpool-II can help make debugging production as described above possible. “Pgpool-II is a middleware that works between PostgreSQL servers and a PostgreSQL database client”. Pgpool-II provides several capabilities including: connection pooling, replication, load balancing, limiting exceeding connections, watchdog, and in-memory query cache. However, we are going to focus on its capability to act as a proxy to the production database. Next let’s see how we can deploy a Pgpool-II container into our OpenShift environment to afford this interaction flow.
Running Pgpool-II on OpenShift
Crunchy Data, which is an Enterprise PostgreSQL solutions company, provides several container based deployments of which Pgpool-II is included. The crunchy-containers repository examples provide deployments for Docker, Helm, and Kubernetes/OpenShift. We are going to walk-through the necessary steps to deploy to OpenShift assuming the PostgreSQL database hostname, username, password, and SSL certificate are stored in an OpenShift secret(s).
The repository above provides deployment steps and scripts that build on top of the Crunchy Data example deployment to OpenShift.
git clone git@github.com:chambridge/openshift-pgpool.gitFrom here you need to setup the environment that will be used when deploying crunchy-containers with the supplied example.env.sh that you can copy to env.sh and source, where you will at minimum need to update the following variables.
CCPROOT=/home/user/code/crunchy-containers # The base of the crunchy-containers clone GitHub repositoryCCP_NAMESPACE=myproject # Change this to the OpenShift project name
Next step is to login to your OpenShift cluster and project that contains the PostgreSQL hostname, username, password, and SSL certificate secret(s).
oc login
oc project myprojectNow you can create several configuration files needed for the deployment of pgpool using shell scripts contained in the repository.
# Create pgpool.conf file with create_pgpool_conf.sh script:./create_pool_passwd.sh SECRET_NAME HOSTNAME_KEY USERNAME_KEY PASSWORD_KEY
# Create pool_passwd file with create_pool_passwd.sh script:./create_pool_passwd.sh SECRET_NAME USERNAME_KEY PASSWORD_KEY
# Create server.pem file with create_server_pemfile.sh script:./create_server_pemfile.sh SECRET_NAME CACERT_KEY
All of these files will be stored within kube/pgpool/configs/ so they can be copied over into a cloned crunchy-containers repository.
cd ..
git clone https://github.com/CrunchyData/crunchy-containers.git
cd crunchy-containers/examples/Now you copy over the setup files created above and deploy pgpool:
cp -rf ../../openshift-pgpool/kube/ kube/
cd kube/pgpool/
./run_sslcacert.sh
Working with deployed pgpool
You can now connect to the deployed pgpool pod which will essentially proxy the PostgreSQL database. Use oc port-forward to connect to the database.
Follow these steps to interact with deployed proxy:
- Find the pod name for the deployed
pgpool.
oc get pods -l name=pgpool -o name 2. Use port-forwarding to connect to the remote database.
oc port-forward [pod] 5432:5432 >/dev/null 2>&1 &Now you can connect to remote database with either your local application or psql.
psql --password --host=127.0.0.1 --port=5432 --username=dbuser --dbname=mydbselect * from mytable;
Access to port-forwarding is limited to those users with edit or greater permissions to the OpenShift project and the pod can be scaled to zero and/or deployed to any project to after extracting the secret data to limit access.
Synopsis
In this story we highlighted a mechanism to access our network isolated remote production database. We learned about Pgpool-II and its capabilities including acting as a proxy for a PostgreSQL database. Next we walked through setup steps to create the appropriate Pgpool-II configuration to connect to the remote database additionally using an SSL certificate. Lastly we saw how to deploy the pgpool pod into OpenShift and use port-forwarding to obtain the same experienced for a developer against production as they have in local development.

