Step 7 - Configure proxies v5

Configure proxies

PGD can use proxies to direct traffic to one of the clusters nodes, selected automatically by the cluster. There are performance and availabilty reasons for using a proxy:

  • Performance: By directing all traffic and in particular write traffic, to one node, the node can resolve write conflicts locally and more efficiently.
  • Availability: When a node is taken down for maintenance or goes offline for other reasons, the proxy can automatically direct new traffic to a new, automatically selected, write leader.

It is best practice to configure PGD Proxy for clusters to enable this behavior.

Configure the cluster for proxies

To set up a proxy, you will need to first prepare the cluster and sub-group the proxies will be working with by:

  • Logging in and setting the enable_raft and enable_proxy_routing node group options to true for the sub-group. Use bdr.alter_node_group_option, passing the sub-group name, option name and new value as parameters.
  • Create as many uniquely named proxies as you plan to deploy using bdr.create_proxy and passing the new proxy name and the sub-group it should be attached to.
  • Create a pgdproxy user on the cluster with a password (or other authentication)

Configure each host as a proxy

Once the cluster is ready, you will need to configure each host to run pgd-proxy by:

  • Creating a pgdproxy local user
  • Creating a .pgpass file for that user which will allow it to log into the cluster as pgdproxy.
  • Modify the systemd service file for pgdproxy to use the pgdproxy user.
  • Create a proxy config file for the host which lists the connection strings for all the nodes in the sub-group, specifies the name that the proxy should use when connected and gives the endpoint connection string the proxy will accept connections on.
  • Install that file as /etc/edb/pgd-proxy/pgd-proxy-config.yml
  • Restart the systemd service and check its status.
  • Log into the proxy and verify its operation.

Further detail on all these steps is included in the worked example.

Worked example

Preparing for proxies

For proxies to function, the dc1 subgroup must enable Raft and routing.

Log into any node in the cluster, using psql to connect to the bdrdb database as the enterprisedb user, and execute:

SELECT bdr.alter_node_group_option('dc1', 'enable_raft', 'true');
SELECT bdr.alter_node_group_option('dc1', 'enable_proxy_routing', 'true');

The bdr.node_group_summary view can be used to check the status of options previously set with bdr.alter_node_group_option():

SELECT node_group_name, enable_proxy_routing, enable_raft
            FROM bdr.node_group_summary
           WHERE parent_group_name IS NOT NULL;
Output
 node_group_name | enable_proxy_routing | enable_raft
-----------------+----------------------+-------------
 dc1             | t                    | t
(1 row)

bdrdb=#

Next, create a PGD proxy within the cluster using the bdr.create_proxy function. This function takes two parameters, the proxy's unique name and the group it should be a proxy for.

In our example, we want a proxy on each host in the dc1 sub-group:

SELECT bdr.create_proxy('pgd-proxy-one','dc1');
SELECT bdr.create_proxy('pgd-proxy-two','dc1');
SELECT bdr.create_proxy('pgd-proxy-three','dc1');

The bdr.proxy_config_summary view can be used to check that the proxies were created:

SELECT proxy_name, node_group_name
             FROM bdr.proxy_config_summary;
Output
   proxy_name    | node_group_name
-----------------+-----------------
 pgd-proxy-one   | dc1
 pgd-proxy-two   | dc1
 pgd-proxy-three | dc1

 bdrdb=#

Create a pgdproxy user on the database

Create a user named pgdproxy and give it a password. In this example we will use proxysecret

On any node, log into the bdrdb database as enterprisedb/postgres.

CREATE USER pgdproxy PASSWORD 'proxysecret';
GRANT bdr_superuser TO pgdproxy;

Create a pgdproxy user on each host

sudo adduser pgdproxy

This user will need credentials to connect to the server. We will create a .pgpass file with the proxysecret password in it. Then we will lock down the .pgpass file so it is only accessible by its owner.

echo -e "*:*:*:pgdproxy:proxysecret" | sudo tee /home/pgdproxy/.pgpass
sudo chown pgdproxy /home/pgdproxy/.pgpass
sudo chmod 0600 /home/pgdproxy/.pgpass

Configure the systemd service on each host

Switch the service file from using root to using the pgdproxy user

sudo sed -i s/root/pgdproxy/ /usr/lib/systemd/system/pgd-proxy.service

Reload the systemd daemon.

sudo systemctl daemon-reload

Create a proxy config file for each host

The proxy configuration file will be slightly different for each host. It is a YAML file which contains a cluster object. This in turn has three properties:

The name of the PGD cluster's top-level group (as name). An array of endpoints of databases (as endpoints). The proxy definition object with a name and endpoint (as proxy).

The first two properties will be the same for all hosts:

cluster:
  name: pgd
  endpoints:
    - host=host-one dbname=bdrdb port=5444
    - host=host-two dbname=bdrdb port=5444
    - host=host-three dbname=bdrdb port=5444

Remember that host-one, host-two and host-three are the systems on which the cluster nodes (node-one, node-two, node-three) are running. We use the name of the host, not the node, for the endpoint connection.

Also note that the endpoints in this example specify port=5444. This is necessary for EDB Postgres Advanced Server instances. For EDB Postgres Extended and Community PostgreSQL, this can be omitted.

The third property, proxy, has a name property and an endpoint property. The name property should be a name created with bdr.create_proxy earlier, and it will be different on each host. The endpoint property is a string which defines how the proxy presents itself as a connection string. A proxy cannot be on the same port as the Postgres server and, ideally, should be on a commonly used port different from direct connections, even when no Postgres server is running on the host. We typically use port 6432 for PGD proxies.

  proxy:
    name: pgd-proxy-one
    endpoint: "host=localhost dbname=bdrdb port=6432"

In this case, by using 'localhost' in the endpoint, we specify that this proxy will listen on the host where the proxy is running.

Install a PGD proxy configuration on each host

For each host, create the /etc/edb/pgd-proxy directory:

sudo mkdir -p /etc/edb/pgd-proxy

Then on each host, write the appropriate configuration to the pgd-proxy-config.yml file in the /etc/edb/pgd-proxy directory.

For our example, this could be run on host-one to create the file.

cat <<EOF | sudo tee /etc/edb/pgd-proxy/pgd-proxy-config.yml
cluster:
  name: pgd
  endpoints:
    - host=host-one dbname=bdrdb port=5444
    - host=host-two dbname=bdrdb port=5444
    - host=host-three dbname=bdrdb port=5444
  proxy:
    name: pgd-proxy-one
    endpoint: "host=localhost dbname=bdrdb port=6432"
EOF

Restart the service

On each host where the proxy is being installed, restart the pgd-proxy service.

sudo systemctl restart pgd-proxy

Confirm it is running correctly with:

sudo systemctl status pgd-proxy

It should show Active: (running) in the opening details.

Test the proxy

At this point, connecting to the PGD Proxy port on any host in the cluster will result in the connection being routed to the current write lead node.

For example, and assuming you've installed the proxy on all three hosts, then connecting to the proxy on host-three should result in the connection being routed to node-one:

Here we pass connection details, using the -d flag of psql with the hostname for the proxy we just configured and the proxy port number:

sudo -iu enterprisedb psql -d "host=host-three dbname=bdrdb port=6432"
Output
psql (16.1.0, server 16.1.0)
Type "help" for help.

bdrdb=#

Once connected, we can query the server to find out which node the proxy has connected us to.

SELECT node_name FROM bdr.local_node_summary;
Output
 node_name
-----------
 node-one
(1 row)
bdrdb#

We should have connected to the current write leader of the sub-group. You can confirm that by we can querying which node is the write leader for the sub-group we are connected to:

SELECT node_group_name, write_lead FROM bdr.node_group_routing_summary;
Output
 node_group_name | write_lead
-----------------+------------
 dc1             | node-one
(1 row)

bdrdb=#

And the write_lead is node-one too, so we confirm we are being proxy connected to the write leader.