Commit scopes v5

Commit scopes give applications granular control about durability and consistency of EDB Postgres Distributed.

A commit scope is a set of rules that describes the behavior of the system as transactions are committed. The actual behavior depends on which a kind of commit scope a commit scope's rule uses Group Commit, Commit At Most Once, Lag Control, Synchronous Commit or combination of these.

While most commit scope kinds control the processing of the transaction, Lag Control is the exception as it dynamically regulates the performance of the system in response to replication operations being slow or queued up. It is typically used, though, in combination with other commit scope kinds

Commit scope structure

Every commit scope has a name (a commit_scope_name).

Each commit scope has one or more rules.

Each rule within the commit scope has an origin_node_group which together uniquely identify the commit scope rule.

The origin_node_group is a PGD group and it defines the nodes which will apply this rule when they are the originators of a transaction.

Finally there is the rule which defines what kind of commit scope or combination of commit scope kinds should be applied to those transactions.

So if a commit scope has a rule that reads:

origin_node_group := 'example_bdr_group',
rule := 'MAJORITY (example_bdr_group) GROUP COMMIT',

Then, the rule is applied when any node in the example_bdr_group issues a transaction.

The rule itself specifies how many nodes of a specified group will need to confirm the change - MAJORITY (example_bdr_group) - followed by the commit scope kind itself - GROUP COMMIT. This translates to requiring that any two nodes in example_bdr_group must confirm the change before the change can be considered as comitted.

How a commit scope is selected

When any change takes place, PGD looks up which commit scope is should be used for the transaction or node.

If a transaction specifies a commit scope, that scope will be used.

If not specified, the system will search for a default commit scope. Default commit scopes are a group level setting. The system consults the group tree. Starting at the bottom of the group tree with the node's group and working up, it searches for any group which has a default_commit_scope setting defined. This commit scope will then be used.

If no default_commit_scope is found then the node's GUC, bdr.commit_scope is used. And if that isn't set or is set to local then no commit scope applies and PGD's async replication is used.

A commit scope will not be used if it is not local and the node where the commit is being run on is not directly or indirectly related to the origin_node_group.

Creating a Commit Scope

Use bdr.add_commit_scope to add our example rule to a commit scope. For example:

SELECT bdr.add_commit_scope(
    commit_scope_name := 'example_scope',
    origin_node_group := 'example_bdr_group',
    rule := 'MAJORITY (example_bdr_group) GROUP COMMIT',
    wait_for_ready := true
);

This will add the rule MAJORITY (example_bdr_group) GROUP COMMIT for any transaction originating from the example_bdr_group to a scope called example_scope.

If no rules previously existed in example_scope, then adding this rule would make the scope exist.

When a rule is added, the origin_node_group must already exist. If it does not, the whole add operation will be discarded with an error.

The rule will then be evaluated. If the rule mentions groups that don't exist or the settings on the group are incompatible with other configuration setting on the group's nodes, a warning will be emitted, but the rule will be added.

Once the rule is added, the commit scope will be available for use.

The wait_for_ready controls whether the bdr.add_commit_scope() call blocks until the rule has been added to the relevant nodes. The setting defaults to true and can be omitted.

Using a commit scope

To use our example scope, we can set bdr.commit_scope within a transaction

BEGIN;
SET LOCAL bdr.commit_scope = 'example_scope';
...
COMMIT;

You must set the commit scope before the transaction writes any data.

A commit scope may be set as a default for a group or sub_group using bdr.alter_node_group_option

SELECT bdr.alter_node_group_option(
  node_group_name := 'example_bdr_group',
  config_key := 'default_commit_scope',
  config_value := 'example_scope'
);

To completely clear the default for a group of sub-group, set the default_commit_scope value to local.

SELECT bdr.alter_node_group_option(
  node_group_name := 'example_bdr_group',
  config_key := 'default_commit_scope',
  config_value := 'local'
);

You can also do make this change using PGD cli:

pgd set-group-options example-bdr-group --option default_commit_scope=example_scope

And you can clear the default using PGD cli by setting the value to local:

pgd set-group-options example-bdr-group --option default_commit_scope=local

Finally, you can set the default commit_scope for a node using:

SET bdr.commit_scope = 'example_scope';

Set bdr.commit_scope to local to use the PGD default async replication.

Origin groups

Rules for commit scopes can depend on the node the transaction is committed on, that is, the node that acts as the origin for the transaction. To make this transparent for the application, PGD allows a commit scope to define different rules depending on where the transaction originates from.

For example, consider an EDB Postgres Distributed cluster with nodes spread across two data centers: a left and a right one. Assume the top-level PGD node group is called top_group. You can use the following commands to set up subgroups and create a commit scope requiring all nodes in the local data center to confirm the transaction but only one node from the remote one:

-- create sub-groups
SELECT bdr.create_node_group(
    node_group_name := 'left_dc',
    parent_group_name := 'top_group',
    join_node_group := false
);
SELECT bdr.create_node_group(
    node_group_name := 'right_dc',
    parent_group_name := 'top_group',
    join_node_group := false
);

-- create a commit scope with individual rules
-- for each sub-group
SELECT bdr.add_commit_scope(
    commit_scope_name := 'example_scope',
    origin_node_group := 'left_dc',
    rule := 'ALL (left_dc) GROUP COMMIT (commit_decision=raft) AND ANY 1 (right_dc) GROUP COMMIT',
    wait_for_ready := true
);
SELECT bdr.add_commit_scope(
    commit_scope_name := 'example_scope',
    origin_node_group := 'right_dc',
    rule := 'ANY 1 (left_dc) GROUP COMMIT AND ALL (right_dc) GROUP COMMIT (commit_decision=raft)',
    wait_for_ready := true
);

Now, using the example_scope on any node that's part of left_dc uses the first scope. Using the same scope on a node that's part of right_dc uses the second scope. By combining the left_dc and right_dc origin rules under one commit scope name, an application can simply use example_scope on either data center and get the appropriate behavior for that data center.

Each group can also have a default commit scope specified using the bdr.alter_node_group_option admin interface.

Making the above scopes the default ones for all transactions originating on nodes in those groups looks like this:

SELECT bdr.alter_node_group_option(
  node_group_name := 'left_dc',
  config_key := 'default_commit_scope',
  config_value := 'example_scope'
);
SELECT bdr.alter_node_group_option(
  node_group_name := 'right_dc',
  config_key := 'default_commit_scope',
  config_value := 'example_scope'
);