erp5: Introduce mariadb replication at SlapOS level (!1679) · Merge requests · nexedi / slapos

Merged Xavier Thompson requested to merge xavier_thompson/slapos:feat/mariadb-replication into master Nov 05, 2024

EDIT: I rewrote the description to focus on the key points because the previous description had gotten way too long and technical. Everything is described in the commit messages. I invite you to read them in order for a detailed understanding.

Motivation

Despite its name and the high focus on mariadb replication, the overall concern of this MR is the wider question of ERP5 resiliency. Not resiliency with ERP5 inside Theia, but "native" resiliency of ERP5. The involves replicating ERP5's object (ZODB) and SQL database (index catalog, activities, ...). ZODB replication is already well implemented using Neo, thus this MR focuses mostly on mariadb replication; but it does bring some improvements to Neo.

Before this MR, some ERP5 projects on the cutting edge already use Neo + mariadb replication to ensure ERP5 resiliency. But this is mostly done and maintained manually outside of SlapOS. The goal is to mainstream this technique by automating it by integrating it inside SlapOS. Ultimately, EPR5 replication inside Theia should be replaced by "native" ERP5 replication everywhere.

This MR does not aim to complete this transition all in a single step. Instead it makes a significant step in this direction, and sketches an outline of next steps.

Overview of tasks and future todos

Some of these will not be implemented in this current MR

Footnotes

£: https://lab.nexedi.com/nexedi/slapos/-/merge_requests/1792 proposes a much more advanced way to generate and store mariabackups, using frequent incremental mariabackups combined with infrequent full mariabackups, and storing them with restic. This makes for faster and smaller backups. Restic stores the backups as content defined chunks, so the backups are not available as a single file without asking restic to reconstitute it. Thus using restic will imply serving the bootstrap backups withs something like rest server that will reconstitute and serve the backup files on demand. UPDATE: The full + incremental mariabackups feature has now been included here without restic.

££: Replication works by fetching mariadb binlogs. Binlogs are retained on the primary only for a few days (by default). So if when creating a replica the primary is older than the binlog retention time, the replica must first restore itself to a recent backup of the primary to bootstrap replication.

£££: To request a mariadb replica — either standalone or as a sub-instance of ERP5 (§):

   'replication': {
     'upstream-mariadb-url': 'mysql://<user>:<password>@<ip>:<port>',
     'upstream-mariabackup-url': 'http(s)://<recent-mariabackup-of-primary>',
   }

   'replication': {
     'upstream-mariadb-url': 'mysql://<user>:<password>@<ip>:<port>',
     'upstream-bootstrap-url': 'http(s)://<recent-sqldump-backup-of-primary>',
   }

This takes effect on mariadb database creation - when no data exists yet. That way existing data cannot be deleted by setting or changing the replication parameters after the fact.

A promise checks that the state of the running mariadb matches the requested state (replica/primary, replication source); but if not, the mariadb database will not automatically converge without human intervention once ~/srv/mariadb directory exists.

The bootstrap-url or mariabackup-url may be omitted: this skips replication bootstrap and requires that all binlogs be still available on the primary. This is useful when the primary is recent and may not have a ready backup for bootstrap yet.

The primary mariadb publishes the needed parameters under replication-primary-url, replication-bootstrap-url, and replication-mariabackup-url. They can then be plugged directly into the replica request.

££££: If the replica is accessed over TLS IPv6, the caucased-url of the primary on which the replica will request a certificate must be passed as well:

   'replication': {
     'upstream-mariadb-url': 'mysql://<user>:<password>@<ipv6>:<port>',
     'upstream-mariabackup-url': 'http(s)://<recent-mariabackup-of-primary>',
     'upstream-caucased-url': 'http://[<ipv6>]:<port>',
   }

The replica will then publish a CSR under caucased-csr-to-sign — the ERP5 root instance (if there is one ) will republish it (§§). To make the primary caucased sign it, it can be passed back to the primary:

   'caucased': {
     'csr-to-sign': '<PEM-content>',
   }

£££££: For many ERP5 uses cases to work correctly (accurate stock evaluation, activities, ...), the ZODB (neo) and the index catalog (mariadb) must be coherent with each other. This coherence is maintained by the zope processes and the activity queue. At the time a takeover is needed, most likely the replica mariadb and replica neo will not be coherent with each other. One way to reattain coherence is to regenerate the mariadb catalog from scratch by re-indexing the whole ZODB; this is a very lengthy process that can take days or weeks, which makes it unsuitable in practice. Our practical "state-of-the-art" solution is to truncate the neo to its state a few minutes back in time; enough minutes to be certain that all the ZODB objects created and modified prior to that truncation point are correctly indexed in the non-truncated mariadb. Then it's only a matter of examining the indexations in mariadb that occurred in the interval between the truncation time and the most recent state of mariadb to determine which remain valid. This is done by ERP5Site_resynchroniseCatalogSince. Given that that only a few minutes need to be examined, this process is very fast. Thus this technique trades a few minutes of data in the past for the ability to be up and running again a short time in the future.

§: To request a ERP5 with a mariadb replica sub-instance, the same parameters can be forwarded from ERP5 root instance to mariadb by wrapping them in a 'mariadb' dict:

   'mariadb': {
      'replication': { '...' },
      'caucased': { '...' }
   }

§§: The ERP5 root instance (when mariadb is not standalone) will republish the needed parameters by prefixing them with 'mariadb-', e.g. mariadb-replication-primary-url, mariadb-caucased-url, mariadb-caucased-csr-to-sign.

Edited Sep 03, 2025 by Xavier Thompson