Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • slapos slapos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Merge requests 129
    • Merge requests 129
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Jobs
  • Commits
Collapse sidebar
  • nexedinexedi
  • slaposslapos
  • Merge requests
  • !1605

WIP: Fix/rapid cdn promise relax

  • Review changes

  • Download
  • Patches
  • Plain diff
Closed Łukasz Nowak requested to merge luke/slapos:fix/rapid-cdn-promise-relax into master Jul 02, 2024
  • Overview 2
  • Commits 10
  • Pipelines 0
  • Changes 57

Blocker: It's real problem that promises for slave instance preparation are failing, as they indicate that partitions needs to be reprocessed until everything is correctly setup. Tests on this branch are failing, simply exposing the real problem.

Attention: Do not simply silence the promises, as it will lead to problems. One have to rethink how to react on the promise state, and when they shall result with problems. Working on silencing tickets on master is NOGO.

Outcome: The promises promise-key-download-url-ready.py and publish-failsafe-error.py shall have some grace period, so that on real cluster they do not react too fast. Generally distributing the information about the slave requires a lot of processing on each partition, and with high amount of slaves this can take quite some time (up to 2 hours). The idea is, that such proimse shall be allowed to fail up to 5 times before anomaly would be detected, lowering the amount of tickets generated on live clusters after adding a slave.

Tasks:

  • https://lab.nexedi.com/nexedi/slapos.toolbox/-/merge_requests/133
  • https://lab.nexedi.com/nexedi/slapos.toolbox/-/merge_requests/134
  • configure check_file_state with proper TestLess, AnomalyResult and TestResult
  • configure proper grace period (failure_amount)
    • assert that the grace period really works depending of promise configuration, if needed improve promise code
Edited Feb 13, 2025 by Łukasz Nowak
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: fix/rapid-cdn-promise-relax
GitLab Nexedi Edition | About GitLab | About Nexedi | 沪ICP备2021021310号-2 | 沪ICP备2021021310号-7