Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • W wendelin.core
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 2
    • Issues 2
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 5
    • Merge requests 5
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • nexedinexedi
  • wendelin.core
  • Merge requests
  • !33

Fix false alarm about faulty client whereas client just restarted at pin time

  • Review changes

  • Download
  • Patches
  • Plain diff
Open Levin Zimmermann requested to merge levin.zimmermann/wendelin.core:fix-kill-dead-client into master Nov 04, 2024
  • Overview 9
  • Commits 2
  • Pipelines 0
  • Changes 2

Good day, Kirill,

this MR is about a minor issue that we observed on our recently deployed production instance. We could see that zopes were sometimes killed by WCFS (according to WCFS statistics and WCFS log). However, in zope logs we couldn't find any SIGKILL traces, but we could see that whenever WCFS killed zopes, it was just shortly after zopes restarted.

Due to this, I have the assumption that WCFS attempted to kill zopes that just restarted after receiving a pin request. In other words, these clients didn't respond anymore, because they were already dead.

In order to check if my assumption is true, I added hereby provided tests that simulate clients that exit at pin time. I also added a proposal for a fix of this issue.

Generally, from my current understanding, this issue looks relatively tame (and therefore less important than the deadlock issue). However, if it doesn't get fixed, it could cover real problems with clients, and therefore it's still good to find a solution for it earlier or later.

Best, Levin

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: fix-kill-dead-client
GitLab Nexedi Edition | About GitLab | About Nexedi | 沪ICP备2021021310号-2 | 沪ICP备2021021310号-7