Skip to content

vip-manager RAM usage steadily increases till OOM killed by system #285

@thelinuxracoon

Description

@thelinuxracoon

I Have recently upgraded vip-manager 2.3.0 from the ubuntu repos to the newest vip-manager version 2.8.0 from this github repo. The issue began after installing the newest version.

I wanted to have the newest version so I can use patroni as the vip-manager endpoint. This ensures that vip-manager still functions when patroni is in failsafe_mode and/or etcd is unreachable.

OS: Ubuntu 24.04 LTS
vip-manager: version: 2.8.0, commit: 8aef662
patroni: 3.2.2

Screenshot of the system RAM usage:
vip-manager_ram_usage

Logs when vip-manager gets OOM killed:

Dec 07 21:33:13 dbserver vip-manager[325325]: 2024/12/07 21:33:13 IP address 192.168.110.30/24 state is false, desired false
Dec 07 21:33:24 dbserver vip-manager[325325]: 2024/12/07 21:33:24 IP address 192.168.110.30/24 state is false, desired false
Dec 07 21:33:31 dbserver systemd[1]: vip-manager.service: A process of this unit has been killed by the OOM killer.
Dec 07 21:33:33 dbserver systemd[1]: vip-manager.service: Main process exited, code=killed, status=9/KILL
Dec 07 21:33:33 dbserver systemd[1]: vip-manager.service: Failed with result 'oom-kill'.
Dec 07 21:33:33 dbserver systemd[1]: vip-manager.service: Consumed 1h 34min 30.819s CPU time, 9.8G memory peak, 0B memory swap peak.
Dec 07 21:33:33 dbserver systemd[1]: vip-manager.service: Scheduled restart job, restart counter is at 1.
Dec 07 21:33:33 dbserver systemd[1]: Started vip-manager.service - Manages Virtual IP for Patroni.
Dec 07 21:33:33 dbserver vip-manager[3723453]: 2024/12/07 21:33:33 Using config from file: /etc/vip-manager/vip-manager.yml
Dec 07 21:33:33 dbserver vip-manager[3723453]: 2024/12/07 21:33:33 No dcs-endpoints specified, trying to use localhost with standard ports!
Dec 07 21:33:33 dbserver vip-manager[3723453]: 2024/12/07 21:33:33 This is the config that will be used:
Dec 07 21:33:33 dbserver vip-manager[3723453]:         config : /etc/vip-manager/vip-manager.yml
Dec 07 21:33:33 dbserver vip-manager[3723453]:         dcs-endpoints : [http://127.0.0.1:8008/]
Dec 07 21:33:33 dbserver vip-manager[3723453]:         dcs-type : patroni
Dec 07 21:33:33 dbserver vip-manager[3723453]:         hostingtype : basic
Dec 07 21:33:33 dbserver vip-manager[3723453]:         interface : ens192
Dec 07 21:33:33 dbserver vip-manager[3723453]:         interval : 1000
Dec 07 21:33:33 dbserver vip-manager[3723453]:         ip : 192.168.110.30
Dec 07 21:33:33 dbserver vip-manager[3723453]:         manager-type : basic
Dec 07 21:33:33 dbserver vip-manager[3723453]:         netmask : 24
Dec 07 21:33:33 dbserver vip-manager[3723453]:         retry-after : 250
Dec 07 21:33:33 dbserver vip-manager[3723453]:         retry-num : 3
Dec 07 21:33:33 dbserver vip-manager[3723453]:         trigger-key : /leader
Dec 07 21:33:33 dbserver vip-manager[3723453]:         trigger-value : 200
Dec 07 21:33:33 dbserver vip-manager[3723453]:         verbose : false
Dec 07 21:33:33 dbserver vip-manager[3723453]:         version : false
Dec 07 21:33:33 dbserver vip-manager[3723453]: 2024/12/07 21:33:33 IP address 192.168.110.30/24 state is false, desired false
Dec 07 21:33:43 dbserver vip-manager[3723453]: 2024/12/07 21:33:43 IP address 192.168.110.30/24 state is false, desired false

These are the logs from the standby server so the state is false, desired false from vip-manager is correct. The issue also occurs on the leader server.

Enabling debug and verbose options does not give additional logs.

My vip-manager Config:

ip: 192.168.110.30
netmask: 24
interface: ens192
trigger-key: "/leader"
trigger-value: "200"
dcs-type: patroni

My Patroni API Config:

restapi:
  listen: 0.0.0.0:8008
  connect_address: 192.168.110.32:8008

I am starting the vip-manager via systemD. The Unit file looks like this:

# /usr/lib/systemd/system/vip-manager.service
# This is an example of a systemD config file for vip-manager.
# You can copy it to "/etc/systemd/system/vip-manager.service", adjust as necessary and then call
# systemctl daemon-reload && systemctl start vip-manager && systemctl enable vip-manager
# to start and also enable auto-start after reboot.

[Unit]
Description=Manages Virtual IP for Patroni
After=network-online.target
Before=patroni.service

[Service]
Type=simple

ExecStart=/usr/bin/vip-manager --config=/etc/default/vip-manager.yml

Restart=on-failure

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/vip-manager.service.d/override.conf
[Service]
EnvironmentFile=
ExecStart=
ExecStart=/usr/bin/vip-manager --config="/etc/vip-manager/vip-manager.yml"

I have overridden the ExecStart so I can use a custom File. This because if the package from the ubuntu repos ever gets installed again it wont override the custom parameters in the unit file.

The overall behavior of vip-manager is correct and works without issues in failover situations. The only problem is that vip-manager fills up the system RAM gets killed and repeats this. This creates a downtime for the whole server every 2 days.(depending on how much RAM you have)

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions