pve-ha-manager (3.6.1) bullseye; urgency=medium * cli: assert that node exist when changing CRS request state to avoid creating a phantom node by mistake * manager: ensure node-request state gets transferred to new active CRM, so that the request for (manual) maintenance mode is upheld, even if the node that is in maintenace mode is also the current active CRM and gets rebooted. * lrm: ignore shutdown policy if (manual) maintenance mode is requested to avoid exiting from maintenance mode to early. -- Proxmox Support Team Thu, 20 Apr 2023 14:16:14 +0200 pve-ha-manager (3.6.0) bullseye; urgency=medium * fix #4371: add CRM command to switch an online node manually into maintenance (without reboot), moving away all active services, but automatically migrate them back once the maintenance mode is disabled again. * manager: service start: make EWRONG_NODE a non-fatal error, but try to find its the actual node the service is residing on * manager: add new intermediate 'request_started' state for stop->start transitions * request start: optionally enable automatic selection of the best rated node by the CRS on service start up, bypassing the very high priority of the current node on which a service is located. -- Proxmox Support Team Mon, 20 Mar 2023 13:38:26 +0100 pve-ha-manager (3.5.1) bullseye; urgency=medium * manager: update crs scheduling mode once per round to avoid the need for a restart of the currently active manager. * api: status: add CRS info to manager if not set to default -- Proxmox Support Team Sat, 19 Nov 2022 15:51:11 +0100 pve-ha-manager (3.5.0) bullseye; urgency=medium * env: datacenter config: include crs (cluster-resource-scheduling) setting * manager: use static resource scheduler when configured * manager: avoid scoring nodes if maintenance fallback node is valid * manager: avoid scoring nodes when not trying next and current node is valid * usage: static: use service count on nodes as a fallback -- Proxmox Support Team Fri, 18 Nov 2022 15:02:55 +0100 pve-ha-manager (3.4.0) bullseye; urgency=medium * switch to native version formatting * fix accounting of online services when moving services due to their source node going gracefully nonoperational (maintenance mode). This ensures a better balance of services on the cluster after such an operation. -- Proxmox Support Team Fri, 22 Jul 2022 09:21:20 +0200 pve-ha-manager (3.3-4) bullseye; urgency=medium * lrm: fix getting stuck on restart due to finished worker state not being collected -- Proxmox Support Team Wed, 27 Apr 2022 14:01:55 +0200 pve-ha-manager (3.3-3) bullseye; urgency=medium * lrm: avoid possible job starvation on huge workloads * lrm: increase run_worker loop-time for doing actually work to 80% duty-cycle -- Proxmox Support Team Thu, 20 Jan 2022 18:05:33 +0100 pve-ha-manager (3.3-2) bullseye; urgency=medium * fix #3826: fix restarting LRM/CRM when triggered by package management system due to other updates * lrm: also check CRM node-status for determining if there's a fence-request and avoid starting up in that case to ensure that the current manager can get our lock and do a clean fence -> unknown -> online FSM transition. This avoids a problematic edge case where an admin manually removed all services of a to-be-fenced node, and re-added them again before the manager could actually get that nodes LRM lock. * manage: handle edge case where a node gets seemingly stuck in 'fence' state if all its services got manually removed by an admin before the fence transition could be finished. While the LRM could come up again in previous versions (it won't now, see above point) and start/stop services got executed, the node was seen as unavailable for all recovery, relocation and migrate actions. -- Proxmox Support Team Wed, 19 Jan 2022 14:30:15 +0100 pve-ha-manager (3.3-1) bullseye; urgency=medium * LRM: release lock and close watchdog if no service configured for >10min * manager: make recovery actual state in finite state machine, showing a clear transition from fence -> reocvery. * fix #3415: never switch in error state on recovery, try to find a new node harder. This improves using the HA manager for services with local resources (e.g., local storage) to ensure it always gets started, which is an OK use-case as long as the service is restricted to a group with only that node. Previously failure of that node would have a high possibility of the service going into the errors state, as no new node can be found. Now it will retry finding a new node, and if one of the restricted set, e.g., the node it was previous on, comes back up, it will start again there. * recovery: allow disabling a in-recovery service manually -- Proxmox Support Team Fri, 02 Jul 2021 20:03:29 +0200 pve-ha-manager (3.2-2) bullseye; urgency=medium * fix systemd service restart behavior on package upgrade with Debian Bullseye -- Proxmox Support Team Mon, 24 May 2021 11:38:42 +0200 pve-ha-manager (3.2-1) bullseye; urgency=medium * Re-build for Debian Bullseye / PVE 7 -- Proxmox Support Team Wed, 12 May 2021 20:55:53 +0200 pve-ha-manager (3.1-1) pve; urgency=medium * allow 'with-local-disks' migration for replicated guests -- Proxmox Support Team Mon, 31 Aug 2020 10:52:23 +0200 pve-ha-manager (3.0-9) pve; urgency=medium * factor out service configured/delete helpers * typo and grammar fixes -- Proxmox Support Team Thu, 12 Mar 2020 13:17:36 +0100 pve-ha-manager (3.0-8) pve; urgency=medium * bump LRM stop wait time to an hour * do not mark maintenaned nodes as unkown * api/status: extra handling of maintenance mode -- Proxmox Support Team Mon, 02 Dec 2019 10:33:03 +0100 pve-ha-manager (3.0-6) pve; urgency=medium * add 'migrate' node shutdown policy * do simple fallback if node comes back online from maintenance * account service to both, source and target during migration * add 'After' ordering for SSH and pveproxy to LRM service, ensuring the node stays accessible until HA services got moved or shutdown, depending on policy. -- Proxmox Support Team Tue, 26 Nov 2019 18:03:26 +0100 pve-ha-manager (3.0-5) pve; urgency=medium * fix #1339: remove more locks from services IF the node got fenced * adapt to qemu-server code refactoring -- Proxmox Support Team Wed, 20 Nov 2019 20:12:49 +0100 pve-ha-manager (3.0-4) pve; urgency=medium * use PVE::DataCenterConfig from new split-out cluster library package -- Proxmox Support Team Mon, 18 Nov 2019 12:16:29 +0100 pve-ha-manager (3.0-3) pve; urgency=medium * fix #1919, #1920: improve handling zombie (without node) services * fix # 2241: VM resource: allow migration with local device, when not running * HA status: render removal transition of service as 'deleting' * fix #1140: add crm command 'stop', which allows to request immediate service hard-stops if a timeout of zero (0) is passed -- Proxmox Support Team Mon, 11 Nov 2019 17:04:35 +0100 pve-ha-manager (3.0-2) pve; urgency=medium * services: update PIDFile to point directly to /run * fix #2234: fix typo in service description * Add missing Dependencies to pve-ha-simulator -- Proxmox Support Team Thu, 11 Jul 2019 19:26:03 +0200 pve-ha-manager (3.0-1) pve; urgency=medium * handle the case where a node gets fully purged * Re-build for Debian Buster / PVE 6 -- Proxmox Support Team Wed, 22 May 2019 19:11:59 +0200 pve-ha-manager (2.0-9) unstable; urgency=medium * get_ha_settings: cope with (temporarily) unavailable pmxcfs * lrm: exit on restart and agent lock lost for > 90s * service data: only set failed_nodes key if needed -- Proxmox Support Team Thu, 04 Apr 2019 16:27:32 +0200 pve-ha-manager (2.0-8) unstable; urgency=medium * address an issue in dpkg 1.18 with wrong trigger cycle detections if cyclic dependencies are involed -- Proxmox Support Team Wed, 06 Mar 2019 07:49:58 +0100 pve-ha-manager (2.0-7) unstable; urgency=medium * fix #1842: do not pass forceStop to CT shutdown * fix #1602: allow one to delete ignored services over API * fix #1891: Add zsh command completion for ha-manager CLI tools * fix #1794: VM resource: catch qmp command exceptions * show sent emails in regression tests -- Proxmox Support Team Mon, 04 Mar 2019 10:37:25 +0100 pve-ha-manager (2.0-6) unstable; urgency=medium * fix #1378: allow one to specify a service shutdown policy * remove some unused external dependencies from the standalone simulator package * document api result for ha resources -- Proxmox Support Team Mon, 07 Jan 2019 12:59:27 +0100 pve-ha-manager (2.0-5) unstable; urgency=medium * skip CRM and LRM work if last cfs update failed * regression test system: allow to simulate cluster fs failures * postinst: drop transitional cleanup for systemd watchdog mux socket -- Proxmox Support Team Wed, 07 Feb 2018 11:00:12 +0100 pve-ha-manager (2.0-4) unstable; urgency=medium * address timing issues happening when pve-cluster.service is being restarted -- Proxmox Support Team Thu, 09 Nov 2017 11:46:50 +0100 pve-ha-manager (2.0-3) unstable; urgency=medium * add ignore state for resources * lrm/crm service: restart on API changes * lrm.service: do not timeout on stop * fix #1347: let postfix fill in FQDN in fence mails * fix #1073: do not count backup-suspended VMs as running -- Proxmox Support Team Fri, 13 Oct 2017 11:10:51 +0200 pve-ha-manager (2.0-2) unstable; urgency=medium * explicitly sync journal when disabling watchdog updates * always queue service stop if node shuts down * Fix shutdown order of HA and storage services * Resource/API: abort early if resource in error state -- Proxmox Support Team Wed, 14 Jun 2017 07:49:59 +0200 pve-ha-manager (2.0-1) unstable; urgency=medium * rebuild for PVE 5.0 / Debian Stretch -- Proxmox Support Team Mon, 13 Mar 2017 11:31:53 +0100 pve-ha-manager (1.0-40) unstable; urgency=medium * ha-simulator: allow adding service on runtime * ha-simulator: allow deleting service via GUI * ha-simulator: allow new service request states over gui * ha-simulator: use JSON instead of Dumper for manager status view -- Proxmox Support Team Tue, 24 Jan 2017 10:03:07 +0100 pve-ha-manager (1.0-39) unstable; urgency=medium * add setup_environment hook to CLIHandler class * ha-simulator: fix typo s/Mode/Node/ * is_node_shutdown: check for correct systemd targets * Simulator: fix scrolling to end of cluster log view * Simulator: do not use cursor position to insert log -- Proxmox Support Team Thu, 12 Jan 2017 13:15:08 +0100 pve-ha-manager (1.0-38) unstable; urgency=medium * update manual page -- Proxmox Support Team Wed, 23 Nov 2016 11:46:21 +0100 pve-ha-manager (1.0-37) unstable; urgency=medium * HA::Status: provide better/faster feedback * Manager.pm: store flag to indicate successful start * ha status: include common service attributes * Groups.pm: add verbose_description for 'restricted' * Resources.pm: use verbose_description for state * pve-ha-group-node-list: add verbose_description * ha-manager: remove 'enabled' and 'disabled' commands * rename request state 'enabled' to 'started' * get_pve_lock: correctly send a lock update request -- Proxmox Support Team Tue, 22 Nov 2016 17:04:57 +0100 pve-ha-manager (1.0-36) unstable; urgency=medium * Resources: implement 'stopped' state * ha-manager: remove obsolet pod content * Fix #1189: correct spelling in fence mail * API/Status: avoid using HA Environment * factor out resource config check and default set code -- Proxmox Support Team Tue, 15 Nov 2016 16:42:07 +0100 pve-ha-manager (1.0-35) unstable; urgency=medium * change service state to error if no recovery node is available * cleanup backup & mounted locks after recovery (fixes #1100) * add possibility to simulate locks from services * don't run regression test when building the simulator package -- Proxmox Support Team Thu, 15 Sep 2016 13:23:00 +0200 pve-ha-manager (1.0-34) unstable; urgency=medium * fix race condition on slow resource commands in started state -- Proxmox Support Team Mon, 12 Sep 2016 13:07:05 +0200 pve-ha-manager (1.0-33) unstable; urgency=medium * relocate policy: try to avoid already failed nodes * allow empty json status files * more regression tests -- Proxmox Support Team Fri, 22 Jul 2016 12:16:48 +0200 pve-ha-manager (1.0-32) unstable; urgency=medium * use correct verify function for ha-group-node-list * send email on fence failure and success -- Proxmox Support Team Wed, 15 Jun 2016 17:01:12 +0200 pve-ha-manager (1.0-31) unstable; urgency=medium * selcet_service_node: include all online nodes in default group * LRM: do not count erroneous service as active * fix relocate/restart trial count leak on service deletion -- Proxmox Support Team Fri, 06 May 2016 08:26:10 +0200 pve-ha-manager (1.0-30) unstable; urgency=medium * Env: allow debug logging -- Proxmox Support Team Fri, 29 Apr 2016 16:50:34 +0200 pve-ha-manager (1.0-29) unstable; urgency=medium * Resources: deny setting nonexistent group -- Proxmox Support Team Wed, 20 Apr 2016 18:22:28 +0200 pve-ha-manager (1.0-28) unstable; urgency=medium * Config: add get_service_status method -- Proxmox Support Team Tue, 19 Apr 2016 08:41:22 +0200 pve-ha-manager (1.0-27) unstable; urgency=medium * use pve-doc-generator to generate man pages -- Proxmox Support Team Fri, 08 Apr 2016 08:25:07 +0200 pve-ha-manager (1.0-26) unstable; urgency=medium * status: show added but not yet active services * status: mark CRM as idle if no service is configured -- Proxmox Support Team Tue, 15 Mar 2016 12:49:18 +0100 pve-ha-manager (1.0-25) unstable; urgency=medium * Use config_file from PVE::QemuConfig -- Proxmox Support Team Tue, 08 Mar 2016 11:50:49 +0100 pve-ha-manager (1.0-24) unstable; urgency=medium * simulator: install all virtual resources -- Proxmox Support Team Wed, 02 Mar 2016 10:30:40 +0100 pve-ha-manager (1.0-23) unstable; urgency=medium * fix infinite started <=> migrate cycle * exec_resource_agent: process error state early * avoid out of sync command execution in LRM * do not pass ETRY_AGAIN back to the CRM -- Proxmox Support Team Wed, 24 Feb 2016 12:15:21 +0100 pve-ha-manager (1.0-22) unstable; urgency=medium * fix 'change_service_location' misuse and recovery from fencing * add VirtFail resource and use it in new regression tests * improve relocation policy code in manager and LRM * improve verbosity of API status call -- Proxmox Support Team Mon, 15 Feb 2016 10:57:44 +0100 pve-ha-manager (1.0-21) unstable; urgency=medium * Fix postinstall script not removing watchdog-mux.socket -- Proxmox Support Team Thu, 04 Feb 2016 18:23:47 +0100 pve-ha-manager (1.0-20) unstable; urgency=medium * LRM: do not release lock on shutdown errors * Split up resources and move them to own sub folder * Add virtual resources for tests and simulation * add after_fork method to HA environment and use it in LRM -- Proxmox Support Team Wed, 27 Jan 2016 17:05:23 +0100 pve-ha-manager (1.0-19) unstable; urgency=medium * remove 'running' from migrate/relocate log message * LRM: release agent lock on graceful shutdown * LRM: release agent lock also on restart * CRM: release lock on shutdown request * TestHardware: correct shutdown/reboot behaviour of CRM and LRM * resource agents: fix relocate -- Proxmox Support Team Mon, 18 Jan 2016 12:41:08 +0100 pve-ha-manager (1.0-18) unstable; urgency=medium * pve-ha-lrm.service: depend on lxc.service * output watchdog module name if it gets loaded * remove watchdog-mux.socket -- Proxmox Support Team Tue, 12 Jan 2016 12:27:49 +0100 pve-ha-manager (1.0-17) unstable; urgency=medium * Resources.pm: use PVE::API2::LXC -- Proxmox Support Team Mon, 11 Jan 2016 12:25:38 +0100 pve-ha-manager (1.0-16) unstable; urgency=medium * check_active_workers: fix typo /uuid/uid/ -- Proxmox Support Team Mon, 21 Dec 2015 10:21:30 +0100 pve-ha-manager (1.0-15) unstable; urgency=medium * stop all resources on node shutdown (instead of freeze) -- Proxmox Support Team Wed, 16 Dec 2015 10:33:30 +0100 pve-ha-manager (1.0-14) unstable; urgency=medium * allow to configure watchdog module in /etc/default/pve-ha-manager -- Proxmox Support Team Thu, 03 Dec 2015 11:09:47 +0100 pve-ha-manager (1.0-13) unstable; urgency=medium * HA API: Fix permissions -- Proxmox Support Team Fri, 30 Oct 2015 11:16:50 +0100 pve-ha-manager (1.0-12) unstable; urgency=medium * Adding constants to gain more readability * exec_resource_agent: return valid exit code instead of die's * code cleanups -- Proxmox Support Team Thu, 29 Oct 2015 10:21:49 +0100 pve-ha-manager (1.0-11) unstable; urgency=medium * add workaround for bug #775 -- Proxmox Support Team Wed, 21 Oct 2015 08:58:41 +0200 pve-ha-manager (1.0-10) unstable; urgency=medium * better resource status check on addition and update -- Proxmox Support Team Mon, 12 Oct 2015 18:26:24 +0200 pve-ha-manager (1.0-9) unstable; urgency=medium * delete node from CRM status when deleted from cluster -- Proxmox Support Team Tue, 29 Sep 2015 07:35:30 +0200 pve-ha-manager (1.0-8) unstable; urgency=medium * Use new lock domain sub instead of storage lock -- Proxmox Support Team Sat, 26 Sep 2015 10:36:09 +0200 pve-ha-manager (1.0-7) unstable; urgency=medium * enhance ha-managers group commands * vm_is_ha_managed: allow check on service state * improve Makefile -- Proxmox Support Team Mon, 21 Sep 2015 12:17:41 +0200 pve-ha-manager (1.0-6) unstable; urgency=medium * implement bash completion for ha-manager * implement recovery policy for services * simulator: fix random output of manager status -- Proxmox Support Team Wed, 16 Sep 2015 12:06:12 +0200 pve-ha-manager (1.0-5) unstable; urgency=medium * Replacing hardcoded qemu commands with plugin calls * improve error state behaviour -- Proxmox Support Team Tue, 08 Sep 2015 08:45:36 +0200 pve-ha-manager (1.0-4) unstable; urgency=medium * groups: encode nodes as hash (internally) * add trigger for pve-api-updates -- Proxmox Support Team Tue, 16 Jun 2015 09:59:03 +0200 pve-ha-manager (1.0-3) unstable; urgency=medium * CRM: do not start if there is no resource.cfg file to avoid warnings -- Proxmox Support Team Tue, 09 Jun 2015 14:35:09 +0200 pve-ha-manager (1.0-2) unstable; urgency=medium * use Wants instead of Requires inside systemd service definitions -- Proxmox Support Team Tue, 09 Jun 2015 09:33:24 +0200 pve-ha-manager (1.0-1) unstable; urgency=medium * enable/start crm and lrm services by default -- Proxmox Support Team Fri, 05 Jun 2015 10:03:53 +0200 pve-ha-manager (0.9-3) unstable; urgency=medium * regression test improvements -- Proxmox Support Team Fri, 10 Apr 2015 06:53:23 +0200 pve-ha-manager (0.9-2) unstable; urgency=medium * issue warning if ha group does not exist -- Proxmox Support Team Tue, 07 Apr 2015 09:52:07 +0200 pve-ha-manager (0.9-1) unstable; urgency=medium * rename vm resource prefix: pvevm: => vm: * add API to query ha status * allow to use simply VMIDs as resource id * finalize ha api -- Proxmox Support Team Fri, 03 Apr 2015 06:18:05 +0200 pve-ha-manager (0.8-2) unstable; urgency=medium * lrm: reduce TimeoutStopSec to 95 * lrm: set systemd killmode to 'process' -- Proxmox Support Team Thu, 02 Apr 2015 08:48:24 +0200 pve-ha-manager (0.8-1) unstable; urgency=medium * currecrtly send cfs lock update request -- Proxmox Support Team Thu, 02 Apr 2015 08:18:00 +0200 pve-ha-manager (0.7-1) unstable; urgency=medium * create /etc/pve/ha automatically * use correct package for lock_ha_config * fix ha-manager status when ha is unconfigured * do not unlink watchdog socket when started via systemd * depend on systemd -- Proxmox Support Team Wed, 01 Apr 2015 11:05:08 +0200 pve-ha-manager (0.6-1) unstable; urgency=medium * move configuration handling into PVE::HA::Config * ha-manager status: add --verbose flag * depend on qemu-server -- Proxmox Support Team Fri, 27 Mar 2015 12:28:50 +0100 pve-ha-manager (0.5-1) unstable; urgency=medium * implement service migration * fix service dependencies (allow restart, reboot) * freeze services during reboot/restart -- Proxmox Support Team Thu, 26 Mar 2015 13:22:58 +0100 pve-ha-manager (0.4-1) unstable; urgency=medium * increase fence_delay to 60 seconds * fix regression test environment * fix failover after master crash with pending fence action -- Proxmox Support Team Wed, 25 Mar 2015 13:59:28 +0100 pve-ha-manager (0.3-1) unstable; urgency=medium * really activate softdog * correctly count active services * implement fence_delay to avoid immediate fencing * pve-ha-simulator: reset watchdog with poweroff * pve-ha-simulator: use option nofailback for default groups -- Proxmox Support Team Mon, 16 Mar 2015 13:03:23 +0100 pve-ha-manager (0.2-1) unstable; urgency=medium * add ha-manager command line tool * start implementing resources and groups API -- Proxmox Support Team Fri, 13 Mar 2015 09:26:12 +0100 pve-ha-manager (0.1-1) unstable; urgency=low * first package -- Proxmox Support Team Wed, 18 Feb 2015 11:30:21 +0100