Wallaby Series (6.5.0 - 7.0.x) Release Notes

7.1.0-7

Security Issues

  • Ironic-Python-Agent versions prior to the 2023.1 release are vulnerable toCVE-2024-44082, tracked inbug 2071740 <https://bugs.launchpad.net/bugs/2071740>_. Deployers ofIronic versions Zed or older must apply CVE-2024-44082 fixes to theirIronic environment and leave (default for all releases Zed and older)[conductor]/conductor_always_validates_images set toTrue. Thisensures the conductor will security check the image becauseIronic-Python-Agent will not.

Bug Fixes

  • Fixes UEFI NVRAM record handling with efibootmgr so we can accept andhandle UTF-16 encoded data which is to be expected in UEFI NVRAM asthe records are UTF-16 encoded.

  • Fixes handling of UEFI NVRAM records to allow for unexpected charactersin the response, so it is non-fatal to Ironic.

  • Fixes, or at least lessens the case where a running Ironic agent can stackup numerous lookup requests against an Ironic deployment when a node islocked. In particular, this is beause the lookup also drives generation ofthe agent token, which requires the conductor to allocate a worker, andgenerate the token, and return the result to the API client.Ironic’s retry logic will now wait up to60 seconds, and if an HTTPConflict (409) message is received, the agent will automatically pauselookup operations for thirty seconds as opposed continue to attemptlookups which could create more work for the Ironic deploymentneedlessly.

7.1.0

Bug Fixes

  • Fixes a minor issue with the regular expression used for UEFI duplicateentry cleanup which was introduced in a prior change to refactor thecleanup operation to avoid UEFI firmware which treats deletion ofentries after addition as an invalid operation.

  • Fixes cases where duplicates may not be found in the UEFIfirmware NVRAM boot entry table by explicitly looking for, and deletingfor matching labels in advance of creating the EFI boot loader entry.

  • In case the CSV file used for the bootloader hint does not have BOMwe fail reading its content as utf-16 codec is too generic.Fail over to utf-16-le as Little Endian is mostly used.

  • Fixes configuring UEFI boot when the EFI partition is located on adevicemapper device.

  • Fixes GenericHardwareManager to find network informationfor bonded interfaces if they exist.

  • Fixes a race on software RAID creation: since the creation ofpartitions is asynchronous, we need to wait for all udev eventsto be processed before we can use the partitions to create anmd device.

  • Fixes an issue where partitions are not visible due to aincorrect call to have the partition table re-read.

  • Fixes an issue where partitions are not visible due to anincorrect call to have the partition table re-read during raidconfiguration creation.

  • Fixes handling of Software RAID device discovery so RAID deviceNamesandEvents field values do not inadvertently cause the command toreturn unexpected output. Previously this could cause a deployment towhen handling UEFI partitions.

  • Fixes an issue when the EFI partition UUID is not set and an attemptto edit /etc/fstab is made.

  • Fixes handling of a Partition UUID being returned instead of aPartition’s UUID when the OS may not return the Partition’s UUID in time.These two fields are typically referred to as PARTUUID and UUID,respectively. Often these sorts of issues arise under heavy IO load.We now scan, and identify which “UUID” we identified, and updatea Linux fstab entry appropriately. For more information, please seestory #2009881.

  • Recent releases of redhat grub2 will always fail when installing to EFIpaths, to encourage a transition to the signed shim bootloader. Partitionimage deploys avoid calling grub2-install with the preserve-efi-assetsfunctions. Deploying whole disk images doesn’t require grub2-install. Thisleaves whole disk images installed onto softraid devices, which still callsgrub2-install. Running grub2-install is still attempted in this oneremaining case, but any failures are now ignored.

  • Fixes failures with handling of Multipath IO devices where Active/Passivestorage arrays are in use. Previously, “standby” paths could result inIO errors causing cleaning to terminate. The agent now explicitly attemptsto handle and account for multipaths based upon the MPIO data available.This requires themultipath andmultipathd utility to be presentin the ramdisk. These are supplied by thedevice-mapper-multipath ormultipath-tools packages, and are not requried for the agent’s use.

  • Fixes non-ideal behavior when performing cleaning where Active/ActiveMPIO devices would ultimately be cleaned once per IO path, instead ofonce per backend device.

  • Fixes discovering WWN/serial numbers for devicemapper devices.

Other Notes

  • The agent will now attempt to collect any multipath path informationand upload it to the agent ramdisk, if the tooling is present.

7.0.2

New Features

  • Heartbeats to the conductor are grouped when they are scheduled orrequested within a time interval of five seconds to avoid sendingthem in quick succession.

  • Adds the capability into the agent to read and act upon bootloader CSVfiles which serve as authoritative indicators of what bootloader to loadinstead of leaning towards utilizing the default.

Known Issues

  • If multiple bootloader CSV files are present on the EFI filesystem, thefirst CSV file discovered will be utilized. The Ironic team considersmultiple files to be a defect in the image being deployed. This may bechanged in the future.

Bug Fixes

  • Fixes an issue with bootloader installation on a software RAID bychecking if the ESP is already mounted.

  • Fixes an issue where a quick succession of heartbeats exposes a racecondition in the conductor’s RPC handling.

  • Fixes fall-back to sysrq when powering off or rebooting the node frominside a container.

  • Fixes an error with UEFI based deployments where using a partition imagea NVMe device was previously failing due to the different device namepattern.

  • Fixes an issue where the NTP time sync at the IPA startup via chronyd isnot immediate (which can break time sensitive components such as thegeneration of a TLS certificate).

  • Fixes failures with disk image conversions which result in memoryallocation or input/output errors due to memory limitations by limitingthe number of available memory allocation pools to a non-dynamicreasonable number which should not exceed the available system memory.

  • The lshw package version B.02.19.2-5 on CentOS 8.4 and 8.5 contains abug that prevents thesize of individual memory banks from being reported, with the result thatthe total memory size would be reported as 0 in some places. The totalmemory size is now taken from lshw’s total memory size output (which doesnot suffer from the same problem) when available.

  • Mirrors the previously disconnected EFI system partitions (ESPs) in UEFIsoftware RAID setups. Disconnected ESPs can lead to nodes booting withoutdated kernel parameters or the UEFI firmware not finding bootablekernels at all.

  • Fixes nodes failing after deployment completes due to issues in the Grub2EFI loader entry addition where aBOOT.CSV file provides theauthoritative pointer to the bootloader to be used for booting the OS. Thebase issue with Grub2 is that it would update the UEFI bootloader NVRAMentries with whatever is present in a vendor specificBOOT.CSV orBOOTX64.CSV file. In some cases, a baremetal machinecan crash whenthis occurs. More information can be found atstory 2008962.

7.0.1

Bug Fixes

  • Fixes initial logging before configuration is loaded to re-log anythingrecorded for the purposes of troubleshooting. This is necessary as systemddoes not report stdout from a process launch as part of the process’slogging. Now messages will be re-logged once the configuration has beenloaded.

  • No longer crashes if MAC address cannot be determined for one of thenetwork interfaces.

  • Adds a call to “udevadm settle” in write_image.sh.After GPT and MBR are destroyed systemd-udevd gets triggeredwhich may hold /dev/sda open preventing qemu-img from writtingits image.

7.0.0

New Features

  • Adds support for NVMe-specific storage cleaning to IPA. Currently this is implemented by using nvme-cli format functionality. Crypto Erase is used if supported by the device, otherwise the code falls back to User Data Erase. The operators can control NVMe cleaning by using deploy.enable_nvme_erase config option which controlsagent_enable_nvme_erase internal setting in driver_internal_info.

Known Issues

  • Logic around virtual media device validation is now much more strict,and may not work in all cases. Should you discover a case, please providethe output fromlsblk-P-O with a virtual media device attached to theIronic development community viaStoryboard.

  • Internal logic to copy configuration data from virtual media now requirestheboot_method=vmedia flag to be set on the kernel command line ofthe bootloader for the virtual media. Operators crafting custom bootISOs, should ensure that the appropriate command line is being added inany custom build processes.

Upgrade Notes

  • It is no longer possible to enable the so calledstandalone mode, inwhich the agent does not communicate with ironic. This mode is onlyuseful for local testing, enabling it on production is always wrong.The ironic team does not support using ironic-python-agent as a standaloneapplication outside of the normal workflow.

Security Issues

  • Addresses a potential vector in which an system authenticated maliciousactor could leveraged data left on disk in some limited cases to make theAPI of theironic-python-agent attackable, or possibly break cleaningprocesses to prevent the machine from being able to be returned to theavailable pool. Please seestory 2008749for more information.

Bug Fixes

  • Adds validation of Virtual Media devices in order to prevent existingpartitions on the system from being considered as potential sources of IPAconfiguration data.

  • Adds check into the configuration load from virtual media, to ensure itonly occurs when the machine booted from virtual media.

  • IPA will now successfully clean configuration when it encounters a software RAID array that was previously created using entire devices instead of partitions.

  • IPA now properly checks if the root partition is already mounted.SeeStory 2008631for details.

  • Fixes an issue where metadata erasure cleaning fails for partitionsbecause the read-only file isn’t found, while it is available at thebase device. Adds a check for the base device file on failure. Seestory 2008696.

  • Fixes incorrect root partition UUID after streaming a raw partitionimage.

  • Increase memory usage limit forqemu-imgconvert command to 2 GiB.SeeStory 2008667for details.