2023.2 Series (9.5.0 - 9.7.x) Release Notes¶
2023.2-eol¶
Bug Fixes¶
The use of md_device as the default volume name if the volume name of a RAIDarray hasn’t been specified caused a ‘Not POSIX compatible’ error.This has been fixed by using just the last part of the md_device.Fixeshttps://bugs.launchpad.net/ironic-python-agent/+bug/2073406
Prevent the UnboundLocalError in erase_devices_express,for example, on a disk failure.
9.7.2¶
Upgrade Notes¶
Deployers implementing their own
HardwareManagersmust to audit their code for unsafe uses ofqemu-img and related methods.
Security Issues¶
Ironic-Python-Agent now checks any supplied image format value against the detected format of the image file and will prevent deployments shouldthe values mismatch.
Images previously misconfigured as raw despite being in another format, in some non-default configurations, may have been mistakenly converted ifneeded. Ironic-Python-Agent will no longer perform conversion in any casefor images with metadata indicating in raw format.
Ironic-Python-Agentalways inspects any non-raw user image content for safety before running any qemu-based utilities on the image. This is utilized to identify the format of the image and to verify the overall safety of the image. Any images with unknown or unsafe feature uses are explicitly rejected. This can be disabled in both IPA and Ironic by setting
[conductor]disable_deep_image_inspectiontoTruefor the Ironicdeployment. Image inspection is the primary mitigation for CVE-2024-44082 being tracked inbug 2071740.Operators may desire to set[conductor]conductor_always_validates_imageson Ironic conductors to mitigate the issue before they have upgraded their Ironic-Python-Agent.
Ironic-Python-Agent now explicitly enforces a list of permitted image types for deployment, defaulting to “raw” and “qcow2”. Other image types may work, but are not explicitly supported and must be enabled. This can be modified by setting
[conductor]permitted_image_formatsfor all Ironic services.
Bug Fixes¶
Fixes an issue where configuration drive volumes which are mountedby the operating system could remain mounted and cause a lock to beheld, which may conflict with actions such as
rebuild.The agent now always makes sure the folder used by Glean and Cloud-initis not mounted.
Fixes multiple issues in the handling of images as it related to execution of the
qemu-imgutility. When using this utility to convertan unsafe image, a malicious user can extract information from a node while Ironic-Python-Agent is deploying or converting an image. Ironic-Python-Agent now inspects all non-raw images for safety, and neverruns qemu-based utilities on raw images. This fix is tracked as CVE-2024-44082 andbug 2071740.
Images with metadata indicating a “raw” disk format may have been transparently converted from another format. Now, these images will have their exact contents imaged to disk without modification.
Fixes bug 2066308, an issue where Ironic Python Agent would callevaluate_hardware_support multiple times on hardware manager plugins.Scanning for hardware and disks is time consuming, and caused timeoutson badly-performing nodes.
9.7.1¶
Bug Fixes¶
Fixes a failure case where downloads would not be retried when thechecksum fails verification. the agent now includes the checksumactivity as part of the file download operation, and willautomatically retry downloads when the checksum fails inaccordance with the existing download retry logic.This is largely in response to what appears to be intermittenttransport failures at lower levels which we cannot otherwisedetect.
Fixes missing
Content-Typeheader when sending inspection data backto ironic-inspector or ironic. While ironic-inspector tolerates themissing header, it may cause issues with the new inspection implementation.
The default timeout value for the agent to lookup itself in an Ironicdeployment has been extended to 600 seconds from 300 seconds. This isto provide better stability for Ironic deployments under heavy loadwhich may be unable to service new requests. This is particularly truewhen the backing database is SQLite for Ironic due to the limited writeconcurrency of the database.
Fixes referencing to raid_device variable before assignment,is replaced by blk variable.
Inspection is now retried on HTTP 409 (conflict), which can be returnedby the new implementation in Ironic.
Fixes the post data to inspector to retry in 50X errors.
The error handling of the multipathd service startup/discovery process.IPA handles both scenario when the multipathd service is already startedand the scenario when the service has not been started and in the secondscenario IPA will try to start the service. IPA is not pre checking whethermultipathd is running already or not, it will start the multipathd serviceeven if it is already running and expects 0 error code . It has beennoticed that with certain combinations of Linux distros and multipathdversions the error code is not 0 when IPA tries to start multipathd incase an instance of multipathd is already running.When the expected return code is not 0 an exception will be thrown and thatwill cause the multipath device discovery to terminate prematurely andif the selected root device is a multipath device then IPA won’t beable to provision.This fix discards the exception that is caused by the non 0 error codereturned by the multipathd startup process. In case there is a genuineissue with the multipath service, that would be caught when the actualmultipath device listing command is executed (multipath -ll).
Fixes an issue with rebuilding instances on Software RAID withRAIDed ESP partitions.
9.7.0¶
New Features¶
Adds a new
serviceextension which facilitates command handling forIronic to retrieve a list of service steps.
Adds a new base method to base HardwareManager,
get_service_stepswhich works the same asget_clean_stepsandget_deploy_steps. These methods can be extended by hardware managers to permit them to signal what steps are permitted.
Extends reasonable deploy/clean steps to also be service steps which are embedded in the Ironic agent. For example, CPU, Network, and Memory burnin steps are available as service steps, but not the disk burnin step as that would likely result in the existing disk contents being damaged.
Bug Fixes¶
Fixes a failure case where a deployed instance may be unable to accessthe configuration drive post-deployment. This can occur when blockdevices only support 4KB IO interactions. When 4KB block IO sizesare in use, the ISO9660 filesystem driver in Linux cannot be usedas it is modeled around a 2KB block. We now attempt to verify, andrebuild the configuration drive on a FAT filesystem when we cannotmount the supplied configuration drive. Operators can force the agentto write configuration drives using the FAT filesystem using the
[DEFAULT]config_drive_rebuildoption.
Fixes, or at least lessens the case where a running Ironic agent can stackup numerous lookup requests against an Ironic deployment when a node islocked. In particular, this is beause the lookup also drives generation ofthe agent token, which requires the conductor to allocate a worker, andgenerate the token, and return the result to the API client.Ironic’s retry logic will now wait up to
60seconds, and if an HTTPConflict (409) message is received, the agent will automatically pauselookup operations for thirty seconds as opposed continue to attemptlookups which could create more work for the Ironic deploymentneedlessly.
