Xena Series Release Notes

13.9.0-24

New Features

  • Added two new flags to alter behaviour in RabbitMQ:*rabbitmq_message_ttl_ms, which lets you set a TTL on messages.*rabbitmq_queue_expiry_ms, which lets you set an expiry time on queues.Seehttps://www.rabbitmq.com/ttl.html for more information on both.

Upgrade Notes

  • Nowironic_tftp service does not bind on 0.0.0.0, by default it uses ipaddress of theapi_interface. To revert to the old behaviour, pleasesetironic_tftp_interface_address:0.0.0.0 inglobals.yml.

  • Influxdb variableinfuxdb_internal_endpoint has been fixed toinfluxdb_internal_endpoint.Operators might need to review the relevant variable.

Security Issues

  • The kolla-genpwd, kolla-mergepwd, kolla-readpwd and kolla-writepwdcommands now creates or updates passwords.yml with correctpermissions. Also they display warning message about incorrectpermissions.

  • Restrict the access to the http Openstack services exposed /server-statusby default through the HAProxy on the public endpoint. Fixes issue forUbuntu/Debian installations. RockyLinux/CentOS not affected.LP#1996913

Bug Fixes

  • The precheck for RabbitMQ failed incorrectly whenkolla_externally_managed_cert was set totrue.LP#1999081

  • Fixes create sasl account before config file is ready.LP#2015589

  • Fixes an issue when Kolla is setting the producer tasks to None,and this disables all designate producer tasks.LP#1879557

  • Configuration of service user tokens for all Nova and Cinder servicesis now done automatically, to ensure security of block-storage volumedata.

    SeeLP#[2004555] formore details.

  • Fixesironic_tftp which binds to all ip addresses on the system.Addedironic_tftp_interface,ironic_tftp_address_family andironic_tftp_interface_address parameters to set the address for theironic_tftp service.LP#2024664

  • Adds configuration necessary for application credential access rules toproperly function.LP#1965111

  • Fixes deployment when using Ansible check mode.LP#2002661

  • Fixes the incorrect endpoint URLs and service type information for theCyborg service in the Keystone.LP#2020080

  • When upgrading or deploying RabbitMQ, the policyha-all is cleared ifom_enable_rabbitmq_high_availability is set tofalse.

13.9.0

New Features

  • Since CVE-2022-29404 is fixed the default value for the LimitRequestBodydirective in the Apache HTTP Server has been changed from 0 (unlimited) to1073741824 (1 GiB). This limits the size of images (for example) uploadedin Horizon. Now this limit can be configured viahorizon_httpd_limitrequestbody.LP#2012588

  • etcd is now exposed internally via HAProxy onetcd_client_port.

  • The config optionrabbitmq_ha_replica_count is added, to allow forchanging the replication factor of mirrored queues in RabbitMQ. While theflag is unset, the queues are mirrored across all nodes using“ha-mode”:”all”. Note that this only has an effect if the flag` om_enable_rabbitmq_high_availability` is set toTrue, as otherwisequeues are not mirrored.

  • The config optionrabbitmq_ha_promote_on_shutdown has been added, whichallows changing the RabbitMQ definitionha-promote-on-shutdown. Bydefaultha-promote-on-shutdown is “when-synced”. We recommend changingthis to be “always”. This basically means we don’t mind losing somemessages, instead we give priority to rabbitmq availability. This is mostrelevant when restarting rabbitmq, such as when upgrading. Note thatsetting the value of this flag, even to the default value of “when-synced”,will cause RabbitMQ to be restarted on the next deploy.For more details please see:https://www.rabbitmq.com/ha.html#cluster-shutdown

  • Services using etcd3gw via tooz now use etcd via haproxy. This removesa single point of failure, where we hardcoded the first etcd host forbackend_url.

Upgrade Notes

  • ironic.conf now sets[pxe]\kernel_append_params instead of[pxe]\pxe_append_params which has been deprecated.Please override the new config option if you are overriding theold one.

  • Default tags ofneutron_tls_proxy andglance_tls_proxy have beenchanged tohaproxy_tag, as both services are usinghaproxycontainer image.Any custom tag overrides for those services should be altered beforeupgrade.

Bug Fixes

  • Set the etcd internal hostname and cacert for tls internal enableddeployments. This allows services to work with etcd whencoordination is enabled for TLS interal deployments. Without thisfix, the coordination backend fails to connect to etcd and theservice itself crashes.

  • fix missing [taskflow] section in masakari.conf.j2LP#1966536

  • When upgrading RabbitMQ, the policyha-all was cleared only ifrabbitmq_remove_ha_all_policy is set totrue. Now,om_enable_rabbitmq_high_availability must also be set tofalse.

13.8.0

New Features

  • Adds the flagom_enable_rabbitmq_high_availablity. Setting this totrue will enable both durable queues and classic mirrored queues inRabbitMQ. Note that classic queue mirroring and transient (aka non-durable)queues are deprecated and subject to removal in RabbitMQ version 4.0 (dateof release unknown).Changes the pattern used in classic mirroring to exclude some queue types.This pattern is^(?!(amq\\.)|(.*_fanout_)|(reply_)).*.

Bug Fixes

  • Fixeskolla_docker module which did not take into accountthe common_options parameter, so there were always module’sdefault values.LP#2003079

  • Fixes the baremetal role to avoid an error“apparmor_parser apparmor_parser –version failed” by installingapparmor package on debian like systems.LP#2004583

  • The value of[oslo_messaging_rabbit]heartbeat_in_pthread is explicitlyset to eithertrue for wsgi applications, orfalse otherwise.

  • Fix issue with octavia config generation when usingoctavia_auto_configure and thegenconfig command.Note that access to the OpenStack API is necessary for Octavia autoconfiguration to work, even when generating config.SeeLP#1987299for more details.

  • Fixes an issue where some prechecks would fail or not run when running incheck mode.LP#2002657

13.7.0

Bug Fixes

  • Fixes an issue withironic-inspector using the wrong option toconfigure the interface used to communicate with the Ironic API.LP#1995246

13.6.0

Upgrade Notes

  • image_upload_use_cinder_backend=True is no longer set on theCinder’s default Ceph RBD backend, the common upstream default isnow used (False currently).See alsoLP#1991516

Bug Fixes

  • image_upload_use_cinder_backend=True is no longer set on theCinder’s default Ceph RBD backend. Related ERRORs and WARNINGs inCinder and Glance logs are prevented.LP#1991516

  • Fixes Keystone OIDC failing to validate JWT because of missing keyon Azure auth-oidc endpoint. Adds new variable containing JWKS urithat delivers missing keys.LP#1990375

  • Removes thedhcp-sequential-ip configuration option fromironic_dnsmasq to avoid a race condition offering the same IP addressto multiple hosts being inspected at the same time.

13.5.0

Bug Fixes

  • Fixes an issue with AlertManager external Web URL being unconfigurable.A new variableprometheus_alertmanager_external_url has beenintroduced that users can use to set web.external-url to public.

  • Under circumstances of extended disruption to the Fluentd-ElasticSearchcentral logging pipeline, it is possible to generate a sufficient bufferof unsent log data that takes longer than the default Fluentd requesttimeout (default 5 seconds) to transfer the buffer. The default requesttimeout value is raised to60s, and made configurable using newparameterfluentd_elasticsearch_request_timeout.LP#1983031

  • Fixes Ironic API healthchecks when backend TLS encryption is enabled.LP#1990819

  • Fixes an issue withironic-neutron-agent using the wrong option toconfigure the interface used to communicate with the Ironic API.LP#1990675

13.4.0

Security Issues

  • Kolla Ansible used to run Ironic’s tftpd as an (unprivileged) rootuser.Now, it will explicitly use the nobody user.

Bug Fixes

  • Fixes 1982777.Set multipathd user_friendly_names to “no”to make os-brick able to resize volumes online.Adds ability to override multipathd config.LP#1982777

  • Fixedbug #1987982This bug caused the database log_bin_trust_function_creators variablenot to be set back to “OFF” after a keystone upgrade.

  • Fixes an issue where ping might not be installed on some systems, causingHAProxy prechecks to fail.

  • Ifironic_enabled_notification_topics is set totrue,ironic_notification_level is set toinfo in order to ensurethat Ironic actually sends out notifications.

    Seebug 1969826for details.

13.3.0

New Features

  • Adds variables to configure whether monitoring services should be exposedexternally:

    • enable_grafana_external

    • enable_kibana_external

    • enable_prometheus_alertmanager_external

Bug Fixes

  • Fixes an issue where Ironic Inspector could be configured withoutauthentication in a multi-region environment in a region without a localKeystone service.

13.2.0

New Features

  • Adds support for configuring the Openstack Compute API microversion used bythe OpenStack exporter for Prometheus using theprometheus_openstack_exporter_compute_api_version variable. The defaultvalue is2.1 to keep metrics unchanged when using recent exporterreleases.

Bug Fixes

  • Fixes the issue of exponential growth of/run/openvswitch mounts when kolla-toolboxcontainer is restarted.LP#1979295

  • Fixes an issue with recovering multi-node MariaDB Galera cluster.

  • Increasesprometheus_openstack_exporter_timeout to 45 seconds to reducethe odds of scrape failures on deployments with large number of OpenStackresources.LP#1976629

13.1.0

New Features

  • Deploys and configures a prometheus-libvirt-exporter image as part of thePrometheus monitoring stack.

  • Adds atls_connect module to the Prometheus blackbox exporter. This canbe used to test connectivity of TLS servers.

  • New switches added to control deployment of the Masakari monitors. Thedeployment of each type of monitors can be controlled individually viaenable_masakari_instancemonitor andenable_masakari_hostmonitor.By default, both are set totrue when the deployment of the Masakariis enabled viaenable_masakari.

  • Implements container healthchecks for ironic-neutron-agent service.Seeblueprint

  • Adds support for libvirt SASL authentication. It is enabled by default.LP#1964013

  • Adds support for Rocky Linux 8 as Host OS.

Known Issues

  • Existing fluentd log rotation failed to delete old haproxy, swift,glance-tls-proxy and neutron-tls-proxy logs. These will not bedeleted by the new logrotate config and will have to be removedmanually.

Upgrade Notes

  • RabbitMQ’s Prometheus plugin is no longer enabled by defaultif Prometheus is not deployed. If external Prometheus is used,you need to turn onrabbitmq_enable_prometheus_pluginto get old behaviour.

  • The addition of libvirt SASL authentication requires a new password inpasswords.yml,libvirt_sasl_password. This may be generated usingthe existingkolla-genpwd andkolla-mergepwd tooling.

  • The addition of libvirt SASL authentication requires both thenova_libvirt andnova_compute containers to be updatedsimultaneously, using new images with the necessary Cyrus SASLdependencies, as well as configuration containing the SASL credentials.

  • It is no longer possible to override the removal of the MonascaLog Metrics service and it will be removed automatically if ithasn’t already been removed in the Wallaby release. It is upto the operator to remove any associated docker volumes.

  • update the default value of node_custom_config to {{ node_config }}/config,when specified using –configdir

Security Issues

  • Explicitly removes thenet.ipv4.ip_forward sysctl from/etc/sysctl.conf on hosts with Neutron L3 Agent. In the absence ofanother source for this sysctl, it should revert to the default of 0 afterthe next reboot. This is a follow up to a previous change which stoppedsetting the sysctl, but leaves existing systems with the original value of1 set.

    A deployer looking to more aggressively change the value may setneutron_l3_agent_host_ipv4_ip_forward to 0 using a Yoga release ofKolla Ansible. This option will be removed in future. Any deploymentsstill relying on the previous value may setneutron_l3_agent_host_ipv4_ip_forward to 1.LP#1945453

  • Fixes an issue where the default configuration of libvirt did not useauthentication for the API exposed over TCP on the internal API network.This allowed anyone with access to the internal API network read-writeaccess to libvirt. While the internal API network is typically trusted,other services on this network generally at least require authentication.

    SASL authentication is now enabled for libvirt by default. Kolla Ansiblesupports libvirt TLS since the Train release, and this is recommended toprovide a higher level of security.LP#1964013

Bug Fixes

  • Fixes an issue with an OIDC authentication flow requiring unnecessaryaction from the user. Redirecting to the target IdP page now happensautomatically.LP#930055

  • Removes custom value ofmax_allowed_secret_in_bytes inbarbican.conf. The default maximum size in Barbican was doubled toavoid issues with some certificates.LP #1957795

  • Fixes deploy Zun with Cinder Ceph support.Adds support for zun to access cinder volumeswhen external ceph is configured for cinder.LP#1848934

  • Fixed the deployment failure of outward_rabbitmq by resolving portconflicts by customizing RabbitMQ’sprometheus.tcp.port.LP #1885106

  • Use Volume V3 API in OpenStack exporter. Volume V2 API has been removedsince OpenStack Wallaby.LP#1938194

  • Fixes the copy job for grafana custom home dashboard file.The copy job for the grafana home dashboard file needs to run priviliged,otherwise permission denied error occurs.LP#[1947710]

  • Fixes Octavia’s “Connection refused” errors by addingovn_sb_connectiontooctavia.conf.LP#195011

  • Ironic API and Ironic Inspector API use separate policy files. Ironic rolewas updated to be able to handle both policies separately.LP#1952948

  • Continue to run all actions if one action failed in Elasticsearchcurator.LP#1954720

  • Fixes Placement no logrotate configurationLP#1954723

  • Fixes Nova resize failing whenmigration_interface is customised.LP#1956976

  • Fixes unable to connect to zun console whenkolla_enable_tls_external is true.Access to console of any zun container fails whenkolla_enable_tls_external is true.This fix sets the protocol for wsproxybase_urlinzun.conf according to the value ofkolla_enable_tls_externalLP#1957117

  • FixesRegisterIdentityProvidersinOpenStack taskwhich was missing an= in the openstack command causingthe task to fail to register an IDP with Keystone.LP#1959022

  • Fixes Glance with Cinder iSCSI backend failing due to lack of lock_pathsetting.LP#1959663

  • Fixes logrotate config missing for openvswitch andprometheus services.LP#1961795

  • Fixes an issue with Ironic’s PXE components not getting updated onupgrade.LP#1963752

  • Fixes configuration of the Prometheus HTTP API URL when using thePrometheus collector in CloudKitty.LP#1961615

  • Fixes an issue with Prometheus scraping when targets’ Ansible inventoryhostnames (inventory_hostname) do not resolve to reachable IPaddresses. Reverts to the previous behaviour of using IP addresses tocommunicate with targets. The side effect of this is that targets instanceswill again be labelled using IP addresses rather than hostnames.LP#1955563

  • Fix the apache’s wsgi configuration for the aodh servicein Debuntu binary flavours.LP#1953059

  • Fixes the baremetal role to avoid an error “Unable to remove “libvirtd”.Now the symlink /etc/apparmor.d/disable/usr.sbin.libvirtd is created bythe role.LP#1960302

  • Existing fluentd log rotation failed to delete old haproxy, swift,glance-tls-proxy and neutron-tls-proxy logs. Standardise rotationand deletion of logs using logrotate.

  • Fixes an issue with setting up OIDC based Keystone federation against IDPthat has a different response type than id_token. This can now be set usinga new variablekeystone_federation_oidc_response_type.LP#1959781

  • adds back the option to configure the rabbitmqclustering interface via kollaLP#1900160 <https://bugs.launchpad.net/kolla-ansible/+bug/1900160>

  • Fixes an issue seen when using Jinja2 3.1.0.

  • Fixes an issue with Masakari instance monitor when libvirt SASL is enabled.libvirt SASL was enabled by default in a recent change to Kolla Ansible.LP#1965754

  • Fixes the configuration option setting the type of endpoint used by Neutronto send requests to Placement.LP#1960503

  • Fixes a configuration issue with Node Exporter causing all file systemmetrics of a host to be identical.LP#1961438

  • Fixes an issue where a failure of any Nova compute service to registeritself would cause only the host querying the nova API to fail.Now, only hosts that fail to register will fail the Kolla Ansible run.Alternatively, to fail all hosts in a cell when any compute service failsto register, setnova_compute_registration_fatal totrue.LP#1940119

  • The prometheus openstack exporters are now behind haproxy,providing a unique time series in the prometheus database.Also ensures that only one exporter queriesthe openstack APIs at any given time interval.With the previous behavior each openstack exporterwas scraped at the same time.This caused each exporter to query the openstack APIssimultaneously introducing unneccesary load and duplicatetime series in the prometheus database due to the instancelabel being unique for each exporter.LP#1972818

  • Fixes an issue where RabbitMQ was configured to mirror classic transientqueues for all services. According to the RabbitMQ documentation this isnot a supported configuration, and contributed to numerous bug reports.In order to avoid making unexpected changes to the RabbitMQ cluster, it isnecessary to setrabbitmq_remove_ha_all_policy toyes in order toapply this fix. This variable will be removed in the Yoga release.LP#1954925

  • Fixes an issue with Cinder upgrade where Cinder services would remainpinned to the previous release’s RPC & object versions.LP#1954932

13.0.1

Security Issues

  • Adds mitigation for the Apache Log4j2 Remote Code Execution (RCE)Vulnerability in Elasticsearch - CVE-2021-44228.

Bug Fixes

  • Only runconfigureovninovsdb task on ovn-controller hostsThe task will fail on hosts (like controller nodes) withouttunnel interfaceLP#1953367

  • Fixes an issue where the Nova API logs were written tofiles ending with-wsgi.log which affected the processing ofthese logs in the Fluentd pipeline.LP#1950185

  • On slower nodes, the initial grafana startup could experience atimeout failure when the migrations for setting up the databasetook longer than expected. This has been fixed by increasing thedefault timeout. The timeout settings can be changed via newparametersgrafana_start_first_node_delay andgrafana_start_first_node_retries for thegrafana role.LP#1769962

Other Notes

  • The containerironic-dnsmasq now creates thednsmasq.log just asthe containerneutron-dhcp-agent. For both log files verbosity can beincreased globally viaopenstack_logging_debug or per service viaironic_logging_debug orneutron_logging_debug variables.

13.0.0

New Features

  • Add support for Alertmanager metrics scraping in Prometheus.

  • Adds support for integrating Fluentd metrics into Prometheus. Bydefault this is now enabled when Prometheus is enabled. This behaviourcan be overridden via theenable_prometheus_fluentd_integration flag. Bydefault the integration provides metrics relating to the processing of logsby Fluentd. These metrics can be useful for monitoring the status of theFluentd service. Additional metrics can also be extracted from logs viacustom Fluentd config.

  • Adds config parameterhaproxy_nova_spicehtml5_proxy_tunnel_timeoutto configure theTunnelTimeOut directive for spicehtml5proxyhaproxy service.

  • Adds support for CentOS Stream 8 as a host Operating System and basecontainer image. This is the only distribution of CentOS supported fromthe Wallaby release. The Victoria release will support both CentOS Linux 8and CentOS Stream 8 hosts and images, and provides a route for migration.

  • Adds support for integration with Ceph RadosGW.

  • Supports Debian Bullseye (11) as host distribution.

  • Adds a new variable,disable_firewall, which defaults totrue. Ifset tofalse, then the host firewall will not be disabled duringkolla-ansiblebootstrap-servers.

  • Disables usage collection (telemetry) in Kibana by default. User has stillan option to enable it via GUI.

  • Adds support inkolla_docker module to setCgroupnsMode for Dockercontainers (viacgroupns_mode module param). Requires Docker 20.10.Note that pre-20.10 all containers behave as if they were run with modehost.

  • Added a new haproxy configuration variable,haproxy_host_ipv4_tcp_retries2,which allows users to modify this kernel option.This option sets maximum number of times a TCP packet is retransmittedin established state before giving up. The default kernel value is 15,which corresponds to a duration of approximately between 13 to 30minutes, depending on the retransmission timeout. This variable can be usedto mitigate an issue with stuck connections in case of VIP failover,seebug 1917068for details.

  • Adds akolla-ansiblegather-facts command that may be used to gatherAnsible host facts.

  • The haproxy-config role now allows user to set weight perhaproxy’s backend. This can be achieved by setting a hostvarhaproxy_{{service}}_weight in inventory file to any integervalue in range from 1 to 256, so the higher the weight, the higherthe load. This can be set per{{service}}. If hostvar is notspecified, backend’s weight is not rendered in final haproxyconfiguration.

  • Adds two new variablesservice_images_pull_retries andservice_images_pull_delay which control the behaviour of imagepulling tasks. These are useful if your registry is not 100%reliable (usually due to load). The defaults have been set to3 retries and 5 seconds delay to ensure a better default experience(these are actually Ansible defaults when task retries are enabled).

  • Implements container healthchecks for memcached services.Seeblueprint

  • Implements container healthchecks for rabbitmq services.Seeblueprint

  • Implemented container healthchecks for the following services:ceilometer,kafka,keystone-fernet,kuryr,mistral,nova-spicehtml5proxy,qdrouterd,zun.Seeblueprint

  • Adds two new arguments to thekolla-ansible command,--checkand--diff. They are passed through directly toansible-playbook.

  • Transitions to using system-scoped tokens when authenticating as the Keystone admin user. This is a necessary step towards being able to enable the updated oslo policies in services that allow finer grained access to system-level resources and APIs. Since Queens, the admin role is assigned to the admin user with system scope as well as in the admin project.

  • Add ability to use and enable the neutron packet loggingframework.

  • Adds the ability to override the automatic detection offluentd_versionandfluentd_binary. These can now be defined as extra variables. Thisremoves the dependency of having docker configured for config generation.

  • Added support to override rabbitmq config (erl_inetrc andrabbitmq-env.conf) in thekolla-toolbox container.

  • OVN deployment will now configureexternal_ids:ovn-chassis-mac-mappingsto make DVR work on VLAN tenant networks.

  • Changes target names in Prometheus to user-friendly, Ansible inventorybased values.

  • Adds support for passing extra runtime options to cAdvisor viaprometheus_cadvisor_cmdline_extras new variable. By defaultsystem cgroups’ metrics are disabled, plus container labelsdon’t get exposed to Prometheus. Expensive metrics that usuallyshould not be exported are also enforced to be disabled - consulthttps://github.com/google/cadvisor/blob/master/docs/runtime_options.md#metricsfor a list. These defaults create savings in resources usage by bothcAdvisor and Prometheus.

  • Due to the removal of the Monasca Grafana fork, the Monasca datasourceis now configured in vanilla Grafana.

  • Adds support for configuring thefilter andgather_subset argumentsfor thesetup module viakolla_ansible_setup_filter andkolla_ansible_setup_gather_subset respectively. These can be used toreduce the number of facts, which can have a significant effect onperformance of Ansible.

  • Adds functionality to allow passwords that are generated for KollaAnsible to be stored in Hashicorp Vault. Use new CLI commandskolla-readpwd andkolla-writepwd to read and write Kolla Ansiblepasswords to a configured Hashicorp Vault kv secrets engine.

  • Adds “manila_cephfs_filesystem_name” variable to support multi-fsCeph Pacific+ deloyments.

  • It is now possible to pass multiple inventories tokolla-ansible. To doso you should specify--inventory multiple times.

  • New variableironic_enable_keystone_integration was added.It helps to add keystone connection information intoironic.conf if we want to connect to existing keystone(not installing it at the same time).

Upgrade Notes

  • Minimum supported Ansible version is now2.10 and maximum supportedis4 (ansible-core 2.11).

  • Updates all references to Ansible facts within Kolla Ansible from usingindividual fact variables to using the items in theansible_factsdictionary. This allows users to disablefact variable injectionin their Ansible configuration, which may provide some performanceimprovement. Check for facts referenced in local configuration files, andupdate to useansible_facts before disabling fact variable injection.

  • rp_filter is no longer set by Kolla Ansible by default.Users may wish to remove the related setting fromkolla_sysctl_conf_path (/etc/sysctl.conf by default).

  • Kolla Ansible now defaultsdocker_registry_insecure tofalse.If you relied on the previous behaviour, please switch it back onbut bear in mind the consequences as discussed in the related securitynote as well as the linked bug report.LP#1940547

  • To fix LP#1941940,nova_libvirt_dimensions now by default combines withnova_libvirt_default_dimensions. Please consider this when customisingthat variable.

  • Bumps minimum required Docker version to 18.09 and minimum requiredDocker Python SDK version to 3.4.1. These two are checked inprechecks.

  • CentOS Linux 8 is no longer supported as a host Operating System or basecontainer image. CentOS users should migrate to CentOS Stream 8. TheVictoria release will support both CentOS Linux 8 and CentOS Stream 8hosts and images, and provides a route for migration.

  • Updates the default image type tosource. Users wishing to deploybinary type images should setkolla_install_type tobinary inglobals.yml. This change is to reflect the reality that source imagesare tested more thoroughly and we (as OpenStack community) have bettercontrol over them.

  • Adds a new flag,docker_disable_ip_forward, whichdefaults todocker_disable_default_iptables_rules and is used todisable docker’sip-forward option which makes docker setnet.ipv4.ip_forward sysctl to1. By default,docker_disable_default_iptables_rules istrue, in which casedocker’sip-forward option isdisabled.

    For existing hosts, this configuration change is applied when configuringdocker viakolla-ansiblebootstrap-servers. Docker changes the sysctlin a non-persistent manner, so it will revert to the default of0 aftera reboot, if not configured elsewhere. This should not cause a problem,since Kolla Ansible applies the sysctl where necessary. Operators may wishto perform a proactive reboot, or apply the default through other means.

  • enable_host_ntp variable is dropped per the deprecation process.

  • A new grouploadbalancer is required in inventory file prior to upgrade. Theloadbalancer group is a replacement for thehaproxy group.

  • An HTTP server is now always deployed for Ironic conductor, whilepreviously it was only deployed when iPXE is enabled.

    In the Xena release, Ironic removed the iSCSI driver. The recommendeddeploy driver isdirect, which uses HTTP to transfer the disk image.This requires an HTTP server, and the simplest option is to use the onepreviously deployed whenenable_ironic_ipxe is set totrue.

  • Thehaproxy_single_service_listen.cfg.j2 template isnot supported in haproxy roles and has been deleted.

  • Changes the default ofmariadb_clustercheck_tag andmariabackup_tagfromopenstack_tag tomariadb_tag. This allows one variable to setthe tag for all MariaDB images.

  • Updates the default value ofmonasca_ntp_server fromexternal_ntp_servers[0] to0.pool.ntp.org. This is due to theremoval of theexternal_ntp_servers variable as part of the removal ofChrony deployment.

  • Modifies the default value ofceph_nova_user fromnova tothe value ofceph_cinder_user, in line with the default forceph_nova_keyring. Users who have overriddenceph_nova_keyring touse separate keyrings for Nova and Cinder should also overrideceph_nova_user to match the Nova keyring.LP#1934145

  • Changes Prometheus targets naming. This makes their names more userfriendly but also creates a completely new set of a time series data.New target names are taken from Ansible inventory and have the exporterport number stripped off. Any Grafana dashboard that relies on a specific,hard-coded naming pattern for the targets will stop showing metrics afterthe upgrade.

  • cAdvisor has now reduced number of Prometheus metrics and labelsexported by default. This means that corresponding timeseries willno longer be created. If existing setup relies on these, e.g. forvisualisation or alerting, they could be explicitly enabled prior toupgrading with theprometheus_cadvisor_cmdline_extras new variable.Reference for the possible options:https://github.com/google/cadvisor/blob/master/docs/runtime_options.md#metrics.

  • Modifies the default value ofrabbitmq_server_additional_erl_args froman empty string to+S2:2+sbwtnone+sbwtdcpunone+sbwtdionone.

  • Support for deployment of chrony has been removed.

  • Service containers and configuration for the Monasca Grafana servicewill be removed automatically. It is up to the operator to remove therelated HAProxy configuration, the Monasca Grafana database, andassociated Docker volumes.

  • Support for panko has been removed due to upstream retirement.

  • Removes support for Prometheus v1 deployment. Any previously deployedPrometheus v1 instances will create a conflict during an upgrade. Theyshould be either manually stopped/removed or Prometheus v2 deploymentshould be disabled by settingenable_prometheus tono.

  • TheRally andtempest projects are not OpenStack services,but clients.Their images and support are removed since Xena cycle.

  • Thewsrep-notify.sh script has been removed (following deprecationin Wallaby).

  • Switches default images source (docker_registry) toquay.io.Thedocker_namespace is also changed toopenstack.kolla to match.This is to make the default experience better, especially for users inChina, those deploying more than once and/or beyond the all-in-one (AIO)environment used for development, testing and evaluation.Do note for multinode and production deployments it is still recommendedto use a local registry as docs suggest.LP#1942134

Deprecation Notes

  • Settingrp_filter via Kolla Ansible is deprecated.

  • Support for configuration of NTP daemon (viaenable_host_ntp) isdeprecated and will be removed in the next Kolla Ansible release(Xena).Please use other means of configuring NTP.

  • The Monasca Fork of Grafana is deprecated due to lack of maintenanceand will be removed in the Xena release. Instead, support will beprovided to allow Monasca users to migrate to the vanilla Grafanaservice with the Monasca datasource.

  • Support for deployingtempest andrally is deprecated andwill be removed in the Xena cycle.The reason is that these are not services of an OpenStack cloud butits clients.

Critical Issues

  • Fixes a critical bug which caused Nova instances (VMs) using libvirtd(the default/usual choice) to get killed on libvirtd (nova_libvirt)container stop (and thus any restart - either manual or done by runningKolla Ansible). It was affecting Wallaby+ on CentOS, Ubuntu and DebianBuster (not Bullseye). If your deployment is also affected, please read thereferenced Launchpad bug report, comment #22, for how to fix it withoutrisking data loss. In short: fixing requires redeploying and this willtrigger the bug so one has to first migrate important VMs away and onlythen redeploy empty compute nodes.LP#1941706

Security Issues

  • Previously, Kolla Ansible, by default (as documented in several places),configured Docker to insecure mode for the configured registry (i.e., ifnot using the default one). This is controlled by thedocker_registry_insecure variable.If operators did not notice this quirk, they could have opened theirdeployments up for potential MITM attacks. See the bug report formore discussion.LP#1940547

  • Fixesnet.ipv4.ip_forward not to be enabled by Kolla Ansibleon the default network namespace.It was enabled on hosts with Neutron L3 Agent (thus in most commonsetups with OVS and/or Linux Bridge, but not OVN) and allowed,unless users had extra iptables rules to avoid that, any trafficto be accepted for forwarding (as long as it was routable and passedother checks).Users of existing setups are advised to re-evaluate whether theyneed this sysctl enabled and disable if not necessary.Kolla Ansible will simply no longer try to set this sysctl at all.Neutron L3 Agent handles forwarding enablement per managednamespace.LP#1945453

Bug Fixes

  • Fixes monasca-thresh to correctly submit the topology to Storm.The previous container ran the topology in local mode (within thecontainer), and didn’t use the Storm cloud. The new containerhandles submitting the topology to Storm and also handles killingand replaces the topology when it’s configuration has changed.As a result, the monasca-thresh container is only used forsubmission, and exits after that’s completed.The logs for the topology will now be available in the stormworker-artifact logs.LP#1808805

  • Workaroundsrp_filter setting issues by defaulting to skipping it.LP#1837551

  • Fixes an issue where configuration in containers could become stale.This prevented containers with updated configuration from beingrestarted, e.g., if thekolla-ansiblegenconfig andkolla-ansibledeploy-containers commands were used together.LP#1848775

  • chronyd crash loop if server is rebooted (Debian)LP#1915528

  • Fixed an issue when Docker was configured after startup on Debian/Ubuntu,which resulted in iptables rules being created - before they were disabled.LP#1923203

  • Fixes an issue with Octavia SSH key copying if user disabled Octaviaauto configuration.LP##1927727

  • Fixes elasticsearch fluentd output being enabled whenelasticsearch is not enabled.LP#1927880

  • Fixed an issue where docker python SDK 5.0.0 was failing due to missingsix - introduced a constraint to install version lower than 5.x.LP#1928915

  • Fixes more-than-2-node RabbitMQ upgrade failing randomly.LP#1930293.

  • Fixes Swift deploy when TLS enabled.Added the missing handler and corrected the container name.LP#1931097

  • Fixes missing region_name in keystone_auth sections.Seebug 1933025 for details.

  • Fixesiscsid failing in current CentOS 8 based images due topid file being needlessly set.LP#1933033

  • Fixes host bootstrap on Debian not removing the conflicting packages.It now behaves in accordance with the docs.LP#1933122

  • Fixes default Masakari host monitor config to work with other config thatKolla Ansible sets.This setsdisable_ipmi_check due torestrict_to_remotes being set.It prevents theTypeError that happened when host monitor had totake action.This does not affect any functionality so far as Kolla Ansible does notmanage IPMI credentials in Pacemaker.LP#1933209

  • Fixes an issue with timesync checks on deployment host. Seebug1933347 for details.

  • Fixes horizon’s healthcheck when SSL is turned on.LP#1933846

  • Fixes an issue seen when customising the Docker Yum repository URL onCentOS, where thedocker_yum_gpgkey variable is not used consistently.LP#1934913

  • Fixes an issue where spice console is freezed after while,seeLP#1938549.

  • Fixes Masakari in multi-region deployments to query Nova API in itsown region.LP#1939291

  • Fixes nova’s healthchecks when upgrading from previous version.LP#1939679

  • Fixed brokenkolla-toolbox container when RabbitMQ is disabled andIPv6 is used.LP#1939883

  • Fixes inability to attach devices (e.g., volumes via iSCSI/FC)to instances on Debian Bullseye.LP#1941940

  • Fixes kolla-toolbox ansible.log logging for different users than ansible.LP#1942846

  • Fixesmariadb-clustercheck not to run when there is no HAProxy.LP#1944114

  • No longer creates directories for haproxy and swift logs where theyare not needed.LP#1945070

  • Fixes an issue with multinode MariaDB deployments which could failthe playbook execution on WSREP check due to the new behaviour ofGalera 4.LP#1947485.

  • Fixes an issue with single node MariaDB deployments with HAProxy disabled.Seebug 1947534 for details.

  • Fixes the generation ofwsrep_cluster_address ingalera.cnfwhen--limit is used while deploying MariaDB nodes.LP#1947589

  • Fixes an error in placement role which prevents to deploy the placementservice when custom policy file is used.LP#1948835

  • Fixes missing current Ansible version in the error message.LP#1948979

  • Fix octavia role doesn’t set the amphora network’s gateway_ipLP#1949260

  • Fixes an issue wherekolla-ansible exits with a zero exit code whenexecuted with a bogus command name.LP#1929397

  • Removes deprecatedexport_synchronous option from Designateconfig.

  • Fixes potential issue with Alertmanger in non-HA deployments. In thisscenario, peer gossip protocol is now disabled and Alertmanager won’ttry to form a cluster with non-existing other instances.LP#1926463

  • Adds a new flag,docker_disable_ip_forward, whichdefaults todocker_disable_default_iptables_rules and is used todisable docker’sip-forward option which makes docker setnet.ipv4.ip_forward sysctl to1.This is to protect from creating all-forwarding hosts.LP#1931615

  • Fixes an issue when generating/etc/hosts duringkolla-ansiblebootstrap-servers when one or more hosts has anapi_interface withdashes (-) in its name.LP#1927357

  • Fixes the container health check for theironic_ipxe container onDebian and Ubuntu systems.LP#1937037

  • Fixes an issue with Gnocchi when gnocchi-statsd is disabled.LP#1926914

  • Fixes HAProxy prechecks whenkolla_externally_managed_cert is used.

  • Fixes an issue with Magnum when TLS is enabled.LP#781062

  • Fixes an issue withconfig.json forneutron-server when a VMwareplugin agent is used.

  • Stops Fluentd warning message when posting to Elasticsearch 7 bulkAPI.

  • Fixes an issue with Neutronlinuxbridge ML2 agent whenneutron_external_interface includes multiple interfaces.LP#1863935

  • Fixes an issue with Manila configuration which was missing a[glance]section, preventing some drivers from operating.

  • Fixes the container image used by mariabackup. It was using themariadbimage, which was deprecated in Victoria and removed in Wallaby.LP#1928129

  • Fixes an issue with default Nova configuration for Ceph where the RBD useris set tonova, but only acinder keyring is copied. The defaultvalue ofceph_nova_user is changed to the value ofceph_cinder_user, in line with the default forceph_nova_keyring.LP#1934145

  • Fixes an issue with Octavia deployment when using a custom service authproject. Ifoctavia_service_auth_project is set to a project that doesnot exist, Octavia deployment would fail. The project is now created.LP#1922100

  • Fixes an issue where Libvirt secrets were not persisted. There are no knownnegative side-effects to this, however it was fixed as a precaution.LP#1821696

  • Removes “fix_cephfs_owner.yaml” which related to pre-wallaby Manila’suse of subfolders. Post-wallaby Manila now uses cephfs volumes instead,as such this file is no longer required.LP#1938285LP#1935784

  • Removes use of “cephfs_enable_snapshots” in Manila config as thisoption was removed from Manila in the Wallaby release.

Other Notes

  • Following Cinder upstream, support for using ZFSSA with Cinder has beenremoved. ZFSSA was unsupported in Train and later removed in Ussuri.

  • Optimised image pulling to avoid looping over disabled services.