openstack-discuss@lists.openstack.org

November 2024

99 participants
134 discussions

[all][dev][ops][tc] Bridging the gap between community and contributing orgs
by Jeremy Stanley 08 Dec '25

08 Dec '25

The tl;dr on this is that I'm opening a broad community discussionon how we can, collectively through improved communication, betterbring together our established contributors with casual andprospective contributors who struggle to find successful patterns ofcontribution, so that we might all support each other in order tobenefit OpenStack as a whole. Read on for details...OpenStack's TC has requested that foundation staff, when inconversation with representatives of member organizations, encourageparticipation in the project and collect feedback on any relatedchallenges those organizations encounter in their attempts to do so.In order to better understand the feedback and brainstorm achievablerecommendations for the broader community[*], Community Managers onthe foundation staff started to work with a small focus group ofestablished contributors with a solid understanding of whatsuccessful patterns of contribution look like. It also came to lightthat established contributors experience many of the same sorts ofchallenges, even more so if they don't have the luxury of focusingfull-time upstream, and so success often comes down to knowing howto effectively navigate those challenges. The community's goal withthis exercise is to improve everyone's experience, and your help isneeded to do that!The most common themes reported by those struggling to contributeare not related to tooling or workflow confusion, but instead seemto come down to basic communication challenges. Ideas so far toaddress these gaps include:* Review bandwidth - Ensure contributors and companies understand the importance and benefits of code review - Incentivize and credit meaningful "+1" reviews more - Understand that the volume of changes and limited reviewer bandwidth often means proposals aren't reviewed quickly (or at all), and their proponents may need some additional tenacity and engagement with the team to get their work noticed - Utilize "review dashboards" as a way to find stale changes which have fallen through the cracks* Review strategy and etiquette - Be clear about the meaning of votes on changes, especially negative ones, and set explicit timelines for things like "procedural -2" or WIP blocks when their approval needs to be delayed - Focus on comments and requests that affect whether or not the change can be merged after the fixes are submitted, rather than requesting trivial adjustments to an unsuitable change which has more fundamental problems - Reciprocate with reviews of changes from new reviewers you see leaving insubstantial reviews, demonstrating to them what deeper and more meaningful reviewing looks like* Mentoring newcomers - Established contributors with sufficient bandwidth can help mentoring newcomers, potential core reviewers and new leaders - Mentoring can happen as part of internship programs as well as by helping newcomers determined to become active contributors; both ways should be embraced and utilizedWhat challenges are you facing? And, how would you improve them?Two weeks from now, we'll have a forum session at the OpenInfraSummit Asia in Suwon[**] where our community can refine thesepossible approaches and integrate others, as well as bring attentionto them and promote their application. I've also proposed a similarforum session for OpenInfra Days North America in Indianapolis amonth after that, where we can hopefully continue this conversation.The goal for this multi-stage effort is to improve the contributionexperience for established participants, casual contributors andnewcomers alike. While it's easy to jump to blaming in thesesituations, it will be far more productive if we can focus on thechallenges and experiences that we would each like to remove orimprove for ourselves and others in our community. As we find waystogether to improve our efficiency, we can reduce the load on allcontributors and lower the barriers for people to join andparticipate.[*]https://etherpad.opendev.org/p/r.2205024e55689bccb82c20c960853cb5[**]https://2024.openinfraasia.org/a/schedule#view=calendar&title=Bridging%20th…-- Jeremy Stanley

2 14

[openstack helm] [Glance] Glance showing OSError: unable to receive chunked part when uploading image
by daniel890723＠gmail.com 08 Nov '25

08 Nov '25

Hello i'm testing using Openstack-helm to install all componentbut after the installation complete i can't upload the OS image through both openstack CLI or Horizon it stuck at Queueingand when i checking Glance Api logit shows2024-11-07 07:31:19.362 7 INFO glance.api.v2.image_data [None req-2bd28444-196c-42e9-94ef-18443b58310e b39b42e27f8646acad1303711f1dc8b8 e4ab1a6192b9486da698efa311086c53 - - default default] Unable to create trust: no such option collect_timing in group [keystone_authtoken] Use the existing user token.2024-11-07 07:31:19.400 7 ERROR glance.api.v2.image_data [None req-2bd28444-196c-42e9-94ef-18443b58310e b39b42e27f8646acad1303711f1dc8b8 e4ab1a6192b9486da698efa311086c53 - - default default] Failed to upload image data due to internal error: OSError: unable to receive chunked part2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi [None req-2bd28444-196c-42e9-94ef-18443b58310e b39b42e27f8646acad1303711f1dc8b8 e4ab1a6192b9486da698efa311086c53 - - default default] Caught error: unable to receive chunked part: OSError: unable to receive chunked part2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi Traceback (most recent call last):2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/wsgi.py", line 1302, in __call__2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi action_result = self.dispatch(self.controller, action,2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/wsgi.py", line 1345, in dispatch2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi return method(*args, **kwargs)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/utils.py", line 415, in wrapped2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi return func(self, req, *args, **kwargs)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/api/v2/image_data.py", line 299, in upload2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi with excutils.save_and_reraise_exception():2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi self.force_reraise()2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi raise self.value2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/api/v2/image_data.py", line 162, in upload2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi image.set_data(data, size, backend=backend)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/notifier.py", line 492, in set_data2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi with excutils.save_and_reraise_exception():2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/oslo_utils/excutils.py", line 227, in __exit__2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi self.force_reraise()2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/oslo_utils/excutils.py", line 200, in force_reraise2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi raise self.value2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/notifier.py", line 443, in set_data2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi self.repo.set_data(data, size, backend=backend,2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/quota/__init__.py", line 322, in set_data2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi self.image.set_data(data, size=size, backend=backend,2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/location.py", line 596, in set_data2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi self._upload_to_store(data, verifier, backend, size)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/location.py", line 487, in _upload_to_store2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi multihash, loc_meta) = self.store_api.add_with_multihash(2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/multi_backend.py", line 397, in add_with_multihash2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi return store_add_to_backend_with_multihash(2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/multi_backend.py", line 479, in store_add_to_backend_with_multihash2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi (location, size, checksum, multihash, metadata) = store.add(2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/driver.py", line 294, in add_adapter2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi metadata_dict) = store_add_fun(*args, **kwargs)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/capabilities.py", line 176, in op_checker2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi return store_op_fun(store, *args, **kwargs)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/_drivers/filesystem.py", line 764, in add2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi raise errors.get(e.errno, e)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/_drivers/filesystem.py", line 746, in add2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi for buf in utils.chunkreadable(image_file,2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance_store/common/utils.py", line 73, in chunkiter2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi chunk = fp.read(chunk_size)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/utils.py", line 294, in read2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi result = self.data.read(i)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/utils.py", line 121, in readfn2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi result = fd.read(*args)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/format_inspector.py", line 956, in read2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi chunk = self._source.read(size)2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi File "/var/lib/openstack/lib/python3.10/site-packages/glance/common/wsgi.py", line 1045, in read2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi data = uwsgi.chunked_read()2024-11-07 07:31:19.426 7 ERROR glance.common.wsgi OSError: unable to receive chunked part2024-11-07 07:31:19.426 7 ERROR glance.common.wsgiCeph Cluster Status:NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL FSIDceph /var/lib/rook 3 82m Ready Cluster created successfully HEALTH_OK 5cf8b9bb-13e1-4c31-b47d-9e7d92c3c05dCeph osd status:ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME-1 0.35156 - 360 GiB 3.3 GiB 493 MiB 0 B 2.8 GiB 357 GiB 0.91 1.00 - root default-5 0.11719 - 120 GiB 1.1 GiB 164 MiB 0 B 964 MiB 119 GiB 0.92 1.01 - host instance-lu-1 2 hdd 0.11719 1.00000 120 GiB 1.1 GiB 164 MiB 0 B 964 MiB 119 GiB 0.92 1.01 195 up osd.2-3 0.11719 - 120 GiB 1.1 GiB 164 MiB 0 B 951 MiB 119 GiB 0.91 1.00 - host instance-lu-2 0 hdd 0.11719 1.00000 120 GiB 1.1 GiB 164 MiB 0 B 951 MiB 119 GiB 0.91 1.00 195 up osd.0-7 0.11719 - 120 GiB 1.1 GiB 164 MiB 0 B 951 MiB 119 GiB 0.91 1.00 - host instance-lu-3 1 hdd 0.11719 1.00000 120 GiB 1.1 GiB 164 MiB 0 B 951 MiB 119 GiB 0.91 1.00 197 up osd.1 TOTAL 360 GiB 3.3 GiB 493 MiB 0 B 2.8 GiB 357 GiB 0.91MIN/MAX VAR: 1.00/1.01 STDDEV: 0.01now i'm able to create Empty Volume using horizon(can't create volume using default cirros image )and i can start an Instance without volume and mount the empty volume in(Ceph also can detect when i use that volume)is there something i need to set when starting glance?because i'm following the Openstack-helm installation document to install openstack with helm

5 8

[openstack][kolla-ansible]about group node config overwrite
by Nguyễn Hữu Khôi 29 Oct '25

29 Oct '25

Hello Koller.I want to ask if we can overwrite config for a group of nodes. For example,compute01 >> compute10 will have the same nova configurationcompute11 >> compute20 will have the same nova configuration.It will be very nice if we support this way.Thank you. RegardsNguyen Huu Khoi

4 7

[i18n][infra][horizon] Zanata translation job failures after Python 3.8 deprecation
by Ian Y. Choi 09 Jul '25

09 Jul '25

Hi all,During last week's PTG, the I18n SIG identified Zanata translation jobfailures for Horizon (and likely related plugins). The root causeappears to be Python 3.8 deprecation on Horizon [1].Failed jobs have been observed on Zuul [2], and translation proposalswere made on Gerrit [3].Since translation sync on Horizon stable branches is functioningcorrectly, the error message seems to be related to Python 3.8deprecation: ERROR: Cannot install -r doc/requirements.txt (line 3) due toconflicting package version dependencies.To address this, I'd like to raise two points:1/ To resolve the issue, would you support updating translation syncjobs with Bionic-based Zuul jobs by either upgrading to Jammy ormodifying the Bionic image to support Python >3.8?2/ At least me from I18n SIG acknowledges that recognizing thesetranslation job errors was delayed. Moving forward, would it bebeneficial to implement periodic (e.g., monthly) reporting ontranslation job successes and failures?Thank you,/Ian[1]https://docs.openstack.org/releasenotes/horizon/2024.2.html#upgrade-notes[2]https://zuul.opendev.org/t/openstack/builds?job_name=propose-translation-up…[3]https://review.opendev.org/q/owner:proposal-bot+project:openstack/horizon+b…

2 3

Problems with the Placement service
by Ron Gage 24 Jun '25

24 Jun '25

Hi all:I must have done something wrong in the installation of Openstack Placement. I have been through the steps to install placement multiple times now. I am using the Caracal release. I am on CentOS Stream 9.Everything seems to be installed correctly. Indeed, the upgrade check returns no errors:[root@cloud ~]# placement-status upgrade check+-------------------------------------------+| Upgrade Check Results |+-------------------------------------------+| Check: Missing Root Provider IDs || Result: Success || Details: None |+-------------------------------------------+| Check: Incomplete Consumers || Result: Success || Details: None |+-------------------------------------------+| Check: Policy File JSON to YAML Migration || Result: Success || Details: None |+-------------------------------------------+[root@cloud ~]#Verifying operation however returns nothing:[root@cloud ~]# openstack --os-placement-api-version 1.2 resource class listCould not load 'volume_backup_unset': module 'openstackclient.volume.v2.volume_backup' has no attribute 'UnsetVolumeBackup'Expecting value: line 1 column 1 (char 0)The placement service has been defined (the host name is "cloud", not "controller").[root@cloud ~]# openstack endpoint listCould not load 'volume_backup_unset': module 'openstackclient.volume.v2.volume_backup' has no attribute 'UnsetVolumeBackup'+----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------+| ID | Region | Service Name | Service Type | Enabled | Interface | URL |+----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------+| 0e81501847db4aa38c864a2329a43052 | RegionOne | glance | image | True | admin |http://cloud:9292 || 171c34b267674e798f625a811e87fcf8 | RegionOne | keystone | identity | True | admin |http://cloud:5000/v3/ || 209386e7cddd44bbbbd5d5b3b9aa4ff5 | RegionOne | nova | compute | True | internal |http://cloud:8774/v2.1 || 2f9d477115924b24a12204f54b1d6ddb | RegionOne | placement | placement | True | internal |http://cloud:8778 || 58eb375281724e7998c0b6c8015302c4 | RegionOne | keystone | identity | True | public |http://cloud:5000/v3/ || 618b400dc67c4038a6f62540f97d4246 | RegionOne | glance | image | True | internal |http://cloud:9292 || 77843d0ddbac4cacb805cefd1c029544 | RegionOne | glance | image | True | public |http://cloud:9292 || 8a8cca4fee32461fbe7836e756751df5 | RegionOne | keystone | identity | True | internal |http://cloud:5000/v3/ || a01d89f6ea624e09af9d08d85e9bef40 | RegionOne | nova | compute | True | admin |http://cloud:8774/v2.1 || d06d6b75e9bf4014b4de07df0bce8dd1 | RegionOne | placement | placement | True | public |http://cloud:8778 || dfd618e52cc54af6bcf7afbf1a3a32cd | RegionOne | nova | compute | True | public |http://cloud:8774/v2.1 || f6c41b2d7ba946908da5f0ed4d591c2a | RegionOne | placement | placement | True | admin |http://cloud:8778 |+----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------+What on earth could I have messed up to prevent the Placement API from working. Yes, I have restarted httpd.How do I know that Placement isn't working? Nova Conductor and Nova Scheduler can't connect to it (firewall is off and iptables rules are flushed) and therefore won't start properly. I can telnet to port 8778 yet the Nova services claim it can't connect.Thank you for your help!Ron Gage

5 7

Fwd: DPDK+OVS with OpenStack
by Mark Wittling 23 Apr '25

23 Apr '25

Looking for someone who knows OpenStack with OpenVSwitch, and in additionto that, DPDK with OpenStack and OVS.I am using OpenStack Queens, withOpenVSwitch. The architecture I am using is documented here:https://docs.openstack.org/neutron/queens/admin/deploy-ovs-provider.htmlTheOVS I am using on the Compute Node, is compiled with DPDK, and I haveenabled the datapath to netdev (DPDK) on br-prv (provider network bridge),and br-tun (tunneling bridge). But these two bridges, br-tun and br-prv,are patched into another OpenStack bridge, called br-int. I wasn’t actuallysure about whether to tinker with this bridge, and wondered what datapathit was using.Then, I realized there is a parameter in theopenvswitch_agent.ini file, which I will list here:# OVS datapath to use. 'system' is the default value and corresponds to the# kernel datapath. To enable the userspace datapath set this value to 'netdev'.# (string value)# Possible values:# system - <No description provided># netdev - <No description provided>#datapath_type = systemdatapath_type = netdevSo in tinkering with this, what I realized, is that when you set thisdatapath_type to system or netdev, it will adjust the br-int bridge to thatdatapath type.So here is my question. How can I launch a non-DPDK VM, ifall of the bridges are using the netdev datapath type?Here is anotherquestion. What if one of the flavors don’t have the largepages property seton them? I assumed OpenStack would revert to a system datapath and not useDPDK for those VM interfaces. Well, I found out in testing, that is not thecase. If you set all your bridges up for netdev, and you don’t set theproperty on the Flavor of the VM (largepages), the VM will launch, but itsimply won’t work.Is there no way, on a specific Compute Host, to supportboth DPDK (netdev datapaths) and non-DPDK (system datapaths)?Either on a VMinterface level (VM has one interface that is netdev DPDK and another thatis system datapath non-DPDK)?Or on a VM by VM basis (VM 1 has 1 or morenetdev datapath interfaces and VM 2 has 1 or more system datapathinterfaces)?Am I right here? Once you set up a Compute Host for DPDK, it’sDPDK or nothing on that Compute Host? (edited)

3 2

How to config DPDK compute in OVN
by dangerzone ar 22 Apr '25

22 Apr '25

Hi....as far I understand ovs does support dpdk compute to be deployed. I'mreferring to the guideline '*https://docs.openstack.org/networking-ovn/queens/admin/dpdk.html<https://docs.openstack.org/networking-ovn/queens/admin/dpdk.html>* ' ...It's too brief.. May I know how to config dpdk compute in OVN networkingopenstack? Hope someone could help what is the config to be done....Currently compute is run without dpdk over openstack train ovn based.Please need some info on how I can proceed. Thank you

2 1

[oslo] python-consul2 is unmaintained
by Takashi Kajinami 21 Apr '25

21 Apr '25

Hello,I recently noticed that python-consul2 library[1], which is used bythe consul coordination driver in tooz, hasn't been updated for 3+ years.Looking at issues in the project, there is an indication that the libraryis not compatible with ACL API in consul >= 1.11[2]. There is another forkcreated[3] and this includes fixes to support new API endpoints.Although the driver in tooz does not directly use the feature, I think it'sbetter to discuss our plan to mitigate risks caused by the unmaintained library.As far as I can think of, there are three options now 1. Deprecate consul driver and remove it 2. Keep using python-consul2 now. At least basic functionality is proven to work by our functional tests (Note that the tests do not cover all configuration patterns) 3. Replace python-consul2 by py-consulBefore we start the overall discussions I'd like to ask a few questions tothe wider audience, as our inputs. - Is anyone using the consul coordination driver now ? - Does anyone have experience with python-consul2 and/or py-consul ?Note that this discussion affects the http messaging driver[4] currentlyproposed, because it also requires client library to communicate with Consul.Actually I noticed the situation when I noticed that the requirement checkjob is failing because the library is not part of global requirements list...Thank you,Takashi Kajinami[1]https://github.com/poppyred/python-consul2[2]https://github.com/poppyred/python-consul2/issues/42[3]https://github.com/criteo-forks/py-consul[4]https://review.opendev.org/c/openstack/oslo.messaging/+/912499-- Takashi Kajinamiirc: tkajinamgithub:https://github.com/kajinamitlaunchpad:https://launchpad.net/~kajinamit

2 3

[octavia] Purging ERROR'ed amphora and load balancers stuck in transitional states
by Paul Browne 11 Apr '25

11 Apr '25

Hello Octavia users,For a while I've been trying to find a systematic way to remove Octavia LBs+Amphora that have become stuck in ERROR or transitional PENDING_* states ;$ openstack loadbalancer amphora list -f value | grep ERRORa2323f14-3aaa-418a-8249-111aaa9c21fe 1fa7bd54-f60c-420c-94f3-d4c02f03d4fe ERROR MASTER 10.8.0.242 192.168.3.247e5b236ba-e7ee-4ed7-9f58-57ce7a408489 1fa7bd54-f60c-420c-94f3-d4c02f03d4fe ERROR BACKUP 10.8.0.190 192.168.3.2476b556f28-93c9-49dd-b6ee-4379288e7957 d5e402fe-2c4b-49af-a700-532cb408cee5 ERROR MASTER 10.8.0.39 192.168.3.126c669db5d-8686-4d5c-9e95-e02030b34301 d5e402fe-2c4b-49af-a700-532cb408cee5 ERROR BACKUP 10.8.0.174 192.168.3.126$ openstack loadbalancer list -f value | grep -vi active1fa7bd54-f60c-420c-94f3-d4c02f03d4fe k8s-clusterapi-cluster-default-ci-6386871107-kube-upgrade-kubeapi 3a06571936a0424bb40bc5c672c4ccb1 192.168.3.247 PENDING_UPDATE amphorad5e402fe-2c4b-49af-a700-532cb408cee5 k8s-clusterapi-cluster-default-ci-6386871107-latest-kubeapi 3a06571936a0424bb40bc5c672c4ccb1 192.168.3.126 PENDING_DELETE amphoraThese resources are marked immutable and so cannot be failed over or deleted;$ openstack loadbalancer amphora failover a2323f14-3aaa-418a-8249-111aaa9c21feLoad Balancer 1fa7bd54-f60c-420c-94f3-d4c02f03d4fe is immutable and cannot be updated. (HTTP 409) (Request-ID: req-6e66c4e8-c3d3-4549-a03c-367017c8c8b3)$ openstack loadbalancer failover 1fa7bd54-f60c-420c-94f3-d4c02f03d4feInvalid state PENDING_UPDATE of loadbalancer resource 1fa7bd54-f60c-420c-94f3-d4c02f03d4fe (HTTP 409) (Request-ID: req-84b44212-e7a8-4101-a16f-18c774c0577e)The backing Nova instances for these Amphora do seem to exist and be in good working order.Is there any API way to purge these out of Octavia's service state, or would (very careful) DB hackery be required here?Many thanks,Paul Browne*******************Paul BrowneResearch Computing PlatformsUniversity Information ServicesRoger Needham BuildingJJ Thompson AvenueUniversity of CambridgeCambridgeUnited KingdomE-Mail: pfb29(a)cam.ac.uk<mailto:pfb29@cam.ac.uk>Tel: 0044-1223-746548*******************

4 6

GPU PCI passthrough woes.
by Andy Speagle 09 Apr '25

09 Apr '25

Hey Folks,I could use a little assistance getting GPU passthrough working. I hadthis working already for one flavor of nvidia gpu... and I've addedsome hosts with a much newer gpu... I've updated the pci_alias and pci_passthrough variables and those seemto be getting set properly in nova.conf passthrough_whitelist = [{"vendor_id":"10de","product_id":"1b06"},{"vendor_id":"10de", "product_id":"26b9"}]alias = {"name": "gpu", "product_id": "1b06", "vendor_id": "10de"}alias = {"name": "gpu-l40s", "product_id": "26b9", "vendor_id": "10de"}I believe I have all of the iommu stuff configured and have the pci-stub module entries... dmesg output shows that the GPUs are beingclaimed by the stub module.$ openstack flavor show t1.small_gpu_l40s+----------------------------+--------------------------------------+| Field | Value |+----------------------------+--------------------------------------+| OS-FLV-DISABLED:disabled | False || OS-FLV-EXT-DATA:ephemeral | 0 || access_project_ids | None || description | None || disk | 0 || id | af70c94e-0026-4a39-bc1e-dfb93b286a54 || name | t1.small_gpu_l40s || os-flavor-access:is_public | True || properties | pci_passthrough:alias='gpu-l40s:1' || ram | 2048 || rxtx_factor | 1.0 || swap | 0 || vcpus | 1 |+----------------------------+--------------------------------------+Yet... I can't seem to get an instance to run using that new flavor...keeps complaining about there not being enough hosts available.Fault: code: 500 created: 2024-10-30T02:04:33Z message: "No valid host was found. There are not enough hostsavailable." details: | Traceback (most recent call last): File "/usr/lib/python3/dist-packages/nova/conductor/manager.py",line 1580, in schedule_and_build_instances host_lists = self._schedule_instances(context,request_specs[0], File "/usr/lib/python3/dist-packages/nova/conductor/manager.py",line 940, in _schedule_instances host_lists = self.query_client.select_destinations( File "/usr/lib/python3/dist-packages/nova/scheduler/client/query.py", line 41, inselect_destinations return self.scheduler_rpcapi.select_destinations(context,spec_obj, File "/usr/lib/python3/dist-packages/nova/scheduler/rpcapi.py",line 160, in select_destinations return cctxt.call(ctxt, 'select_destinations', **msg_args) File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/client.py", line 189, in call result = self.transport._send( File "/usr/lib/python3/dist-packages/oslo_messaging/transport.py", line 123, in _send return self._driver.send(target, ctxt, message, File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 689, in send return self._send(target, ctxt, message, wait_for_reply,timeout, File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 681, in _send raise result nova.exception_Remote.NoValidHost_Remote: No valid host was found.There are not enough hosts available. Traceback (most recent call last): File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/server.py", line 241, in inner return func(*args, **kwargs) File "/usr/lib/python3/dist-packages/nova/scheduler/manager.py",line 223, in select_destinations selections = self._select_destinations( File "/usr/lib/python3/dist-packages/nova/scheduler/manager.py",line 250, in _select_destinations selections = self._schedule( File "/usr/lib/python3/dist-packages/nova/scheduler/manager.py",line 416, in _schedule self._ensure_sufficient_hosts( File "/usr/lib/python3/dist-packages/nova/scheduler/manager.py",line 455, in _ensure_sufficient_hosts raise exception.NoValidHost(reason=reason) nova.exception.NoValidHost: No valid host was found. There are notenough hosts available.Any clues on how/where to dig into this more to see what might bemissing? Thanks.-- Andy Speagle

3 2

Movatterモバイル変換

Keyboard Shortcuts

Thread View

openstack-discuss

November 2024