- What I did
While running a CSI volume plugin that supports staging, I created a new swarm service that initiated an attempt to unpublish a cluster volume. The node agent callsNodeUnpublishVolume which returns no errors thenNodeUnstageVolume which also returns no errors. However, there was no call to the underlying plugin driver forNodeUnstageVolume.

- How I did it
Fixed the unpublish check to only return in the failing condition that the volume was not unpublished.

- How to test it
Create a cluster volume using CSI volume plugin then trigger a publish and unpublish by creating a swarm service and restarting the service.

- Description for the changelog

CSI node plugin fix for unstaging volumes

13f03ac

Signed-off-by: Beorn Facchini <beornf@gmail.com>

Copy link

Author

beornf commentedJun 12, 2024•
edited
Loading

Similarly, this early return inNodeUnpublishVolume skips the log which would be helpful in debugging:

swarmkit/agent/csi/plugin/plugin.go

Lines 366 to 373 inea1a7ce

	ifv,ok:=np.volumeMap[req.ID];ok {
	v.publishedPath=""
	v.isPublished=false
	returnnil
	}

	log.G(ctx).Info("volume unpublished")
	returnnil

beornf mentioned this pull request

Jun 13, 2024

CSI volume bugs in Docker Swarmmoby/moby#47974

Open

thaJeztah reviewed

Oct 18, 2024

View reviewed changes

Copy link

Member

thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@dperny you're more familiar with this code; PTAL 🤗

agent/csi/plugin/plugin.go

Comment on lines +266 to 269

		if v, ok := np.volumeMap[req.ID]; ok && v.isPublished {
		return status.Errorf(codes.FailedPrecondition, "Volume %s is not unpublished", req.ID)
		}

Copy link

Member

thaJeztahOct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not super-familiar with this code, but should this actually proceed if the volume was not found innp.volumeMap ?

i.e., the volume should be present if it was successfully staged. If a failure happened during staging, it wouldn't be added. Looking atNodeStageVolume further up;

swarmkit/agent/csi/plugin/plugin.go

Lines 231 to 239 ine8ecf83

	iferr!=nil {
	returnerr
	}

	v:=&volumePublishStatus{
	stagingPath:stagingTarget,
	}

	np.volumeMap[req.ID]=v

Perhaps this should be something like;

ifv,ok:=np.volumeMap[req.ID];!ok||v.isPublished {// volume not found or is still publishedreturnstatus.Errorf(codes.FailedPrecondition,"Volume %s is not unpublished",req.ID)}

Or if it would be useful to have distinct errors for each situation

v,ok:=np.volumeMap[req.IDif!ok {returnstatus.Errorf(codes.FailedPrecondition,"Volume %s not found",req.ID)}if v.isPublished {returnstatus.Errorf(codes.FailedPrecondition,"Volume %s is not unpublished",req.ID)}

Copy link

Author

beornfOct 18, 2024•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Indeed the assumptions around state for unstaging would always hold true ifnp.volumeMap was made persistent:

swarmkit/agent/csi/plugin/plugin.go

Lines 60 to 63 inea1a7ce

	// volumeMap is the map from volume ID to Volume. Will place a volume once it is staged,
	// remove it from the map for unstage.
	// TODO: Make this map persistent if the swarm node goes down
	volumeMapmap[string]*volumePublishStatus

I recall during testing it is possible that a volume had been staged before the node daemon restarted andnp.volumeMap was empty. In the methodNodeUnpublishVolume it always unpublishes the volume irrespective ofnp.volumeMap.

Copy link

Contributor

olljanat commentedNov 23, 2024

@beornf New test case for this would nice as CSI logic is still quite new which bugs like this exist. You can find examples from my PRs#3116 and#3123

Labels

None yet

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CSI node plugin fix for unstaging volumes#3178

Are you sure you want to change the base?

CSI node plugin fix for unstaging volumes#3178

Uh oh!

Conversation

beornf commentedJun 12, 2024

Uh oh!

beornf commentedJun 12, 2024•
edited
Loading

Uh oh!

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztahOct 18, 2024

Choose a reason for hiding this comment

Uh oh!

beornfOct 18, 2024•
edited
Loading

Uh oh!

Choose a reason for hiding this comment

Uh oh!

olljanat commentedNov 23, 2024

Uh oh!

Uh oh!

Movatterモバイル変換

CSI node plugin fix for unstaging volumes#3178

Are you sure you want to change the base?

CSI node plugin fix for unstaging volumes#3178

Uh oh!

Conversation

beornf commentedJun 12, 2024

Uh oh!

beornf commentedJun 12, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztahOct 18, 2024

Choose a reason for hiding this comment

Uh oh!

beornfOct 18, 2024• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

olljanat commentedNov 23, 2024

Uh oh!

Uh oh!

beornf commentedJun 12, 2024•
edited
Loading

beornfOct 18, 2024•
edited
Loading