Movatterモバイル変換


[0]ホーム

URL:


LWN.net LogoLWN
.net
News from the source
LWN
|
|
Log in /Subscribe /Register

Short subjects: Realtime, Futexes, and ntfs3

Benefits for LWN subscribers

The primary benefit fromsubscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

ByJonathan Corbet
August 16, 2021
Even in the dog days of (northern-hemisphere) summer, the kernel communityis a busy place. There are many developments that show up on your editor'sradar, but which, for whatever reason, do not find their way into afull-length feature article. The time has come to catch up with a few ofthose topics; read on for updates on the realtime patch set, the effort toreinvent futexes, and the ntfs3 filesystem.

Realtime

The realtime preemption story is a long one; it firstshowed up on LWN in 2004. Over the years,this work has had a significant impact on kernel development as a whole;much of what is just seen as part of the core kernel now had its origins inthe realtime tree. The code around which the realtime work was initially built — thepreemptible locking infrastructure — remains out of the mainline, though.Without the locking changes, the mainline is not able to offer the sort ofresponse-time guarantees that realtime users need.

The locking infrastructure makes almost all locks, spinlocks included, intosleeping locks; that ensures that a higher-priority task can always takeover the processor quickly. It is the sort of change that makes kerneldevelopers nervous, since mistakes in this area can lead to all sorts ofsubtle problems. For that reason, predicting when the locking code will bemerged into the mainline is a fool's game. Your editor knows this well,having confidentlypredicted that it wouldbe merged within a year — in 2007.

Still, one might be tempted to think that the end might be getting closer.Realtime developer Thomas Gleixner has brought the locking infrastructureback to the mailing lists for consideration;the fifthrevision of the 72-part patch set was posted on August 15.Normally configured kernels should behave about the same with these patchesapplied, but those configured for realtime operation will haverealtime-specific versions of mutexes, wait/wound mutexes, reader/writersemaphores, spinlocks, and reader/writer locks.

Commentary on this work has slowed; there does not appear to be much in theway of objections at this point — though it must be noted that LinusTorvalds has not yet made his feelings known on the subject. Unlesssomething surprising comes up, it might just be that the core realtime codewill finally find its way into the mainline. Your editor, however, is tooold, wise, and cowardly to venture a guess as to when that will happen.

A smaller step for futex2

Perhaps the number of comments on the realtime changes is low because mostdevelopers fear the prospect of digging into code of that complexity.There are, however, places in the kernel that are even more frightening;the futex subsystem is surely one of them. Futexes provide fast mutexesfor user space; they started out as a simple subsystem but failed to remainthat way. Over time, it has become clear that futexes could do with anumber of improvements to make them better suited for current workloadsand, at the same time, to move beyond the multiplexerfutex()system call.

For some time now, André Almeida has been pushing in that direction withthefutex2 proposal. This work would splitthe futex functionality into several single-purpose system calls, supportmultiple lock sizes, and more. While there has been interest in this work,progress has been slow (to put it charitably); it seems as if the kernel isno closer to a new futex subsystem than it was a year or two ago.

In an attempt to push this project forward, Almeida has postedanew patch set with significantly reduced ambitions. Rather thanintroduce a whole new subsystem with its own system calls, this series addsexactly one system call that works with existing futexes:

    struct futex_waitv {        uint64_t val;        void *uaddr;        unsigned int flags;    };    int futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes,                    unsigned int flags, struct timespec *timo);

This function will cause the calling process to wait on several futexessimultaneously, returning when one or more of them can be acquired (or thetimeout expires). That functionality is not supported by the current futexAPI, but it turns out to be especially useful for game engines, whichperform significantly better when using the new system call.Thisdocumentation patch describes the new API in more detail.

This patch set has drawn no comments in the week since it was posted.Assuming that silence implies a lack of objections rather than a lack ofinterest, this piece of the futex2 work might make it into a mainlinerelease before too long. Whether the rest of the futex2 work will followdepends on how strong the use cases driving it are; iffutex_waitv() solves the worst problems, there might not be muchmotivation to push the other changes.

Waiting for ntfs3

The kernel has long had an implementation of the NTFS filesystem, but ithas always suffered from performance and functionality problems; the usercommunity would gladly trade it for something better. By all accounts, thentfs3 implementation posted by Konstantin Komarov is indeed somethingbetter, but it is still not clear when it will be merged; this work wasfirstposted one year ago, andversion 27of the patch set was posted on July 29.

The delay in accepting this work is proving frustrating to users;thiscomplaint from Neal Gompa is typical:

I know that compared to all you awesome folks, I'm just a lowlyuser, but it's been frustrating to see nothing happen for monthswith something that has a seriously high impact for a lot ofpeople.

It's a shame, because the ntfs3 driver is miles better than thecurrent ntfs one, and is a solid replacement for the unmaintainedntfs-3g FUSE implementation.

Torvalds hassaidthat maybe it is time to merge this code, but that still may not happenright away.

The biggest holdup for ntfs3 at the moment would appear to be concernsabout the level of development effort behind it. From the public evidence,it seems that ntfs3 is a one-person project, and that makes otherfilesystem developers nervous. Those developers have been reporting testfailures for ntfs3 that have gone unfixed.Meanwhile,Komarov is sometimes unresponsive to questions; various comments on theversion 26posting (from early April) got no answers, for example. This sort of silencegives the impression thatntfs3 does not have a lot of effort behind it. (It's worth noting thatsome other developershave beenhappy with the level of response from Komarov).

Unsurprisingly, the filesystem developers are unenthusiastic about theprospect of taking on a new NTFS implementation that may turn out to haveserious problems and which does not come with a promise of reliablesupport. For ntfs3 to be merged, those fears will need to be addressedsomehow. One way for that to happen, assuggested by Ted Ts'o,would be for other developers,perhaps representing one or more distributors that would like to see abetter NTFS implementation in the kernel, to start contributing patches tontfs3 and commit to helping with its maintenance going forward.

Index entries for this article
KernelFilesystems/ntfs3
KernelFutex
KernelRealtime


to post comments

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 16, 2021 17:26 UTC (Mon) byCyberax (✭ supporter ✭, #52523) [Link] (14 responses)


WaitForMultipleObjects, yay!

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 16, 2021 23:23 UTC (Mon) byitsmycpu (guest, #139639) [Link] (13 responses)

This supports only a subset of WaitForMultipleObjects.

After reading comments on previous patch versions, I find it difficult to imagine that kernel engineers plan on accepting this one, without comment.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 17, 2021 17:45 UTC (Tue) byNYKevin (subscriber, #129325) [Link] (12 responses)

Can't you already do most of the other WaitForMultipleObjects things using some combination of select/poll/epoll, signalfd, eventfd, etc.? Or is there some weird use case where you want to mix (very lightweight) futexes with (much heavier) other synchronization/IPC/IO primitives?

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 18, 2021 5:41 UTC (Wed) byCyberax (✭ supporter ✭, #52523) [Link] (11 responses)

You can do that (and that's what Wine does), but for simple mutexes it's about an order of magnitude slower. It's _usually_ not a big deal because WFMO is typically used in top-level event loops that run at most hundreds of times per second.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 18, 2021 20:26 UTC (Wed) byitsmycpu (guest, #139639) [Link] (10 responses)

WaitForMultipleObjects is not a good API though.

I think this comment by Thomas Gleixner still applies, even with the attempt to separate the code:





I think any such step should be conceived on a much larger scale, in a much larger context.
In the meantime, the existing futex API plus appropriate userspace code should do fine.
(Perhaps aided by a much simpler WAKE-multiple syscall that would have a much lower maintenance footprint.)

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 18, 2021 20:30 UTC (Wed) byCyberax (✭ supporter ✭, #52523) [Link] (9 responses)


I never understood why. It's perfect for what it was designed: waiting on a few objects. It's not a replacement for highly scalable epoll or other APIs.


How would this work for the WFMO case?

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 18, 2021 21:07 UTC (Wed) byitsmycpu (guest, #139639) [Link] (2 responses)



Somewhat unfortunately, I've spent a lot of time on a different website/forum to answer such questions and usually this results in exhausting discussions.
So please forgive me for not going into this once more, I understand you'd deserve a better answer. Also your question indicates I'd perhaps basically have to start at the beginning of a longer thing. Allow me to simply state my opinion here without going into details.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 19, 2021 9:37 UTC (Thu) byfarnz (subscriber, #17727) [Link] (1 responses)

Got a link to a discussion of this that you've had in the past? Would be nice to understand it all.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 19, 2021 16:52 UTC (Thu) byitsmycpu (guest, #139639) [Link]


Well, are you asking as someone who
a) already knows about WFMO and problems with it (perceived or real), and
b) already would know how to implement wait-for-any with the existing futex API?
Or are these questions new to you?

Regarding b), you might start with the comments on the article linked above as "futex2 proposal". On a quick (re-)glance, I notice @ras, @ncm, and @pbonzini as knowing what they are talking about.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 19, 2021 7:53 UTC (Thu) byNYKevin (subscriber, #129325) [Link] (5 responses)

The problem with "a few" is that nobody knows how long their code is going to live. If it's only doing "a few" objects now, it's very tempting to just add one more object to the end of the list. I mean, that's only a O(1) slowdown to initialize the array, right? "A few" plus one is still "a few," right?

And then, once you added a new object today, that sets the precedent that it's OK to do so again tomorrow, and the next day, and then... before you know it, you're bumping up against MAXIMUM_WAIT_OBJECTS (64) and have to* start sharding it out into threads.

*Seriously, the MSDN docs explicitly recommend that solution. As a non-Windows developer, I'm appalled that that's apparently the best suggestion they could come up with.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 19, 2021 18:11 UTC (Thu) byCyberax (✭ supporter ✭, #52523) [Link] (4 responses)

WFMO are typically used kinda like "select" statement in Go. E.g. one common usage is to support cancellation:

object = WaitForMultipleObjects(someLock, cancelSignal);
if (object == cancelSignal) { return -ERRCANCELED;}

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 20, 2021 1:25 UTC (Fri) byNYKevin (subscriber, #129325) [Link] (3 responses)

OK, but what do you use for the main event loop?

(Assume, for the sake of argument, that this is a non-GUI application such as a server, and so you're not just pumping window messages with GetMessage().)

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 20, 2021 3:45 UTC (Fri) byCyberax (✭ supporter ✭, #52523) [Link] (2 responses)


WFMO for the GUI apps :)


For server applications you should use either a good old thread-per-connection method or overlapped IO if you want asynchronous processing. WFMO was used in some of Ye Olde Servere Software to wait on large arrays of sockets, but that is roughly from the era when Linux only had select().

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 20, 2021 8:07 UTC (Fri) bynjs (subscriber, #40338) [Link] (1 responses)

The problem is that there are objects that you can *only* wait on using WFMO -- so IOCP isn't enough, you need IOCP *and* WFMO, which is a terrific hassle.

Short subjects: Realtime, Futexes, and ntfs3

Posted Aug 20, 2021 17:09 UTC (Fri) byCyberax (✭ supporter ✭, #52523) [Link]

Technically, you can use WFSO or WFMO _with_ IOCP to get notified about the signaled state.

What's wrong with ntfs-3g?

Posted Aug 31, 2021 18:29 UTC (Tue) byrfjakob (guest, #95595) [Link]

Looking athttps://github.com/tuxera/ntfs-3g , it does not seem unmaintained at all. Last commit yesterday!


Copyright © 2021, Eklektix, Inc.
This article may be redistributed under the terms of theCreative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds


[8]ページ先頭

©2009-2026 Movatter.jp