At the lowest level, pages are a concept implemented by the hardware; thetracking of memory and whether it is present in RAM or not is done at pagegranularity. Any given CPU architecture may offer alimited selection of page sizes, but one "base" page size must be chosen,and the most common choice remains 4,096 bytes — the same as it was whenthe first Linux kernels were released 30 years ago.
The kernel, though, often has reason to work with memory in larger chunks.One example is the management of "huge pages" which, once again, areimplemented by the hardware. The x86 architecture, for example, can work with 2MB hugepages, and there are performance advantages to using them where they areapplicable. The kernel will also allocate groups of pages in other sizes,though, typically for DMA buffers or other uses where a set of physicallycontiguous pages is needed. This sort of grouping of pages is known as a"compound page" in the kernel.
Every base page of memory managed by the kernel is represented by apage structure in the system memorymap. When a compound page is created out of a set of base pages, thepage structure for the first page in the set (the "head page") isspecially marked to make its compound nature explicit. The otherinformation in that structure refers to the compound page as a whole.All of the otherpages (the "tail pages") are marked as such, with a pointer to thepage structure for the associated head page. Seethis article for details on how compound pagesare organized.
$ sudo subscribe todaySubscribe today and elevate your LWN privileges. You’ll haveaccess to all of LWN’s high-quality articles as soon as they’republished, and help support LWN in the process.Act now and you can start with a free trial subscription.
This mechanism makes it easy to go from thepage structure of atail page to the head page for the compound page. Many interfaces withinthe kernel make use of that feature, but it creates a fundamentalambiguity: if a function is passed a pointer to apage structurefor a tail page, is it expected to act on that tail page or on the compoundpage as a whole? Or, as Wilcox put it inthefirst posting of the folio series in December:
A function which has a struct page argument might be expecting ahead or base page and will BUG if given a tail page. It might workwith any kind of page and operate on PAGE_SIZE bytes. It mightwork with any kind of page and operate on page_size() bytes ifgiven a head page but PAGE_SIZE bytes if given a base or tail page.It might operate on page_size() bytes if passed a head or tailpage. We have examples of all of these today.
(PAGE_SIZE is the size of a base page, whilepage_size()returns the full size of a — possibly compound — page.)There does not seem to be an extensive history of bugs resulting from thisparticular API, but an interface that is this poorly defined seems likelyto encourage problems sooner or later.
In an attempt to clarify the situation, Wilcox has come up with the conceptof a "page folio", which is really just apage structure that isguaranteed not to be a tail page. Any function accepting a folio willoperate on the full compound page (if, indeed, it is a compound page) withno ambiguity. The result is greater clarity in the kernel'smemory-management subsystem; as functions are converted to take folios asarguments, it will become clear that they are not meant to operate on tailpages.
When Wilcox first postedthis patch series, though, he emphasized a differentbenefit from the change. Any function that might be passed a tail page,but which must operate on the full compound page containing that tail page,must exchange any pointers to tail-pagepage structuresfor pointers to the head page instead. That is typically done with a callto:
struct page *compound_head(struct page *page);
This function is relatively cheap, but it may be called many times over thecourse of a single operation on a page. That makes the kernel bigger(since it's an inline function) and slows things down. A function thataccepts a folio, instead, knows that it is not dealing with a tail page; thus it need not callcompound_head(). That saves both time andmemory.
The folio type itself is defined as a simple wrapper structure:
struct folio { struct page page; };
From there, a new set of infrastructure is built up. For example,get_folio() andput_folio() will manage references to thefolio much likeget_page() andput_page(), but withoutthe unneeded calls tocompound_head(). A whole set ofhigher-level functions follows from there. Much of the real work, though,will be in converting various kernel subsystems to use the new type; Wilcoxdidn't sugarcoat the nature of that task:
This is going to be a ton of work, and massively disruptive. It'lltouch every filesystem, and a good few device drivers! But I thinkit's worth it.
By the time thefourthversion of this patch set was posted on March 5, the core patchesand the conversions (which Wilcox didn't post) added up to about 100commits, which is a fair amount to review.
Perhaps as a result of the size of the patch series, the previous postingsdid not elicit that much discussion. In response to the latest one,though, Andrew Mortontooka look and was worried by what he saw:
Geeze it's a lot of noise. More things to remember and we'llforever have a mismash of `page' and `folio' and code everywhereconverting from one to the other. Ongoing addition of folioaccessors/manipulators to overlay the existing pageaccessors/manipulators, etc.It's unclear to me that it's all really worth it.
Hugh Dickins, too,expresseda lack of enthusiasm for this work. On the other hand,KirillShutemov andMichal Hockoboth expressed support for it, in concept at least. Dave Chinnersaidthat "this abstraction is absolutely necessary
" for filesystemdevelopers, especially if and when the page cache gains the ability tomanage compound pages of multiple sizes.
So, in other words, there is currently no consensus among the coredevelopers regarding whether this work improves the kernel or not. Thatmay change over time as more people look at it and its advantages (or thelack thereof) become more clear. But change tends to happen slowly in thememory-management subsystem in general, even when the patch set in questionis not so large and messy. One should also bear in mind that there is aninevitable discussion on naming to be had; it is already clear that "folio"is not popular, though alternatives are currently thin on the ground. Oneconclusion is thus clear: thekernel may well get folios or something like them, but it seems unlikely tohappen soon.
Index entries for this article | |
---|---|
Kernel | Memory management/Folios |
Posted Mar 18, 2021 17:24 UTC (Thu) bylogang (subscriber, #127618) [Link] (7 responses) In any case, if it gets in as named, it's only a matter of time before we can start describing compound pages as foliolate (having compound leaves) and someone is sure to come up with a case for a 'struct portfolio'. ;-) Posted Mar 18, 2021 20:14 UTC (Thu) bymathstuf (subscriber, #69389) [Link] (6 responses) Posted Mar 19, 2021 3:17 UTC (Fri) bywilly (subscriber, #9762) [Link] (5 responses) https://lore.kernel.org/linux-fsdevel/20201113174409.GH17... Criteria: Must be easily greppable (book is bad), must be short, shouldn't be too cutesy (banqyet by analogy with byte was not under consideration). Online thesauri are your friends, but at the end of the day it's always a matter of taste. Posted Mar 19, 2021 4:38 UTC (Fri) byjonas.bonn (subscriber, #47561) [Link] Normally, pages are created by folding a 'sheet'... so there you go! https://en.wikipedia.org/wiki/Paper_size#/media/File:A_si... Posted Mar 19, 2021 9:32 UTC (Fri) bygeert (subscriber, #98403) [Link] (3 responses) BTW, "aigle" is not known by "dict", nor by my paper dictionary. Posted Mar 19, 2021 11:10 UTC (Fri) bywilly (subscriber, #9762) [Link] (2 responses) https://en.wikipedia.org/wiki/Units_of_paper_quantity is also a good source of names. Honestly, I'm 120 patches in at this point. Someone's going to have to be really convincing to have a better name than folio. Posted Mar 19, 2021 11:17 UTC (Fri) bygeert (subscriber, #98403) [Link] Posted Apr 2, 2021 11:13 UTC (Fri) byHi-Angel (guest, #110915) [Link] A little trick: doing a rename over all of the 120 patches might be done in just under a minute ;) What I'd do here is: ``` Read `sp` as `sed`. For the sake of completeness: sp is my alias to sed_perl, which in turn is a wrapper over perl to replace text in fileshttps://github.com/Hi-Angel/dotfiles/blob/140c78951502754... I was at some point annoyed by discrepancies in behavior between grep, sed, awk, and what not, and migrated to using perl + ack (a perl version of grep). Never looked back. So… hopefully this will help. Posted Mar 18, 2021 21:47 UTC (Thu) byunixbhaskar (guest, #44758) [Link] "One should also bear in mind that there is an inevitable discussion on naming to be had; it is already clear that "folio" is not popular, though alternatives are currently thin on the ground. One conclusion is thus clear: the kernel may well get folios or something like them, but it seems unlikely to happen soon." Matthew and Jon, how about a simple name(well, kernel is a bloody complex thing, it doesn't mean, it has to have a complex or artistic name,does it?) like "page_access" ? (I am sure that I missed certain things, a plethora of kernel API/ABI should have checked before preaching ...:) , which I haven't done so. But ...... Stop fretting at my naivety .... :) ..please... Posted Mar 19, 2021 3:21 UTC (Fri) byguillemj (subscriber, #49706) [Link] Posted Mar 19, 2021 4:10 UTC (Fri) bywilly (subscriber, #9762) [Link] (3 responses) https://git.infradead.org/users/willy/pagecache.git/short... I'll do the changelog / cover letter / ... in the morning. BTW, I do want to emphasize that real workloads see a performance improvement. With the previous work, based on using Transparent Huge Pages, we saw a 7% performance improvement on kernel compiles, and that was with a very naive untuned algorithm for scaling up the THP size. Posted Mar 19, 2021 9:12 UTC (Fri) bywahern (subscriber, #37304) [Link] (1 responses) Posted Mar 19, 2021 11:29 UTC (Fri) bywilly (subscriber, #9762) [Link] If you have a folio and want the n'th page, that's nth_page(&folio->page, n). Nobody's needed that one yet (and only people with really weird physical memory layouts need to do that ... alloc_folio() won't return a folio that you need to do that to. Others are working on maybe disallowing those from existing entirely, in which case (&folio->page + n) will do fine. The performance improvements do not come from a small subset of the changes. You have to make the entire filesystem safe to handle memory in folios (no more references to, eg, PAGE_SIZE, unless you can prove they're safe, calls to kmap() have to be scrutinised. copy_(to|from)_iter() calls need care and attention, etc, etc). Once the filesystem declares itself safe by setting a bit in the fs_flags then the page cache can start handing it folios instead of pages. I think what you're suggesting is essentially what I did here: I've given up on that approach because it's hard to find all the bugs. "Oh this interface takes a struct page. Does it take any struct page, or do I need to call it once for each tail page in the compound page?" I invite you to consider the various implementations of flush_dcache_page() ... and if you can figure out the answer, please let me know. Posted Mar 25, 2021 23:00 UTC (Thu) byflussence (guest, #85566) [Link] I vaguely remember getting excited over the original THP patchset because I'd measured a consistent 3-4% improvement in memory-heavy workloads… Posted Mar 19, 2021 14:08 UTC (Fri) byclugstj (subscriber, #4020) [Link] (8 responses) Posted Mar 19, 2021 16:57 UTC (Fri) bywilly (subscriber, #9762) [Link] (5 responses) Posted Mar 19, 2021 17:01 UTC (Fri) byclugstj (subscriber, #4020) [Link] (4 responses) Posted Mar 19, 2021 17:02 UTC (Fri) bywilly (subscriber, #9762) [Link] (3 responses) Posted Mar 19, 2021 17:16 UTC (Fri) byclugstj (subscriber, #4020) [Link] (2 responses) Posted Mar 22, 2021 23:48 UTC (Mon) bymilesrout (subscriber, #126894) [Link] (1 responses) Posted Sep 16, 2021 8:10 UTC (Thu) byncm (guest, #165) [Link] How long that name needs to be depends on the scope of the names. In C, lacking any mechanism for namespacing, a practical name *often* must be unpleasantly long. That is a fault of the language, not (usually) of the person choosing the name; although some people confuse names with specifications, and so invent stupidly long names. "Compound_page" is not, in any universe, stupidly long for a C struct tag. Posted Mar 21, 2021 20:38 UTC (Sun) bykiryl (subscriber, #41516) [Link] (1 responses) Posted Jun 15, 2021 12:23 UTC (Tue) byDavideRepetto (guest, #152795) [Link] Posted Aug 3, 2021 8:40 UTC (Tue) bytaladar (subscriber, #68407) [Link] (4 responses) Posted Sep 12, 2021 9:37 UTC (Sun) bydeepfire (guest, #26138) [Link] (1 responses) ..and it's surprising how much resistance does this encounter, given the improvements. Posted Mar 26, 2025 15:42 UTC (Wed) byfest3er (guest, #60379) [Link] 'Bucket-o-bytes', which shortens to 'bob'. And Bob's yer uncle. OK. Maybe not. Posted Oct 16, 2021 9:05 UTC (Sat) byhidave (subscriber, #18406) [Link] (1 responses) Posted Oct 16, 2021 10:24 UTC (Sat) bympr22 (subscriber, #60784) [Link] If anything comes to mind at all, it's most likely to be the Spanish form of the Hebrew name יוֹחָנָן (Yôḥānān), equivalent to English John, German Johann, Russian Иван, French Jean, etc.Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
git format-patch -120 --stdout > 1.patch
sp folio my_better_name
git am -3 1.patch
```Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
https://git.infradead.org/users/willy/pagecache.git/short...Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
Clarifying memory management with page folios
https://en.wikipedia.org/wiki/JuanClarifying memory management with page folios
Copyright © 2021, Eklektix, Inc.
This article may be redistributed under the terms of theCreative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds