Buffer Heads

Linux uses buffer heads to maintain state about individual filesystem blocks.Buffer heads are deprecated and new filesystems should use iomap instead.

Functions

voidbrelse(structbuffer_head*bh)

Release a buffer.

Parameters

structbuffer_head*bh

The buffer to release.

Description

Decrement a buffer_head’s reference count. Ifbh is NULL, thisfunction is a no-op.

If all buffers on a folio have zero reference count, are cleanand unlocked, and if the folio is unlocked and not under writebackthentry_to_free_buffers() may strip the buffers from the folio inpreparation for freeing it (sometimes, rarely, buffers are removedfrom a folio but it ends up not being freed, and buffers may laterbe reattached).

Context

Any context.

voidbforget(structbuffer_head*bh)

Discard any dirty data in a buffer.

Parameters

structbuffer_head*bh

The buffer to forget.

Description

Call this function instead ofbrelse() if the data written to a bufferno longer needs to be written back. It will clear the buffer’s dirtyflag so writeback of this buffer will be skipped.

Context

Any context.

structbuffer_head*__bread(structblock_device*bdev,sector_tblock,unsignedsize)

Read a block.

Parameters

structblock_device*bdev

The block device to read from.

sector_tblock

Block number in units of block size.

unsignedsize

The block size of this device in bytes.

Description

Read a specified block, and return the buffer head that refersto it. The memory is allocated from the movable area so that it canbe migrated. The returned buffer head has its refcount increased.The caller should callbrelse() when it has finished with the buffer.

Context

May sleep waiting for I/O.

Return

NULL if the block was unreadable.

structbuffer_head*get_nth_bh(structbuffer_head*bh,unsignedintcount)

Get a reference on the n’th buffer after this one.

Parameters

structbuffer_head*bh

The buffer to start counting from.

unsignedintcount

How many buffers to skip.

Description

This is primarily useful for finding the nth buffer in a folio; inthat case you pass the head buffer and the byte offset in the foliodivided by the block size. It can be used for other purposes, butit will wrap at the end of the folio rather than returning NULL orproceeding to the next folio for you.

Return

The requested buffer with an elevated refcount.

intsync_mapping_buffers(structaddress_space*mapping)

write out & wait upon a mapping’s “associated” buffers

Parameters

structaddress_space*mapping

the mapping which wants those buffers written

Description

Starts I/O against the buffers at mapping->i_private_list, and waits uponthat I/O.

Basically, this is a convenience function forfsync().mapping is a file or directory which needs those buffers to be written fora successfulfsync().

intgeneric_buffers_fsync_noflush(structfile*file,loff_tstart,loff_tend,booldatasync)

generic buffer fsync implementation for simple filesystems with no inode lock

Parameters

structfile*file

file to synchronize

loff_tstart

start offset in bytes

loff_tend

end offset in bytes (inclusive)

booldatasync

only synchronize essential metadata if true

Description

This is a generic implementation of the fsync method for simplefilesystems which track all non-inode metadata in the buffers listhanging off the address_space structure.

intgeneric_buffers_fsync(structfile*file,loff_tstart,loff_tend,booldatasync)

generic buffer fsync implementation for simple filesystems with no inode lock

Parameters

structfile*file

file to synchronize

loff_tstart

start offset in bytes

loff_tend

end offset in bytes (inclusive)

booldatasync

only synchronize essential metadata if true

Description

This is a generic implementation of the fsync method for simplefilesystems which track all non-inode metadata in the buffers listhanging off the address_space structure. This also makes sure thata device cache flush operation is called at the end.

boolblock_dirty_folio(structaddress_space*mapping,structfolio*folio)

Mark a folio as dirty.

Parameters

structaddress_space*mapping

The address space containing this folio.

structfolio*folio

The folio to mark dirty.

Description

Filesystems which use buffer_heads can use this function as their->dirty_folio implementation. Some filesystems need to do a littlework before calling this function. Filesystems which do not usebuffer_heads should callfilemap_dirty_folio() instead.

If the folio has buffers, the uptodate buffers are set dirty, topreserve dirty-state coherency between the folio and the buffers.Buffers added to a dirty folio are created dirty.

The buffers are dirtied before the folio is dirtied. There’s a smallrace window in which writeback may see the folio cleanness but not thebuffer dirtiness. That’s fine. If this code were to set the foliodirty before the buffers, writeback could clear the folio dirty flag,see a bunch of clean buffers and we’d end up with dirty buffers/cleanfolio on the dirty folio list.

We use i_private_lock to lock againsttry_to_free_buffers() whileusing the folio’s buffer list. This also prevents clean buffersbeing added to the folio after it was set dirty.

Context

May only be called from process context. Does not sleep.Caller must ensure thatfolio cannot be truncated during this call,typically by holding the folio lock or having a page in the foliomapped and holding the page table lock.

Return

True if the folio was dirtied; false if it was already dirtied.

voidmark_buffer_dirty(structbuffer_head*bh)

mark a buffer_head as needing writeout

Parameters

structbuffer_head*bh

the buffer_head to mark dirty

Description

mark_buffer_dirty() will set the dirty bit against the buffer, then setits backing page dirty, then tag the page as dirty in the page cacheand then attach the address_space’s inode to its superblock’s dirtyinode list.

mark_buffer_dirty() is atomic. It takes bh->b_folio->mapping->i_private_lock,i_pages lock and mapping->host->i_lock.

void__brelse(structbuffer_head*bh)

Release a buffer.

Parameters

structbuffer_head*bh

The buffer to release.

Description

This variant ofbrelse() can be called ifbh is guaranteed to not be NULL.

void__bforget(structbuffer_head*bh)

Discard any dirty data in a buffer.

Parameters

structbuffer_head*bh

The buffer to forget.

Description

This variant ofbforget() can be called ifbh is guaranteed to notbe NULL.

structbuffer_head*bdev_getblk(structblock_device*bdev,sector_tblock,unsignedsize,gfp_tgfp)

Get a buffer_head in a block device’s buffer cache.

Parameters

structblock_device*bdev

The block device.

sector_tblock

The block number.

unsignedsize

The size of buffer_heads for thisbdev.

gfp_tgfp

The memory allocation flags to use.

Description

The returned buffer head has its reference count incremented, but isnot locked. The caller should callbrelse() when it has finishedwith the buffer. The buffer may not be uptodate. If needed, thecaller can bring it uptodate either by reading it or overwriting it.

Return

The buffer head, or NULL if memory could not be allocated.

structbuffer_head*__bread_gfp(structblock_device*bdev,sector_tblock,unsignedsize,gfp_tgfp)

Read a block.

Parameters

structblock_device*bdev

The block device to read from.

sector_tblock

Block number in units of block size.

unsignedsize

The block size of this device in bytes.

gfp_tgfp

Not page allocation flags; see below.

Description

You are not expected to call this function. You should use one ofsb_bread(),sb_bread_unmovable() or__bread().

Read a specified block, and return the buffer head that refers to it.Ifgfp is 0, the memory will be allocated using the block device’sdefault GFP flags. Ifgfp is __GFP_MOVABLE, the memory may beallocated from a movable area. Do not pass in a complete set ofGFP flags.

The returned buffer head has its refcount increased. The caller shouldcallbrelse() when it has finished with the buffer.

Context

May sleep waiting for I/O.

Return

NULL if the block was unreadable.

voidblock_invalidate_folio(structfolio*folio,size_toffset,size_tlength)

Invalidate part or all of a buffer-backed folio.

Parameters

structfolio*folio

The folio which is affected.

size_toffset

start of the range to invalidate

size_tlength

length of the range to invalidate

Description

block_invalidate_folio() is called when all or part of the folio has beeninvalidated by a truncate operation.

block_invalidate_folio() does not have to release all buffers, but it mustensure that no dirty buffer is left outsideoffset and that no I/Ois underway against any of the blocks which are outside the truncationpoint. Because the caller is about to free (and possibly reuse) thoseblocks on-disk.

voidclean_bdev_aliases(structblock_device*bdev,sector_tblock,sector_tlen)

clean a range of buffers in block device

Parameters

structblock_device*bdev

Block device to clean buffers in

sector_tblock

Start of a range of blocks to clean

sector_tlen

Number of blocks to clean

Description

We are taking a range of blocks for data and we don’t want writeback of anybuffer-cache aliases starting from return from this function and until themoment when something will explicitly mark the buffer dirty (hopefully thatwill not happen until we will free that block ;-) We don’t even need to markit not-uptodate - nobody can expect anything from a newly allocated bufferanyway. We used to useunmap_buffer() for such invalidation, but that waswrong. We definitely don’t want to mark the alias unmapped, for example - itwould confuse anyone who might pick it withbread() afterwards...

Also.. Note thatbforget() doesn’t lock the buffer. So there can bewriteout I/O going on against recently-freed buffers. We don’t wait on thatI/O inbforget() - it’s more efficient to wait on the I/O only if we reallyneed to. That happens here.

booltry_to_free_buffers(structfolio*folio)

Release buffers attached to this folio.

Parameters

structfolio*folio

The folio.

Description

If any buffers are in use (dirty, under writeback, elevated refcount),no buffers will be freed.

If the folio is dirty but all the buffers are clean then we need tobe sure to mark the folio clean as well. This is because the foliomay be against a block device, and a later reattachment of buffersto a dirty folio will setall buffers dirty. Which would corruptfilesystem data on the same device.

The same applies to regular filesystem folios: if all the buffers areclean then we set the folio clean and proceed. To do that, we requiretotal exclusion fromblock_dirty_folio(). That is obtained withi_private_lock.

Exclusion against try_to_free_buffers may be obtained by eitherlocking the folio or by holding its mapping’s i_private_lock.

Context

Process context.folio must be locked. Will not sleep.

Return

true if all buffers attached to this folio were freed.

intbh_uptodate_or_lock(structbuffer_head*bh)

Test whether the buffer is uptodate

Parameters

structbuffer_head*bh

structbuffer_head

Description

Return true if the buffer is up-to-date and false,with the buffer locked, if not.

int__bh_read(structbuffer_head*bh,blk_opf_top_flags,boolwait)

Submit read for a locked buffer

Parameters

structbuffer_head*bh

structbuffer_head

blk_opf_top_flags

appending REQ_OP_* flags besides REQ_OP_READ

boolwait

wait until reading finish

Description

Returns zero on success or don’t wait, and -EIO on error.

void__bh_read_batch(intnr,structbuffer_head*bhs[],blk_opf_top_flags,boolforce_lock)

Submit read for a batch of unlocked buffers

Parameters

intnr

entry number of the buffer batch

structbuffer_head*bhs[]

a batch ofstructbuffer_head

blk_opf_top_flags

appending REQ_OP_* flags besides REQ_OP_READ

boolforce_lock

force to get a lock on the buffer if set, otherwise drops anybuffer that cannot lock.

Description

Returns zero on success or don’t wait, and -EIO on error.