sysfs - _The_ filesystem for exporting kernel objects¶
Patrick Mochel <mochel@osdl.org>
Mike Murphy <mamurph@cs.clemson.edu>
- Revised:
16 August 2011
- Original:
10 January 2003
What it is¶
sysfs is a RAM-based filesystem initially based on ramfs. It providesa means to export kernel data structures, their attributes, and thelinkages between them to userspace.
sysfs is tied inherently to the kobject infrastructure. Please readEverything you never wanted to know about kobjects, ksets, and ktypes for more information concerning the kobjectinterface.
Using sysfs¶
sysfs is always compiled in if CONFIG_SYSFS is defined. You can accessit by doing:
mount -t sysfs sysfs /sys
Directory Creation¶
For every kobject that is registered with the system, a directory iscreated for it in sysfs. That directory is created as a subdirectoryof the kobject’s parent, expressing internal object hierarchies touserspace. Top-level directories in sysfs represent the commonancestors of object hierarchies; i.e. the subsystems the objectsbelong to.
sysfs internally stores a pointer to the kobject that implements adirectory in the kernfs_node object associated with the directory. Inthe past this kobject pointer has been used by sysfs to do referencecounting directly on the kobject whenever the file is opened or closed.With the current sysfs implementation the kobject reference count isonly modified directly by the functionsysfs_schedule_callback().
Attributes¶
Attributes can be exported for kobjects in the form of regular files inthe filesystem. sysfs forwards file I/O operations to methods definedfor the attributes, providing a means to read and write kernelattributes.
Attributes should be ASCII text files, preferably with only one valueper file. It is noted that it may not be efficient to contain only onevalue per file, so it is socially acceptable to express an array ofvalues of the same type.
Mixing types, expressing multiple lines of data, and doing fancyformatting of data is heavily frowned upon. Doing these things may getyou publicly humiliated and your code rewritten without notice.
An attribute definition is simply:
struct attribute { char *name; struct module *owner; umode_t mode;};int sysfs_create_file(struct kobject * kobj, const struct attribute * attr);void sysfs_remove_file(struct kobject * kobj, const struct attribute * attr);A bare attribute contains no means to read or write the value of theattribute. Subsystems are encouraged to define their own attributestructure and wrapper functions for adding and removing attributes fora specific object type.
For example, the driver model definesstructdevice_attribute like:
struct device_attribute { struct attribute attr; ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf); ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count);};int device_create_file(struct device *, const struct device_attribute *);void device_remove_file(struct device *, const struct device_attribute *);It also defines this helper for defining device attributes:
#define DEVICE_ATTR(_name, _mode, _show, _store) \struct device_attribute dev_attr_##_name = __ATTR(_name, _mode, _show, _store)
For example, declaring:
static DEVICE_ATTR(foo, S_IWUSR | S_IRUGO, show_foo, store_foo);
is equivalent to doing:
static struct device_attribute dev_attr_foo = { .attr = { .name = "foo", .mode = S_IWUSR | S_IRUGO, }, .show = show_foo, .store = store_foo,};Note as stated in include/linux/kernel.h “OTHER_WRITABLE? Generallyconsidered a bad idea.” so trying to set a sysfs file writable foreveryone will fail reverting to RO mode for “Others”.
For the common cases sysfs.h provides convenience macros to makedefining attributes easier as well as making code more concise andreadable. The above case could be shortened to:
staticstructdevice_attribute dev_attr_foo = __ATTR_RW(foo);
the list of helpers available to define your wrapper function is:
- __ATTR_RO(name):
assumes default name_show and mode 0444
- __ATTR_WO(name):
assumes a name_store only and is restricted to mode0200 that is root write access only.
- __ATTR_RO_MODE(name, mode):
for more restrictive RO access; currentlyonly use case is the EFI System Resource Table(see drivers/firmware/efi/esrt.c)
- __ATTR_RW(name):
assumes default name_show, name_store and settingmode to 0644.
- __ATTR_NULL:
which sets the name to NULL and is used as end of listindicator (see: kernel/workqueue.c)
Subsystem-Specific Callbacks¶
When a subsystem defines a new attribute type, it must implement aset of sysfs operations for forwarding read and write calls to theshow and store methods of the attribute owners:
struct sysfs_ops { ssize_t (*show)(struct kobject *, struct attribute *, char *); ssize_t (*store)(struct kobject *, struct attribute *, const char *, size_t);};[ Subsystems should have already defined astructkobj_type as adescriptor for this type, which is where the sysfs_ops pointer isstored. See the kobject documentation for more information. ]
When a file is read or written, sysfs calls the appropriate methodfor the type. The method then translates the genericstructkobjectandstructattribute pointers to the appropriate pointer types, andcalls the associated methods.
To illustrate:
#define to_dev_attr(_attr) container_of(_attr, struct device_attribute, attr)static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr, char *buf){ struct device_attribute *dev_attr = to_dev_attr(attr); struct device *dev = kobj_to_dev(kobj); ssize_t ret = -EIO; if (dev_attr->show) ret = dev_attr->show(dev, dev_attr, buf); if (ret >= (ssize_t)PAGE_SIZE) { printk("dev_attr_show: %pS returned bad count\n", dev_attr->show); } return ret;}Reading/Writing Attribute Data¶
To read or write attributes,show() orstore() methods must bespecified when declaring the attribute. The method types should be assimple as those defined for device attributes:
ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf);ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count);
IOW, they should take only an object, an attribute, and a buffer as parameters.
sysfs allocates a buffer of size (PAGE_SIZE) and passes it to themethod. sysfs will call the method exactly once for each read orwrite. This forces the following behavior on the methodimplementations:
On read(2), the
show()method should fill the entire buffer.Recall that an attribute should only be exporting one value, or anarray of similar values, so this shouldn’t be that expensive.This allows userspace to do partial reads and forward seeksarbitrarily over the entire file at will. If userspace seeks back tozero or does a pread(2) with an offset of ‘0’ the
show()method willbe called again, rearmed, to fill the buffer.On write(2), sysfs expects the entire buffer to be passed during thefirst write. sysfs then passes the entire buffer to the
store()method.A terminating null is added after the data on stores. This makesfunctions likesysfs_streq()safe to use.When writing sysfs files, userspace processes should first read theentire file, modify the values it wishes to change, then write theentire buffer back.
Attribute method implementations should operate on an identicalbuffer when reading and writing values.
Other notes:
Writing causes the
show()method to be rearmed regardless of currentfile position.The buffer will always be PAGE_SIZE bytes in length. On x86, thisis 4096.
show()methods should return the number of bytes printed into thebuffer.New implementations of
show()methods should only usesysfs_emit()orsysfs_emit_at()when formatting the value to be returned to user space.store()should return the number of bytes used from the buffer. If theentire buffer has been used, just return the count argument.show()orstore()can always return errors. If a bad value comesthrough, be sure to return an error.The object passed to the methods will be pinned in memory via sysfsreference counting its embedded object. However, the physicalentity (e.g. device) the object represents may not be present. Besure to have a way to check this, if necessary.
A very simple (and naive) implementation of a device attribute is:
static ssize_t show_name(struct device *dev, struct device_attribute *attr, char *buf){ return sysfs_emit(buf, "%s\n", dev->name);}static ssize_t store_name(struct device *dev, struct device_attribute *attr, const char *buf, size_t count){ snprintf(dev->name, sizeof(dev->name), "%.*s", (int)min(count, sizeof(dev->name) - 1), buf); return count;}static DEVICE_ATTR(name, S_IRUGO, show_name, store_name);(Note that the real implementation doesn’t allow userspace to set thename for a device.)
Top Level Directory Layout¶
The sysfs directory arrangement exposes the relationship of kerneldata structures.
The top level sysfs directory looks like:
block/bus/class/dev/devices/firmware/fs/hypervisor/kernel/module/power/
devices/ contains a filesystem representation of the device tree. It mapsdirectly to the internal kernel device tree, which is a hierarchy ofstructdevice.
bus/ contains flat directory layout of the various bus types in thekernel. Each bus’s directory contains two subdirectories:
devices/drivers/
devices/ contains symlinks for each device discovered in the systemthat point to the device’s directory under /sys/devices.
drivers/ contains a directory for each device driver that is loadedfor devices on that particular bus (this assumes that drivers do notspan multiple bus types).
fs/ contains a directory for some filesystems. Currently eachfilesystem wanting to export attributes must create its own hierarchybelow fs/ (seeFUSE Overview for an example).
module/ contains parameter values and state information for allloaded system modules, for both builtin and loadable modules.
dev/ contains two directories: char/ and block/. Inside these twodirectories there are symlinks named <major>:<minor>. These symlinkspoint to the directories under /sys/devices for each device. /sys/dev provides aquick way to lookup the sysfs interface for a device from the result ofa stat(2) operation.
More information on driver-model specific features can be found inDocumentation/driver-api/driver-model/.
block/ contains symlinks to all the block devices discovered on the system.These symlinks point to directories under /sys/devices.
class/ contains a directory for each device class, grouped by functional type.Each directory in class/ contains symlinks to devices in the /sys/devices directory.
firmware/ contains system firmware data and configuration such as firmware tables,ACPI information, and device tree data.
hypervisor/ contains virtualization platform information and provides an interface tothe underlying hypervisor. It is only present when running on a virtual machine.
kernel/ contains runtime kernel parameters, configuration settings, and status.
power/ contains power management subsystem information includingsleep states, suspend/resume capabilities, and policies.
Current Interfaces¶
The following interface layers currently exist in sysfs.
devices (include/linux/device.h)¶
Structure:
struct device_attribute { struct attribute attr; ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf); ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count);};Declaring:
DEVICE_ATTR(_name, _mode, _show, _store);
Creation/Removal:
int device_create_file(struct device *dev, const struct device_attribute * attr);void device_remove_file(struct device *dev, const struct device_attribute * attr);
bus drivers (include/linux/device.h)¶
Structure:
struct bus_attribute { struct attribute attr; ssize_t (*show)(const struct bus_type *, char * buf); ssize_t (*store)(const struct bus_type *, const char * buf, size_t count);};Declaring:
static BUS_ATTR_RW(name);static BUS_ATTR_RO(name);static BUS_ATTR_WO(name);
Creation/Removal:
int bus_create_file(struct bus_type *, struct bus_attribute *);void bus_remove_file(struct bus_type *, struct bus_attribute *);
device drivers (include/linux/device.h)¶
Structure:
struct driver_attribute { struct attribute attr; ssize_t (*show)(struct device_driver *, char * buf); ssize_t (*store)(struct device_driver *, const char * buf, size_t count);};Declaring:
DRIVER_ATTR_RO(_name)DRIVER_ATTR_RW(_name)
Creation/Removal:
int driver_create_file(struct device_driver *, const struct driver_attribute *);void driver_remove_file(struct device_driver *, const struct driver_attribute *);
Documentation¶
The sysfs directory structure and the attributes in each directory define anABI between the kernel and user space. As for any ABI, it is important thatthis ABI is stable and properly documented. All new sysfs attributes must bedocumented in Documentation/ABI. See also Documentation/ABI/README for moreinformation.