Semantics | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Semantics of programming languages | ||||||||
| ||||||||
Semantic file systems arefile systems used for information persistence which structure the data according to theirsemantics and intent, rather than the location as with current file systems. It allows the data to be addressed by their content (associative access). Traditional hierarchical file-systems tend to impose a burden, for example when a sub-directory layout is contradicting a user's perception of where files would be stored. Having a tag-based interface alleviates this hierarchy problem and enables users to query for data in an intuitive fashion.
Semantic file systems raise technical design challenges as indexes of words, tags or elementary signs of some sort have to be created and constantly updated, maintained and cached for performance to offer the desired random, multi-variate access to files in addition to the underlying, mostly traditional block-based filesystem.
A semantic file system can be envisioned as a part of asemantic desktop.
The notion of semantic file system was proposed in 1991 by researchers of theMIT andÉcole des Mines de Paris.[1] They proposed an integrated system whose main query interface looked like a traditional file system interface via a virtual directory system that interpreted a path as aconjunctive query. Their implementation had automatic extraction of the relevantmetadata via what they calledfile type specific transducers.
Starting in around 2004, a new wave of implementations centered on manual tagging of files and folders.
In 2008, researchers proposed to integrate semantic file systems withSemantic Web technologies.[2]
Tags can be used instead of folders to circumvent the limits of a hierarchical model.
Gifford et al.[1] suggested the idea offile type-specific metadata automatically extracted by a file-type specific transducer.
For instance, for a source code text file, metadata could include the names of the procedures that the program exports or imports, procedure types, and the files included by the program. For a document, its date, author, title and structure (sections and subsections). For an e-mail, its sender, recipient and subject.
In scientific workflows, provenance of a data file is important. A scientist might want to select a results file by filtering by the input dataset.
Vasudevan and Pazandak[3] introduce the distinction between integrated and augmented approaches:
They suggestOpen systems architecture as being well adapted to semantic file system implementations.
Even integrated semantic file systems may choose to expose an interface for compatibility with existing local ordistributed file system protocols. For instance, Gifford et al.’s 1991 implementation was fully compatible withNFS.[1]
Extended file attributes provided by the file system can be a way to store the metadata.
Arelational database is another very frequent way to store the metadata.
Name | Type | Metadata | OS | Date | Comment |
---|---|---|---|---|---|
Lineage File System[4] | File system extension | Lineage | Linux | 2005 | Modifies the Linux kernel to log all process creation and file-related system calls. Uses a MySQL database. |
SemFS (formerly TagFS)[5] | File system | Tags | Linux, Windows | 2006 | On Windows, can be mounted as a WebDAV drive. On Linux, based onFUSE. Tags are stored as RDF. Uses an internal file system, not exposed. |
SFS[1] | File system extension | File type-specific | Linux | 1991 |
Name | Type | Metadata | OS | License | Programming language(s) | Last update | Comment |
---|---|---|---|---|---|---|---|
Be File System (BFS) | File system | BeOS | Proprietary; last version isfreeware | Metadata is stored inextended file attributes. Works with file managerTracker | |||
dantalian | File system extension | Tags | Linux and contiguousPOSIX-compatible file systems | Apache 2 | Python | 2016 | Usessymlinks |
dhtfs | User-level file system extension | Tags | Linux | BSD 3-clause | Python | 2009 | Based onFUSE |
Elyse | Graphicalfile manager | Tags | Windows and MacOS | Proprietary, no cost | 2021 | ||
Fuse::TagLayer | File system extension | Tags | Linux | GPL v3 /AL v2 | Perl | 2013 | Based onFUSE |
Tabbles | Graphicalfile manager | Tags | WindowsVista to11 | Proprietary, freemium | .NET Framework | Uses aSQL Server relational database. | |
Tag2Find | Tags | WindowsXP andVista 32-bit | 2007 | ||||
TagsForAll | Graphicalfile manager | Tags | Windows x64 | Freemium | 2014 | 70 tag limit in free version. Metadata is stored in two places: in files as ADS (Alternate Data Stream forNTFS), and in local database. | |
Tagsistant | File system | Tags | Linux | GPL | C | 2017 | Tag-based, based onFUSE |
TagSpaces | Graphicalfile manager, web or desktop (usesElectron) | Tags | Windows, macOS, Linux, andAndroid. | AGPL (Freemium) | TypeScript,JavaScript,Java,Objective-C | Continues | |
tagxfs | File system extension | Tags | Linux | Boost Software License 1.0 | C++ | 2013 | Extends the user space file system to a tag based hierarchy. |
TMSU | Virtual file system | Tags | 2022 | Uses aSQLiterelational database. | |||
TransparenTag | File system | Tags | Linux,BSD | GPL v2 | OCaml | 2013 | Data and tags are stored as regular files |
WinFS | File system and manager | Any type | Windows XP | Proprietary | .NET Framework | 2006 | Uses a relational database |
xtagfs | File system extension | Tags | MacOS X | GPL v2 | Python | 2009 | Based onFUSE |
Research & Specifications