![]() | This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(September 2009) (Learn how and when to remove this message) |
COM Structured Storage (variously also known asCOM structured storage orOLE structured storage) is a technology developed byMicrosoft as part of itsWindowsoperating system for storing hierarchical data within a single file. Strictly speaking, the termstructured storage refers to a set of COMinterfaces that a conforming implementation must provide, and not to a specific implementation, nor to a specificfile format (in fact, a structured storage implementation need not store its data in a file at all). In addition to providing a hierarchical structure for data, structured storage may also provide a limited form oftransactional support for data access. Microsoft provides an implementation that supports transactions, as well as one that does not (calledsimple-mode storage, the latter implementation is limited in other ways as well, although it performs better).
Structured storage is widely used inMicrosoft Office applications, although newer releases (starting withOffice 2007) use theXML-basedOffice Open XML by default. It is also an important part of both COM and the related Object Linking and Embedding (OLE) technologies. Other notable applications of structured storage includeSQL Server, the Windows shell, and many third-partyCAD programs.
Structured storage addresses some inherent difficulties of storing multiple data objects within a single file. One difficulty arises when an object persisted in the file changes in size due to an update. If the application that is reading/writing the file expects the objects in the file to remain in a certain order, everything following that object's representation in the file may need to be shifted backward to make room if the object grows, or forward to fill in the space left over if the object shrinks. If the file is large, this could result in a costly operation. Of course, there are many possible solutions to this difficulty, but often the application programmer does not want to deal with low level details such as binary file formats.
Structured storage provides an abstraction known as astream, represented by the interfaceIStream
. A stream is conceptually very similar to a file, and theIStream
interface provides methods for reading and writing similar to file input/output. A stream could reside inmemory, within a file, within another stream, etc., depending on the implementation. Another important abstraction is that of astorage, represented by the interfaceIStorage
. A storage is conceptually very similar to adirectory on afile system. Storages can contain streams, as well as other storages.
If an application wishes to persist several data objects to a file, one way to do so would be to open anIStorage
that represents the contents of that file and save each of the objects within a singleIStream
. One way to accomplish the latter is through the standard COM interfaceIPersistStream
. OLE depends heavily on this model to embed objects within documents.
Microsoft's implementation uses a file format known ascompound files, and all of the widely deployed structured storage implementations read and write this format. Compound files use aFAT-like structure to represent storages and streams. Chunks of the file, known assectors (these may or may not correspond to sectors of the underlying file system), are allocated as needed to add new streams and to increase the size of existing streams. If streams are deleted or shrink, leaving unallocated sectors, those sectors can be reused for new streams.
The following applications use the OLE Structured Storage (Compound Document Format)
During thebeta testing phase ofWindows 2000, it included a feature titledNative Structured Storage (NSS) for storage of Structured Storage documents (like the binaryMicrosoft Office formats and thethumbs.db
fileWindows Explorer uses to cache thumbnails) with eachStream that makes up a document stored in a separateNTFSdata stream. It included utilities that automatically split up the streams in a regular Structured Storage document into NTFS data streams and vice versa. However, the feature was withdrawn after Beta 3 due to incompatibilities with other OS components, and any NSS files automatically converted to the single data stream format.[1]