We all know that Memory IO is 50-200 times faster than Disk IO!
Caching plays a decent role in boostingread andcompaction performance but some machines & smaller devices (eg: mobile phones) might have not have enough memory for cache so aconfigurable cache allowingdisabled,partial orfull cache is required.
Note - ASegment(.seg)
file in SwayDB is simply a byte array that stores other bytes arrays like keys, values, indexes etc (Array<Array<Byte>>
). All these bytes can be cached based onany condition which is configurable.
Configuring IO and Cache
When accessingany file with a custom format we generally
- Open the file (
OpenResource
) - Read the file header or info (
ReadDataOverview
) to understand the files content eg: format etc. - Finally read the content of the file (
Compressed
orUncompressed
data).
The following sampleioStrategy
function does exactly that where we get anIOAction
that describes what IO is being performed by SwayDB and in our function we define how we want to perform IO (IOStrategy
) for that action andalso configure caching for the read data/bytes.
.ioStrategy((IOActionioAction)->{if(ioAction.isOpenResource()){//Here we are just opening the file so do synchronised IO because//blocking when opening a file might be cheaper than thread//context switching. Also set cacheOnAccess to true so that other//concurrent threads accessing the same file channel do not//open multiple channels to the same file.returnnewIOStrategy.SynchronisedIO(true);}elseif(ioAction.isReadDataOverview()){//Data overview is always small and less than 128 bytes and can be//read sychronously to avoid switching threads. Also cache//this data (cacheOnAccess) for the benifit of other threads and to save IO.returnnewIOStrategy.SynchronisedIO(true);}else{//Here we are reading actual content of the file which can be compressed//or uncompressed.IOAction.DataActionaction=(IOAction.DataAction)ioAction;if(action.isCompressed()){//If the data is compressed we do not want multiple threads to concurrently//decompress it so perform either Async or Sync IO for decompression//and then cache the compressed data. You can also read the compressed//and decompressed size with the following code//IOAction.ReadCompressedData dataAction = (IOAction.ReadCompressedData) action;//dataAction.compressedSize();//dataAction.decompressedSize();returnnewIOStrategy.AsyncIO(true);}else{//Else the data is not compressed so we allow concurrent access to it.//Here cacheOnAccess can also be set to true but that could allow multiple//threads to concurrently cache the same data. If cacheOnAccess is required//then use Asyc or Sync IO instead.returnnewIOStrategy.ConcurrentIO(false);}}})
You will find the aboveioStrategy
property inall data-blocks that form a Segment -SortedKeyIndex,RandomKeyIndex,BinarySearchIndex,MightContainIndex &ValuesConfig.
A Segment itself is also adata-block and it'sioStrategy
can also be configured viaSegmentConfig.
Cache control/limit with MemoryCache
Caching should be controlled so that it does not lead to memory overflow!
You canenable ordisable caching for any or all of the following
- Bytes within a Segment (
ByteCacheOnly
). - Parsed key-values (
KeyValueCacheOnly
). - Or all the above (
MemoryCache.All
).
By defaultByteCacheOnly
is used becauseKeyValueCacheOnly
uses an in-memorySkipList
and inserts to a largeSkipList
are expensive which is not useful for general use-case. ButKeyValueCacheOnly
can be useful for applications that perform multiple reads to the same data and if that data rarely changes.
AnActor
configuration is also required here which manages the cache in the background. You can configure the Actor to be aBasic
,Timer
orTimerLoop
.
The following demoes how to configured all caches.
//Byte cache only.setMemoryCache(MemoryCache.byteCacheOnlyBuilder().minIOSeekSize(4096).skipBlockCacheSeekSize(StorageUnits.mb(4)).cacheCapacity(StorageUnits.gb(2)).actorConfig(newActorConfig.Basic((ExecutionContext)DefaultConfigs.sweeperEC())))//or key-value cache only.setMemoryCache(MemoryCache.keyValueCacheOnlyBuilder().cacheCapacity(StorageUnits.gb(3)).maxCachedKeyValueCountPerSegment(Optional.of(100)).actorConfig(newSome(newActorConfig.Basic((ExecutionContext)DefaultConfigs.sweeperEC()))))//or enable both the above..setMemoryCache(MemoryCache.allBuilder().minIOSeekSize(4096).skipBlockCacheSeekSize(StorageUnits.mb(4)).cacheCapacity(StorageUnits.gb(1)).maxCachedKeyValueCountPerSegment(Optional.of(100)).sweepCachedKeyValues(true).actorConfig(newActorConfig.Basic((ExecutionContext)DefaultConfigs.sweeperEC())))
minIOSeekSize
TheblockSize which set the minimum number of bytes to read for each IO. For example in the above configuration if you ask for6000 bytes
then4096 * 2 bytes
will be read.
The value to set depends on your machines block size. On Mac this can be read with the followingcommand:
diskutil info / | grep "Block Size"
which returns
Device Block Size: 4096 Bytes
Allocation Block Size: 4096 Bytes
skipBlockCacheSeekSize
This skips theBlockCache
and perform direct IO if the data size is greater than this value.
cacheCapacity
Sets the total memory capacity. On overflow the oldest data in the cache is dropped by theActor
.
maxCachedKeyValueCountPerSegment
If set, eachSegment
is initialised with a dedicatedLimitSkipList
. This cache is managed by theActor
or by theSegment
itself if it gets deleted or when the max limit is reached.
sweepCachedKeyValues
Enables clearing cached key-values via theActor
. Iffalse
, key-values are kept in-memory indefinitely unless theSegment
gets deleted. This configuration can be used for smaller databases (eg: application configs) that read the same data more often.
Memory-mapping (MMAP)
MMAP can also be optionally enabled for all files.
Map<Integer,String,Void>map=MapConfig.functionsOff(Paths.get("myMap"),intSerializer(),stringSerializer()).setMmapAppendix(true)//enable MMAP for appendix files.setMmapMaps(true)//enable MMAP for LevelZero write-ahead log files.setSegmentConfig(//configuring MMAP for Segment filesSegmentConfig.builder()...//either disable memory-mapping Segments.mmap(MMAP.disabled())//or enable for writes and reads..mmap(MMAP.writeAndRead())//or enable for reads only..mmap(MMAP.readOnly())...).get();map.put(1,"one");map.get(1);//Optional[one]
Summary
You are in full control ofCaching &IO and can configure it to suit your application needs. If yourIOStrategy
configurations uses onlyAsyncIO
andConcurrentIO
then you can truely buildreactive applications which are non-blocking end-to-end other than the file system IO performed byjava.nio.*
classes. Support forLibio to provide aysnc file system IO can be implemented as a feature if requested.
Useful links
- SwayDB on GitHub.
- Java examples repo.
- Kotlin examples repo.
- Scala examples repo.
- Documentation.
Top comments(1)

No worries. Let me know if there are any questions.
For further actions, you may consider blocking this person and/orreporting abuse