- Notifications
You must be signed in to change notification settings - Fork213
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
The Cloudflare Developer Platform provides a serverless runtime and a bunch of data stores: KV, Durable Objects, Cache, R2, D1 and Analytics Engine. Each of these data stores provides some mechanism for writing data, then reading it back. Specifically, KV, Durable Objects, Cache and R2 are all key-value stores: you can put/get/delete and (ignoring Cache) list keys. For implementing these, Miniflare has a common key-value storage interface that supports putting/getting/deleting keys with metadata and expiry, then listing them based on
Given that we can make breaking changes to the persistence format with the major version bump to Miniflare 3, I think it's time to rethink storage. RequirementsKV
Our KV implementation also needs to support read-only Workers Sites namespaces, backed by an arbitrary directory, with glob-style include/exclude rules. Cache
R2
D1Requires exclusive access to an SQLite database Durable ObjectsPersistence implemented entirely in ProposalInstead of having a single store for both metadata and large blobs, I propose we split these up. Given the variety in queries required by each data store (especially in R2's list), and the hard requirement on SQLite by D1, using SQLite for the metadata store seems like a good idea. This also gives us the transactional updates for things like multipart upload we're looking for. We could then implement our own simple blob store, supporting multi-ranged, streaming reads. Multiple ranges with multipart responses are required by Cache, and streaming reads seem like a good idea for the large objects R2 can support.
In-memory and file-system backed implementations would be provided for both stores. For file-system backed stores, a root directory should be provided containing the store's data. We should validate that no other stores in the same Miniflare instance are rooted in a child directory of any other file-system stores' roots. We may also want to provide a simple expiring key-value-metadata store abstraction on top of these, for use with KV and Cache. In the future, we may implement Miniflare's simulators in SQLiteWe should create a new SQLite database for each KV namespace, R2 bucket, D1 database, etc.
For the in-memory implementation, we should use SQLite's built-in Blob StoreThis should provide an interface like: typeBlobId=string;interfaceRange{start:number;// inclusiveend:number;// inclusive}interfaceMultipartOptions{contentLength:number;contentType?:string;}interfaceMultipartReadableStream{multipartContentType:string;body:ReadableStream<Uint8Array>;}interfaceBlobStore{get(id:BlobId,range?:Range):Promise<ReadableStream<Uint8Array>|null>;get(id:BlobId,ranges:Range[],opts:MultipartOptions):Promise<MultipartReadableStream|null>;put(userId:string,stream:ReadableStream<Uint8Array>):Promise<BlobId>;delete(id:BlobId):Promise<void>;}
Whilst entire
This interface makes it possible to
This means the file will only be deleted once the streaming
|
BetaWas this translation helpful?Give feedback.
All reactions
Replies: 2 comments 1 reply
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
This looks great! Still forming thoughts, but some initial ones:
|
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Great questions! 😃
I don't think this is a big problem. Because we're using unguessable IDs, they'll be no way to reference the dangling blobs so it won't affect any other records. Because this would be rare, I don't see the wasted disk usage as a problem. The SQLite metadata store will always be the source of truth, and we'll always be sure to initiate blob deletions after transactions to delete metadata records have committed sucessfully.
I don't think R2 provides a rename-like operation in Worker bindings. For multipart values, we don't copy the parts when assembling the final record, we just store pointers to the existing blobs. If R2 were to add something like this in the future, we could implement this by updating the key in the metadata store, and keeping the blob reference the same. This would leave the incorrect key on disk, but I think that's an acceptable tradeoff for the benefits keeping blob IDs/values immutable gives us.
We'll never actually need to decode the encoded keys. The encoding doesn't really need to be 1-1, this is badly worded on my part. File paths will look something like |
BetaWas this translation helpful?Give feedback.
All reactions
-
I feel like we need to define a few more things for this proposal:
|
BetaWas this translation helpful?Give feedback.