Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ReGrid File Storage

Brian Chavez edited this pageSep 11, 2019 ·64 revisions

What is ReGrid?

ReGrid is a distributed large file storage on top ofRethinkDB.ReGrid is similarly inspired byGridFS fromMongoDB. WithReGrid, a large 4GB file can be broken up into chunks and stored on aRethinkDB cluster. Later, the file can be retrieved by streaming the file back to the client. The figure below showsReGrid storing a large video file in chunks across a three node cluster.

Figure 1: Physical Layout

Figure 1: Physical Layout(Note: Pleaseask before using figures in presentations, videos, or other works. Thanks.)

Important Terms

  • Physical view refers to the low-level view of the physical topology, location, and layout of raw file data.
  • Logical view refers to the high-level view of the file system's organization of files regardless of thephysical layout of data.

Getting Started

Download & Install

NuGet PackageRethinkDb.Driver.ReGrid

Install-Package RethinkDb.Driver.ReGrid

Buckets

ABucket is a logical set of files organized together.Fileread/download andwrite/upload operations are performed using aBucket.

  • ABucket requires aRethinkDBdatabase.
  • ARethinkDBdatabase can be partitioned into severalBuckets.
  • MultipleBuckets in the sameRethinkDBdatabase are differentiated by aBucket's name.
  • The default name for aBucket isfs.

The figure below illustrates the logical separation of buckets within a singleMyFiles database:

Figure 2: Logical Buckets inMyFiles DB

Figure 2: Physical Layout

InFigure 2 above, there are three logical fileBucket stores in theMyFilesRethinkDBdatabase. It is important to note thatvideo.mp4 from thefs bucket isnot the same file asvideo.mp4 from thedev bucket.Buckets can be used to organize files in any way app developers see fit.

To create aBucket nameddev inMyFiles simply:

varbucket=newBucket(conn,"MyFiles",bucketName:"dev");bucket.Mount();// required before use...

Mounting thedevBucket before use is required.Mount is necessary to ensure the existence of tables and indexes.

Files

A path is specified when aFile isuploaded into aBucket. Multipleuploads to thesame path cause the file to berevisioned.Figure 3 below shows/video.mp4 uploaded and revisioned 5 times.

Figure 1: Physical Layout

Revision Numbers

PositiveNegative
0: The original stored file.
1: The first revision.
2: The second revision.
etc...
- 1: The most recent revision.
- 2: The second most recent revision.
- 3: The third most recent revision.
etc...

Upload

The following code uploads a file to aBucket:

// Upload a file using byte[]varfileId=bucket.Upload("/video.mp4",videoBytes);// Upload a file using an IO streamGuiduploadId;using(varfileStream=File.Open("C:\\video.mp4",FileMode.Open))using(varuploadStream=bucket.OpenUploadStream("/video.mp4")){uploadId=uploadStream.FileInfo.Id;fileStream.CopyTo(uploadStream);}

fileId will be the file reference for that specificrevision. There are many methods onbucket that allow the use of IO streams andasync methods.

UploadOptions

UploadOptions can be specified to control theChunkSizeBytes. This value controls the size of the document chunks stored in theRethinkDB. Optionally, additional variableMetadata can also be stored along with the uploaded file.

varopts=newUploadOptions();opts.SetMetadata(new{UserId="123",LastAccess=R.Now(),Roles=R.Array("admin","office"),ContentType="application/pdf"});varid=bucket.Upload(testFile,TestBytes.HalfChunk,opts);varfileInfo=bucket.GetFileInfo(id);fileInfo.Metadata["UserId"].Value<string>().Should().Be("123");

Download

// Downloads to a byte[]varbytes=bucket.DownloadAsBytesByName("/video.mp4");// Download revision:0 to a file stream on the clientvarlocalFileStream=File.Open("C:\\video_original.mp4",FileMode.Create);bucket.DownloadToStreamByName("/video.mp4",localFileStream,revision:0);localFileStream.Close();

Caution usingDownloadAsBytes as it returns abyte[] withint.MaxValue as a maximum size. For relatively large files useDownloadToStream.DownloadToStream does not have any maximum size limit beyond the host's OS limitations on the client side.

Seekable Download Streams

ReGrid supports starting downloads at an offset by seeking into part of a large file.

varopts=newDownloadOptions{Seekable=true};using(varstream=bucket.OpenDownloadStream("/video.mp4",options:opts)){stream.Seek(1024*1024*20,SeekOrigin.Begin);//start reading 20MB into the file...}

Delete

By default,ReGrid willSoft delete files. Below shows a few examples of how to delete a file inReGrid:

varfile=bucket.GetFileInfoByName(testfile);// Soft deletebucket.DeleteRevision(file.Id,mode:DeleteMode.Soft);// Hard deletebucket.DeleteRevision(file.Id,mode:DeleteMode.Hard);

Remember, multiple uploads to the same file path do not overwrite a file. Uploading files to the same path cause the file to berevisioned. Deleting a file is deleting arevision of that file.

A convenience methodDeleteAllRevisions exists that deletes filerevisions one-by-one, iteratively. If there is a failure during the iterative deletion, somerevisions of the deleted files might still exist and may not appear fully removed from the file system.

Soft deletes simply set thestatus flag of aFileInfo document. This operation is fast and atomic.

Hard deletes, likeSoft deletes, set thestatus flag of aFileInfo document. However,Hard delete operations involve deleting multiple documents.RethinkDB only supports atomic operations per document. So, a full and completeHard delete on a logicalFile and its revision is inherently non-atomic at the physical layer. If theHard delete operation fails and is incomplete, theGridUtility class contains operations to clean up and restart partially deleted files.

Recommended Usage: Always useSoft delete to delete files. Space can be reclaimed later by using theGridUtility class to reclaim space occupied bySoft deleted files and associatedchunks. Ifoverwrite semantics are desired, delete the original file before uploading a new file to the same path.

Getting Started

ReGrid File Storage

Tutorials

Driver Development

Clone this wiki locally

[8]ページ先頭

©2009-2025 Movatter.jp