Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Eric's Object Storage System

NotificationsYou must be signed in to change notification settings

meow-watermelon/eoss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intro

Eric's Object Storage Service(EOSS) is my experimental object storage service. I designed EOSS with following features:

  • support object versioning
  • support HTTP HEAD/GET/PUT/DELETE methods for object operations
  • support object writing operation rollback if failure occurs
  • support safe mode
  • support concurrent writing
  • object metadata and object files should be easy to move around

EOSS uses local filesystem as the storage layer.

Technical Details

Components

EOSS has 3 pieces of components: EOSS Service, Metadata Database and Storage Layer. EOSS Service is a Flask application that is written by Python 3. Metadata Database uses SQLite to store the object metadata. Storage Layer is local filesystem or local-mounted NFS share.

Metadata Database Schema

Metadata Database(MDS) uses SQLite as the metadata store.

Column NameTypeDescription
idStringobject name
filenameStringobject original filename
versionStringobject version
sizeIntegerobject size
timestampIntegerlatest updated timestamp
stateIntegerobject writing state

Object Versioning

To support object versioning, the user needs to specify a special string calledversion salt, this version salt will be used to construct the final version string by base64 method. Once the user passes version string by using the headerX-EOSS-Object-Version, EOSS will construct the a string in the following format:

<object filename>:<version salt>:<version string>

then EOSS applies base64 method on the string as the final object name. Here's an example:

object filename: testfile100mversion salt: snoopyversion string: ver1.0=>testfile100m:snoopy:ver1.0=>dGVzdGZpbGUxMDBtOnNub29weTp2ZXIxLjA=

The version salt should not be shared with end users.

Object Writing State

EOSS uses 3 integers to object writing states. In each phase, EOSS will update MDS with proper state integer. When an object is uploading, following phases are triggered:

  • set an exclusive lock on a lock file with a ".lock" suffix in object name in storage layer
  • object writing request initialized
  • object data is saved with a ".temp" suffix in object name in storage layer
  • object file is renamed to the final object name
NumberDescription
1initial state - object writing request initialized
2object is saved on storage layer with a temp suffix
0final state - object is renamed to final object name

Object Writing Operation Rollback

In order to avoid inconsistent state left if uploading is failed, a rollback operation is necessary to ensure all writing is atomic. Rollback feature only exists in uploading function(HTTP PUT method).

If any MDS update or object storing operations go wrong, the rollback function will be triggered. The rollback function will delete the data on the storage layer first then removing the object record from MDS.

If rollback is failed, it's possible the service is still in inconsistent state. But such leftover states would be cleaned up when service restarts.

Object Concurrent Write

EOSS utilizes a simplified Two-Phase Locking mechanism to address concurrent write scenarios. Prior to initiating the actual write, an exclusive lock is placed on a lock file. Instead of waiting for the transaction to complete, EOSS promptly returns an HTTP response code 409 to the client, indicating the detection of concurrent writes and advising the client to retry at a later time.

Safe Mode

Safe Mode is a special operational mode in EOSS. Once Safe Mode is enabled, all READ operations(HTTP HEAD and GET methods) is still working but all WRITE(HTTP PUT and DELETE methods) operations would be failed.

This feature is useful when service administrator needs to maintain the service but not interrupting users to read objects.

HTTP Response Codes

EOSS obeys most standard HTTP response codes. Following table lists EOSS customized HTTP response codes:

HTTP Response CodeTextDescription
409Object Read/Write Conflictread or write on the same object that has exclusive lock
440Object Initialized Onlyobject is just initialized, not ready for serving
441Object Saved Not Closedobject is saved with a temp suffix, not ready for serving
520MDS Connection Failurefailed to connect MDS
521MDS Execution Failurefailed to execute SQL queries on MDS
522MDS Commit Failurefailed to commit transactions to MDS
523EOSS Internal Exception FailureEOSS encounters exceptions that are unable to catch
524EOSS Inconsistent Condition FailureEOSS encounters internal inconsistency issue
525EOSS Safemode EnabledSafe Mode is enabled on EOSS
526EOSS Rollback DoneRollback procedure is successful
527EOSS Rollback FailedRollback procedure is failed

HTTP Endpoints

Each HTTP response from EOSS would have a special header calledX-EOSS-Request-ID which is used to identify source request.

/eoss/v1/object

object endpoint is used for users to handle object operations like download/upload/delete.

HTTP HEAD Method

HTTP HEAD method is used for checking if object exists or not.

Data Flow

Example
$ curl -I http://localhost:4080/eoss/v1/object/testfile100m -H "X-EOSS-Object-Version: ver1.0"HTTP/1.1 200 OKContent-Type: text/html; charset=utf-8Content-Length: 13X-EOSS-Request-ID: 1010f676-5e58-4681-b0cd-bb117d2440fe

HTTP GET Method

HTTP GET method is used for downloading object.

Data Flow

Example
$ curl http://localhost:4080/eoss/v1/object/testfile100m -H "X-EOSS-Object-Version: ver1.0" -o testfile100m.example -v -s*   Trying 127.0.0.1:4080...* Connected to localhost (127.0.0.1) port 4080 (#0)> GET /eoss/v1/object/testfile100m HTTP/1.1> Host: localhost:4080> User-Agent: curl/7.85.0> Accept: */*> X-EOSS-Object-Version: ver1.0> * Mark bundle as not supporting multiuse< HTTP/1.1 200 OK< Content-Disposition: attachment; filename=testfile100m< Content-Type: application/octet-stream< Content-Length: 104857600< Last-Modified: Wed, 12 Apr 2023 04:27:50 GMT< Cache-Control: no-cache< ETag: "1681273670.2883308-104857600-2117080245"< Date: Sun, 23 Apr 2023 18:06:14 GMT< X-EOSS-Request-ID: 75ac9c35-5a03-47fd-a992-3b553585f9a6

HTTP PUT Method

HTTP PUT method is used for uploading object.

If the object exists in the storage already, HTTP PUT will re-write the same object.

Data Flow

Example
$ curl -X PUT -T testfile.100M http://localhost:4080/eoss/v1/object/testfile100m -H "X-EOSS-Object-Version: ver2.0" -v -s*   Trying 127.0.0.1:4080...* Connected to localhost (127.0.0.1) port 4080 (#0)> PUT /eoss/v1/object/testfile100m HTTP/1.1> Host: localhost:4080> User-Agent: curl/7.85.0> Accept: */*> X-EOSS-Object-Version: ver2.0> Content-Length: 104857600> Expect: 100-continue> * Done waiting for 100-continue* We are completely uploaded and fine* Mark bundle as not supporting multiuse< HTTP/1.1 201 CREATED< Content-Type: text/html; charset=utf-8< Content-Length: 15< X-EOSS-Request-ID: a9ae69a8-38f6-4de2-8489-12ed52aa041fObject Uploaded

HTTP DELETE Method

HTTP PUT method is used for deleting object.

Data Flow

Example
$ curl -X DELETE http://localhost:4080/eoss/v1/object/testfile100m -H "X-EOSS-Object-Version: ver2.0" -v -s*   Trying 127.0.0.1:4080...* Connected to localhost (127.0.0.1) port 4080 (#0)> DELETE /eoss/v1/object/testfile100m HTTP/1.1> Host: localhost:4080> User-Agent: curl/7.85.0> Accept: */*> X-EOSS-Object-Version: ver2.0> * Mark bundle as not supporting multiuse< HTTP/1.1 200 OK< Content-Type: text/html; charset=utf-8< Content-Length: 14< X-EOSS-Request-ID: e28ac80c-4771-4aea-97aa-5142d8548889Object Deleted

/eoss/v1/stats

stats endpoint reads MDS and display a summary in JSON format. Following fields are included:

Field NameDescription
total_number_objectstotal number of objects
total_storage_usagetotal storage space usage in byte
youngest_object_updated_timestampyoungest object updated timestamp
oldest_object_updated_timestampoldest object updated timestamp
number_object_upload_inittotal number of objects that are just initialized(state 1)
number_object_saved_in_temp_nametotal number of objects that are saved with a temp suffix(state 2)
number_object_uploadedtotal number if objects that are uploaded that are in finalized state(state 0)
Example
$ curl http://localhost:4080/eoss/v1/stats -s | json_pp {   "number_object_saved_in_temp_name" : 0,   "number_object_upload_init" : 0,   "number_object_uploaded" : 14,   "oldest_object_updated_timestamp" : 1681488651,   "total_number_objects" : 14,   "total_storage_usage" : 1156579328,   "youngest_object_updated_timestamp" : 1681273670}

Installation

  1. Clone the git repository

  2. Modifyconfig_file_path variable insrc/eoss/__init__.py to specify correct location foreoss.yaml.

  3. Modify theconfig/eoss.yaml configuration file with proper settings. The file controls EOSS backend service settings.

VERSION_SALT: object version salt. default value is the stringsnoopy

STROAGE_PATH: file path location to store objects

METADATA_DB_PATH: metadata database file location

METADATA_DB_TABLE: metadata database table name. default value is the stringmetadata

OBJECT_LOCK_PATH: file path location to store lock files of objects

LOGGING_PATH: EOSS service logging path to store log files

LOG_BACKUP_COUNT: logging rotation count. default value is 10

LOG_MAX_BYTES: maximum size of each log file. default value is 1 GB

SAFEMODE: safe mode flag. default value isFalse

  1. Modify theconfig/eoss-uwsgi.ini configuration file with proper settings. This file controls EOSS HTTP service settings.

chdir: file path that pointssrc directory

  1. Go tosrc directory and runbootstrap-env.py command. This command will bootstrap and create metadata database and necessary directories.

  2. Go tosrc directory and runstart.sh to start EOSS service. This script will triggerpre-start.py first to check and clean up EOSS environment then it will bring up the WSGI HTTP service.

Logging

There are 4 logs for EOSS service.

mds_client.log: MDS database operations log

object_client.log: object operations log

eoss.log: EOSS main service log

access.log: WSGI HTTP service log

Access Log Format
<request_id> <latency> <request remote address> <http request method> <http request path> <http response code> <http request user agent>
Access Log Example
2023-04-11 21:27:59,538 61c439e6-8f5d-4764-bea0-0a385c334d3e 1215 127.0.0.1 PUT /eoss/v1/object/testfile100m 200 curl/7.85.0

TODO

  • UTF-8 support
  • Allow users to re-upload the same object without extra object deleting cycle
  • Add more integration testings for the service

Changelog

0.0.1 - initial commit0.0.2 - [bug][issue#1] - fix concurrent write failures issue0.0.3 - [enhancement][issue#2] - support override writing in PUT method0.0.4 - [enhancement][issue#9] - improve data store resilience

This object storage service is my experimental project, please feel free to submit a bug or feature request. Thanks!

About

Eric's Object Storage System

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp