- Notifications
You must be signed in to change notification settings - Fork2
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.
License
DataCanvasIO/dingo
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
DingoDB is a distributed multi-modal vector database. It combines the features of a data lake and a vector database, allowing for the storage of any type of data (key-value, PDF, audio, video, etc.) regardless of its size. Utilizing DingoDB, you can construct your ownVector Ocean (the next-generation data architecture following data warehouse and data lake, as introduced byDataCanvas). This enables the analysis of both structured and unstructured data through a singular SQL with exceptionally low latency in real time.
- Provides comprehensive data storage solutions, accommodating a wide range of data types including but not limited to embeddings, audio files, text, videos, images, PDFs, and annotations.
- Facilitates efficient querying and vector searching with minimal latency using a singular SQL approach.
- Employs a hybrid search mechanism that caters to both structured and unstructured data, supporting operations like metadata querying and vector querying.
- Possesses the ability to dynamically ingest data and construct corresponding indexes in real time, promoting operational efficiency.
- MySQL CompatibilityBuilt upon the acclaimedApache Calcite SQL engine, DingoDB is capable of parsing, optimizing, and executing standard SQL statements, and can handle parts of TPC-H and TPC-DS(SeeTPC) queries. Compliant with MySQL Shell and MySQL-JDBC-Driver Client, it offers seamless integration with web services, BI tools, and more.
- Supports High Frequency Write OperationsDingoDB is designed to handle high-frequency write operations, such as INSERT, UPDATE, and DELETE, while maintaining strong data consistency using theRAFT consensus protocol. In the short future, it will also support Redis protocol, You can use redis-cli to access DingoDB RawKV.
- Facilitates Point Queries and Multi-dimensional Analysis Simultaneously:DingoDB can push down expressions to accelerate queries and quickly carry out multi-dimensional analysis with low latency.
- Distributed Storage CapabilitiesAs a distributed storage engine, DingoDB has the capacity to store vast amounts of data. It allows for easy horizontal scaling operations on clusters as data scale increases, and implemented usingdingo-store.
- High Data Reliability and RecoveryBased on theRaft distributed consensus protocol, DingoDB can ensure data reliability and recovery in the event of machine or disk failures and offers a swift automatic recovery mechanism.
- Supports Multiple Vector SearchesDingoDB supports vector searches, which are queries that involve vector data types, such as vectors of integers or vectors of floating-point numbers.
Welcome to visitDingoDB. The documentation of DingoDB is located on the website:https://dingodb.readthedocs.io. The main projects about DingoDB are as follows:
- DingoDB: A Unified SQL Engine to parse and compute for both structured and unstructured data.
- Dingo-Store: A strongly consistent distributed storage system based on the Raft protocol.
- Dingo-Deploy: The deployment project of compute nodes and storage nodes.
The documentation of DingoDB is located on the website:https://dingodb.readthedocs.ioor in thedocs/ directory of the source code.
We recommend IntelliJ IDEA to develop the DingoDB codebase. Minimal requirements for an IDE are:
- Support for Java
- Support for Gradle
- Create a personal fork of dingo on GitHub.
- Clone the fork on your local machine. Your remote repo on GitHub is called origin.
- Add the original repository as a remote called upstream.
- If you created your fork a while ago be sure to pull upstream changes into your local repository.
- Create a new branch to work on. Branch from develop.
- Implement/fix your feature, comment your code.
- Follow the code style of Google code style, including indentation.
- If the project has tests run them!
- Add unit tests that test your new code.
- In general, avoid changing existing tests, as they also make sure the existing public API isunchanged.
- Add or change the documentation as needed.
- Squash your commits into a single commit with git's interactive rebase.
- Push your branch to your fork on GitHub, the remote origin.
- From your fork open a pull request in the correct branch. Target the Dingo's develop branch.
- Once the pull request is approved and merged you can pull the changes from upstream to your localrepo and delete your branch.
- Last but not least: Always write your commit messages in the present tense. Your commit messageshould describe what the commit, when applied, does to the code – not what you did to the code.
The IntelliJ IDE supports Java and Gradle out of the box. Download itatIntelliJ IDEA website.
DingoDB is Sponsored byDataCanvas, a new platform to do data science and data process in real-time.
I highly recommend YourKit Java Profiler for any preformance critical application you make.
Check it out athttps://www.yourkit.com/
DingoDB is an open-source project licensed inApache License Version 2.0, welcome any feedback from the community.For any support or suggestion, please contact us.
If you have any technical questions or business needs, please contact us.
Attach the Wetchat QR Code
About
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Languages
- Java98.5%
- Other1.5%


