The goal is to prevent unsoundness from the safe Rust interface, turning as many illegal operations as possible into compile errors , and catch the others at runtime.

Sorry for the mega-PR, but I don't know if it makes sense to split this up since all changes kind of affect everything.

Error handling, type safety

In general: existing methods onVectorType becomeunsafe fn xxx_unchecked, and safe variants are added that check all required invariants.

There's some ambiguity as to what is actually invalid/unsafe here. As far as I can tell, using an Index created with oneScalarKind (sayint8) as one of a different type (like inserting anf64 vector) does not actually cause any memory unsafety. It can garble data though, which I'd say is enough to make the operationunsafe (pretty common in the ecosystem AFAIK?).

So I've added a check that the casts that will be done on the C++ make sense, but it would be good to have more opinions on which ones are ok and which ones are almost certainly mistakes. Some are easy (f32 tof16,bf16 is ok,f64 tob1x8 is not for example), others less so.

Safe`Index` views into file, buffer

Currently you can have anIndex reading a buffer passed in by the user, which will hopefully stay live as long as theIndex. Though the method is marked unsafe, it's easy to create a dangling pointer situation which is unfortunate in Rust of all places. Also, these immutableIndex views will throw exceptions if any mutating methods are called on them.

My proposed solution is to separate the regular, mutableIndex that owns its memory and the immutableIndexView into different types. This wayIndexView can be associated with a lifetime so the backing buffer can't be dropped while it's being used, and theIndexView object (holding a pointer) can not be moved between threads.

While we're at it, splitting mutating and readonly methods onIndex into different traits allowsIndexView to only expose read methods and prevent some runtime exceptions this way.

Custom metric closure shenanigans

This part could maybe be split into a different PR. First a lot of duplicated code can be written in a generic way, which is always nice. There's also a memory leak pointed out in#629 (fixed by properly dropping the currently set metric closure when it's overwritten).

I only noticed that because I wanted to see how ergonomics could be improved here. Currently custom metrics are boxed closures taking*const T arguments, which would be very nice to instead turn into&[T] slices of correct length/dimensionality. Also, closures are currently double-boxed becauseBox<dyn Fn..> is a wide pointer, andmetric_punned_t wants a regular pointer, so there's three (if I'm counting correctly) pointer dereferences per call to the metric function (trampoline -> pointer to wide pointer -> dynamic dispatch fordyn Fn trait object).

Ithink it's possible to add a shim in there to turn the pointers into slices without adding another indirection (currentlyclosure_address is a pointer to theBox<dyn>, when it could instead point to the trait object/wide pointer instead and the trampoline casts that into a valid trait object reference). But I'm not sure if it's worth it when you could also just allow the user a regularfn (&[T], &[T]) -> Distance pointer, which avoid a lot of that and covers many (most?) use cases just fine.

FFI and closures is a difficult topic though so it would be cool for other to weigh in and check what I'm saying is correct.

tnibler added2 commits

October 28, 2025 12:48

DRAFT: error handling, safe Rust methods

5e7ba0f

Add Sized bound to VectorType trait, mark trait unsafe

0b0f9b5

Every method has a Sized bound anyway, so might as well put it a levelhigher.Also mark VectorType unsafe, since implementors must follow safetyinvariants, or else the entire program becomes unsound.

tnibler force-pushed thesafe-unsafe-rust branch fromb558507 toa69e121Compare

October 28, 2025 21:08

Copy link

Contributor

ashvardanian commentedNov 1, 2025

Thanks for great suggestions,@tnibler! I'm fine with mega-PRs staged for major library rewrites. Assuming we are targeting the v3, it will take some time to fully rewrite the lib, but I'll keep you posted there. Your suggestions are invaluable 🤗

Before USearch v3, I'll be releasingForkUnion v3 and UCall v1. The former has a Rust bindings as well, in case you have opinions on better parallel/concurrent abstractions for Rust 🦀

Copy link

Contributor

ashvardanian commentedNov 1, 2025

P.S.: I'd love to include your entire commit history and thought process around different abstractions. It would be great if we could follow the same commit naming structure shared across the history of all projects I maintain, to be compatible withTinySemVer automation. Let's prefix the commits with intent verbs, likeAdd:,Improve:,Fix:,Docs:,Chore:, orBreak: 🤗

ashvardanian added the v3Breaking changes planned for v3 label

Nov 1, 2025

ashvardanian marked this pull request as ready for review

November 1, 2025 13:03

ashvardanian reviewed

Nov 1, 2025

View reviewed changes

flake.nix OutdatedShow resolvedHide resolved

Copy link

Author

tnibler commentedNov 1, 2025•
edited
Loading

Oh yes you're right the commits are mess, none of this is final I was just looking for input and okay/disagreement on the rought direction. I'll rebase things into a neat order :)

The rough thought process:

If some methods need/want safe wrappers with error checks around the FFI call, it doesn't really cost anything to have the safe/unsafe _unchecked for all/most other methods as well even if they don't do anything special. That way, changes to requirements/invariants for the C++ API can happen without breaking the entire safe Rust API, since everything returnsResult and users have to handle errors. Just gives a bit more flexibility for semver and everything.
The safety checks are also mostly the same for allVectorType implementors, so just one default implementation in the trait covers all of them which is nice.
Readonly/Read-write distinction at the type/trait level withIndex andIndexView prevents illegal mutation calls to immutableIndex at compile time
SeparateIndex andIndexView types also allows tracking the lifetime of the backing buffer for mmapped/in-memory views without adding a lifetime parameter toIndex which would be really annoying.

Re: casting/number type conversion checks, they're just nice to have. They have to be conservative and really only prevent the obviously nonsensical casts.

tnibler added5 commits

November 1, 2025 15:49

Use generic change_metric_unchecked method, remove copy-paste impls

7612efa

Add (failing) test for memory leak in Index::change_metric

3f585d0

Fix leaked metric closure in change_metric

e8eb793

Mark MetricFunctionPtr as doc(hidden)

5532ef1

It has to be pub since it's exposed through the public VectorType trait,but details of how FFI calls, trampolines etc are handled probablyshouldn't be part of the public API.

Move mutating/readonly Index methods into separate traits

18c786b

This allows us to remove the unsafe `view_from_buffer` function byintroducing a new IndexView type that is associated with the buffer'slifetime.It also prevents invalid usage of immutable Index instances, which wouldthrow C++ exceptions if e.g., add() was called on an immutable view.

tnibler force-pushed thesafe-unsafe-rust branch froma69e121 to18c786bCompare

November 1, 2025 14:49

Labels

Breaking changes planned for v3

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Breaking revamp of Rust bindings for v3#670

Are you sure you want to change the base?

Breaking revamp of Rust bindings for v3#670

Uh oh!

Conversation

tnibler commentedOct 28, 2025

Error handling, type safety

Safe`Index` views into file, buffer

Custom metric closure shenanigans

Uh oh!

ashvardanian commentedNov 1, 2025

Uh oh!

ashvardanian commentedNov 1, 2025

Uh oh!

Uh oh!

tnibler commentedNov 1, 2025•
edited
Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Movatterモバイル変換

Breaking revamp of Rust bindings for v3#670

Are you sure you want to change the base?

Breaking revamp of Rust bindings for v3#670

Uh oh!

Conversation

tnibler commentedOct 28, 2025

Error handling, type safety

SafeIndex views into file, buffer

Custom metric closure shenanigans

Uh oh!

ashvardanian commentedNov 1, 2025

Uh oh!

ashvardanian commentedNov 1, 2025

Uh oh!

Uh oh!

tnibler commentedNov 1, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Safe`Index` views into file, buffer

tnibler commentedNov 1, 2025•
edited
Loading