- Notifications
You must be signed in to change notification settings - Fork852
feat: add tag DDL support (CREATE/DROP/SHOW only)#19109
feat: add tag DDL support (CREATE/DROP/SHOW only)#19109TCeason merged 12 commits intodatabendlabs:mainfrom
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When yousign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Uh oh!
There was an error while loading.Please reload this page.
1bfe31d tob8d0aa8CompareCodex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
5f51fe4 to7ce183aCompare
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@drmingdrmer reviewed 5 of 43 files at r1.
Reviewable status: 5 of 44 files reviewed, 3 unresolved discussions (waiting on@TCeason)
src/meta/api/src/tag_api.rs line 99 at r2 (raw file):
}), CreateOption::CreateOrReplace => unreachable!("CreateOrReplace handled above"), },
CreateOption should not be part of the request, it does affect the meta-api behavior. it just affect how to return an error. move these logic to the caller, or caller's caller, whoever that cares about the return value.
Code quote:
Err(existent) => match req.create_option { CreateOption::Create => Err(TagError::already_exists( req.name_ident.tag_name().to_string(), )), CreateOption::CreateIfNotExists => Ok(CreateTagReply { tag_id: *existent.data, }), CreateOption::CreateOrReplace => unreachable!("CreateOrReplace handled above"), },src/meta/api/src/tag_api.rs line 136 at r2 (raw file):
} Ok(Ok(())) }
this check-and-remove is not transactional.
Code quote:
async fn drop_tag(&self, req: DropTagReq) -> Result<Result<(), TagError>, MetaTxnError> { debug!(req :? =(&req); "SchemaApi: {}", func_name!()); let references = collect_tag_references(self,&ListTagReferencesReq { tenant: req.name_ident.tenant().clone(), tag_name: Some(req.name_ident.tag_name().to_string()), object_type: None, }) .await .map_err(MetaTxnError::from)?; if !references.is_empty() { return Ok(Err(TagError::has_references( req.name_ident.tag_name().to_string(), references.len(), ))); } // NOTE: The reference check above and the delete operation below are not transactional. // A concurrent SET/UNSET TAGS call may introduce new references in between, which is an // acceptable trade-off for the initial version of tag DDL support. let removed = self.remove_id_value(&req.name_ident, |_| vec![]).await?; if removed.is_none()&& !req.if_exists { return Ok(Err(TagError::not_found( req.name_ident.tag_name().to_string(), "drop_tag", ))); } Ok(Ok(())) }
src/meta/protos/proto/tag.proto line 36 at r2 (raw file):
string value = 2; string created_on = 3;}
Add doc comment to explain the fields
Code quote:
messageTagMeta {uint64ver=100;uint64min_reader_ver=101;repeatedstringallowed_values=1;stringcomment=2;stringcreated_on=3;optionalstringupdated_on=4;}messageTagRefValue {uint64ver=100;uint64min_reader_ver=101;uint64tag_id=1;stringvalue=2;stringcreated_on=3;}
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Reviewable status: 5 of 44 files reviewed, 3 unresolved discussions (waiting on@drmingdrmer)
src/meta/api/src/tag_api.rs line 99 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
CreateOption should not be part of the request, it does affect the meta-api behavior. it just affect how to return an error. move these logic to the caller, or caller's caller, whoever that cares about the return value.
Done.
src/meta/api/src/tag_api.rs line 136 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
this check-and-remove is not transactional.
Done.
src/meta/protos/proto/tag.proto line 36 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Add doc comment to explain the fields
Done.
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Reviewable status: 4 of 44 files reviewed, 3 unresolved discussions (waiting on@drmingdrmer)
src/meta/api/src/tag_api.rs line 136 at r2 (raw file):
Previously, TCeason wrote…
Done.
Add tag <-> object ref make drop tag transactional.
45f9967 toc5b2a21Compare
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Reviewable status: 4 of 45 files reviewed, 1 unresolved discussion (waiting on@TCeason)
src/meta/api/src/tag_api.rs line 145 at r6 (raw file):
req.name_ident.tag_name().to_string(), "drop_tag", )));
again, if_exists does not affect the tag-api behavior thus it should be done in the caller.
Code quote:
if req.if_exists { return Ok(Ok(())); } return Ok(Err(TagError::not_found( req.name_ident.tag_name().to_string(), "drop_tag", )));f94f4e7 to7a053e2Compare
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 1 comment.
Reviewable status: 4 of 47 files reviewed, 1 unresolved discussion (waiting on@drmingdrmer).
src/meta/api/src/tag_api.rs line 145 at r6 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
again, if_exists does not affect the tag-api behavior thus it should be done in the caller.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When yousign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Uh oh!
There was an error while loading.Please reload this page.
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
you have still been making the mistakes I've pointed out many times in your previous PRs. This PR is literally not ready for review at all. Please re-read each comment I gave you in previous PRs and learn.
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 8 comments.
Reviewable status: 10 of 48 files reviewed, 9 unresolved discussions (waiting on@drmingdrmer).
src/meta/api/src/tag_api.rs line 136 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
If using
txn_cond_eq_keys_with_prefix()to create a condition, you do not need to list them at all.
Done. And I refactor them to check failed stats.
src/meta/app/src/schema/mod.rs line 160 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
import each type one by one. Avoid using
*
Done.
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
why not use
UnknownError: which allows to create fromTIdent::unknown_error()
This error used in set_object_tag.
- this function only accpect tag_id, does not have tag name;UnknownError need a TIdent but TagIdIdent only an internal kv key. It's not good to expose it to users.
- The mapping of TagError → ErrorCode::UnknownTag is explicitly defined (the same as impl From for ErrorCode at the bottom of the file). The client can stably obtain an SQL-level error like "UnknownTag", while UnknownError will merely return debug information such as "Unknown __fd_tag_by_id/tenant/xxxx". Moreover, we should not include tenant information in the error.
src/meta/app/src/schema/tag/error.rs line 38 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
None of these three errors explicitly explained: 1: what happened, 2: what to do next.
HasReferencesjust saying about a valid state.Invalidis too generic.ConcurrentModificationdoes not explain why it is an error.
About ConcurrentModification I think this error msg explain the reason for the failure and informs that a retry is needed
src/meta/app/src/schema/tag/mod.rs line 89 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
use
SeqV<T>to include the seq num for updating
It's a retrun value about function:
#[logcall::logcall]#[fastrace::trace]asyncfnlist_tags(&self,tenant:&Tenant) ->Result<Vec<TagInfo>,MetaError>{debug!(tenant:? =(tenant);"SchemaApi: {}", func_name!());let tag_name_key =TagNameIdent::new(tenant,"dummy");let tag_dir =DirName::new(tag_name_key);let tags =self.list_id_value(&tag_dir).await?.map(|(name_ident, tag_id, meta_seqv)|TagInfo{name: name_ident.tag_name().to_string(),tag_id:*tag_id,meta: meta_seqv,}).collect();Ok(tags)}
It can add SeqV but we need to modify the base functionlist_id_value
/// list by name prefix, returns a list of `(name, id, value)`////// Returns an iterator of `(name, id, SeqV<value>)` tuples./// This function list all the ids by a name prefix,/// then get the `id->value` mapping and finally zip these two together.// Using `async fn` does not allow using `impl Iterator` in the return type.// Thus we use `impl Future` instead.fn list_id_value(&self,prefix:&DirName<K>,) ->implFuture<Output =Result<implIterator<Item =(K,DataId<IdRsc>,SeqV<IdRsc::ValueType>)>,MetaError,>,> +Send{
src/meta/app/src/schema/tag/mod.rs line 153 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
do not use the name
Object. It explains nothing.
Done. And I will modify otherobject in GrantObject or OwnershipObject
src/meta/protos/proto/tag.proto line 28 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
if it is a ordered list, then it should be a map instead of a vec.
Why??
https://protobuf.dev/programming-guides/encoding/
Thus, maps are encoded almost exactly like a repeated message field: as a sequence of LEN-typed records, with two fields each. The exception is that order is not guaranteed to be preserved with maps during serialization.src/meta/protos/proto/tag.proto line 44 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
bad name.
valueexplained nothing.
Done.
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@drmingdrmer partially reviewed 11 files, made 8 comments, and resolved 3 discussions.
Reviewable status: 16 of 48 files reviewed, 10 unresolved discussions (waiting on@TCeason).
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, TCeason wrote…
This error used in set_object_tag.
- this function only accpect tag_id, does not have tag name;UnknownError need a TIdent but TagIdIdent only an internal kv key. It's not good to expose it to users.
- The mapping of TagError → ErrorCode::UnknownTag is explicitly defined (the same as impl From for ErrorCode at the bottom of the file). The client can stably obtain an SQL-level error like "UnknownTag", while UnknownError will merely return debug information such as "Unknown __fd_tag_by_id/tenant/xxxx". Moreover, we should not include tenant information in the error.
TIdent::unknown_error() is the UnknownTagId error, right?
src/meta/app/src/schema/tag/error.rs line 38 at r11 (raw file):
Previously, TCeason wrote…
About ConcurrentModification I think this error msg explain the reason for the failure and informs that a retry is needed
The error should explain what's the resource that has a race condition causing this error, but not a genericConcurrent. It expressed nothing for the user to figure out what was going wrong. The resource should be a TagName, TagId, TagMeta, or a Tag referrer or referee.
src/meta/app/src/schema/tag/mod.rs line 89 at r11 (raw file):
Previously, TCeason wrote…
It's a retrun value about function:
#[logcall::logcall]#[fastrace::trace]asyncfnlist_tags(&self,tenant:&Tenant) ->Result<Vec<TagInfo>,MetaError>{debug!(tenant:? =(tenant);"SchemaApi: {}", func_name!());let tag_name_key =TagNameIdent::new(tenant,"dummy");let tag_dir =DirName::new(tag_name_key);let tags =self.list_id_value(&tag_dir).await?.map(|(name_ident, tag_id, meta_seqv)|TagInfo{name: name_ident.tag_name().to_string(),tag_id:*tag_id,meta: meta_seqv,}).collect();Ok(tags)}It can add SeqV but we need to modify the base function
list_id_value/// list by name prefix, returns a list of `(name, id, value)`////// Returns an iterator of `(name, id, SeqV<value>)` tuples./// This function list all the ids by a name prefix,/// then get the `id->value` mapping and finally zip these two together.// Using `async fn` does not allow using `impl Iterator` in the return type.// Thus we use `impl Future` instead.fn list_id_value(&self,prefix:&DirName<K>,) ->implFuture<Output =Result<implIterator<Item =(K,DataId<IdRsc>,SeqV<IdRsc::ValueType>)>,MetaError,>,> +Send{
It looks it requires a modification on thelist_id_value to support returning a SeqV.
src/meta/protos/proto/tag.proto line 28 at r11 (raw file):
Previously, TCeason wrote…
Why??
https://protobuf.dev/programming-guides/encoding/
Thus, maps are encoded almost exactly like a repeated message field: as a sequence of LEN-typed records, with two fields each. The exception is that order is not guaranteed to be preserved with maps during serialization.
Because you are gonna have to access it in a map-like way!!
The only usage of it is to check if a string in it or not. so it is a map!!
Do want to iterate over all of the 100 values in the vector to check if one string is in it or not??
src/meta/api/src/tag_api.rs line 271 at r13 (raw file):
// Tag metadata was modified concurrently (e.g., allowed_values changed) Ok(Err(TagError::concurrent_modification())) }
Why did you retry in the transaction loop, but just return the error at once?
Code quote:
let (succ, _) = send_txn(self, TxnRequest::new(txn_conditions, txn_ops)).await?; if succ { Ok(Ok(())) } else { // Tag metadata was modified concurrently (e.g., allowed_values changed) Ok(Err(TagError::concurrent_modification())) }src/meta/api/src/tag_api.rs line 317 at r13 (raw file):
let refs_dir = DirName::new(obj_ref_key); let stream = self.list_pb(&refs_dir).await?; let entries = stream.try_collect::<Vec<_>>().await?;
You don't need to collect the result at once. but instead, map the stream item into the format you want and finally do one collect. use StreamExt::map_ok()
src/meta/app/src/schema/tag/mod.rs line 53 at r13 (raw file):
/// (databases, tables, stages, connections) for governance and classification.#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, Eq)]pub struct TagMeta {
Most**Meta does not need to beserde because they are only serialized in protobuf for being written to meta-service. Check every line of your code you have written, including the attribute declaration make sure every statement does make sense.
Code quote:
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, Eq)]pub struct TagMeta {src/meta/app/src/schema/tag/mod.rs line 73 at r13 (raw file):
/// Response from creating a tag.#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, Eq)]
serde needed?
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 8 comments.
Reviewable status: 16 of 48 files reviewed, 10 unresolved discussions (waiting on@drmingdrmer).
src/meta/api/src/tag_api.rs line 271 at r13 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Why did you retry in the transaction loop, but just return the error at once?
Done.
src/meta/api/src/tag_api.rs line 317 at r13 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
You don't need to collect the result at once. but instead, map the stream item into the format you want and finally do one collect. use StreamExt::map_ok()
Done.
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
TIdent::unknown_error() is the UnknownTagId error, right?
#[logcall::logcall]#[fastrace::trace]asyncfn set_object_tags(&self,req:SetObjectTagsReq,) ->Result<Result<(),TagError>,MetaError>{ ...for(((tag_id, tag_value), meta_opt), tag_meta_key) in req.tags.iter().zip(tag_metas).zip(tag_meta_keys){// Verify tag existsletSome(meta_seqv) = meta_optelse{returnOk(Err(TagError::not_found(*tag_id)));};// Check allowed_values: None means any value is allowedifletSome(allowed) =&meta_seqv.data.allowed_values{if !allowed.contains(tag_value){returnOk(Err(TagError::not_allowed_value(*tag_id, tag_value.clone(),Some(allowed.clone()),)));}} ...}
- Don’t want this error to leak the tenant ID.
- And it can be map to ErrorCode::UnknownTag.
- TagError also can map not_allowed_value Error
src/meta/app/src/schema/tag/error.rs line 38 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
The error should explain what's the resource that has a race condition causing this error, but not a generic
Concurrent. It expressed nothing for the user to figure out what was going wrong. The resource should be a TagName, TagId, TagMeta, or a Tag referrer or referee.
Done. TagMetaConcurrentModification
src/meta/app/src/schema/tag/mod.rs line 89 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
It looks it requires a modification on the
list_id_valueto support returning a SeqV.
Done.
src/meta/app/src/schema/tag/mod.rs line 53 at r13 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Most
**Metadoes not need to beserdebecause they are only serialized in protobuf for being written to meta-service. Check every line of your code you have written, including the attribute declaration make sure every statement does make sense.
Done.
src/meta/app/src/schema/tag/mod.rs line 73 at r13 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
serde needed?
Done.
src/meta/protos/proto/tag.proto line 28 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Because you are gonna have to access it in a map-like way!!
The only usage of it is to check if a string in it or not. so it is a map!!
Do want to iterate over all of the 100 values in the vector to check if one string is in it or not??
Keep aVec here to preserve declaration order so future ON_CONFLICT semantics can support ALLOWED_VALUES_SEQUENCE behavior. In short need to keep order .
- Example: CREATE TAG env_tag ALLOWED_VALUES ('blue','red','green') PROPAGATE= ON_DEPENDENCY ON_CONFLICT= ALLOWED_VALUES_SEQUENCE;-- Suppose a view inherits env_tag from two base tables, one tagged env_tag = 'red' and the other env_tag = 'blue'. Because 'blue' appears before 'red' in the allowed list, the view’s final env_tag value becomes 'blue'. If the conflict involved 'green' and 'red', 'red' would win, and so on.
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 1 comment.
Reviewable status: 13 of 53 files reviewed, 10 unresolved discussions (waiting on@drmingdrmer).
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, TCeason wrote…
#[logcall::logcall]#[fastrace::trace]asyncfn set_object_tags(&self,req:SetObjectTagsReq,) ->Result<Result<(),TagError>,MetaError>{ ...for(((tag_id, tag_value), meta_opt), tag_meta_key) in req.tags.iter().zip(tag_metas).zip(tag_meta_keys){// Verify tag existsletSome(meta_seqv) = meta_optelse{returnOk(Err(TagError::not_found(*tag_id)));};// Check allowed_values: None means any value is allowedifletSome(allowed) =&meta_seqv.data.allowed_values{if !allowed.contains(tag_value){returnOk(Err(TagError::not_allowed_value(*tag_id, tag_value.clone(),Some(allowed.clone()),)));}} ...}
- Don’t want this error to leak the tenant ID.
- And it can be map to ErrorCode::UnknownTag.
- TagError also can map not_allowed_value Error
- An error should not contain anytenant info in its
to_string()result.
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'd appreciate it if you could consider my suggestion more carefully before responding in a defensive attitude.
@drmingdrmer reviewed 9 files and all commit messages, made 5 comments, and resolved 6 discussions.
Reviewable status: 22 of 53 files reviewed, 5 unresolved discussions (waiting on@TCeason).
src/meta/api/src/tag_api.rs line 271 at r13 (raw file):
Previously, TCeason wrote…
Done.
It's not done yet. If there is an error, why don't you just retry?
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, TCeason wrote…
#[logcall::logcall]#[fastrace::trace]asyncfn set_object_tags(&self,req:SetObjectTagsReq,) ->Result<Result<(),TagError>,MetaError>{ ...for(((tag_id, tag_value), meta_opt), tag_meta_key) in req.tags.iter().zip(tag_metas).zip(tag_meta_keys){// Verify tag existsletSome(meta_seqv) = meta_optelse{returnOk(Err(TagError::not_found(*tag_id)));};// Check allowed_values: None means any value is allowedifletSome(allowed) =&meta_seqv.data.allowed_values{if !allowed.contains(tag_value){returnOk(Err(TagError::not_allowed_value(*tag_id, tag_value.clone(),Some(allowed.clone()),)));}} ...}
- Don’t want this error to leak the tenant ID.
- And it can be map to ErrorCode::UnknownTag.
- TagError also can map not_allowed_value Error
You don't need to link the tenant I D to any error because this error will be converted to error code. So at that step, you can discard some sensitive information.
src/meta/api/src/tag_api.rs line 158 at r12 (raw file):
))); } }
I'm not sure if you can remove this error. If there are references, this function will loop forever.
Code quote:
return Ok(Ok(Some((id_seqv, meta_seqv)))); } else { let reference_count = self.list_pb_vec(&refs_dir).await?.len(); if reference_count > 0 { return Ok(Err(TagError::tag_has_references( name_ident.tag_name().to_string(), reference_count, ))); } }Uh oh!
There was an error while loading.Please reload this page.
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 2 comments.
Reviewable status: 19 of 53 files reviewed, 5 unresolved discussions (waiting on@drmingdrmer).
src/meta/api/src/tag_api.rs line 271 at r13 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
It's not done yet. If there is an error, why don't you just retry?
emm , retry is still needed because it can mask the influence of some meta errors.
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
You don't need to link the tenant I D to any error because this error will be converted to error code. So at that step, you can discard some sensitive information.
ok, you mean this?
#[derive(Clone,Debug, thiserror::Error,PartialEq,Eq)]pubenumTagError{#[error(transparent)]UnknownTagId(#[from]UnknownError<TagIdResource,TagId>),
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@drmingdrmer reviewed 5 files and all commit messages, made 1 comment, and resolved 3 discussions.
Reviewable status: 23 of 53 files reviewed, 3 unresolved discussions (waiting on@TCeason).
src/meta/app/src/schema/tag/error.rs line 26 at r17 (raw file):
pub enum TagError { #[error(transparent)] UnknownTagId(#[from] UnknownError<TagIdResource, TagId>),
do you really include this UnknownError in TagError?
Code quote:
pub enum TagError { #[error(transparent)] UnknownTagId(#[from] UnknownError<TagIdResource, TagId>),
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 2 comments.
Reviewable status: 23 of 53 files reviewed, 3 unresolved discussions (waiting on@drmingdrmer).
src/meta/api/src/tag_api.rs line 271 at r13 (raw file):
Previously, TCeason wrote…
emm , retry is still needed because it can mask the influence of some meta errors.
The condition is txn_cond_eq_seq(&tag_meta_key, meta_seqv.seq). If this condition is not met, it will cause succ to be false, and at this point, it can fail directly.
src/meta/app/src/schema/tag/error.rs line 26 at r17 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
do you really include this UnknownError in TagError?
Yes, I need to include UnknownError in TagMetaError. Because:
set_object_tags contains three errors: 1. The tag does not exist. 2. Whether the value is valid. 3. The tag meta has been modified
- set_object_tags now perform an optimistic retry with txn_backoff to keep allowed_values consistent with the latest tag meta.- remove the TagMetaConcurrentModification errors and corresponding exception code, instead retry logic processing.
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 2 comments.
Reviewable status: 20 of 53 files reviewed, 3 unresolved discussions (waiting on@drmingdrmer).
src/meta/api/src/tag_api.rs line 158 at r12 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
I'm not sure if you can remove this error. If there are references, this function will loop forever.
If succ==false will check has reference
src/meta/api/src/tag_api.rs line 271 at r13 (raw file):
Previously, TCeason wrote…
The condition is txn_cond_eq_seq(&tag_meta_key, meta_seqv.seq). If this condition is not met, it will cause succ to be false, and at this point, it can fail directly.
Now I get you.
/// Attach multiple tags to an object in a single transaction./// Validates tag existence and allowed_values constraints.////// The txn uses optimistic locking:/// - We must re-fetch `TagMeta` and verify values on every retry so each attempt uses the/// latest `allowed_values`./// - The generated conditions include `tag_meta.seq`, ensuring no concurrent update sneaks in/// between validation and commit. If any `TagMeta` changes, the txn fails and the loop/// re-executes the entire validation/write process.
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I'm glad you have finally figured out how to build something that does not smell like AI generated blah.
@drmingdrmer reviewed 3 files and all commit messages, made 3 comments, and resolved 2 discussions.
Reviewable status: 23 of 53 files reviewed, 2 unresolved discussions (waiting on@TCeason).
src/meta/app/src/schema/tag/error.rs line 20 at r11 (raw file):
Previously, TCeason wrote…
ok, you mean this?
#[derive(Clone,Debug, thiserror::Error,PartialEq,Eq)]pubenumTagError{#[error(transparent)]UnknownTagId(#[from]UnknownError<TagIdResource,TagId>),
Yes I mean this.
src/meta/app/src/schema/tag/error.rs line 75 at r19 (raw file):
.join(", ") ), _ => String::new(),
whenallowed_value is None, it should never reach here to create a not-allowed-value error.
Code quote:
pub fn not_allowed_value( tag_id: u64, tag_value: impl Into<String>, allowed_values: Option<Vec<String>>, ) -> Self { let allowed_values_display = match allowed_values { Some(values) if !values.is_empty() => format!( ". Allowed values: [{}]", values .into_iter() .map(|v| format!("'{v}'")) .collect::<Vec<_>>() .join(", ") ), _ => String::new(),
- Proto3 `repeated` fields can’t distinguish `None` from `Some(Vec::new())`, so extending TagMeta with a dedicated `enforce_allowed_values` flag lets us tell whether a tag declared `ALLOWED_VALUES` even when the list is empty. - Split the semantics in Rust too: keep `allowed_values: Vec<String>` strictly for ordered values, and use the boolean to record if the constraint is active instead of overloading `Option<Vec<_>>`. - Update CREATE TAG interpreter, tag validation, error text, and system.tags display to respect the new structure, and refresh the proto conversion tests.
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 1 comment.
Reviewable status: 17 of 53 files reviewed, 2 unresolved discussions (waiting on@drmingdrmer).
src/meta/app/src/schema/tag/error.rs line 75 at r19 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
when
allowed_valueis None, it should never reach here to create a not-allowed-value error.
Yes. I modify this func now it only accept Vec.
BTW: in proto can notrepeated fields can’t distinguishNone fromSome(Vec::new()).
createtablet(idint);create tag tag1 allowed_values'x','y';ALTER TAG tag1 DROP ALLOWED_VALUES'x';ALTER TAG tag1 DROP ALLOWED_VALUES'y';altertable t1set tag tag1='y';-- expect err.
So I modify the tag.proto. Now it's this.
message TagMeta { uint64 ver = 100; uint64 min_reader_ver = 101; // Ordered list of string values that may be set on this tag. // Matches CREATE TAG ... ALLOWED_VALUES semantics: must be declared before // other parameters, accepts up to 5,000 entries, and when omitted any string // (including empty) is accepted. Propagating tags rely on this order to // resolve conflicts when ON_CONFLICT = ALLOWED_VALUES_SEQUENCE (e.g. `CREATE // TAG my_tag ALLOWED_VALUES 'blue', 'red' PROPAGATE = ON_DEPENDENCY // ON_CONFLICT = ALLOWED_VALUES_SEQUENCE` always picks `'blue'`). A conflict // arises when the same downstream object inherits different values: // // Tag definition: my_tag ALLOWED_VALUES 'blue', 'red' // Object A --(my_tag='blue')--> \ // -> Object C (final = 'blue') // Object B --(my_tag='red')--> / // // We keep a repeated field instead of a map to avoid undefined iteration // order or manually maintained indices. With maps, gaps like `{blue:1, red:3}` // (after deletes/edits) would desync the conflict priority from the declared // order, so we store the sequence directly. repeated string allowed_values = 1; string comment = 2; string created_on = 3; optional string updated_on = 4; optional string drop_on = 5; // Distinguishes between `CREATE TAG ... ALLOWED_VALUES` (even if the list is // currently empty) and tags that never configured allowed values. When false, // the tag accepts any string. When true, only entries from `allowed_values` // are valid; the list may be empty when all values have been dropped but the // constraint still exists until an UNSET command runs. bool enforce_allowed_values = 6;}
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@drmingdrmer reviewed 2 files and all commit messages, made 2 comments, and resolved 1 discussion.
Reviewable status: 19 of 53 files reviewed, 3 unresolved discussions (waiting on@TCeason).
src/meta/app/src/schema/tag/mod.rs line 57 at r20 (raw file):
/// The vector may be empty when all values are dropped but the constraint /// remains active. pub allowed_values: Vec<String>,
keep it anOption<Vec<>>
src/meta/protos/proto/tag.proto line 51 at r20 (raw file):
// are valid; the list may be empty when all values have been dropped but the// constraint still exists until an UNSET command runs. bool enforce_allowed_values = 6;
put allowed_values in another message and you can useoption for it. This approach will be more intuitive.
And convert it to aOption<Vec<String>> in FromToProto impl for easily using it.
Code quote:
repeated string allowed_values = 1; string comment = 2; string created_on = 3; optional string updated_on = 4; optional string drop_on = 5;// Distinguishes between `CREATE TAG ... ALLOWED_VALUES` (even if the list is// currently empty) and tags that never configured allowed values. When false,// the tag accepts any string. When true, only entries from `allowed_values`// are valid; the list may be empty when all values have been dropped but the// constraint still exists until an UNSET command runs. bool enforce_allowed_values = 6;
TCeason left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
@TCeason made 2 comments.
Reviewable status: 17 of 53 files reviewed, 3 unresolved discussions (waiting on@drmingdrmer).
src/meta/app/src/schema/tag/mod.rs line 57 at r20 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
keep it an
Option<Vec<>>
Done.
src/meta/protos/proto/tag.proto line 51 at r20 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
put allowed_values in another message and you can use
optionfor it. This approach will be more intuitive.And convert it to a
Option<Vec<String>>in FromToProto impl for easily using it.
Done.And I modify like this:
message TagMeta { uint64 ver = 100; uint64 min_reader_ver = 101; message AllowedValues { // Ordered list of string values that may be set on this tag. // Matches CREATE TAG ... ALLOWED_VALUES semantics: must be declared before // other parameters, accepts up to 5,000 entries, and when omitted any string // (including empty) is accepted. Propagating tags rely on this order to // resolve conflicts when ON_CONFLICT = ALLOWED_VALUES_SEQUENCE (e.g. `CREATE // TAG my_tag ALLOWED_VALUES 'blue', 'red' PROPAGATE = ON_DEPENDENCY // ON_CONFLICT = ALLOWED_VALUES_SEQUENCE` always picks `'blue'`). A conflict // arises when the same downstream object inherits different values: // // Tag definition: my_tag ALLOWED_VALUES 'blue', 'red' // Object A --(my_tag='blue')--> \ // -> Object C (final = 'blue') // Object B --(my_tag='red')--> / // // We keep a repeated field instead of a map to avoid undefined iteration // order or manually maintained indices. With maps, gaps like `{blue:1, red:3}` // (after deletes/edits) would desync the conflict priority from the declared // order, so we store the sequence directly. repeated string values = 1; } optional AllowedValues allowed_values = 1; string comment = 2; string created_on = 3; optional string updated_on = 4; optional string drop_on = 5;}
drmingdrmer left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I reviewed just the meta service part and it looks good enough to me.
@drmingdrmer reviewed 11 files and all commit messages, made 1 comment, and resolved 3 discussions.
Reviewable status: 28 of 53 files reviewed, all discussions resolved.
5346432 intodatabendlabs:mainUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
I hereby agree to the terms of the CLA available at:https://docs.databend.com/dev/policies/cla/
Summary
Add tag support for Databend objects (databases, tables, stages, connections)
to enable data governance and classification.
This PR implements:
SET/UNSET TAG will be added in a follow-up PR.
Query Layer
New SQL syntax:
New components:
New error codes:
System Table
system.tags exposes all tag definitions:
Design Decisions
Tests
Type of change
This change is