- Notifications
You must be signed in to change notification settings - Fork144
feat(knowledge): introduce the generic SQL knowledge#1389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
coderabbitaibot commentedDec 16, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Warning Rate limit exceeded@goldmedal has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait9 minutes and 19 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see ourFAQ for further information. 📒 Files selected for processing (2)
WalkthroughRefactored Ibis Server to load SQL rules and instructions from filesystem resources, added SQL correction rule and dialect-specific instruction files, expanded Wren Core with a JSON scalar UDF module and BigQuery rewrites, reorganized function utils imports, and added tests for BigQuery JSON transformations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for usingCodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (4)
ibis-server/resources/knowledge/dialects/bigquery.txt (1)
5-7:Inconsistent INTERVAL part naming.Line 5 uses
'7' days(plural) while lines 6-7 use'1' monthand'2' year(singular). Standard SQL INTERVAL syntax typically uses singular forms:INTERVAL '7' day,INTERVAL '1' month, etc. Consider standardizing to singular forms throughout for consistency with SQL standards.- - Example: `timestamp_with_time_zone_column + INTERVAL '7' days` is valid.+ - Example: `timestamp_with_time_zone_column + INTERVAL '7' day` is valid.ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt (1)
1-1:Minor typo: extra space in header.There's a double space between "for" and "DATE" in the header.
-### Instructions for DATE AND TIME FUNCTIONALITY ###+### Instructions for DATE AND TIME FUNCTIONALITY ###ibis-server/app/mdl/knowledge.py (1)
16-26:Consider sorting files for deterministic output.
instructions_path.iterdir()does not guarantee a consistent ordering across different filesystems or Python versions. If the order of instructions matters for downstream consumers, consider sorting the files.- files = [f for f in instructions_path.iterdir() if f.is_file()]+ files = sorted([f for f in instructions_path.iterdir() if f.is_file()])wren-core/core/src/mdl/function/scalar/json.rs (1)
250-268:Consider Time64 for microsecond precision.
as_timeusesTime32(Millisecond)which limits precision to milliseconds. If the source data contains microsecond precision (common in JSON timestamps), this could silently truncate data. Consider usingTime64(Microsecond)for consistency withas_timestampwhich usesMicrosecondprecision.make_udf_function!( ByPassScalarUDF::new( "as_time",- ReturnType::Specific(DataType::Time32(- datafusion::arrow::datatypes::TimeUnit::Millisecond+ ReturnType::Specific(DataType::Time64(+ datafusion::arrow::datatypes::TimeUnit::Microsecond )),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
ibis-server/app/mdl/knowledge.py(1 hunks)ibis-server/resources/knowledge/dialects/bigquery.txt(1 hunks)ibis-server/resources/knowledge/instructions/array_usage.txt(1 hunks)ibis-server/resources/knowledge/instructions/calculated_field.txt(1 hunks)ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt(1 hunks)ibis-server/resources/knowledge/instructions/semi_structured_type.txt(1 hunks)ibis-server/resources/knowledge/instructions/structured_type.txt(1 hunks)ibis-server/resources/knowledge/text_to_sql_rules.txt(1 hunks)wren-core/core/src/mdl/dialect/inner_dialect.rs(1 hunks)wren-core/core/src/mdl/function/dialect/bigquery/aggregate.rs(1 hunks)wren-core/core/src/mdl/function/dialect/bigquery/mod.rs(2 hunks)wren-core/core/src/mdl/function/dialect/bigquery/scalar.rs(1 hunks)wren-core/core/src/mdl/function/dialect/bigquery/window.rs(1 hunks)wren-core/core/src/mdl/function/dialect/mod.rs(0 hunks)wren-core/core/src/mdl/function/mod.rs(1 hunks)wren-core/core/src/mdl/function/scalar/json.rs(1 hunks)wren-core/core/src/mdl/function/scalar/mod.rs(2 hunks)wren-core/core/src/mdl/mod.rs(1 hunks)
💤 Files with no reviewable changes (1)
- wren-core/core/src/mdl/function/dialect/mod.rs
🧰 Additional context used
🧬 Code graph analysis (6)
wren-core/core/src/mdl/dialect/inner_dialect.rs (1)
wren-core/core/src/mdl/dialect/utils.rs (2)
scalar_function_to_sql_internal(51-76)args(30-48)
wren-core/core/src/mdl/function/dialect/bigquery/aggregate.rs (2)
wren-core/core/src/mdl/function/utils.rs (1)
build_document(3-5)wren-core/core/src/mdl/function/remote_function.rs (1)
build_document(228-246)
wren-core/core/src/mdl/function/dialect/bigquery/scalar.rs (2)
wren-core/core/src/mdl/function/utils.rs (1)
build_document(3-5)wren-core/core/src/mdl/function/remote_function.rs (1)
build_document(228-246)
wren-core/core/src/mdl/function/dialect/bigquery/window.rs (1)
wren-core/core/src/mdl/function/utils.rs (1)
build_document(3-5)
ibis-server/app/mdl/knowledge.py (1)
ibis-server/app/model/data_source.py (1)
DataSource(62-221)
wren-core/core/src/mdl/function/scalar/json.rs (2)
wren-core/core/src/mdl/function/utils.rs (1)
build_document(3-5)wren-core/core/src/logical_plan/optimize/simplify_timestamp.rs (1)
is_timestamp(160-162)
🪛 LanguageTool
ibis-server/resources/knowledge/text_to_sql_rules.txt
[grammar] ~2-~2: Ensure spelling is correct
Context: ... CONTEXT ### Now, you aren't querying a tranditional database, but rather a semantic layer, ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[style] ~4-~4: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... with some differences and limitations. Wren SQL doesn't equal to your database SQL....
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~5-~5: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...SQL doesn't equal to your database SQL. Wren Engine translates your Wren SQL to your...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[grammar] ~6-~6: Use a hyphen to join words.
Context: ...se SQL internally. Avoid to use database specific SQL syntax and functions in you...
(QB_NEW_EN_HYPHEN)
[style] ~8-~8: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...h database is connected to the backend. You can find it in thedataSource field i...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[grammar] ~15-~15: Ensure spelling is correct
Context: ...REFER USING CTEs over subqueries. - The Identifer case sensitivity is case-sensitivity. Y...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~40-~40: Ensure spelling is correct
Context: ...status IS NOT NULL`. - If the column is indcated as the primary key or indexed column, p...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
ibis-server/resources/knowledge/instructions/calculated_field.txt
[grammar] ~41-~41: Ensure spelling is correct
Context: ...rs?" For the first query: First try to intepret the user query, the user wants to know ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~42-~42: Ensure spelling is correct
Context: ...0 reviews. Then, according to the above intepretation about the given schema, the term 'Ratin...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
ibis-server/resources/knowledge/dialects/bigquery.txt
[style] ~4-~4: To form a complete sentence, be sure to include a subject.
Context: ...database. -TIMESTAMP WITH TIME ZONE can't be+/- with INTERVAL with non-zero...
(MISSING_IT_THERE)
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt
[style] ~13-~13: To form a complete sentence, be sure to include a subject.
Context: ...en timezone is important. -TIMESTAMP can't be directly compared with `TIMESTAMP...
(MISSING_IT_THERE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: ci
- GitHub Check: test
- GitHub Check: ci
- GitHub Check: cargo check
🔇 Additional comments (15)
wren-core/core/src/mdl/function/mod.rs (1)
7-7:LGTM!Centralizing the
utilsmodule at the function level is a sensible refactor that allows sharingbuild_documentacross both dialect and scalar submodules.wren-core/core/src/mdl/function/dialect/bigquery/aggregate.rs (1)
9-15:LGTM!Import paths correctly updated to reference the centralized
build_documentfromcrate::mdl::function::utils. The reorganization is clean and consistent with the module restructuring inmod.rs.wren-core/core/src/mdl/function/dialect/bigquery/window.rs (1)
5-8:LGTM! Import path consolidation.The import path adjustment for
build_documentis a clean refactoring with no functional changes to the window function definitions.ibis-server/resources/knowledge/instructions/structured_type.txt (1)
1-5:LGTM! Clear STRUCT type documentation.The syntax and examples for STRUCTURED data types are accurate and well-documented.
wren-core/core/src/mdl/function/dialect/bigquery/scalar.rs (1)
11-14:LGTM! Import path consolidation.The import reorganization for
build_documentmaintains existing functionality while improving code organization.wren-core/core/src/mdl/dialect/inner_dialect.rs (3)
225-227:LGTM! Straightforward JSON_EXTRACT mapping.The
get_pathtoJSON_EXTRACTrewrite is clean and correct.
250-253:LGTM! Type casting helper mappings.The mappings to
lax_*variants for type casting are appropriate for BigQuery's JSON handling.
228-249:This is a deliberate optimization, not an arbitrary constraint.The strict requirement for
AS_ARRAYto wrapGET_PATHis not a limitation but an intentional design that enables the optimizer to fuseAS_ARRAY(GET_PATH(col, path))into BigQuery's nativeJSON_EXTRACT_ARRAY(col, path). The single test case at line 3881 ofwren-core/core/src/mdl/mod.rsconfirms this is the intended and standard usage pattern. The function definition documentsas_arrayas converting "a json, variant or object to array," with the example showing exactly this GET_PATH pattern. This targeted transformation is appropriate for the implementation's purpose.Likely an incorrect or invalid review comment.
wren-core/core/src/mdl/function/dialect/bigquery/mod.rs (1)
66-66:LGTM! JSON UDF integration.The addition of JSON scalar UDFs to the BigQuery function set is well-structured and extends the dialect's capabilities appropriately.
Also applies to: 297-302
wren-core/core/src/mdl/function/scalar/mod.rs (2)
14-14:LGTM! JSON module addition.The new JSON module provides structured organization for JSON-related scalar UDFs.
195-229:LGTM! JSON functions collection with future use noted.The
json_functions()provides a comprehensive set of JSON UDFs. The#[allow(dead_code)]annotation indicates this function isn't currently called but is likely reserved for future dialect integrations or bulk registration scenarios. Currently, BigQuery dialect imports specific JSON functions directly from thejson::module.wren-core/core/src/mdl/mod.rs (1)
3855-3887:LGTM! Well-structured test coverage for BigQuery JSON transformations.The test correctly validates that
GET_PATHis transformed toJSON_EXTRACTandAS_ARRAY(GET_PATH(...))is transformed toJSON_EXTRACT_ARRAYfor BigQuery dialect. The test follows the established patterns in this file.ibis-server/app/mdl/knowledge.py (1)
5-5:Relative path may be fragile depending on working directory.The
KNOWLEDGE_RESOURCE_PATHuses a relative path, which assumes the code is always executed from theibis-serverdirectory. If the application is run from a different working directory, the path resolution will fail.Consider using
Path(__file__).parent.parent / "resources" / "knowledge"to make the path relative to the module's location instead of the current working directory.Additionally,
get_text_to_sql_rule()will raise aFileNotFoundErrorif the file doesn't exist, but there's no explicit error handling. Verify this is the intended behavior.-KNOWLEDGE_RESOURCE_PATH = "resources/knowledge"+KNOWLEDGE_RESOURCE_PATH = Path(__file__).parent.parent / "resources" / "knowledge"Also applies to: 13-14
wren-core/core/src/mdl/function/scalar/json.rs (2)
1-32:Well-structured JSON UDF module following Snowflake conventions.The module header clearly documents the design choice to follow Snowflake's naming and behavior. The
get_pathfunction is correctly defined with two string arguments and returnsUtf8. Documentation with examples is helpful.
34-50:Comprehensive type checking and casting UDFs.The
as_*andis_*function pairs provide a complete set of operations for semi-structured data handling. The consistent use ofSignature::coerciblewithlogical_string()input andImmutablevolatility is appropriate for these pure transformation functions.Also applies to: 296-532
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
ibis-server/resources/knowledge/instructions/calculated_field.txt OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
ibis-server/resources/knowledge/instructions/semi_structured_type.txt OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
ibis-server/resources/knowledge/instructions/semi_structured_type.txt OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…ype.txtCo-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…ctionality.txtCo-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (2)
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt (1)
5-9:Clarify quoting convention for example parts across functions.Line 7 (EXTRACT example) shows unquoted parts (year, quarter, month, etc.) while lines 5 and 9 (DATE_TRUNC and DATE_DIFF) show quoted parts. While this reflects actual SQL syntax (EXTRACT keywords don't take quotes), it may confuse users. Consider adding a brief note clarifying which functions require quotes around the part parameter.
ibis-server/resources/knowledge/text_to_sql_rules.txt (1)
10-10:Reduce repetitive sentence structure.Multiple successive sentences begin with "You," creating awkward rhythm and affecting readability.
Consider restructuring to improve flow:
-You should follow the rules below when generating SQL queries for Wren engine.-You should check which database is connected to the backend. You can find it in the `dataSource` field in the context. And check the corresponding instructions for the specific database if needed.+When generating SQL queries for Wren engine, follow the rules below. Check which database is connected to the backend (available in the `dataSource` field in the context) and review the corresponding instructions for that database if needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
ibis-server/resources/knowledge/instructions/array_usage.txt(1 hunks)ibis-server/resources/knowledge/instructions/calculated_field.txt(1 hunks)ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt(1 hunks)ibis-server/resources/knowledge/instructions/semi_structured_type.txt(1 hunks)ibis-server/resources/knowledge/text_to_sql_rules.txt(1 hunks)wren-core/core/src/mdl/dialect/inner_dialect.rs(1 hunks)wren-core/core/src/mdl/lineage.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- ibis-server/resources/knowledge/instructions/calculated_field.txt
- ibis-server/resources/knowledge/instructions/array_usage.txt
🧰 Additional context used
🧬 Code graph analysis (1)
wren-core/core/src/mdl/dialect/inner_dialect.rs (1)
wren-core/core/src/mdl/dialect/utils.rs (2)
scalar_function_to_sql_internal(51-76)args(30-48)
🪛 LanguageTool
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt
[style] ~13-~13: To form a complete sentence, be sure to include a subject.
Context: ...en timezone is important. -TIMESTAMP can't be directly compared with `TIMESTAMP...
(MISSING_IT_THERE)
ibis-server/resources/knowledge/text_to_sql_rules.txt
[style] ~7-~7: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...SQL doesn't equal to your database SQL. Wren Engine translates your Wren SQL to your...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[grammar] ~8-~8: Use a hyphen to join words.
Context: ...se SQL internally. Avoid to use database specific SQL syntax and functions in you...
(QB_NEW_EN_HYPHEN)
[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...h database is connected to the backend. You can find it in thedataSource field i...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: ci
- GitHub Check: test
- GitHub Check: cargo check
- GitHub Check: ci
🔇 Additional comments (6)
ibis-server/resources/knowledge/instructions/semi_structured_type.txt (1)
1-40:Well-structured knowledge document with past issues resolved. ✅The file provides comprehensive guidance for SEMI-STRUCTURED data types and correctly addresses the two issues flagged in previous reviews:
- Grammar has been corrected to "SEMI-STRUCTURED data types don't have" (line 6)
- Table name example correctly uses "users" throughout (line 40)
The instructions are clear, well-organized, and align with the Snowflake-style JSON handling introduced across the codebase. The coverage of GET_PATH, type casting functions (AS_VARCHAR, AS_INTEGER, etc.), json_type metadata rules, and UNNEST examples provides practical guidance for SQL generation on semi-structured data.
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt (1)
1-20:Well-structured knowledge file for date/time operations.The file provides clear, practical guidance on date/time functionality with good coverage of truncation, extraction, date arithmetic, timezone handling, and epoch time conversions. Structure and examples are easy to follow.
wren-core/core/src/mdl/dialect/inner_dialect.rs (3)
225-227:LGTM: Clean mapping to BigQuery JSON_EXTRACT.The
get_pathtoJSON_EXTRACTrewrite is straightforward and aligns with BigQuery's JSON extraction capabilities.
248-259:Type conversion mappings to BigQuery LAX functions are correct.The mappings use BigQuery's LAX_STRING, LAX_INT64, LAX_FLOAT64, and LAX_BOOL functions, which handle conversions between mismatched JSON data types. The implementation correctly maps
as_varchar→lax_string,as_integer→lax_int64,as_double→lax_float64, andas_boolean→lax_bool.
228-247:Add a code comment explaining the AS_ARRAY composition pattern.The implementation intentionally requires
AS_ARRAYto wrap aGET_PATHfunction call, which it then unwraps to invokeJSON_EXTRACT_ARRAYwith the inner arguments. This composition pattern is confirmed by the test case (AS_ARRAY(GET_PATH(json_col, '$.field'))transpiles toJSON_EXTRACT_ARRAY) and documented in the function signature example. Consider adding a code comment explaining why this composition is required in Wren's MDL—for instance, whether it clarifies intent, maintains consistency with the MDL type system, or reflects limitations of the transpiler.wren-core/core/src/mdl/lineage.rs (1)
167-167:LGTM: Correct reference passing through iterator.The code correctly passes
ident(a reference from.iter()onBTreeSet<Column>) toqualify_name_from_column_name, which expects&Column. This is type-safe and avoids unnecessary allocations. Rust's compile-time checks ensure correctness.However, this change appears unrelated to the PR objectives (introducing generic SQL knowledge and semi-structured type operations). If this is incidental cleanup, consider moving it to a separate refactoring PR.
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
ibis-server/resources/knowledge/text_to_sql_rules.txt (3)
4-4:Fix grammatical error in statement (previously flagged).Line 4 still contains the same grammatical error from the previous review: "No use the reference named" is incorrect phrasing.
Apply this diff:
-You should only use the names of models, views and columns defined in the MDL to generate SQL queries. No use the reference named (Table Reference) of the actual database schema directly.+You should only use the names of models, views and columns defined in the MDL to generate SQL queries. Do not use the reference names (Table Reference) of the actual database schema directly.
7-8:Fix phrasing and hyphenation issues (previously flagged).These lines remain unfixed from the previous review. Line 7 has repetitive sentence structure, and Line 8 lacks proper hyphenation for the compound adjective.
Apply this diff:
-Wren SQL doesn't equal to your database SQL. Wren Engine translates your Wren SQL to your database SQL internally.-Avoid to use database specific SQL syntax and functions in your Wren SQL.+Wren SQL differs from your database SQL. Wren Engine translates your Wren SQL to your database SQL internally.+Avoid using database-specific SQL syntax and functions in your Wren SQL.
17-17:Clarify redundant statement about identifier case sensitivity (previously flagged).Line 17 remains unfixed from the previous review. The phrase "The Identifier case sensitivity is case-sensitivity" is tautological and unclear.
Apply this diff:
-- The Identifier case sensitivity is case-sensitivity. You need to quote the identifier if it contains unicode characters, special characters (except _), or starts with a digit.+- Identifiers are case-sensitive. You must quote identifiers that contain unicode characters, special characters (except _), or start with a digit.
🧹 Nitpick comments (1)
ibis-server/resources/knowledge/text_to_sql_rules.txt (1)
10-10:Fix repetitive sentence structure.Three successive sentences begin with "You," which weakens the clarity of the guidance.
Apply this diff:
-You should check which database is connected to the backend. You can find it in the `dataSource` field in the context. And check the corresponding instructions for the specific database if needed.+You should check which database is connected to the backend. The `dataSource` field in the context shows this information. Check the corresponding instructions for the specific database if needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ibis-server/resources/knowledge/text_to_sql_rules.txt(1 hunks)
🧰 Additional context used
🪛 LanguageTool
ibis-server/resources/knowledge/text_to_sql_rules.txt
[style] ~7-~7: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...SQL doesn't equal to your database SQL. Wren Engine translates your Wren SQL to your...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[grammar] ~8-~8: Use a hyphen to join words.
Context: ...se SQL internally. Avoid to use database specific SQL syntax and functions in you...
(QB_NEW_EN_HYPHEN)
[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...h database is connected to the backend. You can find it in thedataSource field i...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: ci
- GitHub Check: ci
- GitHub Check: cargo check
- GitHub Check: test
…ctionality.txtCo-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ibis-server/app/routers/v3/connector.py (1)
542-550:Thesql_correction_rulemethod lacks explicit error handling for missing files.The
get_sql_correction_rule()method in the Knowledge class callsPath.read_text()without error handling. While resource files are packaged in the repository and guaranteed to exist, explicit error handling would improve robustness. Additionally, the endpoint currently relies on the global exception handler (which returns a generic 500 error) rather than providing specific error handling at the method level. Consider adding try-except blocks to catchFileNotFoundErrorand return a more informative error response, or document why file existence is guaranteed.
🧹 Nitpick comments (2)
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt (1)
5-5:Normalize quoting of example parts across functions.The example parts in DATE_TRUNC (line 5) and DATE_DIFF (line 9) are quoted, but EXTRACT (line 7) uses unquoted parts. For consistency in the knowledge base, either quote all parts uniformly or leave all unquoted.
Apply this diff to quote EXTRACT parts:
- Example parts: year, quarter, month, week, day, hour, minute, second+ Example parts: 'year', 'quarter', 'month', 'week', 'day', 'hour', 'minute', 'second'Also applies to: 7-7, 9-9
ibis-server/app/mdl/knowledge.py (1)
13-13:Consider using Path.joinpath for cleaner path construction.While f-strings work, using
Path.joinpath()or the/operator is more idiomatic and portable.Example refactor:
# Instead of:rules_path=Path(f"{KNOWLEDGE_RESOURCE_PATH}/text_to_sql_rule.txt")# Consider:rules_path=Path(KNOWLEDGE_RESOURCE_PATH)/"text_to_sql_rule.txt"Also applies to: 21-22
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
ibis-server/app/mdl/knowledge.py(1 hunks)ibis-server/app/routers/v3/connector.py(1 hunks)ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt(1 hunks)ibis-server/resources/knowledge/sql_correction_rule.txt(1 hunks)ibis-server/resources/knowledge/text_to_sql_rule.txt(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
ibis-server/app/routers/v3/connector.py (1)
ibis-server/app/mdl/knowledge.py (1)
get_sql_correction_rule(28-30)
ibis-server/app/mdl/knowledge.py (1)
ibis-server/app/model/data_source.py (1)
DataSource(62-221)
🪛 LanguageTool
ibis-server/resources/knowledge/sql_correction_rule.txt
[style] ~43-~43: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...an internal bug of Wren engine. You may need to report the issue to the Wren support te...
(REP_NEED_TO_VB)
[style] ~44-~44: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ... error is from the execution stage, You need to check the Dialect SQL generated by Wren...
(REP_NEED_TO_VB)
[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... those unsupported features. If a not-found table or columns error occu...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...stage, check the location of the error. If it occur the source subquery (typically...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~47-~47: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ated issues or runtime constraints. You need to analyze the execution context and data ...
(REP_NEED_TO_VB)
[style] ~48-~48: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...alues, or data type mismatches. You may need to check the sample data in the target dat...
(REP_NEED_TO_VB)
[style] ~49-~49: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ze of the data being processed. You may need to optimize the Wren SQL to reduce its com...
(REP_NEED_TO_VB)
[style] ~51-~51: Consider using a different verb for a more formal wording.
Context: ...dentified error, modify the Wren SQL to correct the issues. This may involve changing t...
(FIX_RESOLVE)
[grammar] ~52-~52: Ensure spelling is correct
Context: ...arget database. 2. You can only modifiy the Wren SQL directly. You are not allo...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~61-~61: Ensure spelling is correct
Context: ... to check their database performance or simpilify their question.
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
ibis-server/resources/knowledge/text_to_sql_rule.txt
[style] ~7-~7: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ren SQL differs from your database SQL. Wren Engine translates your Wren SQL to your...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~10-~10: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...h database is connected to the backend. You can find it in thedataSource field i...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
ibis-server/resources/knowledge/instructions/date_and_time_functionality.txt
[style] ~13-~13: To form a complete sentence, be sure to include a subject.
Context: ...en timezone is important. -TIMESTAMP can't be directly compared with `TIMESTAMP...
(MISSING_IT_THERE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: cargo check
- GitHub Check: ci
- GitHub Check: ci
- GitHub Check: test
🔇 Additional comments (1)
ibis-server/resources/knowledge/text_to_sql_rule.txt (1)
1-75:Well-structured SQL generation guidance.The documentation provides comprehensive rules for generating Wren SQL, covering context, rules, filtering, data types, aggregation, and subquery patterns.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actionable comments posted: 6
♻️ Duplicate comments (4)
ibis-server/resources/knowledge/sql_correction_rule.txt (2)
52-52:Fix spelling error (previously flagged)."modifiy" should be "modify".
Apply this diff:
- 2. You can only modify the Wren SQL directly. You are not allowed to modify the Planned SQL or Dialect SQL generated by Wren engine. So, you need to think about how to change the Wren SQL to achieve the desired Planned SQL or Dialect SQL.+ 2. You can only modify the Wren SQL directly. You are not allowed to modify the Planned SQL or Dialect SQL generated by Wren engine. So, you need to think about how to change the Wren SQL to achieve the desired Planned SQL or Dialect SQL.
61-61:Fix spelling error (previously flagged)."simpilify" should be "simplify".
Apply this diff:
- 3. If no alternative and a timeout or resource limit error occurs, report the issue to the user as an EXTERNAL ERROR, not a BUG of Wren engine. You may suggest the user to check their database performance or simplify their question.+ 3. If no alternative and a timeout or resource limit error occurs, report the issue to the user as an EXTERNAL ERROR, not a BUG of Wren engine. You may suggest the user to check their database performance or simplify their question.ibis-server/app/mdl/knowledge.py (2)
37-43:Add error handling for file read operations.Similar to
get_text_to_sql_rule, line 43 needs error handling for potential read failures beyond file existence.Apply this diff:
def get_sql_correction_rule(self) -> str: rules_path = Path(f"{KNOWLEDGE_RESOURCE_PATH}/sql_correction_rule.txt") if not rules_path.exists(): raise WrenError( ErrorCode.GENERIC_INTERNAL_ERROR, "SQL correction rule not found." )- return rules_path.read_text()+ try:+ return rules_path.read_text()+ except (PermissionError, OSError) as e:+ raise WrenError(+ ErrorCode.GENERIC_INTERNAL_ERROR,+ f"Failed to read SQL correction rule: {str(e)}"+ )
21-35:Handle individual file read errors in dictionary comprehension.The dictionary comprehension on line 35 calls
file.read_text()for each file without error handling. If any single file is unreadable (due to permissions, encoding issues, etc.), the entire method fails. As noted in past reviews, this can crash the knowledge system unnecessarily.Apply this diff to gracefully skip unreadable files:
- return {file.stem: file.read_text() for file in files}+ result = {}+ for file in files:+ try:+ result[file.stem] = file.read_text()+ except (PermissionError, OSError, UnicodeDecodeError) as e:+ # Log warning and skip unreadable files+ # Consider adding logging here+ continue+ return result
🧹 Nitpick comments (1)
ibis-server/resources/knowledge/sql_correction_rule.txt (1)
38-62:Consider varying sentence structure for improved readability.The static analysis tool suggests reducing repetitive phrasing (e.g., "You need to", "You may need to") and avoiding consecutive sentences starting with "If". While the content is clear, varying the sentence structure could enhance readability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
ibis-server/app/mdl/knowledge.py(1 hunks)ibis-server/resources/knowledge/sql_correction_rule.txt(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ibis-server/app/mdl/knowledge.py (2)
ibis-server/app/model/data_source.py (1)
DataSource(62-221)ibis-server/app/model/error.py (2)
ErrorCode(19-38)WrenError(65-113)
🪛 LanguageTool
ibis-server/resources/knowledge/sql_correction_rule.txt
[style] ~43-~43: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...an internal bug of Wren engine. You may need to report the issue to the Wren support te...
(REP_NEED_TO_VB)
[style] ~44-~44: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ... error is from the execution stage, You need to check the Dialect SQL generated by Wren...
(REP_NEED_TO_VB)
[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... those unsupported features. If a not-found table or columns error occu...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~45-~45: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...stage, check the location of the error. If it occur the source subquery (typically...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~47-~47: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ated issues or runtime constraints. You need to analyze the execution context and data ...
(REP_NEED_TO_VB)
[style] ~48-~48: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...alues, or data type mismatches. You may need to check the sample data in the target dat...
(REP_NEED_TO_VB)
[style] ~49-~49: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...ze of the data being processed. You may need to optimize the Wren SQL to reduce its com...
(REP_NEED_TO_VB)
[style] ~51-~51: Consider using a different verb for a more formal wording.
Context: ...dentified error, modify the Wren SQL to correct the issues. This may involve changing t...
(FIX_RESOLVE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: ci
- GitHub Check: test
- GitHub Check: cargo check
- GitHub Check: ci
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
goldmedal commentedDec 17, 2025
CI failures are flaky. Let's ignore them |
Uh oh!
There was an error while loading.Please reload this page.
Summary by CodeRabbit
New Features
Documentation
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.