Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

fix: vrl panic issues (#8355)#8366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
oasisk merged 2 commits intomainfromfix-vrl-panic-and-ts-handling
Sep 11, 2025
Merged

Conversation

@oasisk
Copy link
Contributor

@oasiskoasisk commentedSep 10, 2025
edited by github-actionsbot
Loading

PR Type

Bug fix, Enhancement


Description

  • Guard flattening for non-object JSON values

  • Prevent panics by checking null/object types

  • Improve timestamp handling on null values

  • Tighten pipeline flattening conditions


Diagram Walkthrough

flowchart LR  VRL["VRL result"] -- "may be null/non-object" --> Guard["Check is_object && not null"]  Guard -- "true" --> Flatten["Flatten JSON"]  Guard -- "false" --> Bypass["Use value as-is"]  Timestamp["Ingest timestamp"] -- "null" --> Now["Default to current time"]  Pipeline["Pipeline nodes"] -- "flatten only objects" --> Flatten
Loading

File Walkthrough

Relevant files
Bug fix
multi_streams.rs
Safe flattening in multi-stream search                                     

src/handler/http/request/search/multi_streams.rs

  • Flatten only when value is non-null object
  • Preserve non-object items in per-query responses
  • Avoid unwrap panics by using ok()/None
+13/-5   
functions.rs
Guard VRL transform flattening                                                     

src/service/functions.rs

  • Only flatten VRL return if object
  • Prevent empty-string on non-object by-pass
+1/-1     
ingest.rs
Robust null timestamp handling                                                     

src/service/logs/ingest.rs

  • Default null timestamp to now
  • Keep error on unparsable non-null timestamps
+10/-4   
http.rs
Safe flattening in clustered search                                           

src/service/search/cluster/http.rs

  • Flatten array items only if objects
  • Guard VRL result flattening
  • Add conditional hit flattening in query_fn path
+17/-12 
Enhancement
batch_execution.rs
Pipeline: guarded flattening and timestamp checks               

src/service/pipeline/batch_execution.rs

  • Flatten only non-null objects across nodes
  • Apply guards before condition/func/destination steps
  • Conditional timestamp handling only for objects
  • Improve error logging and flow control
+26/-18 

greptile-apps[bot] reacted with thumbs up emoji
@github-actionsgithub-actionsbot added ☢️ BugSomething isn't working Review effort 2/5 labelsSep 10, 2025
@github-actions
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Panic

In per-query array handling,v.as_array().unwrap() is still used. Although later guards avoid flattening non-objects, thisunwrap() can panic ifv is not an array whenper_query_resp is true. Consider safe handling or returning None/Error instead of unwrap.

let flattened_array = v.as_array().unwrap().iter().map(|item|{if !item.is_null() && item.is_object(){            config::utils::flatten::flatten(item.clone()).unwrap()}else{            item.clone()}}).collect::<Vec<_>>();Some(serde_json::Value::Array(flattened_array))
Behavior Change

Timestamp parsing now treats explicit null timestamps as "use current time" instead of erroring. Verify this is intended for ingestion semantics and auditing; previously unparseable values errored, null now silently defaults.

Some(v) =>{if !v.is_null(){matchparse_timestamp_micro_from_value(v){Ok(t) => t,Err(_) =>returnErr(anyhow::Error::msg("Can't parse timestamp")),}}else{Utc::now().timestamp_micros()}}None =>Utc::now().timestamp_micros(),};
Dropped Records

Multiple branches now skip processing and pushing records unless they are JSON objects (e.g., before timestamp handling and during flatten-after checks). Confirm non-object records should be discarded rather than forwarded, to avoid unintended data loss.

// handle timestamp before sending to remote_write serviceif !flattened && !record.is_null() && record.is_object(){    record =match flatten::flatten_with_level(        record,        cfg.limit.ingest_flatten_level,){Ok(flattened) => flattened,Err(e) =>{let err_msg =format!("DestinationNode error with flattening: {e}");ifletErr(send_err) = error_sender.send((node.id.to_string(), node.node_type(), err_msg,None)).await{                log::error!("[Pipeline] {pipeline_name} : DestinationNode failed sending errors for collection caused by: {send_err}");break;}continue;}};}if !record.is_null() && record.is_object(){ifletErr(e) =crate::service::logs::ingest::handle_timestamp(&mut record, min_ts, max_ts){let err_msg =format!("DestinationNode error handling timestamp: {}", e);ifletErr(send_err) = error_sender.send((node.id.to_string(), node.node_type(), err_msg,None)).await{            log::error!("[Pipeline] {} : DestinationNode failed sending errors for collection caused by: {send_err}",                pipeline_name);break;}continue;}    records.push(record);    count +=1;}

Copy link
Contributor

@greptile-appsgreptile-appsbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Greptile Summary

This PR fixes VRL (Vector Remap Language) panic issues by adding comprehensive null and type validation checks before calling theflatten function throughout the codebase. The changes target four critical files where VRL functions process JSON data:

Core Problem: Theflatten utility function expects JSON objects as input and returns errors for other types (null, arrays, primitives). The original code was callingflatten without proper type validation, causing runtime panics when VRL functions returned non-object values.

Changes Made:

  1. Multi-stream search handling (multi_streams.rs): Added!item.is_null() && item.is_object() checks before flattening search results, replacing unsafe.unwrap() calls with.ok() for graceful error handling
  2. Pipeline batch execution (batch_execution.rs): Added validation checks throughout stream, condition, function, and remote stream nodes to prevent processing of null/non-object records
  3. Function testing (functions.rs): Enhanced the test runner with object type validation before flattening VRL function return values
  4. Cluster search (http.rs): Replaced error propagation patterns with graceful handling using.ok() and added explicit type checks before flatten operations

Integration Context: These changes align with OpenObserve's robust data processing pipeline, where VRL functions are used for data transformation across search, ingestion, and pipeline processing. The fixes ensure the system can handle diverse input data types without crashing, maintaining system stability when processing real-world data that may contain unexpected formats or null values. The defensive programming approach allows partial processing success rather than complete system failures.

Confidence score: 4/5

  • This PR significantly improves system stability by preventing VRL-related panics with minimal risk
  • Score reflects well-targeted fixes addressing a specific panic issue with consistent patterns across multiple files
  • Pay close attention to the pipeline batch execution file for potential edge cases in complex data processing scenarios

4 files reviewed, 1 comment

Edit Code Review Bot Settings |Greptile

.map(|item|{
config::utils::flatten::flatten(item.clone()).unwrap()
if !item.is_null() && item.is_object(){
config::utils::flatten::flatten(item.clone()).unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

style: Still using.unwrap() here - consider using.ok() for consistency with the other flatten calls

@github-actions
Copy link
Contributor

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                   Impact
Possible issue
Prevent panics on non-array values

Avoid.unwrap() onas_array() and during flatten to prevent panics ifv is not an
array or flattening fails. Gracefully handle non-array or non-object items by
skipping or passing them through unchanged.

src/handler/http/request/search/multi_streams.rs [542-560]

 if per_query_resp {-    let flattened_array = v-        .as_array()-        .unwrap()-        .iter()-        .map(|item| {-            if !item.is_null() && item.is_object() {-                config::utils::flatten::flatten(item.clone()).unwrap()-            } else {-                item.clone()-            }-        })-        .collect::<Vec<_>>();-    Some(serde_json::Value::Array(flattened_array))+    if let Some(arr) = v.as_array() {+        let flattened_array = arr+            .iter()+            .map(|item| {+                if !item.is_null() && item.is_object() {+                    config::utils::flatten::flatten(item.clone()).unwrap_or(item.clone())+                } else {+                    item.clone()+                }+            })+            .collect::<Vec<_>>();+        Some(serde_json::Value::Array(flattened_array))+    } else {+        None+    } } else if !v.is_null() && v.is_object() {     config::utils::flatten::flatten(v.clone()).ok() } else {     None }
Suggestion importance[1-10]: 7

__

Why: Replacingunwrap() onas_array() withif let Some(arr) removes a potential panic and aligns with prior optional handling; the improved_code correctly mirrors the new hunk and keeps behavior for objects. Impact is moderate since per-query response may encounter non-arrays at runtime.

Medium
General
Keep timestamp field in sync

Treat null timestamps consistently with missing timestamps by defaulting without
error, but also update the object so downstream logic sees the finalized timestamp.
This avoids silent divergence betweentimestamp andvalue[TIMESTAMP_COL_NAME].

src/service/logs/ingest.rs [568-579]

 let timestamp = match local_val.get(TIMESTAMP_COL_NAME) {     Some(v) => {         if !v.is_null() {             match parse_timestamp_micro_from_value(v) {                 Ok(t) => t,                 Err(_) => return Err(anyhow::Error::msg("Can't parse timestamp")),             }         } else {-            Utc::now().timestamp_micros()+            let now = Utc::now().timestamp_micros();+            local_val.insert(TIMESTAMP_COL_NAME.to_string(), json::Value::from(now));+            now         }     }-    None => Utc::now().timestamp_micros(),+    None => {+        let now = Utc::now().timestamp_micros();+        local_val.insert(TIMESTAMP_COL_NAME.to_string(), json::Value::from(now));+        now+    } };
Suggestion importance[1-10]: 6

__

Why: Ensuring the JSON object gets updated when defaulting the timestamp improves consistency and downstream behavior. The change is accurate to the new hunk context and moderately improves robustness, though not critical.

Low
Avoid silently dropping records

Log or handle records that are null or non-objects to avoid silently dropping data.
At minimum, count or forward them unchanged so the pipeline doesn't appear to stall
with missing outputs.

src/service/pipeline/batch_execution.rs [861-881]

 if !record.is_null() && record.is_object() {     if let Err(e) =         crate::service::logs::ingest::handle_timestamp(&mut record, min_ts, max_ts)     {         let err_msg = format!("DestinationNode error handling timestamp: {}", e);         if let Err(send_err) = error_sender             .send((node.id.to_string(), node.node_type(), err_msg, None))             .await         {             log::error!(                 "[Pipeline] {} : DestinationNode failed sending errors for collection caused by: {send_err}",                 pipeline_name             );             break;         }         continue;     }-+    records.push(record);+    count += 1;+} else {+    // Forward or at least account for non-object records to avoid silent drops.     records.push(record);     count += 1; }
Suggestion importance[1-10]: 5

__

Why: Highlighting potential silent drops for non-object records is reasonable and the proposed forwarding is consistent with prior behavior, but it changes semantics (may push invalid records) and may not be desired. Impact is modest and context-dependent.

Low

@oasiskoasiskforce-pushed thefix-vrl-panic-and-ts-handling branch 5 times, most recently fromf5af0c8 toa0cb4a5CompareSeptember 11, 2025 11:44
@oasiskoasiskforce-pushed thefix-vrl-panic-and-ts-handling branch froma0cb4a5 toc7a6394CompareSeptember 11, 2025 13:27
@oasiskoasisk merged commit49a6744 intomainSep 11, 2025
28 checks passed
@oasiskoasisk deleted the fix-vrl-panic-and-ts-handling branchSeptember 11, 2025 14:51
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@uddhavdaveuddhavdaveuddhavdave approved these changes

+1 more reviewer

@greptile-appsgreptile-apps[bot]greptile-apps[bot] left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

☢️ BugSomething isn't workingReview effort 2/5

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@oasisk@uddhavdave

[8]ページ先頭

©2009-2025 Movatter.jp