Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: add organization-level alert deduplication with semantic field groups#9209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
ByteBaker merged 13 commits intomainfromfeat/dedup
Nov 26, 2025

Conversation

@ByteBaker
Copy link
Contributor

@ByteBakerByteBaker commentedNov 20, 2025
edited
Loading

Summary

Implements org-level alert deduplication with semantic field groups for intelligent alert suppression and batched notifications. This feature prevents alert storms by deduplicating alerts based on configurable fingerprints and grouping related alerts together.

Core Capabilities

  • Semantic Field Groups: Map field name variations (hostname,host,node) to canonical dimensions for consistent deduplication across different data sources
  • Organization-Level Configuration: Global deduplication settings with cross-alert support to suppress alerts from different rules sharing semantic dimensions
  • Alert Grouping & Batching: Wait-and-collect logic with three send strategies (FirstWithCount,Summary,All) for batched notifications
  • Per-Alert Control: Individual alert deduplication settings with optional fingerprint field customization

Features

Deduplication Modes:

  • Per-alert exact matching using fingerprint-based TTL expiration
  • Cross-alert semantic matching using org-defined dimension groups
  • Configurable time windows (per-alert override or org-level default)

Alert Grouping:

  • Batch alerts with same fingerprint before sending notifications
  • Three strategies: first alert with count, summary with names, or all details
  • Configurable wait time (1-300 seconds) and max batch size (1-100 alerts)
  • Background processor runs every 1 second for expired batches

Semantic Field Groups:

  • Built-in presets for common fields (host, IP address, service)
  • Kubernetes resource groups (cluster, namespace, pod, node, container)
  • Field normalization support (lowercase + trim)
  • Overlap handling with first-defined precedence

Configuration Types

  • SemanticFieldGroup: Field name equivalence definitions with normalization
  • GlobalDeduplicationConfig: Org-level settings with semantic groups and cross-alert dedup
  • DeduplicationConfig: Per-alert fingerprint fields and grouping options

API Endpoints

  • GET /{org_id}/alerts/deduplication/config - Retrieve org-level dedup config
  • POST /{org_id}/alerts/deduplication/config - Set org-level dedup config
  • DELETE /{org_id}/alerts/deduplication/config - Delete org-level dedup config

Implementation Details

  • Pure business logic algorithms in enterprise layer (fingerprinting, matching, calculations)
  • OSS service layer orchestrates database operations with enterprise delegation
  • Feature-gated with#[cfg(feature = "enterprise")] for enterprise-only functionality
  • Background job architecture for async batch processing
  • Storage: Org config in KV DB at/alert_config/{org_id}/deduplication

Architecture

Alert Triggered     ↓Query Evaluation     ↓Deduplication Check (fingerprint + TTL)     ↓Grouping Enabled? → Add to Batch → Wait/Collect → Send Grouped     ↓ NoSend Individual Notification

@github-actions
Copy link
Contributor

Failed to generate code suggestions for PR

@ByteBakerByteBakerforce-pushed thefeat/dedup branch 6 times, most recently fromb8a5ee6 toc83de04CompareNovember 25, 2025 07:46
ByteBaker added a commit that referenced this pull requestNov 25, 2025
Implements alert correlation that groups related alerts into incidents based on semantic field matching and temporal proximity.**Backend:**- Add `correlation.rs` config with validation for correlation dimensions and matching strategies- Add `alert_incidents` and `alert_incident_alerts` database entities with SeaORM- Add database migration `m20251107_000003_create_alert_correlation_schema`- Add `correlation.rs` service with transaction-safe incident creation and matching- Add `incidents.rs` HTTP handlers for incident CRUD operations (6 endpoints)- Integrate correlation into alert scheduler to auto-correlate on alert firing- Add 7 correlation metrics for observability (incidents created, alerts matched, confidence distribution, processing duration, MTTR)- Update `org_config.rs` with correlation config persistence functions- Update organization settings to include deduplication config in response**Frontend:**- Add `IncidentList.vue` component with status filtering and sortable table- Add `IncidentDetailsDrawer.vue` for viewing incident details and associated alerts- Add Incidents tab to `AlertList.vue` for accessing incident management UI- Add 6 incident API methods to `alerts.ts` service (list, get, update status, config CRUD)**Fixes:**- Fix `OrganizationSettingResponse` test to include `deduplication_config` field- Fix metering init call signature (remove extra argument)- Comment out data retention usage code pending enterprise module updateMigrated from PR#9011, separated from deduplication feature (PR#9209).
ByteBaker added a commit that referenced this pull requestNov 25, 2025
Implements alert correlation that groups related alerts into incidents based on semantic field matching and temporal proximity.**Backend:**- Add `correlation.rs` config with validation for correlation dimensions and matching strategies- Add `alert_incidents` and `alert_incident_alerts` database entities with SeaORM- Add database migration `m20251107_000003_create_alert_correlation_schema`- Add `correlation.rs` service with transaction-safe incident creation and matching- Add `incidents.rs` HTTP handlers for incident CRUD operations (6 endpoints)- Integrate correlation into alert scheduler to auto-correlate on alert firing- Add 7 correlation metrics for observability (incidents created, alerts matched, confidence distribution, processing duration, MTTR)- Update `org_config.rs` with correlation config persistence functions- Update organization settings to include deduplication config in response**Frontend:**- Add `IncidentList.vue` component with status filtering and sortable table- Add `IncidentDetailsDrawer.vue` for viewing incident details and associated alerts- Add Incidents tab to `AlertList.vue` for accessing incident management UI- Add 6 incident API methods to `alerts.ts` service (list, get, update status, config CRUD)**Fixes:**- Fix `OrganizationSettingResponse` test to include `deduplication_config` field- Fix metering init call signature (remove extra argument)- Comment out data retention usage code pending enterprise module updateMigrated from PR#9011, separated from deduplication feature (PR#9209).
ByteBakerand others added10 commitsNovember 26, 2025 11:22
…groupsImplements org-level deduplication configuration with semantic field groupsfor intelligent alert suppression and batched notifications.Core capabilities:- Semantic field groups: Map field name variations (`hostname`/`host`/`node`) to  canonical dimensions for consistent deduplication across data sources- Org-level dedup config: Global settings with cross-alert deduplication  support to suppress alerts sharing semantic dimensions- Alert grouping: Wait-and-collect batching with three send strategies  (`FirstWithCount`, `Summary`, `All`)- HTTP API: Endpoints for org-level deduplication configuration managementFeatures:- Per-alert fingerprint-based deduplication with TTL expiration- Cross-alert semantic matching using org-defined dimension groups- Configurable time windows (per-alert override or org default)- Background job processes expired batches every 1 second- Three notification strategies for grouped alertsConfiguration:- `SemanticFieldGroup`: Define field equivalences with optional normalization- `GlobalDeduplicationConfig`: Org-level settings stored at `/alert_config/{org_id}/deduplication`- `DeduplicationConfig`: Per-alert settings with fingerprint fields and grouping options- Default presets for common semantic groups (host, IP, service, K8s resources)API Endpoints:- `GET/POST/DELETE /{org_id}/alerts/deduplication/config`Implementation:- Business logic: Pure algorithms in enterprise layer for fingerprinting and matching- Service layer: Orchestrates DB operations with algorithm delegation- HTTP handlers: Feature-gated dual implementations for OSS/enterprise builds- Background jobs: Batch processor for grouped notification delivery
Implemented wait-and-batch mechanism to group multiple alerts with the same fingerprint before sending a single notification, reducing alert fatigue and improving visibility.**Alert Grouping/Batching:**- Added `grouping.rs` module with in-memory batch storage using `DashMap`- Implemented background worker in `alert_grouping.rs` polling every 1s for expired batches- Integrated grouping logic in `scheduler/handlers.rs` after deduplication- Supported all three `SendStrategy` variants: `FirstWithCount`, `Summary`, `All`- Auto-send when `max_group_size` reached or timer expires after `group_wait_seconds`- Registered background worker in `job/mod.rs`**Observability - Prometheus Metrics:**- Added 8 metrics to `metrics.rs`: dedup suppressions/passed/errors, grouping batches pending/sent/size/wait-time/errors- Instrumented `deduplication.rs` to track suppressions and passed alerts by type (same-alert vs cross-alert)- Instrumented `grouping.rs` and `alert_grouping.rs` to track batch lifecycle- All metrics registered in Prometheus registry and exposed at `/metrics` endpoint**UI Visibility:**- Added dedup badges to alert names in `AlertList.vue` showing configuration status- Added dedup column in `AlertHistory.vue` with visual indicators for sent/suppressed/grouped alerts- Created `DedupSummaryCards.vue` component displaying org-wide stats (total alerts, dedup enabled count, suppression rate, pending batches)- Added backend API `dedup_stats.rs` with `/alerts/dedup/summary` endpoint- Removed legacy View History button from alert list page- Extended `AlertHistoryEntry` with dedup fields: `dedup_enabled`, `dedup_suppressed`, `dedup_count`, `grouped`, `group_size`**Logging & Debugging:**- Comprehensive logging throughout grouping flow with `[grouping]` and `[alert_grouping_worker]` prefixes- Enhanced deduplication logging with `[dedup]` prefix showing fingerprints and occurrence counts- Added `get_pending_batch_count()` helper for API consumption**Technical Details:**- All features properly gated behind `#[cfg(feature = "enterprise")]`- Backward compatible: grouping disabled by default- In-memory batches cleared on restart (acceptable for 30s window)- Thread-safe implementation using `DashMap` and atomic operations
Extended TriggerData with dedup fields for per-execution visibility and added compact dedup column to alert table.Alert History Tracking:- Added dedup fields to TriggerData: dedup_enabled, dedup_suppressed, dedup_count, grouped, group_size- Implemented Default trait for clean initialization with ..Default::default()- Set tracking fields in handlers.rs when alerts are suppressed, grouped, or sent- History UI now shows actual dedup activity with iconsAlert List UI:- Added compact Dedup column in alert table (80px width)- Shows check icon if dedup enabled, dash if not- Tooltip displays fingerprint fields and grouping config- Clean inline status per alert without UI clutter
Fixed four failing unit tests:1. test_organization_config_minimal_serialization - Updated to check correct field name 'alert_dedup_enabled' instead of 'cross_alert_dedup'2. test_flatten_json_complex - Made JSON comparison order-independent by parsing embedded JSON strings3. test_flatten_with_level - Added helper function to structurally compare JSON values regardless of key ordering4. test_trigger_data_field_names - Updated field count from 21 to 26 to reflect new dedup/grouping fieldsThe tests were failing due to non-deterministic JSON key ordering and outdated field counts.
…groupsImplements org-level deduplication configuration with semantic field groupsfor intelligent alert suppression and batched notifications.Core capabilities:- Semantic field groups: Map field name variations (`hostname`/`host`/`node`) to  canonical dimensions for consistent deduplication across data sources- Org-level dedup config: Global settings with cross-alert deduplication  support to suppress alerts sharing semantic dimensions- Alert grouping: Wait-and-collect batching with three send strategies  (`FirstWithCount`, `Summary`, `All`)- HTTP API: Endpoints for org-level deduplication configuration managementFeatures:- Per-alert fingerprint-based deduplication with TTL expiration- Cross-alert semantic matching using org-defined dimension groups- Configurable time windows (per-alert override or org default)- Background job processes expired batches every 1 second- Three notification strategies for grouped alertsConfiguration:- `SemanticFieldGroup`: Define field equivalences with optional normalization- `GlobalDeduplicationConfig`: Org-level settings stored at `/alert_config/{org_id}/deduplication`- `DeduplicationConfig`: Per-alert settings with fingerprint fields and grouping options- Default presets for common semantic groups (host, IP, service, K8s resources)API Endpoints:- `GET/POST/DELETE /{org_id}/alerts/deduplication/config`Implementation:- Business logic: Pure algorithms in enterprise layer for fingerprinting and matching- Service layer: Orchestrates DB operations with algorithm delegation- HTTP handlers: Feature-gated dual implementations for OSS/enterprise builds- Background jobs: Batch processor for grouped notification delivery
ByteBakerand others added3 commitsNovember 26, 2025 11:30
Implemented wait-and-batch mechanism to group multiple alerts with the same fingerprint before sending a single notification, reducing alert fatigue and improving visibility.**Alert Grouping/Batching:**- Added `grouping.rs` module with in-memory batch storage using `DashMap`- Implemented background worker in `alert_grouping.rs` polling every 1s for expired batches- Integrated grouping logic in `scheduler/handlers.rs` after deduplication- Supported all three `SendStrategy` variants: `FirstWithCount`, `Summary`, `All`- Auto-send when `max_group_size` reached or timer expires after `group_wait_seconds`- Registered background worker in `job/mod.rs`**Observability - Prometheus Metrics:**- Added 8 metrics to `metrics.rs`: dedup suppressions/passed/errors, grouping batches pending/sent/size/wait-time/errors- Instrumented `deduplication.rs` to track suppressions and passed alerts by type (same-alert vs cross-alert)- Instrumented `grouping.rs` and `alert_grouping.rs` to track batch lifecycle- All metrics registered in Prometheus registry and exposed at `/metrics` endpoint**UI Visibility:**- Added dedup badges to alert names in `AlertList.vue` showing configuration status- Added dedup column in `AlertHistory.vue` with visual indicators for sent/suppressed/grouped alerts- Created `DedupSummaryCards.vue` component displaying org-wide stats (total alerts, dedup enabled count, suppression rate, pending batches)- Added backend API `dedup_stats.rs` with `/alerts/dedup/summary` endpoint- Removed legacy View History button from alert list page- Extended `AlertHistoryEntry` with dedup fields: `dedup_enabled`, `dedup_suppressed`, `dedup_count`, `grouped`, `group_size`**Logging & Debugging:**- Comprehensive logging throughout grouping flow with `[grouping]` and `[alert_grouping_worker]` prefixes- Enhanced deduplication logging with `[dedup]` prefix showing fingerprints and occurrence counts- Added `get_pending_batch_count()` helper for API consumption**Technical Details:**- All features properly gated behind `#[cfg(feature = "enterprise")]`- Backward compatible: grouping disabled by default- In-memory batches cleared on restart (acceptable for 30s window)- Thread-safe implementation using `DashMap` and atomic operations
Extended TriggerData with dedup fields for per-execution visibility and added compact dedup column to alert table.Alert History Tracking:- Added dedup fields to TriggerData: dedup_enabled, dedup_suppressed, dedup_count, grouped, group_size- Implemented Default trait for clean initialization with ..Default::default()- Set tracking fields in handlers.rs when alerts are suppressed, grouped, or sent- History UI now shows actual dedup activity with iconsAlert List UI:- Added compact Dedup column in alert table (80px width)- Shows check icon if dedup enabled, dash if not- Tooltip displays fingerprint fields and grouping config- Clean inline status per alert without UI clutter
@ByteBakerByteBaker merged commit44011fb intomainNov 26, 2025
34 checks passed
@ByteBakerByteBaker deleted the feat/dedup branchNovember 26, 2025 06:39
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@oasiskoasiskoasisk approved these changes

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@ByteBaker@oasisk@Shrinath-O2

[8]ページ先頭

©2009-2025 Movatter.jp