Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[Enhancement] PowerShell - Optimize Entropy Calculation, Add Normalized Entropy, Add Pipeline Benchmark#16707

Merged
w0rk3r merged 6 commits intomainfrom
posh_entropy_2
Jan 26, 2026
Merged

[Enhancement] PowerShell - Optimize Entropy Calculation, Add Normalized Entropy, Add Pipeline Benchmark#16707
w0rk3r merged 6 commits intomainfrom
posh_entropy_2

Conversation

@w0rk3r
Copy link
Contributor

@w0rk3rw0rk3r commentedDec 26, 2025
edited
Loading

Proposed commit message

windows: refine PowerShell script entropy pipelineReplace code-point HashMap counting with a fixed 65k UTF-16 char histogramand skip truncated signature fragments before entropy is computed. Add anormalized entropy field scaled by script length (0–1).

Summary

Related issue:

This PR:

  • Replaces code‑point HashMap counting with a fixed 65k UTF‑16 char histogram for script entropy, reducing the script processor time and improving eps (2924 → 4873 eps in warm run).
  • Skips truncated signature fragments before entropy is computed.
  • Addspowershell.file.script_block_entropy_normalized = entropy_bits / log2(script_block_length) (0–1).
  • Adds benchmark fixtures to track performance regressions during our research.

Old pipeline:

image

Improved pipeline:

image
Complete benchmark output

Old:

PS C:\Users\Jonhnathan\Documents\Github\integrations\packages\windows> .\..\..\elastic-package.exe benchmark pipeline --data-streams powershell_operational --use-test-samples=falseRun pipeline benchmarks for the package--- Benchmark results for package: windows - START ---╭─────────────────────────╮│ parameters              │├──────────────────┬──────┤│ source_doc_count │   11 ││ doc_count        │ 2500 │╰──────────────────┴──────╯╭───────────────────────────╮│ pipeline_performance      │├─────────────────┬─────────┤│ processing_time │   1.10s ││ eps             │ 2278.94 │╰─────────────────┴─────────╯╭────────────────────────────────────────╮│ procs_by_total_time                    │├───────────────────────────────┬────────┤│ script @ default.yml:322      │ 47.49% ││ gsub @ default.yml:305        │ 30.36% ││ fingerprint @ default.yml:311 │  3.19% ││ set @ default.yml:60          │  2.10% ││ script @ default.yml:13       │  1.82% ││ gsub @ default.yml:316        │  1.09% ││ script @ default.yml:30       │  1.00% ││ remove @ default.yml:575      │  0.55% ││ rename @ default.yml:290      │  0.18% ││ trim @ default.yml:302        │  0.18% │╰───────────────────────────────┴────────╯╭─────────────────────────────────────────╮│ procs_by_avg_time_per_doc               │├───────────────────────────────┬─────────┤│ script @ default.yml:322      │ 208.4µs ││ gsub @ default.yml:305        │ 133.2µs ││ fingerprint @ default.yml:311 │    14µs ││ set @ default.yml:60          │   9.2µs ││ script @ default.yml:13       │     8µs ││ gsub @ default.yml:316        │   4.8µs ││ script @ default.yml:30       │   4.4µs ││ remove @ default.yml:575      │   2.4µs ││ rename @ default.yml:290      │   800ns ││ trim @ default.yml:302        │   800ns │╰───────────────────────────────┴─────────╯--- Benchmark results for package: windows - END   ---Done--- Benchmark results for package: windows - START ---╭─────────────────────────╮│ parameters              │├──────────────────┬──────┤│ source_doc_count │   11 ││ doc_count        │ 2500 │╰──────────────────┴──────╯╭───────────────────────────╮│ pipeline_performance      │├─────────────────┬─────────┤│ processing_time │   0.85s ││ eps             │ 2923.98 │╰─────────────────┴─────────╯╭────────────────────────────────────────╮│ procs_by_total_time                    │├───────────────────────────────┬────────┤│ script @ default.yml:322      │ 50.53% ││ gsub @ default.yml:305        │ 34.15% ││ fingerprint @ default.yml:311 │  2.57% ││ gsub @ default.yml:316        │  1.17% ││ script @ default.yml:13       │  0.70% ││ set @ default.yml:60          │  0.58% ││ remove @ default.yml:575      │  0.35% ││ script @ default.yml:30       │  0.35% ││ rename @ default.yml:290      │  0.12% │╰───────────────────────────────┴────────╯╭─────────────────────────────────────────╮│ procs_by_avg_time_per_doc               │├───────────────────────────────┬─────────┤│ script @ default.yml:322      │ 172.8µs ││ gsub @ default.yml:305        │ 116.8µs ││ fingerprint @ default.yml:311 │   8.8µs ││ gsub @ default.yml:316        │     4µs ││ script @ default.yml:13       │   2.4µs ││ set @ default.yml:60          │     2µs ││ remove @ default.yml:575      │   1.2µs ││ script @ default.yml:30       │   1.2µs ││ rename @ default.yml:290      │   400ns │╰───────────────────────────────┴─────────╯--- Benchmark results for package: windows - END   ---Done

Improved:

PS C:\Users\Jonhnathan\Documents\Github\integrations\packages\windows> .\..\..\elastic-package.exe benchmark pipeline --data-streams powershell_operational --use-test-samples=falseRun pipeline benchmarks for the package--- Benchmark results for package: windows - START ---╭─────────────────────────╮│ parameters              │├──────────────────┬──────┤│ source_doc_count │   11 ││ doc_count        │ 2500 │╰──────────────────┴──────╯╭───────────────────────────╮│ pipeline_performance      │├─────────────────┬─────────┤│ processing_time │   0.51s ││ eps             │ 4892.37 │╰─────────────────┴─────────╯╭────────────────────────────────────────╮│ procs_by_total_time                    │├───────────────────────────────┬────────┤│ gsub @ default.yml:305        │ 55.19% ││ script @ default.yml:322      │ 28.18% ││ fingerprint @ default.yml:311 │  4.11% ││ gsub @ default.yml:316        │  1.96% ││ script @ default.yml:13       │  0.59% ││ remove @ default.yml:657      │  0.39% ││ rename @ default.yml:290      │  0.20% ││ set @ default.yml:60          │  0.20% │╰───────────────────────────────┴────────╯╭─────────────────────────────────────────╮│ procs_by_avg_time_per_doc               │├───────────────────────────────┬─────────┤│ gsub @ default.yml:305        │ 112.8µs ││ script @ default.yml:322      │  57.6µs ││ fingerprint @ default.yml:311 │   8.4µs ││ gsub @ default.yml:316        │     4µs ││ script @ default.yml:13       │   1.2µs ││ remove @ default.yml:657      │   800ns ││ rename @ default.yml:290      │   400ns ││ set @ default.yml:60          │   400ns │╰───────────────────────────────┴─────────╯--- Benchmark results for package: windows - END   ---Done--- Benchmark results for package: windows - START ---╭─────────────────────────╮│ parameters              │├──────────────────┬──────┤│ source_doc_count │   11 ││ doc_count        │ 2500 │╰──────────────────┴──────╯╭───────────────────────────╮│ pipeline_performance      │├─────────────────┬─────────┤│ processing_time │   0.51s ││ eps             │ 4873.29 │╰─────────────────┴─────────╯╭────────────────────────────────────────╮│ procs_by_total_time                    │├───────────────────────────────┬────────┤│ gsub @ default.yml:305        │ 57.89% ││ script @ default.yml:322      │ 25.93% ││ fingerprint @ default.yml:311 │  3.51% ││ gsub @ default.yml:316        │  1.95% ││ script @ default.yml:13       │  0.78% ││ remove @ default.yml:657      │  0.39% ││ set @ default.yml:60          │  0.19% │╰───────────────────────────────┴────────╯╭─────────────────────────────────────────╮│ procs_by_avg_time_per_doc               │├───────────────────────────────┬─────────┤│ gsub @ default.yml:305        │ 118.8µs ││ script @ default.yml:322      │  53.2µs ││ fingerprint @ default.yml:311 │   7.2µs ││ gsub @ default.yml:316        │     4µs ││ script @ default.yml:13       │   1.6µs ││ remove @ default.yml:657      │   800ns ││ set @ default.yml:60          │   400ns │╰───────────────────────────────┴─────────╯--- Benchmark results for package: windows - END   ---Done

Checklist

  • I have reviewedtips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package'schangelog.yml file.
  • I have verified that Kibana version constraints are current according toguidelines.
  • I have verified that any added dashboard complies with Kibana'sDashboard good practices

@w0rk3rw0rk3r self-assigned thisDec 26, 2025
@w0rk3rw0rk3r requested review froma team ascode ownersDecember 26, 2025 22:21
@w0rk3rw0rk3r added enhancementNew feature or request Integration:windowsWindows Team:Security-Windows PlatformSecurity Windows Platform team [elastic/sec-windows-platform] labelsDec 26, 2025
@elasticmachine
Copy link

Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform)

@pierrehilbertpierrehilbert added the Team:Elastic-Agent-Data-PlaneAgent Data Plane team [elastic/elastic-agent-data-plane] labelJan 4, 2026
@elasticmachine
Copy link

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@mauri870mauri870 self-requested a reviewJanuary 5, 2026 12:15
Copy link
Member

@mauri870mauri870 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM, but I'm not very proficient with PowerShell. The code looks fine, but it needs a deeper look from the Windows team.

@andrewkrohandrewkroh added the documentationImprovements or additions to documentation. Applied to PRs that modify *.md files. labelJan 8, 2026

double normalizedEntropy = 0.0;
if (length > 1) {
double maxEntropy = Math.log((double) length) * invLog2; // max bits if every character is unique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think the normalized entropy calculation looks good 👍

Few notes for posterity:

  • For the linedouble maxEntropy = Math.log((double) length) * invLog2; // max bits if every character is unique I think it makes sense to use length here. Typical normalized entropy calculations (like that for R/Posteriorref) would use something akin toseenCount instead oflength. However, this is expecting the input to be more akin to categories wherea anda are equivalent regardless of their position in the script block. In our case, I think we want the position to mater as well, so each value is by definition unique makinglength the correct number to use here (as is correctly done in the code).
  • The pre-output checkelse if (normalizedEntropy > 1.0) normalizedEntropy = 1.0; I think is technically not necessary, as this should not occur. However, I think we should keep this check as it could catch floating point rounding issues without impacting the integrity of the data result (code is correct as is).

w0rk3r reacted with thumbs up emoji
@elastic-vault-github-plugin-prod

🚀 Benchmarks report

To see the full report comment with/test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

History

cc@w0rk3r

@w0rk3rw0rk3r merged commitda83fd3 intomainJan 26, 2026
8 checks passed
@w0rk3rw0rk3r deleted the posh_entropy_2 branchJanuary 26, 2026 23:31
@elastic-vault-github-plugin-prod

Package windows - 3.4.0 containing this change is available athttps://epr.elastic.co/package/windows/3.4.0/

jakubgalecki0 pushed a commit to jakubgalecki0/integrations that referenced this pull requestFeb 19, 2026
…ed Entropy, Add Pipeline Benchmark (elastic#16707)* [Enhancement] PowerShell - Optimize Entropy Calculation, Add Normalized Entropy, Add Pipeline Benchmark* Update test-powershell-operational-events.json-expected.json* Update changelog.yml* rename benchmark file
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@eric-forte-elasticeric-forte-elasticeric-forte-elastic left review comments

@gogochangogochangogochan approved these changes

@mauri870mauri870mauri870 approved these changes

@nfrittsnfrittsnfritts approved these changes

@faecfaecAwaiting requested review from faecfaec is a code owner automatically assigned from elastic/elastic-agent-data-plane

Assignees

@w0rk3rw0rk3r

Labels

documentationImprovements or additions to documentation. Applied to PRs that modify *.md files.enhancementNew feature or requestIntegration:windowsWindowsTeam:Elastic-Agent-Data-PlaneAgent Data Plane team [elastic/elastic-agent-data-plane]Team:Security-Windows PlatformSecurity Windows Platform team [elastic/sec-windows-platform]

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

8 participants

@w0rk3r@elasticmachine@gogochan@mauri870@nfritts@eric-forte-elastic@pierrehilbert@andrewkroh

Comments


[8]ページ先頭

©2009-2026 Movatter.jp