- Notifications
You must be signed in to change notification settings - Fork28
Search optimization and indexing based on datetime#405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Open
GrzegorzPustulka wants to merge6 commits intostac-utils:mainChoose a base branch fromGrzegorzPustulka:search_optimization
base:main
Could not load branches
Branch not found:{{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
Open
Search optimization and indexing based on datetime#405
GrzegorzPustulka wants to merge6 commits intostac-utils:mainfromGrzegorzPustulka:search_optimization
Uh oh!
There was an error while loading.Please reload this page.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
38c85e4
to11e40c4
Compare295d3d6
to243dd1c
CompareContributorAuthor
GrzegorzPustulka commentedJul 8, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@jonhealy1 The MR is already finished and ready for code review. |
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading.Please reload this page.
Related Issue(s):
Index Management System with Time-based Partitioning
Description
This PR introduces a new index management system that enables automatic index partitioning based on dates and index size control with automatic splitting.
How it works
System Architecture
The system consists of several main components:
1. Search Engine Adapters
SearchEngineAdapter
- base classElasticsearchAdapter
andOpenSearchAdapter
- implementations for specific engines2. Index Selection Strategies
AsyncDatetimeBasedIndexSelector
/SyncDatetimeBasedIndexSelector
- date-based index filteringUnfilteredIndexSelector
- returns all indexes (fallback)3. Data Insertion Strategies
Datetime Strategy - Operation Details
Index Format:
Item Insertion Process:
properties.datetime
)DATETIME_INDEX_MAX_SIZE_GB
) - splits indexEarly Date Handling:
If item has date earlier than oldest index:
Index Splitting:
When index exceeds size limit:
Cache and Performance
IndexCacheManager:
AsyncIndexAliasLoader / SyncIndexAliasLoader:
Configuration
New Environment Variables:
Usage Examples
Scenario 1: Adding items to new collection
2025-01-15
→ creates indexitems_collection_2025-01-15
Scenario 2: Size limit exceeded
items_collection_2025-01-01
reaches 25GB2025-03-15
→ system splits index:items_collection_2025-01-01-2025-03-15
items_collection_2025-03-16
Scenario 3: Item with early date
items_collection_2025-02-01
2024-12-15
→ creates:items_collection_2024-12-15-2025-01-31
Search
System automatically filters indexes during search:
Query with date range:
Searches only indexes containing items from this period, instead of all collection indexes.
Factories
IndexSelectorFactory:
create_async_selector()
/create_sync_selector()
IndexInsertionFactory:
SearchEngineAdapterFactory:
Backward Compatibility
ENABLE_DATETIME_INDEX_FILTERING=false
→ works as beforeAll operations have sync and async versions for different usage contexts in the application.
PR Checklist:
pre-commit run --all-files
)make test
)