- Notifications
You must be signed in to change notification settings - Fork3.8k
Loki: Add a configurable ability to fudge incoming timestamps for guaranteed query sort order when receiving entries for the same stream that have duplicate timestamps.#6042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…utorSigned-off-by: Edward Welch <edward.welch@grafana.com>
Signed-off-by: Edward Welch <edward.welch@grafana.com>
Signed-off-by: Edward Welch <edward.welch@grafana.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM, I wonder if that could have been done instead though so that you have persistence and can avoid between batch problem.
I did run through the same thoughts with Owen and chased down to the ingester unordered head block to look at adding this there. It's possible but it's more complex, instead of a simple check vs the next entry you almost end up in a recursive pattern of checking, find a dupe, fudging, then checking that for a new dupe, fudging if necessary, checking etc.... Also it could still only guarantee fudging for the active head block and not already cut blocks. So my conclusion was: it's a little better but quite a bit more complex and performance impacting to do it in the ingester and I think this is a better place to start. We can always revisit that path if it looks like it would work better |
Uh oh!
There was an error while loading.Please reload this page.
What this PR does / why we need it:
Loki will accept entries with duplicate timestamps for the same stream as long as the log content is different.
Loki stores nanosecond precise timestamps which makes duplicates unlikely if your source system generates timestamps with this precision, however many systems do not have this level of precision, and in some cases may only have second level precision.
This leads to a common enough case where Loki receives multiple entries with the same timestamp.
The problem arises at query time, while Loki can definitely sort entries with different timestamps, in the case where the timestamps are duplicate for the same stream it's currently not possible for Loki to guarantee they will always be displayed exactly as received.
This PR takes a fairly naive approach at solving this problem by intentionally fudging the timestamp of log lines with duplicate timestamps by one nanosecond such that they are no longer duplicate and will always sort correctly at query time.
Two important things to note
I think however this is a reasonable and simple approach to the most common case which is a low precision timestamp source having multiple entries at the same timestamp in which it's more important to have them sort in the received order at query time than it is if the timestamp is fudged by one or a few nanoseconds to accomplish this.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Checklist
CHANGELOG.md
.docs/sources/upgrading/_index.md