- Notifications
You must be signed in to change notification settings - Fork1k
Closed
Description
In the latest scale test, we ran into coderd restarts and upon inspecting logs, we saw that the OOM killer was summoned, that we sawa lot of log messages from the aggregator:update queue is full
.
As can be seen from these graphs, both CPU and memory usage spikes coincided with the restart(s).

What can also be observed above is that one coderd instance had it's CPU pegged, upon CPU/trace inspection, the finger is once again pointed towards the aggregator:

The same is shown in the CPU profile:

One code-path that's being executed for a while (as shown in the trace above), is this loop:
for_,m:=rangereq.metrics { |
// ping@mtojek for insights since you worked on the initial feature.