- Notifications
You must be signed in to change notification settings - Fork38
Add more columns/dimensions to pg_wait_sampling#97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:master
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…sionsSometimes it can be useful to have additional info collected with wait_events,so we add two new views/functions that include more information. The structure ofthose views could be changed in new versions of pg_wait_sampling extension
Also fix some typos/reword some sentences
DmitryNFomin commentedMar 26, 2025
@Medvecrab Performance penalty is exact risk that I would like to avoid while still want to have some extra dimensions. So my suggestion to add fields from PGPROC to the "main" views and fields from |
The idea of *_extended views is to add ALL additional fields to them (also any that could be added in the future) so the original views remain the same for backwards compatibility. I agree that if we were rewriting pg_wait_sampling anew, it would make some sense to distribute columns differently, but alas. This is also the reason that with my patch we still set profiling per profile/query_id with only the old GUC variables, not with the new ones When all fields from |
DmitryNFomin commentedMar 27, 2025
agree with your point, just one more comment, can we add isBackgroundWorker from PGPROC? I understand that backend_type is more detailed, but it we can get it without performance penalty while we can distinct regular backends in performance analysis already |
Seems like a fair request, will probably add after/during review |
Following the discussion in#94, I decided to add more columns to pg_wait_sampling_current/history/profile
I've added most of the columns from request in#94, but some fields from "pg_stat_activity" (=
localBackendStatusTable
) like query_text, query_start, query_id (if we actually took it from "pg_stat_activity") work in the following way: they save values at the start of the query and DO NOT clear those fields when the query has ended. So after executing "select 1;" in some client backend we will always sample its query_text and query_start until we execute other query. We specifically avoided this kind of sampling when we were making our way of tracking query_id with Executor hooks.All new columns are added to new views with suffix "_extended". Those views COULD be changed between extension versions, while existing views without those suffixes SHOULD NOT be changed.
One more thing to highlight - since we have to look into
localBackendStatusTable
when we sample some columns, we have reduced perfomance. This is unavoidable and is noted in updated README. BUT, in PostgreSQL 13-16 we can't reliably linkProcGlobal
andlocalBackendStatusTable
arrays and from this was bornget_beentry_by_procpid
. For each interesting process fromProcGlobal
we have to iterate throughlocalBackendStatusTable
. This has O(n^2) time complexity, where n is a number of all backends. Very inefficient. I probably could remake the collector code to iterate throughlocalBackendStatusTable
first and then find corresponding entries inProcGlobal
but I have not investigated it and am not sure it would be faster (it may be faster, depending onProcGlobal
access/iteration)There could some sloppy code, including possible "you should copy the struct here, not use pointer" moments
Everyone is welcome to review the code and share their thoughts.