spotify/postgresql-metricsPublic

NotificationsYou must be signed in to change notification settings
Fork51
Star597

Tool that extracts and provides metrics on your PostgreSQL database

License

Apache-2.0 license

597 stars 51 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
debian		debian
etc/postgresql-metrics		etc/postgresql-metrics
postgresql_metrics		postgresql_metrics
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

Spotify PostgreSQL Metrics

Service to extract and provide metrics on your PostgreSQL database.

This tool is a CLI (command line) tool that can be called to extractstatistics and create metrics from your PostgreSQL database cluster.CLI is runnable in long running process mode, which will periodicallysend the gathered metrics forward.

The default metrics format is aMetrics 2.0compatible JSON, which is created by a set of functions listed inthe configuration file.

The extracted metrics can be printed out as direct output ofthe CLI tool, or sent out of the host the postgresql-metricsprocess is running in usingFFWD.

TheFFWD format isplain JSON sent over a UDP socket, so you can use whateverUDP socket endpoint, which understands JSON, to consume the metrics.

Prerequisites

The versions mentioned below are tested to work, but the code shouldwork on many unlisted versions as well. Just add an issue or senda pull-request with missing prerequisites, if you test and confirmpostgresql-metrics to work on other versions of the mentionedtechnologies.

Python 3.5 or later
PostgreSQL 9.3 or later
Debian based distribution with systemd (packaging requirement only)

Building and Installing

You can build a Debian package by running the following in project root:

dpkg-buildpackage -us -uc -b

Notice that postgresql-metrics includes by default a systemd service torun as long running process, pushing metrics to FFWD as gathered.You need to stop the long running process after installing the package for configuration.

sudo systemctl stop postgresql-metrics

If you want to debug the process. For the systemd service, you can runsudo journalctl -u postgresql-metrics to see the service's log.

Edit Configuration

Edit the configuration in/etc/postgresql-metrics/postgresql-metrics.ymland in/etc/postgresql-metrics/default/postgresql-metrics.yml. Notice thatthe configuration in the default folder will be overwritten value by valuefrom the configuration in the configuration root.

If you are not interested in using the default configuration overwritingfunctionality, just delete one of the configurations mentioned above, andkeep using a single configuration file.

Edit at least the values underpostgres section in the configuration tomatch your PostgreSQL cluster setup. Remember also to list thedatabasesyou want to gather metrics from. By database in this context we meana database name you created within your PostgreSQL cluster.

Prepare Database

Before starting to extract the defined metrics, you need to setup yourdatabase cluster using theprepare-db CLI call. This will createthe required extensions for your database, and a few functions that areused by the statistics gathering queries from the metrics process.The configured metrics user will be also granted access to the createdstatistics functions and views.

You need to provide administrator user to theprepare-db call, whichthe tool is kind enough to ask. You don't need to provide credentials ifyou are running theprepare-db with a local user that is configured to betrusted locally by the PostgreSQL cluster (inpg_hba.conf), and isa super user, like the defaultpostgres user created by some distributionpackages (e.g. Debian). You can do the prepare-db call e.g. as follows:

sudo su -c "postgresql-metrics prepare-db" postgres

It is safe to call theprepare-db multiple times for the same database(the call is idempotent).

Grant Access for Metrics User

In addition to granting access to the statistics gathering functions and viewswithin your PostgreSQL cluster (previous step), you need to also add accessto the metrics user into the host based access file (pg_hba.conf).

Add one line per database you are monitoring into the end of thepg_hba.conffile for your cluster:

host my_database_name postgresql_metrics_user 127.0.0.1/32 md5  # metrics user access

Replace themy_database_name andpostgresql_metrics_user with the values youconfigured into the postgresql-metrics configuration inEdit Configurationstep above.

You need to reload (or restart) your server after editingpg_hba.conf forthe changes to take effect.

Getting Metrics

After you have the postgresql-metrics configured, and the database prepared,you can print out all the metrics that will be extracted from your databaseby calling:

postgresql-metrics all

You need to call the command above as a user that has access to the WAL logdirectory under PostgreSQL, or the metric gathering WAL file amounts will fail.Single failed metric calls will not prevent the rest of gathering process.

You can also start the long running process again, if using systemd:

sudo systemctl start postgresql-metrics

Explaining the Gathered Metrics

This section explains the metrics we gather using this tool.

Notice that there are many system specific metrics that you should gatherin addition to the Postgres specific metrics, for example:

CPU usage
Network usage, sent / received bytes per related network interface
Memory usage
Disk I/O operations
I/O await times
Disk usage and free space left

Database Specific Metrics

Called once per configured database inside your Postgres cluster.

get_stats_disk_usage_for_database:This metric shows the size of each of your databases in bytes.Don't forget to measure the total disk usage of the disk your datadirectory resides in as well.
get_stats_tx_rate_for_database:Shows the rate of transactions executed per second since the last callof this function. Shows the rate of executed rollbacks also separately.
get_stats_seconds_since_last_vacuum_per_table:This metric shows the amount of time passed in seconds since the lastvacuum was run per table in your database.
get_stats_oldest_transaction_timestamp:This metric shows the time the longest running transaction has been openin your database. This should be usually close to zero, but sometimes,for example when an administrator forgets to close a maintenance connection,you will see this value going up. Long running transactions are bad foryour database, so fix the issue as soon as you see this metric increase.
get_stats_index_hit_rates:This metric shows you the amount of reads hitting the table indexesversus the amount of reads requiring sequential scan through the table.Depending on your table, the amount of data, and the created indexes,the index hit rate varies. You should understand your data well enoughto know when high index usage is desirable to low index usage.
get_stats_table_bloat:This metric shows the amount of wasted space in the database table dueto the MVCC process. Deletes and updates to the table just markthe obsolete data free, but does not really delete it. Vacuums do freesome of this wasted data, but to get totally rid of table bloat you mustre-create the table with vacuum full.The current implementation of table bloat metric is rather heavy, so youmight want to disable it in case you see issues with it.

Database Cluster (Global) Metrics

Called once per your Postgres cluster.

get_stats_client_connections:This metrics shows the amount of connections open to the databaseat the moment. The actual metrics gathering connections should be visiblehere as well. Notice that having more than a hundred connections open isusually a bad thing. Consider using a connection pooler, like pgbouncer.
get_stats_lock_statistics:This metric shows locks being waited upon by queries, and the amount oflocks granted. In general having any query waiting for locks for anyextended period of time is a sign of problems, like heavy lock contention.
get_stats_heap_hit_statistics:This metric shows the amount of reads hitting the memory buffers on yourcluster, and also the amount of reads hitting the disk (or disk caches onthe operating system). Also the heap hit ratio is calculated based onthese values.
Notice that the read amounts are not actual read queries, but the amountof blocks read. You will get good idea of amount of reads hitting yourdatabase, when comparing these values with the transaction rate.
get_stats_replication_delays:This metric shows the amount of bytes the replication delay is behind masterper each slave. If the slave and the master are in synchronous state,the replication delay is zero.
get_stats_wal_file_amount:This graph shows the amount of files in your database clusters WAL logdirectory (pg_wal or pg_xlog). If the WAL file amount starts to suddenlyincrease, you probably have issues with your WAL archiving process, whichmight lead to the disk filling up, and you database cluster crashing.
get_xid_remaining_ratio, get_multixact_remaining_ratio, get_multixact_members_remaining_ratio:These metric shows the corresponding remaining % of transaction ids ("xid"), multixact ids ("mxid"),and multixact members that are available for postgres to use before exhaustion.Useful for ensuring that the vacuuming is working as intended for your postgres instance.
get_multixact_members_per_mxid:This metric emits the number of multixact members there are per multixact ID. A larger number meansthat it'll be quicker for the multixact members exhaustion to happen (as canbe seen inget_multixact_members_usage_ratio).

Short Overview of Python Modules

common.py: Contains code common for the whole package, like logging andconfiguration parsing.
default_metrics.py: Is responsible for turning statistics values intothe default metrics JSON format.
localhost_postgres_stats.py: Functions for statistics extraction fromlocal Postgres data directory. These calls are automatically global froma database cluster perspective.
metrics_gatherer.py: Functions for calling the statistics extractionfunctions and converting the results into correct metrics format. Calledfrom the metrics_logic.py.
metrics_logic.py: Contains the CLI tool, initialization, andthe scheduling logic.
postgres_queries.py: Functions for statistics extraction fromthe Postgres database.
prepare_db.py: Code for preparing your databases for the metricsgathering process.

How to Add More Metrics

If you want to add more metrics into postgresql-metrics tool, you can do itby making the following changes to the source:

If you gather the metric using a Postgres SQL query, add the code intopostgres_queries.py, and if you gather the metric by accessing the localPostgres data directory, add the code intolocalhost_postgres_stats.py.
Write a function for formatting your gathered metric values into wantedformat, as is done indefault_metrics.py. You can either expandthe default metrics, or write your own format into another module.
Write a function intometrics_gatherer.py, which will call the metricextraction functionality you wrote on the first step, and then the metricvalue formatting function you wrote on the previous step.
Add the name of your metrics gatherer function, written in the previousstep, intopostgresql-metrics configuration file, with wanted timeinterval to call the metric gathering function. Notice that you need toadd the function into the correct list of functions depending on whetheryou gather a metric that covers your whole database cluster, or a metricthat targets a single database in your cluster. Data directory basedmetrics must be a 'global' metric.
Update this README with explanation on what your new metric is about.

About

Tool that extracts and provides metrics on your PostgreSQL database

Releases3

v0.3.3: Merge pull request #29 from mahlberg/release-0.3.3 Latest

Jul 13, 2021

+ 2 releases

Packages

No packages published

Contributors11

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Spotify PostgreSQL Metrics

Prerequisites

Building and Installing

Edit Configuration

Prepare Database

Grant Access for Metrics User

Getting Metrics

Explaining the Gathered Metrics

Database Specific Metrics

Database Cluster (Global) Metrics

Short Overview of Python Modules

How to Add More Metrics

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases3

Packages

Uh oh!

Contributors11

Uh oh!

Languages

Movatterモバイル変換

License

spotify/postgresql-metrics

Folders and files

Latest commit

History

Repository files navigation

Spotify PostgreSQL Metrics

Prerequisites

Building and Installing

Edit Configuration

Prepare Database

Grant Access for Metrics User

Getting Metrics

Explaining the Gathered Metrics

Database Specific Metrics

Database Cluster (Global) Metrics

Short Overview of Python Modules

How to Add More Metrics

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases3

Packages0

Uh oh!

Contributors11

Uh oh!

Languages

Packages