- Notifications
You must be signed in to change notification settings - Fork1
Monitor data sources and track changes over time 🐿️
License
niqodea/meerkat
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Python library for monitoring data sources and tracking changes over time.Just as meerkats in nature keep vigilant watch over their surroundings, this library helps you maintain awareness of changes in your data sources.
Usingpip
:
pip install git+https://github.com/niqodea/meerkat.git@v0.2.0
Thing
is the base dataclass that you can extend to represent items you want to monitor.For example, if you're monitoring job postings, you might create aJob
class that extendsThing
with fields liketitle
,location
, andurl
.EachThing
must have a unique identifier that allows the system to track it over time.
A data source is any collection ofThing
objects you want to monitor.The library is flexible and can work with any data source that can be represented as a dictionary of these uniquely identified items.
A meerkat (Meerkat
) is the main orchestrator that monitors a data source for changes. It periodically:
- Fetches data from the data source
- Tracks changes by comparing against the previous state
- Executes actions in response to detected changes
A fetcher (Fetcher
) is responsible for:
- Fetching data from the data source
- Converting the data into a dictionary of
Thing
objects (items that can be monitored) - Communicating errors that occur during fetching
A snapshot manager (SnapshotManager
) is responsible for:
- Tracking the current state of monitored items
- Detecting changes by comparing against the previous state
- Computing operations (Create, Update, Delete) based on the differences
The standard implementation (JsonSnapshotManager
) stores snapshots as JSON files on disk, making it easy to inspect the state with a text editor.
An action executor (ActionExecutor
) defines what happens when changes are detected.It receives a dictionary of operations (Create/Update/Delete) and can perform any desired actions in response.
The CLI module provides a convenient way to deploy meerkats that report changes to the terminal.One of its main advantages is simplicity, as you only need to implement the following to get started:
- A fetcher to get your data
- A stringifier function to convert your items to human-readable text
The module automatically handles everything else with sensible default implementations.
frommeerkat.cliimportCliDeployerspecs= {"data-source-name":CliDeployer.MeerkatSpec(fetcher=YourFetcher(),stringifier=lambdax:str(x),# How to convert things to stringssnapshot_path=Path("./snapshots"),# Where to store stateinterval_seconds=60# How often to check for changes ),# Add meerkat specs for other data sources here# Each meerkat will operate independently and report to the same terminal}deployer=awaitCliDeployer.create(specs)awaitdeployer.run()
WhenCliDeployer
is running, you'll see changes printed to the terminal like this:
Changes for north-pole-workshop [1999-12-24 12:00:00]Created:* ELF-123 Title: 'Emergency Present Inspector', Location: 'Jingletown'* ELF-124 Title: 'Last-Minute Gift Wrapper', Location: 'Peppermint Port'Deleted:* ELF-111 Title: 'Coal Distribution Manager', Location: 'Kringle Quarry'Updated:* ELF-100 from: Title: 'Junior Reindeer Trainer', Location: 'Snowdrift Haven' to: Title: 'Senior Reindeer Trainer', Location: 'Snowdrift Haven'Error for y2k-industries [2000-01-01 00:00:00]Server responded with HTTP 500: "Date overflow - expected 19XX, got 19100"
Note that the actual output will also be colored, making it easier to read.
CTRL+L
: Clear the screen (useful after reading changes)CTRL+D
: Graceful shutdown
This repository also includes an example project that demonstrates how to use Meerkat to monitor job postings from various sources.You can find it in thejob-monitor
directory; just keep in mind, you’ll need to implement the job fetchers yourself!
Licensed under the MIT License. Check theLICENSE file for details.
About
Monitor data sources and track changes over time 🐿️