Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

License

NotificationsYou must be signed in to change notification settings

kuleuven/mango-ingest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ManGO ingest is a lightweight tool to monitor a local directory for filechanges and ingest (part of) them into iRODS. There is no need for cronjobsas it is based on python watchdog which starts its own threads for continousoperations.

The main purpose it to be an easy entry point for ingestion of files intoiRODS, from where possibly a ManGO Flow task will pick up and handle furtherprocessing

Current state of supported platforms: beta software

The initial development is focusing on Linux, but the target platforms arealso including Windows and Mac OS. It may or may not work for you (yet), pleaseuse the issue tracker to report on your findings/use cases and more..

Installation

Recommended

  • check out this repository and cd into it
  • run the following commands
$ python -m venv venv$. venv/bin/activate$ pip install --editable src

Afterwards verify the executablemango_ingest is available in your PATH

$ mango_ingest --help

Quick checkout

Just checkout the repository and copy the scriptmango_ingest.py around to where you want to execute it

Authentication

Authentication is done by creating aniRODSSession from a configuration file either as specified by the environment variableIRODS_ENVIRONMENT_FILE or with a fallback to the current user~/.irods/irods_environment.json.

Usage

mango_ingest [OPTIONS] [COMMAND [OPTIONS] [ARGS]]

If it detects a new file creation, the corresponding file is inspectedthrough a white list (glob pattern and/or regular expression list) and ifany of those match, it is uploaded to the specified path in iRODS/ManGO

Ignore patterns--ignore-glob and regular expressions--ignore areevaluated before any--glob and/or--regex

CUSTOM FILTERS

Custom filters can be specified too with --custom-filter, if they areresolvable with a dynamic import. The parameter is a string defining thename of the module nf function in the form<module>.<function> and thatfunctions takes as the first positional parameter thepathlib.Pathparameter of the file to validate, followed by an optional set of kwargsparameters. See also the option--filter-kwargs which accepts a dict/jsonstring.

METADATA

In addition, there are a number of ways to add metadata on the fly. A fewbuiltin functions cover the case for some rather obvious ones like metadatathat is included in the path--metadata-path or shorter--md-path andfile system properties such as modified time--metadata-mtime and symlinkinformation

You can also add your custom handler much in the same way as you can addcustom filters, see--help and the--metadata-handler option. An exampleis also included indoc/examples/extract_metadata.py which relies on theexiftool executable and corresponding Python module.

ENVIRONTMENT VARIABLES

All parameters can also be set via environment variables using their longname, uppercased and prefixed withMANGO_ . For example

  `export MANGO_DESTINATION="/zone/home/project/ingest" `

is the same as specifying the command line option

  `mango_ingest --destination="/zone/home/project/ingest" `

CONFIGURATION FILE

Besides command line options, environment variables, you can also specify aYaml formatted configuration file through the environment variableMANGO_INGEST_CONFIG. This can hold all or a subset of the command lineoptions. It acts as a "default" setting for each option, and the valuespecified by the command line option or environment variable takesprecedence.

The builtin sub commandgenerate-config will create such a yaml formattedconfig file for you.

Options:  -v, --verbose                   Show runtime messages  [default: 0]  -r, --recursive                 Also watch sub directories  -p, --path TEXT                 The (local) path to monitor  [default: .]  -d, --destination TEXT          iRODS destination collection path  --observer [native|polling]     The observer system to use for getting                                  changed paths. Defaults to 'polling' which                                  is recommended for most use cases, but you                                  can use also 'native' in for linux/mac                                  filesystems when watching for new files that                                  are directly written into the                                  directorypolling is a rather brute force                                  algorithm, needed for network mounted drives                                  and windows for example  [default: polling]  --polling-interval INTEGER      Polling interval in seconds in case the                                  observer is specified as 'polling'                                  [default: 5]  --regex TEXT                    regular expression to match [multiple]  --glob TEXT                     glob expression to match as a simpler                                  alternative to --regex [multiple]  --filter-func TEXT              use an external filter (along regex/glob                                  patterns), it will be dynamically imported  --filter-func-kwargs TEXT       A json string that will be parsed as a dict                                  and injected as kwargs into the filter after                                  the path  --ignore TEXT                   regular expression to ignore certain                                  files/folders [multiple]  --ignore-glob TEXT              glob patterns to ignore files / folders                                  [multiple]  --sync                          Do an initial sync  --verify-checksum               Verify checksums  --restart PATH                  Use restart file to retry failed uploads                                  from a previous run  --dry-run                       Dry run: do not upload anything, implies                                  --verbose  -nw, --no-watch                 Do not start monitoring for future changes,                                  implies --sync  --metadata-path, --md-path TEXT                                  regular expression to extract metadata from                                  the path [multiple]  --metadata-mtime, --md-mtime    Add the original modify time as metadata  --metadata-handler, --md-handler TEXT                                  a custom PYPON_PATH accessible                                  module.function to handle metadata  --metadata-handler-kwargs, --md-handler-kwargs TEXT                                  kwargs parameters for the metadata-handler                                  as a json string  --help                          Show this message and exit.Commands:  check-regex      Utilty to test a regular expression against a filename...  clean            Clean up older (default) or all (-a) result files  examples         Examples  generate-config  Generate a YAML config template  show             Show parameter and values as would be used given the...

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp