- Notifications
You must be signed in to change notification settings - Fork79
A command line interface and Python module for accessing the CKAN Action API
License
Unknown, Unknown licenses found
Licenses found
ckan/ckanapi
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A command line interface and Python module for accessing theCKAN Action API
Installation with pip:
pip install ckanapi
Installation with conda:
conda install -c conda-forge ckanapi
The ckanapi command line interface lets you access local andremote CKAN instances for bulk operations and simple API actions.
Simple actions with string parameters may be called directly. Theresponse is pretty-printed to STDOUT.
$ ckanapi action group_list -r https://demo.ckan.org --insecure[ "data-explorer", "example-group", "geo-examples", ...]
Use -r to specify the remote CKAN instance, and -a to provide anAPI KEY. Remote actions connect as an anonymous user by default.For this example, we use --insecure as the CKAN demo uses aself-signed certificate.
Local CKAN actions may be run by specifying the config file with -c.If no remote server or config file is specified the CLI will look fora development.ini file in the current directory, much like pastercommands.
Local CKAN actions are performed by the site user (default systemadministrator) when -u is not specified.
To perform local actions with a less privileged user usethe -u option with a user name or a name that doesn't exist. This isuseful if you don't want things like deleted datasets or privateinformation to be returned.
Note that all actions in theCKAN Action APIand actions added by CKAN plugins are supported.
Simple action arguments may be passed in KEY=STRING form for stringvalues or in KEY:JSON form for JSON values.
$ ckanapi action package_show id=my-dataset-name{ "name": "my-dataset-name", ...}
$ ckanapi action datastore_info id=my-resource-id-or-alias{ "meta": { "aliases": [ "test_alias" ], "count": 1000, ...}
$ ckanapi action package_search facet.field:'["organization"]' rows:0{ "facets": { "organization": { "org1": 42, "org2": 21, ... } }, ...}
Files may be passed for upload using the KEY@FILE form.
$ ckanapi action resource_create package_id=my-dataset-with-files \ upload@/path/to/file/to/upload.csv
$ ckanapi action package_show id=my-dataset-id > my-dataset.json$ nano my-dataset.json$ ckanapi action package_update -I my-dataset.json$ rm my-dataset.json
$ ckanapi action resource_patch id=my-resource-id size:42000000
Datasets, groups, organizations, users and related items may be dumped toJSON linestext files and created or updated from JSON lines text files.
dump
andload
jobs can be run in parallel withmultiple worker processes using the-p
parameter. The jobs in progress,the rate of job completion and any individual errors are shown on STDERRwhile the jobs run.
There are no parallel limits when running against a CKAN on localhost.When running against a remote site, there's a default limit of 3 worker processes.
The environment variablesCKANAPI_MY_SITES
andCKANAPI_PARALLEL_LIMIT
can beused to adjust these limits.CKANAPI_MY_SITES
(comma-delimited list of CKAN urls)will not have thePARALLEL_LIMIT
applied.
dump
andload
jobs may be resumed from the last completedrecord or split across multiple servers by specifying recordstart and max values.
$ ckanapi dump datasets --all -O datasets.jsonl.gz -z -p 4 -r http://localhost
$ ckanapi search datasets include_private=true -O datasets.jsonl.gz -z \ -c /etc/ckan/production.ini
search
is faster thandump
because it callspackage_search
to retrievemany records per call, paginating automatically.
You may add parameters supported bypackage_search
to filter therecords returned.
$ ckanapi load datasets -I datasets.jsonl.gz -z -p 3 -c /etc/ckan/production.ini
Datasets, groups, organizations, users and related items may be deleted inbulk with the delete command. This command accepts ids or names on thecommand line or a number of different formats piped on standard input.
$ ckanapi action package_list -j | ckanapi delete datasets
$ ckanapi action package_search q=ponies | ckanapi delete datasets
$ ckanapi dump groups --all > groups.jsonl$ grep ponies groups.jsonl | ckanapi delete groups
$ cat users_to_remove.txtfredbilllarry$ ckanapi delete users < users_to_remove.txt
Datasets may be exported to a simplifieddatapackage.json format(which includes the actual resources, where available).
If the resource url is not available, the resource will be includedin the datapackage.json file but the actual resource data will not be downloaded.
$ ckanapi dump datasets --all --datapackages=./output_directory/ -r http://sourceckan.example.com
Run a set of actions from a JSON lines file. For local actions this is much faster than runningckanapi action ...
in a shell loop because the local start-up time only happens once.
Batch actions can also be run in parallel with multiple processes and errors logged, just like thedump and load commands.
$ cat update-emails.jsonl{"action":"package_patch","data":{"id":"dataset-1","maintainer_email":"new@example.com"}}{"action":"package_patch","data":{"id":"dataset-2","maintainer_email":"new@example.com"}}{"action":"package_patch","data":{"id":"dataset-3","maintainer_email":"new@example.com"}}$ ckanapi batch -I update-emails.jsonl
$ cat upload-files.jsonl{"action":"resource_patch","data":{"id":"408e1b1d-d0ca-50ca-9ae6-aedcee37aaa9"},"files":{"upload":"data1.csv"}}{"action":"resource_patch","data":{"id":"c1eab17f-c2d0-536d-a3f6-41a3dfe6a2c3"},"files":{"upload":"data2.csv"}}{"action":"resource_patch","data":{"id":"8ed068c2-4d4c-5f20-90db-39d2d596ce1a"},"files":{"upload":"data3.csv"}}$ ckanapi batch -I upload-files.jsonl --local-files
The"files"
values in the JSON lines file is ignored unless the--local-files
parameter is passed.Paths in the JSON lines file reference files on the local filesystems relative to the current workingdirectory.
Simple shell pipelines are possible with the CLI.
$ ckanapi action package_show id=my-dataset \ | jq '.+{"title":.name}' \ | ckanapi action package_update -i
$ ckanapi dump datasets --all -q -r http://sourceckan.example.com \ | ckanapi load datasets
The ckanapi Python module may be used from within aCKAN extensionor in a Python 2 or Python 3 application separate from CKAN.
Making a request:
fromckanapiimportRemoteCKANua='ckanapiexample/1.0 (+http://example.com/my/website)'demo=RemoteCKAN('https://demo.ckan.org',user_agent=ua)groups=demo.action.group_list(id='data-explorer')print(groups)
result:
[u'data-explorer', u'example-group', u'geo-examples', u'skeenawild']
The example above is using an "action shortcut". The.action
object detectsthe method name used ("group_list" above) and converts it to a normalcall_action
call. This is equivalent code without using an action shortcut:
groups=demo.call_action('group_list', {'id':'data-explorer'})
Once again, all actions in theCKAN Action APIand actions added by CKAN plugins are supported by action shortcuts andcall_action
calls.
For example, if theShowcase extension is installed:
fromckanapiimportRemoteCKANua='ckanapiexample/1.0 (+http://example.com/my/website)'demo=RemoteCKAN('https://demo.ckan.org',user_agent=ua)showcases=demo.action.ckanext_showcase_list()print(showcases)
Combining query parameters clauses is possible as in the followingpackage_search
action. This query combines three clauses that are all satisfied by the singleexample dataset in the Demo CKAN site.
More detailed complex query syntax examples can be found in theSOLR documentation.
fromckanapiimportRemoteCKANua='ckanapiexample/1.0 (+http://example.com/my/website)'demo=RemoteCKAN('https://demo.ckan.org',user_agent=ua)packages=demo.action.package_search(q='+organization:sample-organization +res_format:GeoJSON +tags:geojson')print(packages)
Many CKAN API functions can only be used by authenticated users. Use theapikey
parameter to supply your CKAN API key toRemoteCKAN
:
demo = RemoteCKAN('https://demo.ckan.org', apikey='MY-SECRET-API-KEY')
An example of updating a single field in an existing dataset can be seen in theExamples directory
NotAuthorized
- user unauthorized or accessing a deleted itemNotFound
- name/id not foundValidationError
- field errors listed in.error_dict
SearchQueryError
- error reported from SOLR indexSearchError
CKANAPIError
- incorrect use of ckanapi or unable to parse responseServerIncompatibleError
- the remote API is not a CKAN API
When using an action shortcut or thecall_action
methodfailures are raised as exceptions just like when callingget_action
from aCKAN plugin:
fromckanapiimportRemoteCKAN,NotAuthorizedua='ckanapiexample/1.0 (+http://example.com/my/website)'demo=RemoteCKAN('https://demo.ckan.org',apikey='phony-key',user_agent=ua)try:pkg=demo.action.package_create(name='my-dataset',title='not going to work')exceptNotAuthorized:print('denied')
When it is possible toimport ckan
all the ckanapi exception classes arereplaced with the CKAN exceptions with the same names.
File uploads for CKAN 2.2+ are supported by passing file-like objects to actionshortcut methods:
fromckanapiimportRemoteCKANua='ckanapiexample/1.0 (+http://example.com/my/website)'mysite=RemoteCKAN('http://myckan.example.com',apikey='real-key',user_agent=ua)mysite.action.resource_create(package_id='my-dataset-with-files',url='dummy-value',# ignored but required by CKAN<2.6upload=open('/path/to/file/to/upload.csv','rb'))
When usingcall_action
you must pass file objects separately:
mysite.call_action('resource_create', {'package_id':'my-dataset-with-files'},files={'upload':open('/path/to/file/to/upload.csv','rb')})
As of ckanapi 4.0 RemoteCKAN will keep your HTTP connection open using arequests session.
For long-running scripts make sure to close your connections by usingRemoteCKAN as a context manager:
fromckanapiimportRemoteCKANua='ckanapiexample/1.0 (+http://example.com/my/website)'withRemoteCKAN('https://demo.ckan.org',user_agent=ua)asdemo:groups=demo.action.group_list(id='data-explorer')print(groups)
Or by explicitly callingRemoteCKAN.close()
.
A similar class is provided for accessing local CKAN instances from a plugin inthe same way as remote CKAN instances.UnlikeCKAN's get_actionLocalCKAN prevents data from one actioncall leaking into the next which can cause issues that are very hard do debug.
This class defaults to using the site user with full access.
fromckanapiimportLocalCKAN,ValidationErrorregistry=LocalCKAN()try:registry.action.package_create(name='my-dataset',title='this will work fine')exceptValidationError:print('unless my-dataset already exists')
For extra caution pass a blank username to LocalCKAN and only actions allowedby anonymous users will be permitted.
fromckanapiimportLocalCKANanon=LocalCKAN(username='')print(anon.action.status_show())
To enable extra info logging for the execution of LocalCKAN ckanapi commands, you can enable the config option in your CKAN INI file.
ckanapi.log_local = True
The output of the log will look like:
INFO [ckan.ckanapi] OS User <user> executed LocalCKAN: ckanapi <args>
A class is provided for making action requests to awebtest.TestAppinstance for use in CKAN tests:
fromckanapiimportTestAppCKANfromwebtestimportTestApptest_app=TestApp(...)demo=TestAppCKAN(test_app,apikey='my-test-key')groups=demo.action.group_list(id='data-explorer')
To run the tests:
python setup.py test
🇨🇦 Government of Canada / Gouvernement du Canada
The project files are covered under Crown Copyright, Government of Canadaand is distributed under the MIT license. Please seeCOPYING /COPYING.fr for full details.
About
A command line interface and Python module for accessing the CKAN Action API