- Notifications
You must be signed in to change notification settings - Fork26
Python package to detect suspicious OSM changesets
License
OSMCha/osmcha
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
OSM Changeset Analyser,osmcha
, is a Python package to detect suspicious OSM changesets.It was designed to be used withosmcha-django,but also can be used standalone or in other projects.
You can report issues or request new features in the theosmcha-frontend repository.
pip install osmcha
You can read areplication changeset filedirectly from the web:
fromosmcha.changesetimportChangesetListc=ChangesetList('https://planet.openstreetmap.org/replication/changesets/002/236/374.osm.gz')
or from your local filesystem.
c=ChangesetList('tests/245.osm.gz')
c.changesets
will return a list containing data of all the changesets listed in the file.
You can filter the changesets passing a GeoJSON file with a polygon with yourinterest area to ChangesetList as the second argument.
Finally, to analyse an especific changeset, do:
fromosmcha.changesetimportAnalysech=Analyse(changeset_id)ch.full_analysis()
You can customize the detection rules by defining your prefered values wheninitializing theAnalyze
class. See below the default values.
ch=Analyse(changeset_id,create_threshold=200,modify_threshold=200,delete_threshold=30,percentage=0.7,top_threshold=1000,suspect_words=[...],illegal_sources=[...],excluded_words=[...])
The command line interface can be used to verify an especific changeset directlyfrom the terminal.
Usage:osmcha <changeset_id>
osmcha
works by analysing how many map features the changeset created, modifiedor deleted, and by verifying the presence of some suspect words in thecomment
,source
andimagery_used
fields of the changeset. Furthermore, we alsoconsider if the software editor used allows to import data or to do mass edits.We considerpowerfull editors
: JOSM, Merkaartor, level0, QGIS and ArcGis.
In theUsage
section, you can see how to customize some of these detection rules.
We tag a changeset as apossible import
if the number of created elements isgreater than 70% of the sum of elements created, modified and deleted and if itcreates more than 1000 elements or 200 elements case it used one of thepowerfull editors
.
We consider a changeset as amass modification
if the number of modified elementsis greater than 70% of the sum of elements created, modified and deleted and if itmodifies more than 200 elements.
All changesets that delete more than 1000 elements are considered amass deletion
.If the changeset deletes between 200 and 1000 elements and the number of deletedelements is greater than 70% of the sum of elements created, modified and deletedit's also tagged as amass deletion
.
The suspect words are loaded from ayaml file.You can customize the words by setting another default file with a environmentvariable:
export SUSPECT_WORDS=<path_to_the_file>
or pass a list of words to theAnalyse
class, more information on the sectionCustomizing Detection Rules
. We use a list of illegal sources to analyse thesource
andimagery_used
fields and another more general list to examinethe comment field. We have also a list of excluded words to avoid false positives.
Verify if the user has less than 5 edits or less than 5 mapping days.
Changesets created by users that has received more than one block will beflagged.
If you need to use OSMCha with another OSM server instance, you need to configure the OSM_SERVER_URL environment variable, without trailing slash. Example:
export OSM_SERVER_URL='https://www.openhistoricalmap.org'
To run the tests on osmcha:
git clone https://github.com/osmcha/osmcha.gitcd osmchapip install -e .[test]py.test -v
Update the version number inosmcha/__init__.py
and executethe following commands:
python setup.py bdist_wheeltwine upload dist/osmcha-{version}...
CheckCHANGELOG for the version history.
- osmcha-django - backend and API
- osmcha-frontend - frontend of theOSMCha application
- osm-compare - library that analyse OSM features to input it to OSMCha
GPLv3
About
Python package to detect suspicious OSM changesets