- Notifications
You must be signed in to change notification settings - Fork5
Apache2 / Nginx / IIS logs analyzer: parse access logs and view dynamically generated statistics
License
elB4RTO/LogDoctor
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
- Overview
- Installation and usage
- Updates
- Before to start
- Logs data
- Statistics
- Extra features
- Final considerations
- Languages
- Contributions
LogDoctor is a web servers' access logs parser which allows to view dynamic satistics of the collected data.
Supported web servers areApache2,Nginx andIIS.
LogDoctor is a hard fork ofCraplog.
- From binary:
- C++ 20
- Qt6(Framework 6.6+, Linguist, Widgets, Charts, Sql, Network)
- From source:
- all the above
- CMake
- gcc / clang / msvc
- As Docker:
- Docker
- Docker
Download a pre-compiledRelease
or
Follow the step-by-step guide inHOW_TO_COMPILE.mdRun the executable
To check for updates, open the menuUtilities
→Version check
.
When you run LogDoctor for the first time, you will most likely see an empty list of log files.
Head to theconfigurations section and give a look at least at thelogs format settings. Only files containings logs that match the given format will be shown in the list.
Archived (gzipped) log files can be used as well as normal files.
Parsed data will be stored in anSQLite database, which makes it easy to transport/view/edit it as you please.
If LogDoctor's funcionalities aren't enough for your needs, you can always use aDB manager or the SQLiteAPI to make your own queries and retrieve the data you need.
Not all the available log fields (expecially forApache2 andNginx) are taken into consideration.
The considered fields are:
- Date andTime
- Request stuff:Protocol,Method,URI andQuery
- Server stuff:Bytes received,Bytes sent andTime taken
- Client stuff:User-agent,IP address,Cookie andReferrer site
Further informations can be found in thewiki or while running LogDoctor.
Various options can be configured about log files.
When you parse a file, it will be hashed using theSHA256 algorithm and the hash will be stored in another database, to keep track of which files you've already used and help you not parsing them twice.
TheSHA256 algorithm produces an irreversible hash, which means that no information about the file can be retrieved from the hash.
LogDoctor willnever grab and/or use any information about you or the usage you make of it.
A different logs path can be used for any of the three supportedWeb Servers.
It can be the default system folder or any folder you decide to use, just set it in the options.
Before to start parsing logs, you must set-up theloga format.
Head to theconfigurations section, underLogs
select theWeb Server you want to configure and tapFormat
.
Once inside theFormat section, you can insert thelog format string you're using. Don't forget to use theGenerete preview
button to generate alog line sample andcheck the correctness of the format!
For reliability reasons, LogDoctordoes not support the usage of theCarriage Return inside the log format string.
The log format string must be specified. Any format is supported, if valid.
To retrieve your format string:
- open the configuration file
/etc/apache2/apache2.conf
- usually, the line you're looking for is the one starting with
LogFormat
and ending withcombined
. It should be somewhere near to the end of the file. - you must not paste the whole line, just the part holding theformat string.
Example:- this is the whole line:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined
- this is theformat string:please notice that you have to remove the enclosing quotes/apostrophes as well
%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"
- this is the whole line:
More informations can be found in thewiki or while setting the format.
The log format string must be specified. Any format is supported, if valid.
To retrieve your format string:
- open the configuration file
/usr/local/etc/nginx/nginx.conf
- usually, the line you're looking for is the one starting with
log_format main
. It should be somwehere in the middle of the file - oneimportant thing: don't paste the indentations and new lines! The default line is usualy declared in consecutive lines, and indented. You must reduce it to a one consecutive string (by also removing theapostrophes in the middle of it). The best way is to do this job inside the configuration file, then save and restart Nginx to see if any error is thrown.
Example:- this is the whole line:
log_format main '$remote_addr - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" "$gzip_ratio"';
- this is the resultingformat string:please notice that you have to remove the enclosing apostrophes/quotes as well
$remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent "$http_referer" "$http_user_agent" "$gzip_ratio"
- this is the whole line:
More informations can be found in thewiki or while setting the format.
Supported log formats are:W3C,NCSA andIIS.
TheNCSA andIIS modules doesn't allow any modification from the user, so nothing more have to be specified.
TheW3C module instead allows the user to decide which fields to log, and thus you must declare thelog format string you're using.To retrieve your format string (for theW3C module only):
- open any of the log files which have been generated by this module
- the line you're looking for is the one starting with
#Fields:
, usually at the beginning of the file.
Example:- this is the whole line:
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
- this is theformat string:
date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
- this is the whole line:
More informations can be found in thewiki or while setting the format.
You can add elements to theblacklist to avoid storing the lines containing those elements.
Each web server has its own list.
As for theblacklist, you can add elements to thewarnlist.
Warnlists will mark with awarning the lines triggering them. Warnings can be viewed in the relativestatistics section.
Each web server has its own lists.
Most of thestatistics sections allows you to set filters to the log fields, to skim data by only including lines matching those parameters.
In thewarning section you can view the lines which are triggering a warning.
Warnings are generated dinamically depending on yourwarnlists: changing the elements in thewarnlists will produce different warnings.
In thespeed section you can view how fast has been your server at serving contents (if you logged thetime taken, of course).
Thecount section is very simple. It just shows the recurrence of the elements for a specific field.
In thetime of day section you can see the traffic, in terms of number of requests logged.
When viewing a period of time, the mean value (of all the logged days in that period) is shown.
In therelational section you can view how many times a specific field brought to another.
This section is more suited for long periods of time.
In theglobals section you can have an overview of your logs history.
Use the built-in logs viewer to inspect the content of your log files.
Color schemes will be applied using the currently set log format.
A block-note utility is available atTools
→BlockNote
which can be used to temporary write text, notes, etc.
Simple mini-games to kill the time.
LogDoctor can automatically do a backup of yourlogs database file, so you can recover your data in case something goes wrong.
Move inside LogDoctor's folder (if you don't know/remember the path, open theUtilities
→Infos
>Paths
menu to view it) and open the folder named "backups'.
Here you will find the backups with an increasing index, where '.1' represents the newest.
A new backup is made every time you quit LogDoctor after doing a job which affected the database in any way.
Only thelogs-data database will be backed-up, thehashes databasewon't.
This is because it is unlikely (supposedly impossible) that a hash equals another, therefore they're supposed to be useful for a short period of time (that is, until you or your web server delete the original log files).
10~200 MB/s
Take this estimation with a grain of salt, it may be even higher or lower depending on a variety of factors, like: the build type, your hardware, the complexity of the logs, the complexity of the blacklist, the workload of your system during the execution...
LogDoctor is available in multiple languages, most of which are automatically translated.Wannacontribute to improve them?)
LogDoctor is under constant development.
If you have suggestions about how to improve it, please open anissue.
If you want to contribute to the code, please read theContribution Guidelines.
If you want to contribute to the translation, please read theTranslation Guidelines.
About
Apache2 / Nginx / IIS logs analyzer: parse access logs and view dynamically generated statistics