Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Analyzing logs - the lame way
Naineel Soyantar
Naineel Soyantar

Posted on • Edited on

     

Analyzing logs - the lame way

In the previous blog post, we saw how to create a production-ready logging system. In this blog post, we will see how to analyze the logs generated by the system.

look-below

🫱🫱🫱🫱Let's build a production-ready logger using Winston🫲🫲🫲🫲

While many advanced tools are available for log analysis, such as the ELK stack and Splunk, we will see how to do it the lame way using JavaScript🥱🥱.

The Log Format

The logs generated by the logger are in the following format:

logs generated by the logger

Understanding The Log Format

The log format includes the following fields:

  • timestamp: The time at which the log was generated
  • level: The log level (DEBUG, INFO, WARN, ERROR)
  • Source IP: The IP address of the source of the client making the request
  • kind of request: The kind of request made by the client (GET, POST, PUT, DELETE)
  • response time: The time taken by the server to respond to the request
  • path: The path of the request made by the client
  • status code: The status code of the response sent by the server
  • response time: The time taken by the server to respond to the request
  • total time: The total time taken by the server to process the request
  • Remote IP: The IP address of the client making the request

The Log Analyzer

Laying The Basis For The Log Analyzer

First, create a regular expression to parse the log format.

constlogLevelPattern=/(?<date>\d{4}-\d{2}-\d{2})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<level>\w+):\s+(?<client_ip>::ffff:\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|::1)\s+(?<method>\w+)\s+(?<path>\/\S*)\s+(?<status>\d{3})\s+(?<response_time_1>\d+\.\d{3})\s+ms\s+(?<response_time_2>\d+\.\d{3})\s+ms\s+(?<other_ip>::ffff:\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|::1)/g;
Enter fullscreen modeExit fullscreen mode

omfg

The regular expression seems daunting at first sight, but it's not that hard to understand. It consists of named groups for each field in the log format. The named groups are enclosed in(?<name>pattern).

Using named groups makes it easier to extract the fields from the log line. For example, to extract thelevel field from the log line, we can use the following code:

constlogLine='2024-07-04 10:42:09 error: ::ffff:72.200.26.133 GET /api/v1/resource 201 0.082 ms 0.177 ms ::ffff:72.200.26.133';constmatch=logLevelPattern.matchAll(logLine);for(constmofmatch){//We are interested in the groups from the match object because we have used named groups in the regex.console.log(m.groups);console.log(m.groups.level);}
Enter fullscreen modeExit fullscreen mode

Thematch object will have this kind of output:

single-log-level-output

Thegroups property of the match object will have the named groups extracted from the log line. We can access thelevel field usingm.groups.level.

Analyzing The Logs Using The Log Analyzer

Now that we have the regular expression to parse the log format, let's create a log analyzer that reads the logs from a file and analyzes them.

import{EventEmitter}from'events';importfsfrom'fs';importpathfrom'path';classLogAnalyzerextendsEventEmitter{// initialize the log analyzer with the log file stream and the log level pattern and initialize the objects to store the analysis resultsconstructor(logFileStream,logLevelPattern){super();this.logFileStream=logFileStream;this.logLevelPattern=logLevelPattern;this.time={};this.paths={};this.ips={};this.responseTime=[];this.totalResponseTime=[];this.count=0;}//method to start the analysisanalyze(){this.logFileStream.on('ready',()=>console.log('================START======================='));this.logFileStream.on('data',this.processChunk.bind(this));this.logFileStream.on('end',this.finishAnalysis.bind(this));}//method to process each chunk of data read from the log fileprocessChunk(chunk){console.log('Processing chunk:',this.count);this.logFileStream.pause();constoutput=chunk.toString().matchAll(this.logLevelPattern);for(constmatchofoutput){//extract the named groups from the match objectconst{groups}=match;//update the objects with the extracted fieldsthis.updateObjects(groups);}this.count++;this.logFileStream.resume();}//method to update the objects with the extracted fieldsupdateObjects(groups){consthourKey=groups.time.split(':')[0]+':00';this.time[hourKey]=(this.time[hourKey]||0)+1;this.updateObject(this.paths,groups.path);this.updateObject(this.ips,groups.client_ip);this.responseTime.push(parseFloat(groups.response_time_1));this.totalResponseTime.push(parseFloat(groups.response_time_2));}//method to update an object with a keyupdateObject(obj,key){obj[key]=(obj[key]||0)+1;}//method to finish the analysisfinishAnalysis(){console.log('================END=========================');console.log("Let's perform some analysis on the log file");this.emit('analysisComplete',this.getAnalysisResults());}//method to sort an object based on the count of the keyssortingObject(obj,max=true){//sorting is based on the count of the keys to get the most used or least used keysif(max){returnObject.entries(obj).sort((a,b)=>b[1]-a[1]);}else{returnObject.entries(obj).sort((a,b)=>a[1]-b[1]);}}//method to get the analysis resultsgetAnalysisResults(){return{timeDistribution:this.sortingObject(this.time,true).slice(0,4),mostUsedPathDistribution:this.sortingObject(this.paths,true).slice(0,4),leastUsedPathDistribution:this.sortingObject(this.paths,false).slice(0,4),ipDistribution:this.sortingObject(this.ips,true).slice(0,4),avgResponseTime:this.average(this.responseTime).toFixed(5),avgTotalResponseTime:this.average(this.totalResponseTime).toFixed(5),};}average(arr){returnarr.reduce((a,b)=>a+b,0)/arr.length;}}// Usage:constlogLevelPattern=/(?<date>\d{4}-\d{2}-\d{2})\s+(?<time>\d{2}:\d{2}:\d{2})\s+(?<level>\w+):\s+(?<client_ip>::ffff:\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|::1)\s+(?<method>\w+)\s+(?<path>\/\S*)\s+(?<status>\d{3})\s+(?<response_time_1>\d+\.\d{3})\s+ms\s+(?<response_time_2>\d+\.\d{3})\s+ms\s+(?<other_ip>::ffff:\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|::1)/g;constlogFileStream=fs.createReadStream(path.resolve('./logs/combined/generated_logs.log'),{encoding:'utf-8',highWaterMark:10*1024,// 10KB});constanalyzer=newLogAnalyzer(logFileStream,logLevelPattern);analyzer.on('analysisComplete',(results)=>{console.log('Analysis complete. Results:',results);});analyzer.analyze();
Enter fullscreen modeExit fullscreen mode

Workflow:

  • ThelogFileStream is created usingfs.createReadStream to read the log file in chunks.
  • TheLogAnalyzer class is created with thelogFileStream andlogLevelPattern as arguments.
  • Theanalyze method is called on theLogAnalyzer instance to start the analysis.
  • TheprocessChunk method is called when a chunk of data is read from the log file. It processes the chunk and extracts the fields from the log line using thelogLevelPattern.
  • In theprocessChunk method,updateObjects is called to update the objects with the extracted fields.
  • ThefinishAnalysis method is called when the analysis is complete. It emits an eventanalysisComplete with the analysis results.
  • ThesortingObject method is used to sort the objects based on the count of the keys.
  • ThegetAnalysisResults method returns the analysis results.
  • Theaverage method calculates the average of an array of numbers.
  • TheanalysisComplete event is listened to and the analysis results are logged to the console.
  • Theanalyze method is called to start the analysis.

TheLogAnalyzer class reads the log file in chunks and processes each chunk to extract the fields from the log line. It updates the objects with the extracted fields and calculates the average response time and total response time. Finally, it emits an eventanalysisComplete with the analysis results.

The analysis results include the time distribution, most used path distribution, least used path distribution, IP distribution, average response time, and average total response time.

Output:

output-of-log-analyzer

This way we can analyze the logs generated by the logger using the log analyzer in an absolutely lame way😂😂

You can find the complete code for the log analyzerhere

Generating random logs

Let's address the elephant in the room: not everyone has a log file from a real production system to analyze. So, let's generate some random logs using thefaker library.

fakerjs documentation

importfsfrom'fs';import{faker}from'@faker-js/faker';// Sample dataconstmethods=["GET","POST","PUT","DELETE"];constpaths=["/","/api/v1/resource","/login","/register"];constips=['45.94.188.156','238.249.31.148','91.113.1.90','113.232.207.105','96.129.247.250','105.171.179.234','144.42.125.14','109.111.74.178','72.200.26.133','83.65.134.149'];conststatusCodes=[200,201,400,404,500];constlevels=["info","error","warn","debug"];// Function to generate a random timestampconstrandomTimestamp=(start,end)=>{conststartDate=start.getTime();constendDate=end.getTime();returnnewDate(startDate+Math.random()*(endDate-startDate));};// Generate log entriesconstlogEntries=[];conststartDate=newDate(2024,6,4,10,41,0);// Months are 0-based in JavaScriptconstendDate=newDate(2024,6,4,10,45,0);for(leti=0;i<1000;i++){// Generate 500 log entries//Generating random log entriesconsttimestamp=randomTimestamp(startDate,endDate);constdateStr=timestamp.toISOString().replace('T','').substring(0,19);constlevel=faker.helpers.arrayElement(levels);constclientIp=faker.helpers.arrayElement(ips);constmethod=faker.helpers.arrayElement(methods);constpath=faker.helpers.arrayElement(paths);conststatus=faker.helpers.arrayElement(statusCodes);constresponseTime1=(Math.random()*(0.1-0.05)+0.05).toFixed(3);constresponseTime2=(Math.random()*(0.3-0.1)+0.1).toFixed(3);constotherIp=faker.helpers.arrayElement(ips);logEntries.push(`${dateStr}${level}: ::ffff:${clientIp}${method}${path}${status}${responseTime1} ms${responseTime2} ms ::ffff:${otherIp}`);}// Write to log filefs.writeFileSync('./logs/combined/generated_logs.log',logEntries.join('\n'));console.log("Log file generated successfully.");
Enter fullscreen modeExit fullscreen mode

This script generates 1000 random log entries and writes them to a log filegenerated_logs.log. The log entries are generated with random timestamps, log levels, client IPs, methods, paths, status codes, response times, and other IPs.

You can find the complete code for generating random logshere

Conclusion

In this blog post, we saw how to analyze the logs generated by the logger using the log analyzer. We created a log analyzer that reads the logs from a file, parses the log format, and analyzes the logs. We also generated random logs using thefaker library to test the log analyzer.

While the log analyzer is a simple way to analyze logs, it is not suitable for large log files or real-time log analysis. For large log files or real-time log analysis, you can use advanced log analysis tools like ELK stack, Splunk, etc. But for small log files or testing purposes, the log analyzer is a good starting point.

So... Thanks for reading this blog post. I hope you enjoyed it. If you have any questions or feedback, feel free to leave a comment below. And I'll...

brb

PS:

I typically write these articles in the TIL form, sharing the things I learn during my daily work or afterward. I aim to post once or twice a week with all the things I have learned in the past week.

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Jr. Software Engineer @Zuru Tech. Primarily in backend stuff. Sharing what I learn in day to day.
  • Location
    Ahmedabad, Gujarat
  • Joined

More fromNaineel Soyantar

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp