You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
This project implements a hybrid log classification system, combining three complementary approaches to handle varying levels of complexity in log patterns. The classification methods ensure flexibility and effectiveness in processing predictable, complex, and poorly-labeled data patterns.
Classification Approaches
Regular Expression (Regex):
Handles the most simplified and predictable patterns.
Useful for patterns that are easily captured using predefined rules.
Sentence Transformer + Logistic Regression:
Manages complex patterns when there is sufficient training data.
Utilizes embeddings generated by Sentence Transformers and applies Logistic Regression as the classification layer.
LLM (Large Language Models):
Used for handling complex patterns when sufficient labeled training data is not available.
Provides a fallback or complementary approach to the other methods.
Folder Structure
training/:
Contains the code for training models using Sentence Transformer and Logistic Regression.
Includes the code for regex-based classification.
models/:
Stores the saved models, including Sentence Transformer embeddings and the Logistic Regression model.
resources/:
This folder contains resource files such as test CSV files, output files, images, etc.
Root Directory:
Contains the FastAPI server code (server.py).
Setup Instructions
Install Dependencies:Make sure you have Python installed on your system. Install the required Python libraries by running the following command:
pip install -r requirements.txt
Run the FastAPI Server:To start the server, use the following command:
uvicorn server:app --reload
Once the server is running, you can access the API at:
This project, including its code and resources, is intended solely for educational purposes and should not be used for any commercial purposes without proper authorization.
About
Log classification using hybrid classification framework