- Notifications
You must be signed in to change notification settings - Fork0
C# implementation of the Normalized Compression Distance (NCD) classification algorithm with Gzip compression
License
NotificationsYou must be signed in to change notification settings
techjb/Gzip-Text-Classifier
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This project is a C# implementation of the Normalized Compression Distance (NCD)classification algorithm with Gzip compression, which can be read about inthis paper.
The original repository written in Python, can be found atthis link.
Also available as aNuget package.
Predict csv test file:
stringtrainFile=@"C:\Users\Chus\Downloads\ag_news_train.csv";stringtestFile=@"C:\Users\Chus\Downloads\ag_news_test.csv";GzipClassifierOptionsgzipClassifierOptions=new(){TrainFile=trainFile,// File path for csv train fileParallelismOnCalc=true,// Use paralelism on distance calc. Default: trueParallelismOnTestFile=false,// Use paralelism for each test. Default: falseK=3,// Value of K in k-nearest-neighbor. Default: 3TextColumn=0,// Text column number in csv file. Default: 0LabelColumn=1,// Label column number in csv file. Default: 1HasHeaderRecord=true,// Csv has header record. Deault: trueConsoleOutput=true,// Output console during file prediction. Default: true};GzipClassifiergzipClassifier=new(gzipClassifierOptions);doubleresult=gzipClassifier.PredictFile(testFile);Console.WriteLine(result);
Single text prediction:
stringtext="Socialites unite dolphin groups Dolphin groups, or\"pods\", rely on socialites to keep them from collapsing, scientists claim.";varprediction=gzipClassifier.Predict(text);Console.WriteLine(prediction);
About
C# implementation of the Normalized Compression Distance (NCD) classification algorithm with Gzip compression
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
Uh oh!
There was an error while loading.Please reload this page.