| ML.NET | |
|---|---|
| Original author | Microsoft |
| Developer | .NET Foundation |
| Initial release | 7 May 2018; 7 years ago (2018-05-07)[1] |
| Stable release | 3.0.0 / 28 November 2023; 2 years ago (2023-11-28) |
| Preview release | 3.0.0-preview.23511.1 / 14 October 2023; 2 years ago (2023-10-14) |
| Written in | C# andC++ |
| Operating system | Linux,macOS,Windows[2] |
| Platform | .NET Core, .NET Framework |
| Type | Machine learninglibrary |
| License | MIT License[3] |
| Website | dot |
| Repository | github |
ML.NET is afree softwaremachine learninglibrary for theC# andF# programming languages.[4][5][6] It also supportsPython models when used together with NimbusML. The preview release of ML.NET included transforms forfeature engineering liken-gram creation, and learners to handle binary classification, multi-class classification, and regression tasks.[7] Additional ML tasks like anomaly detection and recommendation systems have since been added, and other approaches like deep learning will be included in future versions.[8][9]
ML.NET brings model-based Machine Learning analytic and prediction capabilities to existing .NET developers. The framework is built upon .NET Core and .NET Standard inheriting the ability to run cross-platform onLinux,Windows andmacOS. Although the ML.NET framework is new, its origins began in 2002 as a Microsoft Research project named TMSN (text mining search and navigation) for use internally within Microsoft products. It was later renamed to TLC (the learning code) around 2011. ML.NET was derived from the TLC library and has largely surpassed its parent says Dr. James McCaffrey, Microsoft Research.[10]
Developers can train a Machine Learning Model or reuse an existing Model by a 3rd party and run it on any environment offline. This means developers do not need to have a background in Data Science to use the framework. Support for theopen-source Open Neural Network Exchange (ONNX)Deep Learning model format was introduced from build 0.3 in ML.NET. The release included other notable enhancements such as Factorization Machines,LightGBM, Ensembles, LightLDA transform and OVA.[11] The ML.NET integration ofTensorFlow is enabled from the 0.5 release. Support for x86 & x64 applications was added to build 0.7 including enhanced recommendation capabilities with Matrix Factorization.[12] A full roadmap of planned features have been made available on the official GitHub repo.[13]
The first stable 1.0 release of the framework was announced atBuild (developer conference) 2019. It included the addition of a Model Builder tool andAutoML (Automated Machine Learning) capabilities.[14] Build 1.3.1 introduced a preview of Deep Neural Network training using C# bindings[15] for Tensorflow and a Database loader which enables model training on databases. The 1.4.0 preview added ML.NET scoring on ARM processors and Deep Neural Network training with GPU's for Windows and Linux.[16]
Microsoft's paper on machine learning with ML.NET demonstrated it is capable of training sentiment analysis models using large datasets while achieving high accuracy. Its results showed 95% accuracy on Amazon's 9GB review dataset.[17]
The ML.NET CLI is aCommand-line interface which uses ML.NET AutoML to perform model training and pick the best algorithm for the data. The ML.NET Model Builder preview[18] is an extension forVisual Studio that uses ML.NET CLI and ML.NET AutoML to output the best ML.NET Model using aGUI.[14]
AI fairness andexplainability has been an area of debate for AI Ethicists in recent years.[19] A major issue for Machine Learning applications is the black box effect where end users and the developers of an application are unsure of how an algorithm came to a decision or whether the dataset contains bias.[20] Build 0.8 included model explainability API's that had been used internally in Microsoft. It added the capability to understand the feature importance of models with the addition of 'Overall Feature Importance' and 'Generalized Additive Models'.[21]
When there are several variables that contribute to the overall score, it is possible to see a breakdown of each variable and which features had the most impact on the final score. The official documentation demonstrates that the scoring metrics can be output for debugging purposes. During training & debugging of a model, developers can preview and inspect live filtered data. This is possible using theVisual Studio DataView tools.[22]
Microsoft Research announced the popular Infer.NET model-based machine learning framework used for research in academic institutions since 2008 has been released open source and is now part of the ML.NET framework.[23] The Infer.NET framework utilisesprobabilistic programming to describeprobabilistic models which has the added advantage of interpretability. The Infer.NET namespace has since been changed to Microsoft.ML.Probabilistic consistent with ML.NET namespaces.[24]
Microsoft acknowledged that thePython programming language is popular with Data Scientists, so it has introduced NimbusML the experimental Python bindings for ML.NET. This enables users to train and use machine learning models in Python. It was made open source similar to Infer.NET.[12]
ML.NET allows users to export trained models to theOpen Neural Network Exchange (ONNX) format.[25] This establishes an opportunity to use models in different environments that don't use ML.NET. It would be possible to run these models in the client side of a browser using ONNX.js, a JavaScript client-side framework for deep learning models created in the Onnx format.[citation needed]
Along with the rollout of the ML.NET preview, Microsoft rolled out free AI tutorials and courses to help developers understand techniques needed to work with the framework.[26][27][28]
Over time, it will enable other ML tasks like anomaly detection, recommendation system, and other approaches like deep learning using the benefits of added libraries.
Even though the ML.NET library is new, its origins go back many years. Shortly after the introduction of the Microsoft .NET Framework in 2002, Microsoft Research began a project called TMSN ("text mining search and navigation") to enable software developers to include ML code in Microsoft products and technologies. The project was very successful, and over the years grew in size and usage internally at Microsoft. Somewhere around 2011 the library was renamed to TLC ("the learning code"). TLC is widely used within Microsoft and is currently in version 3.10. The ML.NET library is a descendant of TLC, with Microsoft-specific features removed. I've used both libraries and, in many ways, the ML.NET child has surpassed its parent.