This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can trysigning in orchanging directories.
Access to this page requires authorization. You can trychanging directories.
Text recognition, also known as optical character recognition (OCR), is supported by a set of Windows AI APIs that can detect and extract text within images and convert it into machine readable character streams.
These APIs can identify characters, words, lines, polygonal text boundaries, and provide confidence levels for each match. They are also exclusively supported by hardware acceleration in devices with a neural processing unit (NPU), making them faster and more accurate than the legacy Windows.Media.Ocr.OcrEngine APIs in theWindows platform SDK.
ForAPI details, seeAPI ref for Text Recognition (OCR).
Use AI Text Recognition features to identify and recognize text in an image. You can also get the text boundaries and confidence scores for the recognized text.
Note
Characters that are illegible or small in size can generate inaccurate results.
In this WinUI example we call aLoadImageBufferFromFileAsync function to get anImageBuffer from an image file.
In the LoadImageBufferFromFileAsync function, we complete the following steps:
using Microsoft.Windows.AI.Imaging;using Microsoft.Graphics.Imaging;using Windows.Graphics.Imaging;using Windows.Storage;using Windows.Storage.Streams;public async Task<ImageBuffer> LoadImageBufferFromFileAsync(string filePath){ StorageFile file = await StorageFile.GetFileFromPathAsync(filePath); IRandomAccessStream stream = await file.OpenAsync(FileAccessMode.Read); BitmapDecoder decoder = await BitmapDecoder.CreateAsync(stream); SoftwareBitmap bitmap = await decoder.GetSoftwareBitmapAsync(); if (bitmap == null) { return null; } return ImageBuffer.CreateBufferAttachedToBitmap(bitmap);}#include <iostream>#include <sstream>#include <winrt/Microsoft.Windows.AI.Imaging.h>#include <winrt/Windows.Graphics.Imaging.h>#include <winrt/Microsoft.Graphics.Imaging.h>#include <winrt/Microsoft.UI.Xaml.Controls.h>#include<winrt/Microsoft.UI.Xaml.Media.h>#include<winrt/Microsoft.UI.Xaml.Shapes.h>using namespace winrt;using namespace Microsoft::UI::Xaml;using namespace Microsoft::Windows::AI;using namespace Microsoft::Windows::AI::Imaging;using namespace winrt::Microsoft::UI::Xaml::Controls;using namespace winrt::Microsoft::UI::Xaml::Media;winrt::Windows::Foundation::IAsyncOperation<winrt::hstring> MainWindow::RecognizeTextFromSoftwareBitmap( Windows::Graphics::Imaging::SoftwareBitmap const& bitmap){ winrt::Microsoft::Windows::AI::Imaging::TextRecognizer textRecognizer = EnsureModelIsReady().get(); Microsoft::Graphics::Imaging::ImageBuffer imageBuffer = Microsoft::Graphics::Imaging::ImageBuffer::CreateForSoftwareBitmap(bitmap); RecognizedText recognizedText = textRecognizer.RecognizeTextFromImage(imageBuffer); std::wstringstream stringStream; for (const auto& line : recognizedText.Lines()) { stringStream << line.Text().c_str() << std::endl; } co_return winrt::hstring{ stringStream.str()};}The following example shows how to recognize some text in aSoftwareBitmap object as a single string value:
EnsureModelIsReady function, which also confirms there is a language model present on the system.RecognizeTextFromSoftwareBitmap function.Note
TheEnsureModelIsReady function is used to check the readiness state of the text recognition model (and install it if necessary).
using Microsoft.Windows.AI.Imaging;using Microsoft.Windows.AI;using Microsoft.Graphics.Imaging;using Windows.Graphics.Imaging;using Windows.Storage;using Windows.Storage.Streams;public async Task<string> RecognizeTextFromSoftwareBitmap(SoftwareBitmap bitmap){ TextRecognizer textRecognizer = await EnsureModelIsReady(); ImageBuffer imageBuffer = ImageBuffer.CreateBufferAttachedToBitmap(bitmap); RecognizedText recognizedText = textRecognizer.RecognizeTextFromImage(imageBuffer); StringBuilder stringBuilder = new StringBuilder(); foreach (var line in recognizedText.Lines) { stringBuilder.AppendLine(line.Text); } return stringBuilder.ToString();}public async Task<TextRecognizer> EnsureModelIsReady(){ if (TextRecognizer.GetReadyState() == AIFeatureReadyState.NotReady) { var loadResult = await TextRecognizer.EnsureReadyAsync(); if (loadResult.Status != AIFeatureReadyResultState.Success) { throw new Exception(loadResult.ExtendedError().Message); } } return await TextRecognizer.CreateAsync();}winrt::Windows::Foundation::IAsyncOperation<winrt::Microsoft::Windows::AI::Imaging::TextRecognizer> MainWindow::EnsureModelIsReady(){ if (winrt::Microsoft::Windows::AI::Imaging::TextRecognizer::GetReadyState() == AIFeatureReadyState::NotReady) { auto loadResult = TextRecognizer::EnsureReadyAsync().get(); if (loadResult.Status() != AIFeatureReadyResultState::Success) { throw winrt::hresult_error(loadResult.ExtendedError()); } } return winrt::Microsoft::Windows::AI::Imaging::TextRecognizer::CreateAsync();}Here we show how to visualize theBoundingBox of each word in aSoftwareBitmap object as a collection of color-codedpolygons on aGrid element.
Note
For this example we assume aTextRecognizer object has already been created and passed in to the function.
using Microsoft.Windows.AI.Imaging;using Microsoft.Graphics.Imaging;using Windows.Graphics.Imaging;using Windows.Storage;using Windows.Storage.Streams;public void VisualizeWordBoundariesOnGrid( SoftwareBitmap bitmap, Grid grid, TextRecognizer textRecognizer){ ImageBuffer imageBuffer = ImageBuffer.CreateBufferAttachedToBitmap(bitmap); RecognizedText result = textRecognizer.RecognizeTextFromImage(imageBuffer); SolidColorBrush greenBrush = new SolidColorBrush(Microsoft.UI.Colors.Green); SolidColorBrush yellowBrush = new SolidColorBrush(Microsoft.UI.Colors.Yellow); SolidColorBrush redBrush = new SolidColorBrush(Microsoft.UI.Colors.Red); foreach (var line in result.Lines) { foreach (var word in line.Words) { PointCollection points = new PointCollection(); var bounds = word.BoundingBox; points.Add(bounds.TopLeft); points.Add(bounds.TopRight); points.Add(bounds.BottomRight); points.Add(bounds.BottomLeft); Polygon polygon = new Polygon(); polygon.Points = points; polygon.StrokeThickness = 2; if (word.Confidence < 0.33) { polygon.Stroke = redBrush; } else if (word.Confidence < 0.67) { polygon.Stroke = yellowBrush; } else { polygon.Stroke = greenBrush; } grid.Children.Add(polygon); } }}void MainWindow::VisualizeWordBoundariesOnGrid( Windows::Graphics::Imaging::SoftwareBitmap const& bitmap, Grid const& grid, TextRecognizer const& textRecognizer){ Microsoft::Graphics::Imaging::ImageBuffer imageBuffer = Microsoft::Graphics::Imaging::ImageBuffer::CreateForSoftwareBitmap(bitmap); RecognizedText result = textRecognizer.RecognizeTextFromImage(imageBuffer); auto greenBrush = SolidColorBrush(winrt::Microsoft::UI::Colors::Green()); auto yellowBrush = SolidColorBrush(winrt::Microsoft::UI::Colors::Yellow()); auto redBrush = SolidColorBrush(winrt::Microsoft::UI::Colors::Red()); for (const auto& line : result.Lines()) { for (const auto& word : line.Words()) { PointCollection points; const auto& bounds = word.BoundingBox(); points.Append(bounds.TopLeft); points.Append(bounds.TopRight); points.Append(bounds.BottomRight); points.Append(bounds.BottomLeft); winrt::Microsoft::UI::Xaml::Shapes::Polygon polygon{}; polygon.Points(points); polygon.StrokeThickness(2); if (word.MatchConfidence() < 0.33) { polygon.Stroke(redBrush); } else if (word.MatchConfidence() < 0.67) { polygon.Stroke(yellowBrush); } else { polygon.Stroke(greenBrush); } grid.Children().Append(polygon); } }}We have used a combination of the following steps to ensure these imaging APIs are trustworthy, secure, and built responsibly. We recommend reviewing the best practices described inResponsible Generative AI Development on Windows when implementing AI features in your app.
Was this page helpful?
Need help with this topic?
Want to try using Ask Learn to clarify or guide you through this topic?
Was this page helpful?
Want to try using Ask Learn to clarify or guide you through this topic?