Movatterモバイル変換

MurageKabui/AutoIT-OCRSpace-UDFPublic

NotificationsYou must be signed in to change notification settings
Fork1
Star14

A AutoIT 3 wrapper library around the OCRSpace API.

License

GPL-3.0 license

14 stars 1 fork Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
Assets		Assets
OCRSpaceUDF		OCRSpaceUDF
LICENSE		LICENSE
OCR_Space Example1.au3		OCR_Space Example1.au3
README.md		README.md
receipt.jpg		receipt.jpg

Repository files navigation

This UDF provides text capturing support for applications and controls using the OCRSpace API - a powerful Online OCR service that converts images of text documents into editable files by using Optical Character Recognition (OCR)

Setting up

Assuming the udf folder is present in the working directory, include it in your script with the directive :

#include"OCRSpaceUDF\_OCRSpace.au3"

Initialize your preferences beforehand with the function_OCRSpace_SetUpOCR.
Here ,you're required to set up your API key at least.

$a_ocr= _OCRSpace_SetUpOCR("0123456789abcdefABCDEF",1,false,true,"eng")

Parse the returned array "handle" from the function_OCRSpace_SetUpOCR to the first parameter of
the function_OCRSpace_ImageGetText, along with the rest of the optional or required parameters.

$sText_Detected= _OCRSpace_ImageGetText($a_ocr,@scriptdir&"\receipt.jpg",0)

[Expand full script]

#include-once#include"OCRSpaceUDF\_OCRSpace.au3"; Get your free key at http://eepurl.com/bOLOcf$api_key="0123456789abcdefABCDEF"; Setup some preferences in retrospect.$a_ocr= _OCRSpace_SetUpOCR($api_key,1,false,true,"eng",true,Default,Default,Default); Scan a receipt (using a local image or remote url reference)$sText_Detected= _OCRSpace_ImageGetText($a_ocr ,"https://i.imgur.com/eCuYtDe.png",0); Display the result.ConsoleWrite(_" Detected text   :"&$s_textdetected&@CRLF&_" Error Returned  :"&@error&@CRLF)

Functions & Syntax

_OCRSpace_SetUpOCR($s_APIKey,$i_OCREngineID=1,$b_IsTable=False,$b_DetectOrientation=True,$s_LanguageISO="eng",$b_IsOverlayRequired=False,$b_AutoScaleImage=False,$b_IsSearchablePdfHideTextLayer=False,$b_IsCreateSearchablePdf=False)_OCRSpace_ImageGetText($aOCR_OptionsHandle,$sImage_UrlOrFQPN,$iReturnType=0,$sURLVar="__OCRSPACE_SEARCHABLE_PDFLINK")

Scanning a PDF

Could be to extract text from scanned papers (e.g. Invoices, Receipts etc.). For this example, scan a PDF
file ;Human_Genome_Project.pdf

#include"OCRSpaceUDF\_OCRSpace.au3"; get your free key at http://eepurl.com/bOLOcf$api_key="0123456789abcdefABCDEF"; setup some preferences in retrospect.$a_ocr= _OCRSpace_SetUpOCR($api_key,1,false,true,"eng",true,Default,Default,Default); scan a PDF file (Using a PDF URL reference)$s_textdetected= _OCRSpace_ImageGetText($a_ocr ,"https://www.lkouniv.ac.in/site/writereaddata/siteContent/202003271601129023vibha_Human_Genome_Project.pdf",0); display the result.ConsoleWrite(_" Detected text   :"&$s_textdetected&@CRLF&_" Error Returned  :"&@error&@CRLF)

Expand Output

 Detected text       : Human Genome ProiectIntroductionThe Human Genome Project (HGP) is an internationally collaborative Rnture to identiöy and mark allthe locations of every- gene of the human species. The HGP in the United States was started in 1990and was expected to be a fifteen year effort to map the human genome. There have been a number oftechnological advances since 1990 that have accelerated the progress of the project to a completiondate sometime during the year 2003. The US. HGP is composed of the Depaftnent of Energy (DOE)and the National Institute of Health (NIH) uålich hopes to discoRr 50,000 to 100,000 human genesand make them available for biological study There are a number of other countries that areinvolved in the project, including Australia: Brazil, Canada, France, Germany, Japan, and the UnitedKingdom Besides numerous countries involved in the project there is also a number of commercialcompanies that are invoked in sequencing. The collaborative 3 billion dollar price tag will be used tosequence the possible 3 billion DNA base pairs of human DNA_The possibilities from the information that will be obtained from the project are virtually endless. Itwill most likely change many biological and medical research techniques and many of the practicesused by our medical professionals today. The knowledge that will be obtained will help lead to newways of diagnosing, treating, and possibly preventing diseases. Through the discovery of the humangenome, the possibilities are endless for agriculture, health semices, and new energy sources also. Theend result of the HGP will be information about the structure, and organization of DNA_ aswe know it today.Since the beginning of time, people have yearned to explore the un_known, chaff where they harbeen, and contemplate uhat they har found. The maps we make of these treks enable the nextexplorers to push ever farther the boundaries of our knowledge - about the the sea, the sky, andindeed, ourselves. On a new quest to chafi the innermost reaches of the human cell, scientists havenow set out on biology's most important mappmg expedition the Human Genome Project Its mission...Error Returned      : 0

Request a searchable PDF

A searchable PDF can be requested and its URL retrieved by:

Set the last option of the_OCRSpace_SetUpOCR toTrue :

$a_ocr= _OCRSpace_SetUpOCR($api_key,1,false,true,"eng",true,Default,Default,True)

Set the string you want the url assigned to, in this caseMyPDF_URL_

; scan a receipt (using a image uri reference). The url to a searchable pdf requested will be assigned to 'MyPDF_URL_'$s_textdetected= _OCRSpace_ImageGetText($a_ocr ,"https://i.imgur.com/eCuYtDe.png",0,"MyPDF_URL_")

Display the results , evaluate the string pointing to a searchable PDF URL .

ConsoleWrite(_" Detected text        :"&$s_textdetected&@CRLF&_" Error Returned       :"&@error&@CRLF&_" Searchable PDF  Link :"&Eval("MyPDF_URL_")&@CRLF)

Advanced Usage

ByDefault, the detected text is returned as aplain stringi.e. when$iReturnType at
function_OCRSpace_ImageGetText is set to0 .
Setting$iReturnType to1 returns a 2Darray containing the coordinates of the bounding
boxes for each word detected, in the format :#WordDetected ,#Left ,#Top ,#Height,#Width

Example with a URL reference :https://i.imgur.com/eCuYtDe.png

lorem_ipsum.png

Result with ArrayDisplay()

; Iterating the array..For$i=0ToUBound($aText_Detected,1)-1ConsoleWrite(_"Word ("&$aText_Detected[$i][0]&")  Left ("&$aText_Detected[$i][1]&")"&" Top ("&$aText_Detected[$i][2]&") Height ("&$aText_Detected[$i][3]&") Width ("&$aText_Detected[$i][4]&")"&@CRLF)Next

Output :

Word (Lorem)  Left (14) Top (17) Height (10) Width (42)Word (ipsum)  Left (66) Top (16) Height (12) Width (43)Word (dolor)  Left (119) Top (15) Height (12) Width (42)Word (sit)  Left (171) Top (15) Height (12) Width (25)Word (amet,)  Left (206) Top (16) Height (12) Width (40)Word (consectetur)  Left (259) Top (15) Height (12) Width (95)

•••

Other Stuff

Check the API performance and uptime at the API status pagehere
Register here for your free OCR API keyhere
Subscribe to a PRO planhere
If you want to try all the available features of the OCR API, check out their full documentationhere!

Plot :https://github.com/users/KabueMurage/projects/7

Other credits

•••

Thanks to AspirinJunkie for the JSON UDF

Legal

Use this however you want, all at your own risk. This code is in no way affiliated with, authorized, maintained, sponsored or endorsed by OCRSpaceand/or AutoIt or any of its affiliates or subsidiaries. This is independent and unofficial.

About

A AutoIT 3 wrapper library around the OCRSpace API.

Releases4

v1.4.0 Latest

Aug 8, 2022

+ 3 releases

Packages

No packages published

Languages

AutoIt100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Setting up

Functions & Syntax

Scanning a PDF

Request a searchable PDF

Advanced Usage

Output :

Other Stuff

Other credits

Legal

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases4

Packages

Uh oh!

Languages

Movatterモバイル変換

License

MurageKabui/AutoIT-OCRSpace-UDF

Folders and files

Latest commit

History

Repository files navigation

Setting up

Functions & Syntax

Scanning a PDF

Request a searchable PDF

Advanced Usage

Output :

Other Stuff

Other credits

Legal

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases4

Packages0

Uh oh!

Languages

Packages