CROSS-REFERENCE TO RELATED APPLICATIONThis application claims priority to and the full benefit of U.S. Provisional Patent Application Ser. No. 62/032,543, filed Aug. 2, 2014, and titled “METHODS AND APPARATUS FOR BOUNDED IMAGE DATA ANALYSIS AND NOTIFICATION MECHANISM”, the entire contents of which are incorporated herein by reference.
FIELD OF INVENTIONThis invention relates to methods and apparatus for processing an image captured via an image capture device. In some embodiments, the image capture device may be a mobile device and a processor in the mobile device may perform a preliminary scan and notify the user of the scan result. More particularly, the present invention relates to performing a preliminary bounded image data analysis on a document prior to transmission of the document to a server for continued remote processing, wherein the server may perform a complete characterization of the document and/or store the imaging results within a virtual cloud based image storage.
BACKGROUNDWith the increasing use of smart phones equipped with cameras and integrated email and text communication platforms, people have a widespread convenience of electronically sending a picture or document from almost anywhere. However, in many cases a captured image of a document may not be legible when reprinted or processed at the messages destination. A sender may not realize that the document is not legible until it is too late to easily resend. Traditional methods of optical character recognition (OCR) of an entire document before sending is often impractical due to the significant processing power required to locally OCR each document.
In such examples, the necessary information may be generally minimal with easy-to-read fonts, wherein the image may be small and only specific text may be necessary to complete the action. Current applications are generally not targeted to process large, complex documents.
What is needed therefore is an apparatus and associated methods for timely, conveniently and efficiently capturing and notifying a user immediately whether the document can be properly analyzed transmitting electronically the image or document for remote processing.
SUMMARYAccordingly, the present invention provides apparatus and methods for preliminary bounded image data analysis and notification mechanism on an electronic device with image capture capabilities.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. The accompanying drawings that are incorporated in and constitute a part of this specification, illustrate several embodiments of the invention and, together with the description, serve to explain the principles of the invention: Other features, objects, and advantages of the invention will be apparent from the description, drawings and the claims herein.
A system of one or more computers can be configured to perform particular operations or actions by virtue of having software, firmware, hardware, or a combination of them installed on the system that in operation causes or cause the system to perform the actions. One or more computer programs can be configured to perform particular operations or actions by virtue of including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions. One general aspect includes an image feature processing apparatus including: an image capture device for capturing a set of image data; a preliminary image recognition mechanism operative to identify and analyze potential characteristics included in the set of captured image data, where preliminary image recognition includes—automated identification of a feature based upon at least partial recognition of the feature; and a notification mechanism configured to notify a user of a result of the preliminary image recognition as compared to predefined parameters. A notification may include no artificial delay in order to provide a prompt process. Other embodiments of this aspect include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
Implementations may include one or more of the following features. One feature may include an example with an image feature processing apparatus where a preliminary image recognition mechanism provides optical character recognition identifying potential text characters from the image data, where preliminary optical character recognition includes one or both of: partial character recognition of potential text characters and character recognition of a portion of at least one text character. The example may also include a notification mechanism which may notify a user of a preliminary optical character recognition based on predefined parameters such as, but not limited to minimal font size, percent success of optical character recognition (OCR) attempts, total time to attempt an OCR as specified by the user for a document type captured in the image data. Other examples may include an image feature processing apparatus which may be capable of connecting to cloud storage and processing server. The image feature processing apparatus may include a mobile device. The image feature processing apparatus may include a digital camera. The image feature processing apparatus may also be configured to identify potential text characters from a document image and analyze the potential text characters with optical character recognition to provide a degree of character recognition.
There may be methods which also include transmitting the captured document image to a cloud server for storage of the document image on the cloud server. In some methods a transmission may be based on a result of analyzing the transmitted captured document image where the result includes a success result based upon recognition of one or both of a partial character recognition or a recognition of a portion of at least one text character. The method may further include steps of: transmitting the document image to an external server, where the external server is configured to identify the potential text characters from the document image and analyze the potential text characters with optical character recognition. The method may further include steps of: categorizing the document image based on predefined content criteria of the recognized text. The method may further include steps of: prompting a user to accept or reject the document image. The method may further include steps of prompting the user to input authorization information, where the authorization information is capable of granting access to a secure data storage system.
Some methods may have exemplary steps where a notification is performed and the notifying includes at least one of an audible notification, a visual notification, or a tactile notification. Methods may further include steps of: identifying borders of the document image. The method may also include recognizing an orientation of the at least one potential text character. In some methods, the notification of a fail may include prompting the user to select or capture an alternate document image. Implementations of the described techniques may include hardware, a method or process, or computer software on a computer-accessible medium.
One general aspect may include methods of preliminary optical character recognition and notification, the methods may include steps of: identifying at least one potential text character from a captured document image, where an electronic device with image capture capabilities is configured to store the document image. The methods may include analyzing the at least one potential text character with preliminary optical character recognition, where the analyzing creates results based on an ability to recognize at least one text character, and where the preliminary optical character recognition includes one or both a partial character recognition or a character recognition of a portion of the at least one text character. Some methods may include communicating the results to a notification device configured to notify a user of the results based on predefined parameters of a fail result or a success result. The method of notifying a user may include at least one of an audible notification, a visual notification, or a tactile notification. Other aspects and examples of these methods may include corresponding computer systems, apparatus, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of the methods.
BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1A illustrates exemplary embodiments of engaging a Bounded Image Data Analysis (BIDA) on a local system, wherein the local system comprises at least an image-capturing mechanism and a processor, are illustrated for an exemplary portable computing device.
FIG. 1B illustrates exemplary embodiments of engaging a Bounded Image Data Analysis (BIDA) on a local system, wherein the local system comprises at least an image-capturing mechanism and a processor, are illustrated for an exemplary portable computing device.
FIG. 2A illustrates exemplary embodiments of the BIDA capture system utilizing the image capturing mechanism on a local system.
FIG. 2B illustrates exemplary embodiments of the BIDA capture system utilizing the image capturing mechanism on a local system.
FIG. 3A illustrates an exemplary step for utilizing a local system with BIDA capabilities and uploading a captured document.
FIG. 3B illustrates an exemplary step for utilizing a local system with BIDA capabilities and uploading a captured document.
FIG. 3C illustrates an exemplary step for utilizing a local system with BIDA capabilities and uploading a captured document.
FIG. 3D illustrates an exemplary step for utilizing a local system with BIDA capabilities and uploading a captured document.
FIG. 3E illustrates an exemplary step for utilizing a local system with BIDA capabilities and uploading a captured document.
FIG. 3F illustrates an exemplary step for utilizing a local system with BIDA capabilities and uploading a captured document.
FIG. 4A illustrates an exemplary step for developing a document image with recognizable specified features through use of overlap differential.
FIG. 4B illustrates an exemplary step for developing a document image with recognizable specified features through use of overlap differential.
FIG. 4C illustrates an exemplary step for developing a document image with recognizable specified features through use of overlap differential.
FIG. 4D illustrates an exemplary step for developing a document image with recognizable specified features through use of overlap differential.
FIG. 4E illustrates an exemplary step for developing a document image with recognizable specified features through use of overlap differential.
FIG. 5A illustrates an exemplary step for manually selecting an area of focus utilizing a BIDA image capture system on a mobile device with image capturing capabilities.
FIG. 5B illustrates an exemplary step for manually selecting an area of focus utilizing a BIDA image capture system on a mobile device with image capturing capabilities.
FIG. 5C illustrates an exemplary step for manually selecting an area of focus utilizing a BIDA image capture system on a mobile device with image capturing capabilities.
FIG. 5D illustrates an exemplary step for manually selecting an area of focus utilizing a BIDA image capture system on a mobile device with image capturing capabilities.
FIG. 5E illustrates an exemplary step for manually selecting an area of focus utilizing a BIDA image capture system on a mobile device with image capturing capabilities.
FIG. 6 illustrates exemplary method steps for utilizing a BIDA image capture system on an electronic device with image capturing and network connection capabilities.
FIG. 7A illustrates exemplary method steps for performing a BIDA image capture process.
FIG. 7B illustrates exemplary method steps for performing a BIDA image capture process.
FIG. 8 illustrates additional aspects of apparatus that may be used in some implementations of the present invention.
FIG. 9 illustrates a block diagram of a mobile device that may be used in some implementations of the present invention.
DETAILED DESCRIPTIONThe present invention relates generally to a preliminary image processing system that analyzes a set of image data and characterizes the quality of the set of image data. In some preferred embodiments, the preliminary image processing system is incorporated into a mobile device such as a smart phone and includes image capture hardware and software and is further capable to communicate via a distributed communications network to a storage device, such as, for example a cloud document type storage apparatus. According to the present invention a set of image data is captured and processed to determine a threshold of image quality (TOIQ) prior to transmission of the set of image data to the storage apparatus. A threshold of image quality may include, for example, user defined parameters for a percent of success of optical character recognition, object image recognition, a potential for test character threshold, biometric recognition or other standard.
Glossary- As used herein the following terms will have the following associated meaning:
- Bounded Image Data Analysis (BIDA): as used herein refers to an analysis of a captured image based on a predefined set of criteria, wherein a local system processes the analysis. The system-defined analysis comprises a preliminary feature quality analysis that may assess whether feature recognition may be possible with an acceptable threshold of image quality during continued remote processing.
- Feature Recognition: as used herein refers to a capability to isolate and identify specified features on an image, wherein the specified feature may depend on a type or category of image data.
- Examples of Feature Recognition may include: physical object, alphanumeric text recognition, barcode recognition, hash code recognition, feature recognition, biometric recognition, or other input recognition that may be captured as a two dimensional (2D) image on a three dimensional (3D) image derived from one of many known techniques of imaging.
- Specified Feature: as used herein refers to a component of an image that when processed may identify and categorize an image. For example, for a contract document, the specified feature may comprise alphanumeric characters and symbols. For a photograph, the specified features may comprise objects, including, for example, an article, person, device, structure, or combinations thereof. For some images, such as a driver's license, the specified features may comprise a combination of characters and objects utilized in facial recognition.
- Local System: as used herein refers to a device or system of devices that may process the BIDA without requiring logical communication with a network.
- Overlap Differential: as used herein refers to an image created through an amalgamation of multiple captured images of the same document, wherein feature recognition may not be possible on an isolated captured image but may be possible on the amalgamated image.
- Authenticated: as used herein refers to a state where the image capture and remote access rights have been verified by a predefined verification mechanism, including, for example, time stamp, user login, unique ID of image capture device, such as a IMEI, voice recognition, geolocation, fingerprint, or face recognition.
- Continued Remote Process: as used herein refers to a process separate from the BIDA, wherein the separate process occurs after receipt of an image from the local system over a network.
- Field of View: as used herein refers to the viewable boundaries of an image capturing device.
- Area of Focus: as used herein refers to a subset of boundaries within the field of view where the image capturing device may be capable of focusing text or images within the subset.
- Document Image: as used herein refers to a captured view of a document, wherein the captured view may comprise at least one specified feature.
- BIDA Image Capturing System: as used herein refers to an image capturing device component of the local system, which may be configured to capture images that may be compatible with the BIDA. In some embodiments, the BIDA image capturing system may automatically frame and focus the image capturing mechanism of the electronic device. Alternatively, the BIDA image capture system may comprise a manual mode, wherein a user may control the framing and focusing of the image capturing mechanism.
Referring now toFIGS. 1A-1B, exemplary embodiments illustrate some implementations of the present invention for engaging a BIDA on a local system, wherein the local system comprises at least an image-capturing mechanism and a processor on alocal system100 which may be an exemplary portable computing device, such as, for example a mobile smart phone.FIGS. 1A and 1B illustrate an exemplary embodiment of the BIDA capture system where the system may access and review images stored within a photograph album or other such file storage on thelocal system100.FIG. 1A illustrates an image where the BIDA capture system may recognize specified features such as a line oftext121 on adocument image120. The user interface for file-based BIDA may include such features as anicon140 to designate the local storage of images captured in file-based BIDA.
An ability to review the images in the storage location such as icons for proceeding through the images in a first direction at154 or a reverse direction at155. The image may be chosen at165 or rejected at160 as non-limiting examples of a user interface. Aparticular document image120 may be represented where specified features121 may include text that may be located where the text may have been successfully scanned. On a successful detection of character text during BIDA, anotification130 may be provided to the user, such as, for example, one or more of: a vibration, a tone, a verification number certifying acceptance of the image at a defined quality level, and acceptance icon, or other indicator. A verification may include, for example, a universally unique identification number (UUID) that may be stored on one or both of the mobile device and the cloud document mechanism.
FIG. 1B illustrates adocument image125 where the BIDA capture system may not recognize aspecified feature126. In some embodiments, there may still be a notification of some kind including one or more of audible, visual, or tactile indicator. In other embodiments, the absence of a notification may be the indicator that the document image does not comprise recognizable features and does not pass a BIDA capture test, such as meeting a threshold of image quality. In some embodiments, avisual notification131 may highlight the specifiedfeature126 that may not be recognizable at a certain distance from the camera, which may allow a user to decide to move the camera orientation and/or adjust lighting before taking a new image capture of the document or to accept the document image despite one or more unrecognizable specified features.
Referring now toFIGS. 2A and 2B, exemplary embodiments of the BIDA capture system utilizing the image capturing mechanism on alocal system200 for different configuration setup.FIG. 2A may illustrate alocal system200 where the BIDA capture system may not recognize aspecified feature221 within adocument220. This may be due to various factors including an inadequate focusing condition or the lack of textual data capable of an efficient OCR in the image captured. For example, as illustrated, thedocument220 may be within the field ofview210 of the capture system but may be beyond the area offocus211. Accordingly, the BIDA capture system may not notify the user of a successful scan, wherein the scan meets defined criteria.
FIG. 2B may illustrate an image capture where the BIDA capture system on thelocal system200 may recognize the specifiedfeature221 on thedocument220 and provide anotification230 to the user, such as one or more of a visual, tactile, or audible indication. For example, thenotification230 may comprise a vibration or audio laying of a .wav file, which may articulate or announce the condition of the captured image. In some examples, an alternate notification may occur if feature recognition fails during a BIDA of a captured image.
In some embodiments, the BIDA capture system may engage the targeted document once thelocal system200 captures the image. In some embodiments, the BIDA capture system may perform a continuous scan process when the image capturing device is actively capturing images. In such embodiments, the BIDA capture system may control the capturing mechanism and trigger the capture when text is recognized according to a threshold of image quality where a PFQA is successful. In still further embodiments, the BIDA may occur multiple times. For example, the BIDA may occur while the camera is active and then again after the image has been captured. Such combinations may allow the user to consistently capture readable images combined with a secondary confirmation scan. In some embodiments, the image capture device may capture a series of images in quick succession or extract images of a document from a video capture. The specific settings may be based on the image capturing capabilities of the local and/or user preferences.
Referring now toFIGS. 3A-3F, exemplary steps for utilizing a local system with BIDA capabilities and uploading a captured document are illustrated.FIG. 3A illustrates application icons, including a BIDAcapture system icon370, and acamera icon375 on aportable computing device300, wherein auser380 may select the BIDAcapture system icon370. In some embodiments, atFIG. 3B the BIDA capture system may prompt a user to select between two image selection icons; aphotograph album icon340, and acamera icon345. Theuser380 may click acamera icon345, which may initiate the image capturing mechanism on the mobile device, or thephotograph album icon340, which may access document images stored on theportable computing device300.
AtFIG. 3C, theuser380 may select a capturedimage320 to send to a secure cloud based image storage facility over a network for continued remote processing. In some exemplary embodiments, theuser380 may cycle through captured images in the folder by clickingdirectional arrows354,355. Theuser380 may confirm selection of the capturedimage320 by selecting the remote secure cloud basedimage storage365 icon. In some embodiments, atFIG. 3D, auser380 may select the folder within the remote secure cloud based image storage to direct the storage of the capturedimage320. In other embodiments, not shown, the BIDA may be capable of limited content analysis, wherein the local system may suggest folders to place image files.
AtFIG. 3E, theuser380 may be prompted to respond to an authentication step. In some exemplary embodiments, the authentication step may be prompt to connect a dongle or chip to theportable computing device300, which may be used to confirm the identity of theuser380. In alternate embodiments, the image capturing mechanism may flip and capture a facial image of theuser380. In still further embodiments, the authentication step may comprise a voice recognition, and theportable computing device300 may be programmed to recognize specific words spoken by theuser380. The authentication step may comprise a fingerprint recognition or other biometric as available on some smart mobile devices. AtFIG. 3F, the local system may transmit the capturedimage320 over anetwork390 to a remote secure cloud basedimage storage395 for continued remote processing.
Referring now toFIGS. 4A-4E, exemplary steps are illustrated for developing a document image with recognizable specified features through use of overlap differential. AtFIG. 4A, a BIDAimage capturing device400 may capture an image of adocument420 comprising three specifiedfeature segments421,422,423. Thedocument420 may be placed within the field ofview410 of the BIDA image capturing device but out of range of the area offocus411. This step may allow the local system to recognize the boundaries of thedocument420 but may not allow forspecified feature421,422,423 recognition.
At steps4B-4D, the BIDAimage capturing device400 may capture multiple images of thedocument420, wherein the recognizable specifiedfeatures421,422,423 may be different at each step. The images captured at4B-4D capture focused andrecognizable features421,422,423, respectively. At4E, the local system may process the multiple images utilizing overlap differential, which may develop asingle document image424 comprising completelyrecognizable features421,422,423. The BIDAimage capturing device400 may present the amalgamateddocument image424 and notify430 the user that thedocument image424 comprises recognizable specified features. Embodiments may include notification of recognizable specified features across an entire document or a portion thereof.
In some exemplary embodiments, the images captured inFIGS. 4A-4D may be captured through an automated series of image capture. In some alternate embodiments, the images captured may be extracted from a video segment filming the document. In some embodiments, the user may select multiple images and confirm that the images comprise a single document. In some exemplary embodiments, the BIDA may determine whether multiple images comprise a single document or multiple documents. In some such embodiments, the BIDA may identify the borders of a document and separate multiple documents from a single captured image. This embodiment may be particularly useful when the BIDA may extract the documents from a video segment filming one or more documents.
In BIDA for more complex images, such as three-dimensional object recognition, known techniques, such as simultaneous localization and mapping (SLAM), may be implemented in part within the local system and completed through remote continued processing.
Referring now toFIGS. 5A-5E, exemplary steps for manually selecting an area of focus are illustrated. At5A, a BIDAimage capturing device500 may capture an image of adocument520 comprising three specifiedfeature segments521,522,523. Thedocument520 may be placed within the field ofview510 of the BIDA image capturing device but out of range of the area offocus515. This step may allow the local system to recognize the boundaries of thedocument520 but may not allow forspecified feature521,522,523 recognition. At5B, the BIDAimage capturing device500 may present thedocument image524 to theuser580. In other embodiments, multiple documents may be located within the field ofview510, and the local system may be capable of identifying multiple borders and isolating the documents.
In some embodiments at5B, theuser580 may select a particular specifiedfeature523 deemed significant to the document, wherein the document image may be adequate as long as the selected specified feature may be recognizable. For example, a receipt may comprise the amounts and items of the sale, but it may also comprise an intricate logo for the store. A user may determine that the logo may not be a crucial piece of information for that receipt. Accordingly, the user may specify that the necessary specified features comprise thebottom text523.
At step5C, the BIDAimage capturing device500 may capture an image or be capable of capturing an image wherein unselected specifiedfeatures521 may be recognizable but the selected specifiedfeatures522 may be out of the field ofview510. At step5D, the BIDAimage capturing device500 may capture or be in a position to capture a document image, wherein the document image may comprise recognizable specifiedfeatures522 that theuser580 may have selected at step5B. In some embodiments, the feature recognition may trigger anotification530 from the image capturing device. At step5E, thedocument image525 may be presented to theuser580, who may select thedocument image525 for transmission to a secure cloud based image storage for continued remote processing.
Referring now toFIG. 6, anexemplary process600 for utilizing a BIDA capture system may be illustrated. At605, a user may activate the BIDA capture system. In some embodiments, the function may occur on a mobile device. In some embodiments, a user may manually activate the BIDA capture system, such as illustrated inFIG. 1. In some alternative embodiments, a user may activate the BIDA image capture system indirectly by operating a mobile device comprising the BIDA image capture system. For example, a digital camera may automatically initiate the BIDA image capture system when a user is taking photographs or otherwise capturing image data.
Steps610,620 illustrate method steps for utilizing the BIDA image capture system on existing images stored on the mobile device or on a memory card. At610, the user may select an image file folder, and at620 may select an image.Steps615,625 illustrate method steps for utilizing the BIDA image capture system with an image capturing function. At615, the user may select the image capture function, and at620, the user may capture an image. At630, the user may wait for a confirmation indication from the BIDA image capture system. As described in reference toFIG. 1A andFIG. 1B, the BIDA image capture system may scan a document before the user captures its image at625.
At635, a user may receive no response or a negative response from the BIDA image capture system, which may prompt the user to restart the image selection process at610,620 orimage capturing process615,625. Alternatively, at640, a user may receive a confirmation indication. In some embodiments, a successfully recognized document may be automatically stored without requiring further input from a user. In other embodiments, the user may accept and store the document image at645 or may reject the document at650. When a document is stored at645, the application may communicate with a storage system such as a server and a control application on the storage system, at655, may verify the identification of the user associated with the image capture device or the application communicating the image data. In the process of verifying the image, further processing of the document image data may occur on the storage system for appropriate storage as a fully processed digital document. At660, the user may exit the application directly or indirectly.
Referring now toFIG. 7A, exemplary method steps of a BIDAimage capture process700A are illustrated. At701, a BIDA image capture system may capture a document image. In some embodiments, a user may select a previously captured document image from a file folder, such as a photograph album, stored within the electronic device. Some embodiments, such as illustrated in reference toFIGS. 1A and 1B, may allow a user to choose to capture a new document image or select from a previously captured document image.
At702, the BIDA image capture system may identify the borders using one or more of many well known techniques such as contrast edge detection of the documents within the view, and at703 the BIDA image capture system may identify potential characters. The steps at702,703 may provide framing and orientation cues, which may allow for more reliable BIDA capture. As described in reference toFIG. 1A andFIG. 1B, in some embodiments, the BIDA image capture system may control the image capturing mechanism of the electronic device, and such framing and focusing may be automatically performed, without requiring further input from the user. At704, the BIDA image capture system may analyze at least a portion of the potential characters identified at703.
At705, the results of the analysis may be transmitted to a notification mechanism, wherein the notification mechanism for example an OCR processing algorithm which, may discern between and fail result and a success result based on predefined parameters. In some embodiments, a success result may comprise a complete text character recognition of at least a portion of the identified potential text characters, which may comprise a portion of the document image text characters. For example, the BIDA image capture system may identify specific text characters, such as those considered more difficult to recognize or more likely to allow full word recognition.
In other embodiments, a success result may comprise a partial text character recognition of at least a portion of the identified potential text characters, wherein predefined portions of text characters may be recognized. For example, analyzing a select portion of a string of text characters may be sufficient to establish legibility of a document image. In still further embodiments, the predefined parameters may be based on a combination of recognition factors.
In some embodiments, a preliminary optical character recognition system may transmit a fail result at706, the notification mechanism may optionally notify the user at707. At708, a fail result may trigger a prompt to the user to select or capture an alternate document. Alternatively, in some embodiments the preliminary optical character recognition system may transmit asuccess result709, and the notification mechanism may notify the user at710. As discussed above, a notification may include one or more of audible, visual, or tactile notifiers, such as a vibration. In some embodiments, the notification may only occur with a success result at710.
Where the preliminary optical character recognition system may transmit a success result at709, the preliminary optical character recognition system may optionally prompt the user to accept or reject the document at711, which may confirm that the user intended to select and/or capture the particularly document image. At712, the preliminary optical character recognition system may verify access authorization of the electronic device, and at713, the preliminary optical character recognition system may optionally prompt the user to provide responses authentication or security inquiries. Such security measures may be significant where the user intends to transmit the document image to a secure external server storage, such as a cloud based image storage system.
In some embodiments, at714, a preliminary optical character recognition system may be capable of categorizing the document image. For example, based on a predefined classification system and recognition parameters, the preliminary optical character recognition system may be able to analyze identified characters and identify key words, phrases, logos, patterns of words, proximity of words to each other, or other criteria. The analysis may be used to discern between a purchased item, a page of text, a face, an object, an official government document, a receipt, a deed, a birth certificate, and a tax document. In some embodiments, the preliminary optical character recognition system may prompt the user to confirm or reject the categorization, allowing the user to specify the destination file folder or classification of the document image. At715, the preliminary optical character recognition system may transmit the document image file and related OCR results to an external server, wherein the external server may be capable of a complete optical character recognition analysis. In some embodiments, such as a mobile phone or tablet, the electronic device may be capable of wirelessly connecting to the external server and transmitting the data. Alternately, in other embodiments such as with a digital camera, the electronic device may require a hard connection to a secondary device, such as a laptop computer, capable of connecting to a network system.
Referring now toFIG. 7B, similar to the above example relating to an object containing text, exemplary method steps of a BIDA objectimage capture process700B are illustrated. At721, a BIDA image capture system may capture an object image. In some embodiments, a user may select a previously captured object image from a file folder, such as a photograph album, stored within the electronic device. Some embodiments, such as illustrated inFIG. 1A andFIG. 1B, may allow a user to choose to capture a new object image or select from a previously captured object image.
At722, the BIDA image capture system may identify the borders of objects within the view, and at723 the BIDA image capture system may identify potential types of objects using one or more of many well known techniques such as contrast edge detection of the documents within the view. The steps at722,723 may provide framing and orientation cues, or contrast which may allow for more reliable BIDA capture. As described in reference toFIG. 1A andFIG. 1B, in some embodiments, the BIDA image capture system may control the image capturing mechanism of the electronic device, and such framing and focusing may be automatically performed, without requiring further input from the user. At724, the BIDA image capture system may analyze at least a portion of the potential objects identified at723.
At725, the results of the analysis may be transmitted to a notification mechanism, wherein the notification mechanism may discern between and fail result and a success result based on predefined parameters. In some embodiments, a success result may comprise a complete object recognition of at least a portion of the identified potential object, which may comprise a portion of the object image. For example, the BIDA image capture system may identify specific object characteristics, such as those considered more difficult to recognize or more likely to allow full object recognition.
In other embodiments, a success result may comprise a partial object recognition of at least a portion of the identified potential object, wherein predefined portions of object may be recognized. For example, analyzing a select portion of a string of objects may be sufficient to establish recognition of an object within an image. In still further embodiments, the predefined parameters may be based on a combination of recognition factors.
In some embodiments, a preliminary object recognition system may transmit a fail result at726, the notification mechanism may optionally notify the user at727. At728, a fail result may trigger a prompt to the user to select or capture an alternate object. Alternatively, in some embodiments the preliminary object recognition system may transmit asuccess result729, and the notification mechanism may notify the user at730. As discussed above, a notification may include one or more of audible, visual, or tactile notifiers, such as a vibration. In some embodiments, the notification may only occur with a success result at730.
Where the preliminary object recognition system may transmit a success result at729, the preliminary object recognition system may prompt the user to accept or reject the object at731, which may confirm that the user intended to select and/or capture the particularly object image. At732, the preliminary object recognition system may verify access authorization of the electronic device, and at733, the preliminary object recognition system may prompt the user to provide responses to security inquiries. Such security measures may be significant where the user intends to transmit the object image to a secure external server storage, such as a cloud based image storage system.
In some embodiments, at734, a preliminary object recognition system may be capable of categorizing the object image. For example, based on a predefined classification system and recognition parameters, the preliminary object recognition system may be able to analyze identified object characteristics and identify key words, phrases, logos, patterns, proximity of patterns to each other, or other criteria. At735, the preliminary object recognition system may transmit the object image file and related recognition results to an external server, wherein the external server may be capable of a complete object recognition or character recognition analysis. In some embodiments, such as a mobile phone or tablet, the electronic device may be capable of wirelessly connecting to the external server and transmitting the data. Alternately, in other embodiments such as with a digital camera, the electronic device may require a hard connection to a secondary device, such as a laptop computer, capable of connecting to a network system.
Referring now toFIG. 8, additional aspects of controller hardware which may be included as computer hardware, useful for implementing the present invention may be illustrated as a block diagram that may include acontroller850 upon which an embodiment of the invention may be implemented.Controller850 may include abus852 or other communication mechanism for communicating information, and aprocessor854 coupled withbus852 for processing information.
Controller850 may also include amain memory856, such as a random access memory (RAM) or other dynamic storage device, coupled tobus852 for storing information and instructions to be executed byprocessor854.Main memory856 may also be used for storing temporary variables or other intermediate information during execution of instructions to be executed byprocessor854.Controller850 may further include a read only memory (ROM)858 or otherstatic storage device860.
Controller850 may be coupled viabus852 to adisplay862, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user. Aninput device864, including alphanumeric and other keys, or modes of input, such as, for example, a microphone and a radio frequency device such as Bluetooth, may be coupled tobus852 for communicating information and command selections toprocessor854. Another type of user input device may be acursor control866, such as a mouse, a trackball, a touchpad, touchscreen, or cursor direction keys for communicating direction information and command selections toprocessor854 and for controlling cursor movement ondisplay862. This input device may typically have two or three degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane or stereo cameras that process and provide a third axis of input.
Some embodiments of the invention may be related to the use ofcontroller850 for setting operational parameters. According to one embodiment of the invention, control parameters may be defined and managed bycontroller850 in response toprocessor854 executing one or more sequences of one or more instructions contained inmain memory856. Such instructions may be read intomain memory856 from another computer-readable medium, such asstorage device860. Execution of the sequences of instructions contained inmain memory856 causesprocessor854 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein may refer to any medium that participates in providing instructions toprocessor854 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, solid state devices (SSD) or magnetic disks, such asstorage device860. Volatile media may include dynamic memory, such asmain memory856. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprisebus852. Transmission media may also take the form of infrared and radio frequency transmissions, acoustic or light waves, such as those generated during radio wave and infrared data communications.
Common forms of computer-readable media may include, for example, a memory stick, hard disk or any other magnetic medium, a CD-ROM, any other optical medium, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions toprocessor854 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a distributed network such as the Internet. A communication device may receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector may receive the data carried in the infrared signal and appropriate circuitry can place the data onbus852.Bus852 may carry the data, or otherwise be in logical communication to themain memory856, from whichprocessor854 retrieves and executes the instructions. The instructions received bymain memory856 may optionally be stored onstorage device860 either before or after execution byprocessor854.
Controller850 may also include acommunication interface869 coupled tobus852.Communication interface869 provides a two-way data communication coupling to anetwork link870 that may be connected to alocal network872. For example,communication interface869 may operate according to the internet protocol. As another example,communication interface869 may be a local area network (LAN) card a data communication connection to a compatible LAN.
Network link870 may typically provide data communication through one or more networks to other data devices. For example,network link870 may provide a connection throughlocal network872 to ahost computer874 or to data equipment operated by an Internet Service Provider (ISP)876. A wireless links may also be implemented.ISP876 in turn may provide data communication services through the worldwide packet data communication network now commonly referred to as the “Internet”879.Local network872 andInternet879 may both use electrical, electromagnetic or optical signals that carry digital data streams. The signals may be transmitted through the various networks and the signals on thenetwork link870 and throughcommunication interface869, which carry the digital data to and fromcontroller850 are exemplary forms of carrier waves transporting the information.
In some embodiments,Controller850 may send messages and receive data, including program code, through the network(s),network link870 andcommunication interface869. In the Internet example, aserver890 might transmit a requested code for an application program throughInternet879,ISP876,local network872 andcommunication interface869.
Processor854 may execute the received code as it is received, and/or stored instorage device860, or other non-volatile storage for later execution. Someexemplary controllers850 may include a personal digital assistant, a mobile phone, a smart phone, a tablet, a netbook, a notebook computer, a laptop computer, a terminal, a kiosk or other type of automated apparatus. Additional exemplary devices may include any device with a processor executing programmable commands to accomplish the steps described herein.
FIG. 9 is a block diagram of some embodiments of a network access device that may include amobile device902. Themobile device902 comprises anoptical capture device908 to capture an image and convert it to machine-compatible data, and anoptical path906, typically a lens, an aperture or an image conduit to convey the image from the rendered document to theoptical capture device908. Theoptical capture device908 may incorporate a Charge-Coupled Device (CCD), a Complementary Metal Oxide Semiconductor (CMOS) imaging device, or an optical sensor of another type.
Amicrophone910 and associated circuitry may convert the sound of the environment, including spoken words, into machine-compatible signals.Input facilities914 exist in the form of buttons, scroll-wheels or other tactile sensors such as touch-pads. In some embodiments,input facilities914 may include a touchscreen display.
Visual feedback-to the user is possible through a visual display, touchscreen display, or indicator lights. Audible feedback93.4 may come from a loudspeaker or other audio transducer. Tactile feedback may come from avibrate module936.
Amotion sensor938 and associated circuity convert the motion of themobile device902 into machine-compatible signals. Themotion sensor938 may comprise an accelerometer, which may be used to sense measurable physical acceleration, orientation, vibration, and other movements. In some embodiments themotion sensor938 may include a gyroscope or other device to sense different motions.
Alocation sensor940 and associated circuitry may be used to determine the location of the device. Thelocation sensor940 may detect Global Position System (GPS) radio signals from satellites or may also use assisted GPS where the mobile device may use a cellular network to decrease the time necessary to determine location. In some embodiments, thelocation sensor940 may use radio waves to determine the distance from known radio sources such as cellular towers to determine the location of themobile device902. In some embodiments these radio signals may be used in addition to GPS.
Themobile device902 compriseslogic926 to interact with the various other components, possibly processing the received signals into different formats and/or interpretations.Logic926 may be operable to read and write data and program instructions stored in associatedstorage930 such as RAM, ROM, flash, or other suitable memory. It may read a time signal from theclock unit928. In some embodiments, themobile device902 may have an on-board power supply932. In other embodiments, themobile device902 may be powered from a tethered connection to another device, such as a Universal Serial Bus (USB) connection.
Themobile device902 also includes anetwork interface916 to communicate data to a network and/or an associated computing device.Network interface916 may provide two-way data communication. For example,network interface916 may operate according to the internet protocol. As another example,network interface916 may be a local area network (LAN) card allowing a data communication connection to a compatible LAN. As another example,network interface916 may be a cellular antennae and associated circuitry which may allow the mobile device to communicate over standard wireless data communication networks. In some implementations,network interface916 may include a Universal Serial Bus (USB) to supply power or transmit data. In some embodiments other wireless links may also be implemented.
As an example of one use ofmobile device902, a reader may scan some text from a newspaper article withmobile device902. The text is scanned as a bit-mapped image via theoptical capture device908.Logic926 causes the bit-mapped image to be stored inmemory930 with an associated time-stamp read from theclock unit928.Logic926 may also perform optical character recognition (OCR) or other post-scan processing on the bit-mapped image to convert it to text.Logic926 may optionally extract a signature from the image, for example by performing a convolution-like process to locate repeating occurrences of characters, symbols or objects, and determine the distance or number of other characters, symbols, or objects between these repeated elements. The reader may then upload the bit-mapped image (or text or other signature, if post-scan processing has been performed by logic926) to an associated computer vianetwork interface916.
As an example of another use ofmobile device902, a reader may capture some text from an article as an audio file by using microphone99 as an acoustic capture port.Logic926 causes audio file to be stored inmemory928.Logic926 may also perform voice recognition or other post-scan processing on the audio file to convert it to text. As above, the reader may then upload the audio file (or text produced by post-scan processing performed by logic926) to an associated computer vianetwork interface916.
A controller may include one or more of: personal computers, laptops, pad devices, mobile phone devices and workstations located locally or at remote locations, but in communication with the controller. System apparatus may include digital electronic circuitry included within computer hardware, firmware, software, or in combinations thereof. Additionally, aspects of the invention may be implemented manually.
Apparatus of the invention may be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor and method actions can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output. The present invention may be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program may be implemented in a high-level procedural or object oriented programming language, or in assembly or machine language if desired, and in any case, the language can be a compiled or interpreted language. Suitable processors may include, by way of example, both general and special purpose microprocessors.
Generally, a processor may receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer may include one or more mass storage devices for storing data files; such devices include Solid State Disk (SSD), magnetic disks, such as internal hard disks and removable disks magneto-optical disks and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including, by way of example, semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as, internal hard disks and removable disks; magneto-optical disks; and CD ROM disks may be included. Any of the foregoing may be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
In some embodiments, implementation of the features of the present invention may be accomplished via digital computer utilizing uniquely defined controlling logic, wherein the controller includes an integrated network between and among the various participants in Process Instruments.
The specific hardware configuration used may not be particularly critical, as long as the processing power is adequate in terms of memory, information updating, order execution, redemption and issuance. Any number of commercially available database engines may allow for substantial account coverage and expansion. The controlling logic may use a language and compiler consistent with that on a CPU included in the controller. These selections may be set according to per se well-known conventions in the software community.
The present invention is described herein with reference to block diagrams and functional illustrations of methods and apparatus to implement various aspects of the present invention. It is understood that each block of the block diagrams or operational illustration or function represented, and combinations of blocks in the block diagrams or operational or functional illustrations, may be implemented by automated apparatus, such as analog or digital hardware and computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, cellular device, smart device, ASIC, or other programmable data processing apparatus, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implements the functions/acts specified in the block diagrams or operational block or blocks. In some implementations, the functions or method steps described in relation to the blocks or functional representations may occur in an order other than the order noted or described herein, For example, blocks or functional representations shown in a succession may be executed substantially concurrently or the blocks in an alternate order, depending upon a specific implementation of the present invention. It is therefore understood that unless otherwise specifically noted and thereby limited, the discussion here is presented in an order to facilitate enablement and understanding and is not meant to limit the invention disclosed.
CONCLUSIONA number of embodiments of the present invention have been described. While this specification contains many specific implementation details, there should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular embodiments of the present invention.
Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in combination in multiple embodiments separately or in any suitable sub-combination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a sub-combination or variation of a sub-combination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous.
Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order show, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the claimed invention.