Movatterモバイル変換


[0]ホーム

URL:


CN110377914A - Character identifying method, device and storage medium - Google Patents

Character identifying method, device and storage medium
Download PDF

Info

Publication number
CN110377914A
CN110377914ACN201910677203.7ACN201910677203ACN110377914ACN 110377914 ACN110377914 ACN 110377914ACN 201910677203 ACN201910677203 ACN 201910677203ACN 110377914 ACN110377914 ACN 110377914A
Authority
CN
China
Prior art keywords
character
detected
word vector
coding sequence
stroke
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910677203.7A
Other languages
Chinese (zh)
Other versions
CN110377914B (en
Inventor
李原野
季成晖
卢俊之
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co LtdfiledCriticalTencent Technology Shenzhen Co Ltd
Priority to CN201910677203.7ApriorityCriticalpatent/CN110377914B/en
Publication of CN110377914ApublicationCriticalpatent/CN110377914A/en
Application grantedgrantedCritical
Publication of CN110377914BpublicationCriticalpatent/CN110377914B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

This application discloses a kind of character identifying method, device and storage mediums, are related to technical field of information processing.In this application, it according to the stroke order of character to be detected, determines the stroke coding sequence of character to be detected, obtains the contextual information of character to be detected.Later, the word vector of character to be detected is determined according to the stroke coding sequence of character to be detected and contextual information, and then the character to be detected is identified according to the mapping relations of the word vector and character of the word vector of the character to be detected and storage.That is, the embodiment of the present application can identify the character to be detected in conjunction with the font feature and the context meaning of one's words of character to be detected, the recognition accuracy of wrong word is improved.

Description

Character identifying method, device and storage medium
Technical field
This application involves technical field of information processing, in particular to a kind of character identifying method, device and storage medium.
Background technique
Currently, terminal can show various text informations.Wherein, there may be wrong word in text information,Therefore, terminal can identify the wrong word in text information.
In the related technology, terminal is all based on the semanteme of Chinese character usually to carry out the identification of wrong word.For example, terminalThe phonetic of the Chinese character can be obtained according to Chinese character.According to the phonetic of the Chinese character, obtain and the Chinese characterMultiple candidate characters similar in phonetic.Later, terminal can be by the language of the Chinese character and the contextual information of the Chinese characterMeaning matching degree and other candidate characters are compared with the meaning of one's words matching degree of the contextual information, judge the Chinese with thisWhether character is wrong word.
However, the close of phonetic can not represent the close of the meaning of one's words in many cases, that is, the appearance of wrong word canIt can be not because phonetic is close caused.In this case, it will be unable to detect wrong word by the above method.
Summary of the invention
The embodiment of the present application provides a kind of character identifying method, device and storage medium, can be used for improving wrong wordRecognition accuracy.The technical solution is as follows:
On the one hand, a kind of character identifying method is provided, which comprises
According to the stroke order of character to be detected, the stroke coding sequence of the character to be detected is determined;
Obtain the contextual information of the character to be detected;
According to the stroke coding sequence and contextual information of the character to be detected, determine the word of the character to be detected toAmount;
According to the word vector of the mapping relations and the character to be detected of the word vector of storage and character, to described to be detectedCharacter is identified.
On the other hand, a kind of character recognition device is provided, described device includes:
First determining module determines that the stroke of the character to be detected is compiled for the stroke order according to character to be detectedCode sequence;
First obtains module, for obtaining the contextual information of the character to be detected;
Second determining module determines institute for the stroke coding sequence and contextual information according to the character to be detectedState the word vector of character to be detected;
Identification module, for according to the words of the word vector of storage and the mapping relations of character and the character to be detected toAmount, identifies the character to be detected.
On the other hand, provide a kind of character recognition device, described device include processor, communication interface, memory andCommunication bus;
Wherein, the processor, the communication interface and the memory are completed each other by the communication busCommunication;
The memory is for storing computer program;
The processor is for executing the program stored on the memory, to realize character identifying method above-mentionedStep.
On the other hand, a kind of computer readable storage medium is provided, is stored with computer program in the storage medium,The computer program realizes the step of character identifying method of aforementioned offer when being executed by processor.
Technical solution bring beneficial effect provided by the embodiments of the present application includes at least:
In the embodiment of the present application, according to the stroke order of character to be detected, the stroke coding sequence of character to be detected is determinedColumn, obtain the contextual information of character to be detected.Later, according to the stroke coding sequence of character to be detected and contextual information comeDetermine the word vector of character to be detected, and then according to the mapping of the word vector of the character to be detected and the word vector of storage and characterRelationship identifies the character to be detected.That is, the embodiment of the present application can in conjunction with character to be detected font feature andThe context meaning of one's words identifies the character to be detected, improves the recognition accuracy of wrong word.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodimentAttached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, forFor those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings otherAttached drawing.
Fig. 1 is character identifying method flow chart provided by the embodiments of the present application;
Fig. 2 is a kind of character recognition device structural schematic diagram provided by the embodiments of the present application;
Fig. 3 is the structural schematic diagram of the first determining module provided by the embodiments of the present application;
Fig. 4 is another character recognition device structural schematic diagram provided by the embodiments of the present application;
Fig. 5 is provided by the embodiments of the present application for carrying out the structural schematic diagram of the terminal of character recognition.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present inventionFormula is described in further detail.
Before carrying out detailed explanation to the embodiment of the present application, first to the invention relates to application scenariosIt is introduced.
Currently, terminal can show various text informations.For example, terminal can show the text envelope of user's inputBreath.Alternatively, terminal can carry out character knowledge by OCR (Optical Character Recognition, optical character identification)Not, text information is obtained, and then shows text information.Alternatively, after terminal starts some application, it can be from the applicationIt obtains initial data in application server to be shown, for example, terminal can be from the map after terminal starts map applicationThe title of each point of interest on map is obtained in the server of application, and the title of each point of interest is shown in the mapOn.Wherein, due to text information during input or conversion identification it is possible that mistake, can in text informationThere can be wrong word, be based on this, terminal can use character recognition provided by the embodiments of the present application before display text informationMethod the wrong word in text information identified.
Certainly, in some possible scenes, it may be necessary to synonym, the nearly word form for obtaining certain words automatically, thisIn the case of, the identification of synonym, nearly word form can also be carried out using related realization mode provided by the embodiments of the present application.
In addition, it should also be noted that, character identifying method provided by the embodiments of the present application can also be in certain scenesIt is executed by server.For example, the application server can pass through the character for the application server for being stored with text informationRecognition methods identifies wrong word, to be modified to wrong word.
Next character crosstalk detecting method provided by the embodiments of the present application is introduced.
Fig. 1 is a kind of flow chart of character crosstalk detecting method provided by the embodiments of the present application.Referring to Fig. 1, this method can be withApplied in smart machine, in the embodiment of the present application, it is applied to be explained for terminal in this way, this methodIt may comprise steps of:
Step 101: according to the stroke order of character to be detected, determining the stroke coding sequence of character to be detected.
In the embodiment of the present application, the available text information to be detected of terminal, the text information to be detected can be withIncluding multiple Chinese characters.For each character in multiple Chinese character, terminal can as character to be detected,And detect whether the character to be detected is wrong word by character detection method provided by the embodiments of the present application.In other words,The character to be detected can refer to any character in text information to be detected.
Terminal can split character to be detected, obtain including multiple according to the stroke order of character to be detectedThe strokes sequence of character element.According to the mapping relations of character element and encoded information, each character group in strokes sequence is determinedThe corresponding encoded information of part;Determining multiple encoded informations are arranged according to the sequence of character elements multiple in strokes sequenceSequence obtains stroke coding sequence.
It should be noted that Chinese character was usually made of multiple strokes, and according to sequential write, each strokeThere is also sequencings.Based on this, character to be detected can be split as multiple by terminal according to the stroke order of character to be detectedCharacter element, each character element can be a stroke.The multiple character elements splitted into are arranged according to stroke order, thusTo strokes sequence.
It can store the mapping relations of character element and encoded information in terminal.After obtaining strokes sequence, terminalThe corresponding encoded information of each character element can be successively obtained according to the sequence of character element each in strokes sequence, later,The multiple encoded informations successively obtained are arranged according to acquisition sequence, to obtain stroke coding sequence.
It is the mapping relations of a kind of character element and encoded information shown in the embodiment of the present application in table 1.As shown in table 1,Each character element can correspond to a digital encoded information, in this way, passing through the mapping relations, it can by what is splitted outMultiple character elements are converted to the digital code information that can be handled by terminal, that is, stroke coding sequence.
The mapping relations of table 1 character element and encoded information
Character elementOneShuPieDian……
Encoded information123456……
Illustratively, it is assumed that character to be detected is " state ", and " state " word can be split as by terminal first, in accordance with stroke orderMultiple character elements, are respectively as follows: Shu , , one, one, Shu, one, Dian, and one.Next, believing according to character element shown in table 1 and codingThe mapping relations of breath are it is found that " Shu " corresponding encoded information is " 2 " , " " corresponding encoded information is " 6 ", " one " corresponding volumeCode information is " 1 ", and " Dian " corresponding encoded information is " 5 ".Based on this, terminal, can be with according to the sequence of above-mentioned multiple character elementsObtaining stroke coding sequence is " 2,6,1,1,2,1,5,1 ".
Step 102: obtaining the contextual information of character to be detected.
Since the font that the stroke coding sequence for being split Chinese character can be used for characterizing Chinese character is specialPoint, but also it is not enough to characterize the semanteme of Chinese character.Therefore, in the embodiment of the present application, terminal can also obtain to be detectedThe contextual information of character.
Wherein, terminal can obtain the contextual information of character to be detected by N-gram model.Illustratively, terminalThe size n of N-gram window can be set according to the window size that user inputs.Later, terminal can be from the character to be detectedIt is risen the location of in text information, obtains n character forward, obtain n character backward, the 2n character that will acquire is madeFor the contextual information of the character to be detected.Wherein, n is the integer more than or equal to 1.
For example, it is assumed that n=2, character to be detected is " children ", and text information is " the 18th kindergarten address ", then from " children "The location of word rises, and obtains two characters " 18 " forward, obtains two characters " youngster garden ", this 4 characters of acquisition backwardThe as contextual information of " children " word.
It should be noted that in the embodiment of the present application, terminal can also first carry out step 102, then execute step 101,Alternatively, being performed simultaneously step 101 and 102, the embodiment of the present application is not construed as limiting this.
Step 103: according to the stroke coding sequence and contextual information of character to be detected, determine the word of character to be detected toAmount.
After the stroke coding sequence and context information for getting character to be detected, which can compile according to the strokeCode sequence, determines multiple fragment encoding ses;By the way that designated deep learning model is to multiple fragment encoding se and this is to be checkedThe contextual information of glyphomancy symbol is handled, and the word vector of the character to be detected is obtained.
In the embodiment of the present application, in order to be uniformly input to designated deep learning model data format, determining penAfter drawing coded sequence, whether the number that terminal can detecte the encoded information for including in the stroke coding sequence is equal to k, ifThe number for the encoded information for including in the stroke coding sequence is less than k, then the terminal can be in the last of the stroke coding sequencePreset encoding information is supplemented after one encoded information, so that the number etc. for the encoded information that the stroke coding sequence of step includesIn k.
It include 8 encoded informations in the stroke coding sequence of character to be detected, then the terminal can for example, it is assumed that k=50With include in stroke coding sequence 8 encoded informations after supplement 42 preset encoding informations " m " so that supplement afterStroke coding sequence includes 50 encoded informations.Wherein.Preset encoding information can reflecting for character element and encoded informationPenetrate the encoded information being not present in relationship.For example, the preset encoding information can be equal to -1, it is, of course, also possible to be other numerical value,The embodiment of the present application does not limit this.
After supplementing stroke coding sequence, which can generate according to the stroke coding sequence after supplementMultiple fragment encoding ses.Illustratively, terminal can be since first encoded information in the stroke coding sequence, by thisContinuous m encoded information is divided into a segment after encoded information and the encoded information, compiles to obtain first segmentCode sequence.Later, continuous m encoded information is divided into a piece later for second encoded information and second encoded informationSection, so that second fragment encoding se is obtained, and so on.Wherein, m is the integer more than or equal to 1, and m is less than or waitsIn k.In general, m can be equal to 2,3 or 4.
For example, for aforementioned obtained stroke coding sequence " 2,6,1,1,2,1,5,1, -1 ..., -1 ", it is assumed that m=2,It, can be by first encoded information " 2 " and continuous 2 encoded informations later when being divided the stroke coding sequence" 6 ", " 1 " are used as a segment, obtain first fragment encoding se " 2,6,1 ";Later, by second encoded information " 6 " andContinuous 2 encoded informations " 1 ", " 1 " are used as a segment later, obtain second fragment encoding se " 6,1,1 ";By thirdA encoded information and later continuous two encoded informations as a segment, obtain the third fragment coded sequence " 1,1,2 ",And so on.
Optionally, in some possible situations, terminal can take different numerical value to come to stroke with m according to the method described aboveCoded sequence is divided, to obtain multiple fragment encoding ses.For example, terminal can pass through the above method in m=2It determines and obtains the fragment encoding se of multiple encoded informations comprising there are three.Later, terminal can enable m=3, pass through the above methodContinue to divide stroke coding sequence, to obtain multiple comprising there are four the fragment encoding ses of encoded information.Later,Terminal can also enable m=4, continue to divide stroke coding sequence by the above method, to obtain multiple including fiveThe fragment encoding se of a encoded information.
Optionally, if the encoded information for including in the stroke coding sequence of character to be detected is equal to k, which can be withNo longer stroke coding sequence is supplemented, but directly generates multiple fragment encoding ses by the method for above-mentioned introduction.
After determination obtains multiple fragment encoding ses, which can be by multiple fragment encoding ses and aforementioned acquisitionInput value of the contextual information of the character to be detected arrived as designated deep learning model.Designated deep learning model can rootAccording to each fragment encoding se, a fragment coding vector is generated.Later, it according to multiple fragment coding vectors of generation, generatesInitial word vector.Later, which can be according to the context of the initial word vector sum character to be detectedInformation determines and exports the word vector of the character to be detected.
Wherein, designated depth model can according to each fragment encoding se, to the initial segment vector of specified dimension intoRow assignment, to generate the corresponding fragment coding vector of respective segments coded sequence.After generating multiple fragment coding vectors,The vector element being located in same position in multiple fragment coding vectors can be carried out sum-average arithmetic by designated depth model, thusObtain initial word vector.For example, it is assumed that the dimension of fragment coding vector is 20 dimensions, then the designated depth model can be to eachElement sum-average arithmetic in section coding vector on first position, which is first element in initial word vector,By the element sum-average arithmetic in each fragment coding vector on second position, which is second in initial word vectorA element, and so on, obtain initial word vector.
It should be noted that the designated depth model can be skip-gram model.Also, the designated depth model canTo refer to through the model after the training of a large amount of sample data.
Illustratively, terminal it is available include designated character multiple sample datas, each sample data include refer toDetermine the stroke coding sequence of character and a kind of contextual information of designated character;By designated deep learning model to multiple samplesData are handled, and the corresponding word vector of designated character is obtained;It is stored in designated character is corresponding with the word vector of designated characterIn the mapping relations of word vector and character.
Wherein, for each character in Chinese character, the corresponding sample text information of the available kinds of characters of terminal.For example, for " state " word, terminal is available include " state " word multiple sample text information.Later, terminal can basisThe corresponding sample text information of each character, obtains multiple sample datas of each character.
It should be noted that for ease of description, which is known as designated word for any character in Chinese characterSymbol, terminal get include the designated character multiple sample text information after, terminal can be with reference to foregoing descriptionCorrelation technique obtains the designated character in each sample text according to the designated character in the position of each sample text informationContextual information in information.At the same time, which can also refer to the correlation technique of foregoing description, according to the designated characterStroke order, determination obtain the stroke coding sequence of the designated character.Later, the terminal can by the stroke coding sequence andA sample data of the every kind of contextual information in a variety of contextual informations got as the designated character, is input to fingerDepthkeeping degree learning model, which handles multiple sample datas of the designated character, to refer to thisThe model parameter of depthkeeping degree learning model is adjusted, and finally enters the corresponding word vector of the designated character.At this point, the word vectorThe word vector as obtained by the designated depth model training is stored in word vector for the word vector is corresponding with the designated characterIn the mapping relations of character.
For each character in Chinese character, terminal can be handled respective symbols by the above method,And then designated deep learning model is trained by the sample data of respective symbols, while exporting trained word vector,By the storage corresponding with correspondingly character of word vector.
Step 104: according to the word vector of the mapping relations and character to be detected of the word vector of storage and character, to be detectedCharacter is identified.
After obtaining the word vector of character to be detected by step 103 determination, terminal can determine word vector and characterMapping relations in similarity between each word vector for including and the word vector of character to be detected;According to each word vector withSimilarity between the word vector of character to be detected, identifies character to be detected.
Wherein, terminal can calculate the distance between each word vector in the character to be detected and mapping relations, with thisDistance characterizes the similarity between the character to be detected and corresponding word vector.Wherein, the distance between two vectors are closer, thenIllustrate that the similarity between two vectors is higher.Wherein, the implementation for calculating the distance between two vectors can refer to phaseImplementation is closed, details are not described herein for the embodiment of the present application.
After vector distance between the word vector and each word vector for determining the character to be detected, which can be fromThe vector distance for being less than distance to a declared goal threshold value is obtained in multiple vector distances.If it is less than the vector distance of distance to a declared goal threshold valueQuantity is only one, then character corresponding to the corresponding word vector of the available vector distance of the terminal, if the character withCharacter to be detected is identical, then illustrates that the character to be detected is not wrong word.If it is less than the vector distance of distance to a declared goal threshold valueQuantity is multiple, the minimum vector distance in the available multiple vector distances less than distance to a declared goal threshold value of the terminal, and is obtainedTake character corresponding to the corresponding word vector of the minimum vector distance.It, can be true if the character is different from character to be detectedThe fixed character to be detected may be wrong word, at this point, the terminal can will be corresponding to the corresponding word vector of minimum vector distanceCharacter is shown as recommendation character.Wherein, the recommendation character be directed to user's recommendation for replacing the character to be detectedCorrect characters.
Optionally, in some possible situations, if there is no the vector distance for being less than distance to a declared goal threshold value, then endThe character to be detected directly can be determined as wrong word by end.
In the embodiment of the present application, terminal can determine the pen of character to be detected according to the stroke order of character to be detectedCoded sequence is drawn, the contextual information of character to be detected is obtained.Later, according to the stroke coding sequence of character to be detected and up and downLiterary information determines the word vector of character to be detected, and then according to the word vector and word of the word vector of the character to be detected and storageThe mapping relations of symbol identify the character to be detected.That is, the embodiment of the present application can be in conjunction with the word of character to be detectedShape feature and the context meaning of one's words identify the character to be detected, improve the recognition accuracy of wrong word, especially can be improvedThe recognition accuracy of the nearly wrong word of shape.For example, the text information identified by OCR, it will usually which there are the nearly wrong words of shape, lead toMethod provided by the embodiments of the present application is crossed it is accurate that identification can be improved when identifying the nearly wrong word of the shape in such text informationRate.
In addition, the character identifying method in the embodiment of the present application can be also used for carrying out the excavation of nearly word form, in this fieldUnder scape, after the vector distance of word vector and each word vector that character to be detected is calculated, which can also basisThe size of vector distance determines the likelihood probability between each word vector and the word vector of character to be detected, and then according to the phaseLike the size of probability, output character similar with the character to be detected.
Next character string detection device provided by the embodiments of the present application is introduced.
Referring to fig. 2, the embodiment of the present application provides a kind of character recognition device 200, which includes:
First determining module 201 determines the stroke of the character to be detected for the stroke order according to character to be detectedCoded sequence;
First obtains module 202, for obtaining the contextual information of the character to be detected;
Second determining module 203 is determined for the stroke coding sequence and contextual information according to the character to be detectedThe word vector of the character to be detected;
Identification module 204, for the word according to the word vector of storage and the mapping relations of character and the character to be detectedVector identifies the character to be detected.
Optionally, referring to Fig. 3, first determining module 201 includes:
Submodule 2011 is split, for the stroke order according to the character to be detected, the character to be detected is carried outSplit, obtain include multiple character elements strokes sequence;
It determines submodule 2012, for the mapping relations according to character element and encoded information, determines the strokes sequenceIn the corresponding encoded information of each character element;
Sorting sub-module 2013, for the sequence according to multiple character elements described in the strokes sequence to determining moreA encoded information is ranked up, and obtains the stroke coding sequence.
Optionally, second determining module 203 is specifically used for:
According to the stroke coding sequence, multiple fragment encoding ses are determined;
Believed by context of the designated deep learning model to the multiple fragment encoding se and the character to be detectedBreath is handled, and the word vector of the character to be detected is obtained.
Optionally, referring to fig. 4, the device 200 further include:
Second obtains module 205, for obtain include designated character multiple sample datas, each sample data includesThe stroke coding sequence of the designated character and a kind of contextual information of the designated character;
Processing module 206 obtains institute for handling by designated deep learning model the multiple sample dataState the corresponding word vector of designated character;
Memory module 207, for being stored in the word for the designated character is corresponding with the word vector of the designated characterIn the mapping relations of vector and character.
Optionally, the identification module 204 is specifically used for:
Determine the word for each word vector and character to be detected for including in the mapping relations of the word vector and characterSimilarity between vector;
According to the similarity between each word vector and the word vector of the character to be detected, to the character to be detected intoRow identification.
In conclusion the embodiment of the present application can determine the pen of character to be detected according to the stroke order of character to be detectedCoded sequence is drawn, the contextual information of character to be detected is obtained.Later, according to the stroke coding sequence of character to be detected and up and downLiterary information determines the word vector of character to be detected, and then according to the word vector and word of the word vector of the character to be detected and storageThe mapping relations of symbol identify the character to be detected.That is, the embodiment of the present application can be in conjunction with the word of character to be detectedShape feature and the context meaning of one's words identify the character to be detected, improve the recognition accuracy of wrong word.
It should be understood that character recognition device provided by the above embodiment is when identifying wrong word, only with above-mentioned each functionCan module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different functionsModule is completed, i.e., the internal structure of equipment is divided into different functional modules, described above all or part of to completeFunction.In addition, character recognition device provided by the above embodiment and character identifying method embodiment belong to same design, it is specificRealization process is detailed in embodiment of the method, and which is not described herein again.
Fig. 5 shows the structure for the terminal 500 for carrying out character recognition that one exemplary embodiment of the application providesBlock diagram.The terminal 500 may is that smart phone, tablet computer, laptop or desktop computer.Terminal 500 is also possible to be claimedFor user equipment, the equipment of portable adjustment neural network model, the equipment of adjustment neural network model on knee, desk-top adjustmentOther titles such as equipment of neural network model.
In general, terminal 500 includes: processor 501 and memory 502.
Processor 501 may include one or more processing cores, such as 4 core processors, 8 core processors etc..PlaceReason device 501 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmedLogic array) at least one of example, in hardware realize.Processor 501 also may include primary processor and coprocessor, main placeReason device is the processor for being handled data in the awake state, also referred to as CPU (Central ProcessingUnit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.?In some embodiments, processor 501 can be integrated with GPU (Graphics Processing Unit, image processor),GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 501 can also be wrappedAI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learningCalculating operation.
Memory 502 may include one or more computer readable storage mediums, which canTo be non-transient.Memory 502 may also include high-speed random access memory and nonvolatile memory, such as oneOr multiple disk storage equipments, flash memory device.In some embodiments, non-transient computer-readable in memory 502Storage medium for storing at least one instruction, at least one instruction for performed by processor 501 to realize the applicationThe character identifying method that embodiment of the method provides.
In some embodiments, terminal 500 is also optional includes: peripheral device interface 503 and at least one peripheral equipment.It can be connected by bus or signal wire between processor 501, memory 502 and peripheral device interface 503.Each peripheral equipmentIt can be connected by bus, signal wire or circuit board with peripheral device interface 503.Specifically, peripheral equipment includes: radio circuit504, at least one of touch display screen 505, camera 506, voicefrequency circuit 507, positioning component 508 and power supply 509.
Peripheral device interface 503 can be used for I/O (Input/Output, input/output) is relevant outside at least onePeripheral equipment is connected to processor 501 and memory 502.In some embodiments, processor 501, memory 502 and peripheral equipmentInterface 503 is integrated on same chip or circuit board;In some other embodiments, processor 501, memory 502 and outerAny one in peripheral equipment interface 503 or two can realize that this is not added in the present embodiment on individual chip or circuit boardTo limit.
Radio circuit 504 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.It penetratesFrequency circuit 504 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 504 turns electric signalIt is changed to electromagnetic signal to be sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 504 wrapsIt includes: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, codec chipGroup, user identity module card etc..Radio circuit 504 can be by least one wireless communication protocol come neural with other adjustmentThe equipment of network model is communicated.The wireless communication protocol includes but is not limited to: in WWW, Metropolitan Area Network (MAN), Intranet, each generation, moveDynamic communication network (2G, 3G, 4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio circuit 504 can also include that (Near Field Communication, near radio are logical by NFCLetter) related circuit, the application is not limited this.
Display screen 505 is for showing UI (User Interface, user interface).The UI may include figure, text, figureMark, video and its their any combination.When display screen 505 is touch display screen, display screen 505 also there is acquisition to showThe ability of the touch signal on the surface or surface of screen 505.The touch signal can be used as control signal and be input to processor501 are handled.At this point, display screen 505 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft button and/orSoft keyboard.In some embodiments, display screen 505 can be one, and the front panel of terminal 500 is arranged;In other embodimentsIn, display screen 505 can be at least two, be separately positioned on the different surfaces of terminal 500 or in foldover design;In still other realityIt applies in example, display screen 505 can be flexible display screen, be arranged on the curved surface of terminal 500 or on fold plane.Even, it showsDisplay screen 505 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 505 can use LCD (LiquidCrystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode)Etc. materials preparation.
CCD camera assembly 506 is for acquiring image or video.Optionally, CCD camera assembly 506 include front camera andRear camera.In general, the front panel of terminal device is arranged in front camera, the back of terminal device is arranged in rear cameraFace.In some embodiments, rear camera at least two, be respectively main camera, depth of field camera, wide-angle camera,Any one in focal length camera, to realize that main camera and the fusion of depth of field camera realize background blurring function, main cameraIt is merged with wide-angle camera and realizes pan-shot and VR (Virtual Reality, virtual reality) shooting function or otherMerge shooting function.In some embodiments, CCD camera assembly 506 can also include flash lamp.Flash lamp can be monochromatic temperatureFlash lamp is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be withFor the light compensation under different-colour.
Voicefrequency circuit 507 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and willSound wave, which is converted to electric signal and is input to processor 501, to be handled, or is input to radio circuit 504 to realize voice communication.For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 500 to be multiple.MikeWind can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 501 or radio circuit will to be come from504 electric signal is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramic loudspeaker.WhenWhen loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, it can also be by telecommunicationsNumber the sound wave that the mankind do not hear is converted to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 507 can also includeEarphone jack.
Positioning component 508 is used for the current geographic position of positioning terminal 500, to realize navigation or LBS (LocationBased Service, location based service).Positioning component 508 can be the GPS (Global based on the U.S.Positioning System, global positioning system), China dipper system or Russia Galileo system positioning groupPart.
Power supply 509 is used to be powered for various components in terminal 500.Power supply 509 can be alternating current, direct current, oneSecondary property battery or rechargeable battery.When power supply 509 includes rechargeable battery, which can be wired charging batteryOr wireless charging battery.Wired charging battery is the battery to be charged by Wireline, and wireless charging battery is by wireless lineEnclose the battery of charging.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 500 further includes having one or more sensors 510.The one or more sensors510 include but is not limited to: acceleration transducer 511, gyro sensor 512, pressure sensor 513, fingerprint sensor 514,Optical sensor 515 and proximity sensor 516.
The acceleration that acceleration transducer 511 can detecte in three reference axis of the coordinate system established with terminal 500 is bigIt is small.For example, acceleration transducer 511 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 501 canWith the acceleration of gravity signal acquired according to acceleration transducer 511, touch display screen 505 is controlled with transverse views or longitudinal viewFigure carries out the display of user interface.Acceleration transducer 511 can be also used for the acquisition of game or the exercise data of user.
Gyro sensor 512 can detecte body direction and the rotational angle of terminal 500, and gyro sensor 512 canTo cooperate with acquisition user to act the 3D of terminal 500 with acceleration transducer 511.Processor 501 is according to gyro sensor 512Following function may be implemented in the data of acquisition: when action induction (for example changing UI according to the tilt operation of user), shootingImage stabilization, game control and inertial navigation.
The lower layer of side frame and/or touch display screen 505 in terminal 500 can be set in pressure sensor 513.Work as pressureWhen the side frame of terminal 500 is arranged in sensor 513, user can detecte to the gripping signal of terminal 500, by processor 501Right-hand man's identification or prompt operation are carried out according to the gripping signal that pressure sensor 513 acquires.When the setting of pressure sensor 513 existsWhen the lower layer of touch display screen 505, the pressure operation of touch display screen 505 is realized to UI circle according to user by processor 501Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menuAt least one of control.
Fingerprint sensor 514 is used to acquire the fingerprint of user, collected according to fingerprint sensor 514 by processor 501The identity of fingerprint recognition user, alternatively, by fingerprint sensor 514 according to the identity of collected fingerprint recognition user.It is identifyingWhen the identity of user is trusted identity out, the user is authorized to execute relevant sensitive operation, the sensitive operation packet by processor 501Include solution lock screen, check encryption information, downloading software, payment and change setting etc..Terminal can be set in fingerprint sensor 514500 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 500, fingerprint sensor 514 can be withIt is integrated with physical button or manufacturer Logo.
Optical sensor 515 is for acquiring ambient light intensity.In one embodiment, processor 501 can be according to opticsThe ambient light intensity that sensor 515 acquires controls the display brightness of touch display screen 505.Specifically, when ambient light intensity is higherWhen, the display brightness of touch display screen 505 is turned up;When ambient light intensity is lower, the display for turning down touch display screen 505 is brightDegree.In another embodiment, the ambient light intensity that processor 501 can also be acquired according to optical sensor 515, dynamic adjustThe acquisition parameters of CCD camera assembly 506.
Proximity sensor 516, also referred to as range sensor are generally arranged at the front panel of terminal 500.Proximity sensor 516For acquiring the distance between the front of user Yu terminal 500.In one embodiment, when proximity sensor 516 detects useWhen family and the distance between the front of terminal 500 gradually become smaller, touch display screen 505 is controlled from bright screen state by processor 501It is switched to breath screen state;When proximity sensor 516 detects user and the distance between the front of terminal 500 becomes larger,Touch display screen 505 is controlled by processor 501 and is switched to bright screen state from breath screen state.
It will be understood by those skilled in the art that the restriction of the not structure paired terminal 500 of structure shown in Fig. 5, can wrapIt includes than illustrating more or fewer components, perhaps combine certain components or is arranged using different components.
In the exemplary embodiment of the application, a kind of computer readable storage medium is additionally provided, for example including instructionMemory, above-metioned instruction can by the processor in above-mentioned terminal device execute to complete the character recognition side in above-described embodimentMethod.For example, the computer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy diskWith optical data storage devices etc..
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardwareIt completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readableIn storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely the alternative embodiments of the application, not to limit the application, it is all in spirit herein andWithin principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.

Claims (10)

CN201910677203.7A2019-07-252019-07-25Character recognition method, device and storage mediumActiveCN110377914B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201910677203.7ACN110377914B (en)2019-07-252019-07-25Character recognition method, device and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201910677203.7ACN110377914B (en)2019-07-252019-07-25Character recognition method, device and storage medium

Publications (2)

Publication NumberPublication Date
CN110377914Atrue CN110377914A (en)2019-10-25
CN110377914B CN110377914B (en)2023-01-06

Family

ID=68255945

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201910677203.7AActiveCN110377914B (en)2019-07-252019-07-25Character recognition method, device and storage medium

Country Status (1)

CountryLink
CN (1)CN110377914B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111144105A (en)*2019-12-172020-05-12浙江大华技术股份有限公司Word and sentence processing method and device and computer storage medium
CN113934922A (en)*2020-07-142022-01-14中移(成都)信息通信科技有限公司Intelligent recommendation method, device, equipment and computer storage medium
CN114612911A (en)*2022-01-262022-06-10哈尔滨工业大学(深圳) Stroke-level handwritten character sequence recognition method, device, terminal and storage medium
CN114692641A (en)*2020-12-282022-07-01华为技术有限公司 Method and device for obtaining characters

Citations (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPH0573726A (en)*1991-09-131993-03-26Aiwa Co LtdCharacter recognition device
US20120005222A1 (en)*2010-06-302012-01-05Varun BhagwanTemplate-based recognition of food product information
US8160839B1 (en)*2007-10-162012-04-17Metageek, LlcSystem and method for device recognition based on signal patterns
CN102663380A (en)*2012-03-302012-09-12中南大学Method for identifying character in steel slab coding image
CN105549890A (en)*2015-12-292016-05-04清华大学One-dimensional handwritten character input equipment and one-dimensional handwritten character input equipment
CN105608462A (en)*2015-12-102016-05-25小米科技有限责任公司Character similarity judgment method and device
CN106709490A (en)*2015-07-312017-05-24腾讯科技(深圳)有限公司Character recognition method and device
CN107169763A (en)*2017-04-262017-09-15沈思远Safe payment method and system based on signature recognition
CN109271497A (en)*2018-08-312019-01-25华南理工大学A kind of event-driven service matching method based on term vector
CN109299269A (en)*2018-10-232019-02-01阿里巴巴集团控股有限公司A kind of file classification method and device
CN109858039A (en)*2019-03-012019-06-07北京奇艺世纪科技有限公司A kind of text information identification method and identification device
CN109933686A (en)*2019-03-182019-06-25阿里巴巴集团控股有限公司Song Tag Estimation method, apparatus, server and storage medium
CN109992783A (en)*2019-04-032019-07-09同济大学 Chinese word vector modeling method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPH0573726A (en)*1991-09-131993-03-26Aiwa Co LtdCharacter recognition device
US8160839B1 (en)*2007-10-162012-04-17Metageek, LlcSystem and method for device recognition based on signal patterns
US20120005222A1 (en)*2010-06-302012-01-05Varun BhagwanTemplate-based recognition of food product information
CN102663380A (en)*2012-03-302012-09-12中南大学Method for identifying character in steel slab coding image
CN106709490A (en)*2015-07-312017-05-24腾讯科技(深圳)有限公司Character recognition method and device
CN105608462A (en)*2015-12-102016-05-25小米科技有限责任公司Character similarity judgment method and device
CN105549890A (en)*2015-12-292016-05-04清华大学One-dimensional handwritten character input equipment and one-dimensional handwritten character input equipment
CN107169763A (en)*2017-04-262017-09-15沈思远Safe payment method and system based on signature recognition
CN109271497A (en)*2018-08-312019-01-25华南理工大学A kind of event-driven service matching method based on term vector
CN109299269A (en)*2018-10-232019-02-01阿里巴巴集团控股有限公司A kind of file classification method and device
CN109858039A (en)*2019-03-012019-06-07北京奇艺世纪科技有限公司A kind of text information identification method and identification device
CN109933686A (en)*2019-03-182019-06-25阿里巴巴集团控股有限公司Song Tag Estimation method, apparatus, server and storage medium
CN109992783A (en)*2019-04-032019-07-09同济大学 Chinese word vector modeling method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHRIS MCCORMICK: "Word2vec Tutorial-The Skip-Gram Model", 《HTTP://MCCORMICKML.COM/2016/04/19/WORD2VEC TUTORIAL-THE-SKIP-GRAM-MODEL》*
SHAOSHENG CAO等: "cw2vec: Learning Chinese Word Embeddings with Stroke n-gram information", 《AAAI18》*
韩职好; 侯俊: "光照不佳下的小波变换车牌识别方法", 《信息系统工程》*

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111144105A (en)*2019-12-172020-05-12浙江大华技术股份有限公司Word and sentence processing method and device and computer storage medium
CN111144105B (en)*2019-12-172023-03-14浙江大华技术股份有限公司Word and sentence processing method and device and computer storage medium
CN113934922A (en)*2020-07-142022-01-14中移(成都)信息通信科技有限公司Intelligent recommendation method, device, equipment and computer storage medium
CN114692641A (en)*2020-12-282022-07-01华为技术有限公司 Method and device for obtaining characters
CN114612911A (en)*2022-01-262022-06-10哈尔滨工业大学(深圳) Stroke-level handwritten character sequence recognition method, device, terminal and storage medium
CN114612911B (en)*2022-01-262022-11-29哈尔滨工业大学(深圳)Stroke-level handwritten character sequence recognition method, device, terminal and storage medium

Also Published As

Publication numberPublication date
CN110377914B (en)2023-01-06

Similar Documents

PublicationPublication DateTitle
US12210569B2 (en)Video clip positioning method and apparatus, computer device, and storage medium
US10956771B2 (en)Image recognition method, terminal, and storage medium
US20220004794A1 (en)Character recognition method and apparatus, computer device, and storage medium
CN110147533B (en)Encoding method, apparatus, device and storage medium
CN110110145B (en)Descriptive text generation method and device
CN109829456A (en)Image-recognizing method, device and terminal
CN110377914A (en)Character identifying method, device and storage medium
CN108615526A (en)The detection method of keyword, device, terminal and storage medium in voice signal
CN108304506B (en)Retrieval method, device and equipment
US11995406B2 (en)Encoding method, apparatus, and device, and storage medium
CN111339737B (en)Entity linking method, device, equipment and storage medium
CN109491924A (en)Code detection method, device, terminal and storage medium
CN108922531B (en)Slot position identification method and device, electronic equipment and storage medium
CN111209377B (en)Text processing method, device, equipment and medium based on deep learning
CN110110045A (en)A kind of method, apparatus and storage medium for retrieving Similar Text
CN110162604A (en)Sentence generation method, device, equipment and storage medium
CN108132790A (en)Detect the method, apparatus and computer storage media of dead code
CN110096525A (en)Calibrate method, apparatus, equipment and the storage medium of interest point information
CN112232059B (en)Text error correction method and device, computer equipment and storage medium
CN110490179A (en)Licence plate recognition method, device and storage medium
CN110555102A (en)media title recognition method, device and storage medium
CN113569561A (en)Text error correction method and device, computer equipment and computer readable storage medium
CN113763931B (en)Waveform feature extraction method, waveform feature extraction device, computer equipment and storage medium
CN110232417A (en)Image-recognizing method, device, computer equipment and computer readable storage medium
CN113486260A (en)Interactive information generation method and device, computer equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp