CN110377914A

Movatterモバイル変換

Info

Publication number: CN110377914A
Application number: CN201910677203.7A
Authority: CN
Inventors: 李原野; 季成晖; 卢俊之
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-07-25
Filing date: 2019-07-25
Publication date: 2019-10-25
Anticipated expiration: 2039-07-25
Also published as: CN110377914B

Abstract

Description

Character identifying method, device and storage medium

Technical field

This application involves technical field of information processing, in particular to a kind of character identifying method, device and storage medium.

Background technique

Currently, terminal can show various text informations.Wherein, there may be wrong word in text information,Therefore, terminal can identify the wrong word in text information.

However, the close of phonetic can not represent the close of the meaning of one's words in many cases, that is, the appearance of wrong word canIt can be not because phonetic is close caused.In this case, it will be unable to detect wrong word by the above method.

Summary of the invention

The embodiment of the present application provides a kind of character identifying method, device and storage medium, can be used for improving wrong wordRecognition accuracy.The technical solution is as follows:

On the one hand, a kind of character identifying method is provided, which comprises

According to the stroke order of character to be detected, the stroke coding sequence of the character to be detected is determined；

Obtain the contextual information of the character to be detected；

According to the stroke coding sequence and contextual information of the character to be detected, determine the word of the character to be detected toAmount；

According to the word vector of the mapping relations and the character to be detected of the word vector of storage and character, to described to be detectedCharacter is identified.

On the other hand, a kind of character recognition device is provided, described device includes:

First determining module determines that the stroke of the character to be detected is compiled for the stroke order according to character to be detectedCode sequence；

First obtains module, for obtaining the contextual information of the character to be detected；

Second determining module determines institute for the stroke coding sequence and contextual information according to the character to be detectedState the word vector of character to be detected；

Identification module, for according to the words of the word vector of storage and the mapping relations of character and the character to be detected toAmount, identifies the character to be detected.

On the other hand, provide a kind of character recognition device, described device include processor, communication interface, memory andCommunication bus；

Wherein, the processor, the communication interface and the memory are completed each other by the communication busCommunication；

The memory is for storing computer program；

The processor is for executing the program stored on the memory, to realize character identifying method above-mentionedStep.

On the other hand, a kind of computer readable storage medium is provided, is stored with computer program in the storage medium,The computer program realizes the step of character identifying method of aforementioned offer when being executed by processor.

Technical solution bring beneficial effect provided by the embodiments of the present application includes at least:

Detailed description of the invention

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodimentAttached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, forFor those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings otherAttached drawing.

Fig. 1 is character identifying method flow chart provided by the embodiments of the present application；

Fig. 2 is a kind of character recognition device structural schematic diagram provided by the embodiments of the present application；

Fig. 3 is the structural schematic diagram of the first determining module provided by the embodiments of the present application；

Fig. 4 is another character recognition device structural schematic diagram provided by the embodiments of the present application；

Fig. 5 is provided by the embodiments of the present application for carrying out the structural schematic diagram of the terminal of character recognition.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present inventionFormula is described in further detail.

Before carrying out detailed explanation to the embodiment of the present application, first to the invention relates to application scenariosIt is introduced.

Currently, terminal can show various text informations.For example, terminal can show the text envelope of user's inputBreath.Alternatively, terminal can carry out character knowledge by OCR (Optical Character Recognition, optical character identification)Not, text information is obtained, and then shows text information.Alternatively, after terminal starts some application, it can be from the applicationIt obtains initial data in application server to be shown, for example, terminal can be from the map after terminal starts map applicationThe title of each point of interest on map is obtained in the server of application, and the title of each point of interest is shown in the mapOn.Wherein, due to text information during input or conversion identification it is possible that mistake, can in text informationThere can be wrong word, be based on this, terminal can use character recognition provided by the embodiments of the present application before display text informationMethod the wrong word in text information identified.

Certainly, in some possible scenes, it may be necessary to synonym, the nearly word form for obtaining certain words automatically, thisIn the case of, the identification of synonym, nearly word form can also be carried out using related realization mode provided by the embodiments of the present application.

In addition, it should also be noted that, character identifying method provided by the embodiments of the present application can also be in certain scenesIt is executed by server.For example, the application server can pass through the character for the application server for being stored with text informationRecognition methods identifies wrong word, to be modified to wrong word.

Next character crosstalk detecting method provided by the embodiments of the present application is introduced.

Fig. 1 is a kind of flow chart of character crosstalk detecting method provided by the embodiments of the present application.Referring to Fig. 1, this method can be withApplied in smart machine, in the embodiment of the present application, it is applied to be explained for terminal in this way, this methodIt may comprise steps of:

Step 101: according to the stroke order of character to be detected, determining the stroke coding sequence of character to be detected.

In the embodiment of the present application, the available text information to be detected of terminal, the text information to be detected can be withIncluding multiple Chinese characters.For each character in multiple Chinese character, terminal can as character to be detected,And detect whether the character to be detected is wrong word by character detection method provided by the embodiments of the present application.In other words,The character to be detected can refer to any character in text information to be detected.

Terminal can split character to be detected, obtain including multiple according to the stroke order of character to be detectedThe strokes sequence of character element.According to the mapping relations of character element and encoded information, each character group in strokes sequence is determinedThe corresponding encoded information of part；Determining multiple encoded informations are arranged according to the sequence of character elements multiple in strokes sequenceSequence obtains stroke coding sequence.

It should be noted that Chinese character was usually made of multiple strokes, and according to sequential write, each strokeThere is also sequencings.Based on this, character to be detected can be split as multiple by terminal according to the stroke order of character to be detectedCharacter element, each character element can be a stroke.The multiple character elements splitted into are arranged according to stroke order, thusTo strokes sequence.

It can store the mapping relations of character element and encoded information in terminal.After obtaining strokes sequence, terminalThe corresponding encoded information of each character element can be successively obtained according to the sequence of character element each in strokes sequence, later,The multiple encoded informations successively obtained are arranged according to acquisition sequence, to obtain stroke coding sequence.

It is the mapping relations of a kind of character element and encoded information shown in the embodiment of the present application in table 1.As shown in table 1,Each character element can correspond to a digital encoded information, in this way, passing through the mapping relations, it can by what is splitted outMultiple character elements are converted to the digital code information that can be handled by terminal, that is, stroke coding sequence.

The mapping relations of table 1 character element and encoded information

Character element	One	Shu	Pie	□	Dian		……
								Encoded information	1	2	3	4	5	6	……

Illustratively, it is assumed that character to be detected is " state ", and " state " word can be split as by terminal first, in accordance with stroke orderMultiple character elements, are respectively as follows: Shu , , one, one, Shu, one, Dian, and one.Next, believing according to character element shown in table 1 and codingThe mapping relations of breath are it is found that " Shu " corresponding encoded information is " 2 " , " " corresponding encoded information is " 6 ", " one " corresponding volumeCode information is " 1 ", and " Dian " corresponding encoded information is " 5 ".Based on this, terminal, can be with according to the sequence of above-mentioned multiple character elementsObtaining stroke coding sequence is " 2,6,1,1,2,1,5,1 ".

Step 102: obtaining the contextual information of character to be detected.

Since the font that the stroke coding sequence for being split Chinese character can be used for characterizing Chinese character is specialPoint, but also it is not enough to characterize the semanteme of Chinese character.Therefore, in the embodiment of the present application, terminal can also obtain to be detectedThe contextual information of character.

Wherein, terminal can obtain the contextual information of character to be detected by N-gram model.Illustratively, terminalThe size n of N-gram window can be set according to the window size that user inputs.Later, terminal can be from the character to be detectedIt is risen the location of in text information, obtains n character forward, obtain n character backward, the 2n character that will acquire is madeFor the contextual information of the character to be detected.Wherein, n is the integer more than or equal to 1.

For example, it is assumed that n=2, character to be detected is " children ", and text information is " the 18th kindergarten address ", then from " children "The location of word rises, and obtains two characters " 18 " forward, obtains two characters " youngster garden ", this 4 characters of acquisition backwardThe as contextual information of " children " word.

It should be noted that in the embodiment of the present application, terminal can also first carry out step 102, then execute step 101,Alternatively, being performed simultaneously step 101 and 102, the embodiment of the present application is not construed as limiting this.

Step 103: according to the stroke coding sequence and contextual information of character to be detected, determine the word of character to be detected toAmount.

After the stroke coding sequence and context information for getting character to be detected, which can compile according to the strokeCode sequence, determines multiple fragment encoding ses；By the way that designated deep learning model is to multiple fragment encoding se and this is to be checkedThe contextual information of glyphomancy symbol is handled, and the word vector of the character to be detected is obtained.

In the embodiment of the present application, in order to be uniformly input to designated deep learning model data format, determining penAfter drawing coded sequence, whether the number that terminal can detecte the encoded information for including in the stroke coding sequence is equal to k, ifThe number for the encoded information for including in the stroke coding sequence is less than k, then the terminal can be in the last of the stroke coding sequencePreset encoding information is supplemented after one encoded information, so that the number etc. for the encoded information that the stroke coding sequence of step includesIn k.

It include 8 encoded informations in the stroke coding sequence of character to be detected, then the terminal can for example, it is assumed that k=50With include in stroke coding sequence 8 encoded informations after supplement 42 preset encoding informations " m " so that supplement afterStroke coding sequence includes 50 encoded informations.Wherein.Preset encoding information can reflecting for character element and encoded informationPenetrate the encoded information being not present in relationship.For example, the preset encoding information can be equal to -1, it is, of course, also possible to be other numerical value,The embodiment of the present application does not limit this.

After supplementing stroke coding sequence, which can generate according to the stroke coding sequence after supplementMultiple fragment encoding ses.Illustratively, terminal can be since first encoded information in the stroke coding sequence, by thisContinuous m encoded information is divided into a segment after encoded information and the encoded information, compiles to obtain first segmentCode sequence.Later, continuous m encoded information is divided into a piece later for second encoded information and second encoded informationSection, so that second fragment encoding se is obtained, and so on.Wherein, m is the integer more than or equal to 1, and m is less than or waitsIn k.In general, m can be equal to 2,3 or 4.

For example, for aforementioned obtained stroke coding sequence " 2,6,1,1,2,1,5,1, -1 ..., -1 ", it is assumed that m=2,It, can be by first encoded information " 2 " and continuous 2 encoded informations later when being divided the stroke coding sequence" 6 ", " 1 " are used as a segment, obtain first fragment encoding se " 2,6,1 "；Later, by second encoded information " 6 " andContinuous 2 encoded informations " 1 ", " 1 " are used as a segment later, obtain second fragment encoding se " 6,1,1 "；By thirdA encoded information and later continuous two encoded informations as a segment, obtain the third fragment coded sequence " 1,1,2 ",And so on.

Optionally, in some possible situations, terminal can take different numerical value to come to stroke with m according to the method described aboveCoded sequence is divided, to obtain multiple fragment encoding ses.For example, terminal can pass through the above method in m=2It determines and obtains the fragment encoding se of multiple encoded informations comprising there are three.Later, terminal can enable m=3, pass through the above methodContinue to divide stroke coding sequence, to obtain multiple comprising there are four the fragment encoding ses of encoded information.Later,Terminal can also enable m=4, continue to divide stroke coding sequence by the above method, to obtain multiple including fiveThe fragment encoding se of a encoded information.

Optionally, if the encoded information for including in the stroke coding sequence of character to be detected is equal to k, which can be withNo longer stroke coding sequence is supplemented, but directly generates multiple fragment encoding ses by the method for above-mentioned introduction.

After determination obtains multiple fragment encoding ses, which can be by multiple fragment encoding ses and aforementioned acquisitionInput value of the contextual information of the character to be detected arrived as designated deep learning model.Designated deep learning model can rootAccording to each fragment encoding se, a fragment coding vector is generated.Later, it according to multiple fragment coding vectors of generation, generatesInitial word vector.Later, which can be according to the context of the initial word vector sum character to be detectedInformation determines and exports the word vector of the character to be detected.

Wherein, designated depth model can according to each fragment encoding se, to the initial segment vector of specified dimension intoRow assignment, to generate the corresponding fragment coding vector of respective segments coded sequence.After generating multiple fragment coding vectors,The vector element being located in same position in multiple fragment coding vectors can be carried out sum-average arithmetic by designated depth model, thusObtain initial word vector.For example, it is assumed that the dimension of fragment coding vector is 20 dimensions, then the designated depth model can be to eachElement sum-average arithmetic in section coding vector on first position, which is first element in initial word vector,By the element sum-average arithmetic in each fragment coding vector on second position, which is second in initial word vectorA element, and so on, obtain initial word vector.

It should be noted that the designated depth model can be skip-gram model.Also, the designated depth model canTo refer to through the model after the training of a large amount of sample data.

Illustratively, terminal it is available include designated character multiple sample datas, each sample data include refer toDetermine the stroke coding sequence of character and a kind of contextual information of designated character；By designated deep learning model to multiple samplesData are handled, and the corresponding word vector of designated character is obtained；It is stored in designated character is corresponding with the word vector of designated characterIn the mapping relations of word vector and character.

Wherein, for each character in Chinese character, the corresponding sample text information of the available kinds of characters of terminal.For example, for " state " word, terminal is available include " state " word multiple sample text information.Later, terminal can basisThe corresponding sample text information of each character, obtains multiple sample datas of each character.

For each character in Chinese character, terminal can be handled respective symbols by the above method,And then designated deep learning model is trained by the sample data of respective symbols, while exporting trained word vector,By the storage corresponding with correspondingly character of word vector.

Step 104: according to the word vector of the mapping relations and character to be detected of the word vector of storage and character, to be detectedCharacter is identified.

After obtaining the word vector of character to be detected by step 103 determination, terminal can determine word vector and characterMapping relations in similarity between each word vector for including and the word vector of character to be detected；According to each word vector withSimilarity between the word vector of character to be detected, identifies character to be detected.

Wherein, terminal can calculate the distance between each word vector in the character to be detected and mapping relations, with thisDistance characterizes the similarity between the character to be detected and corresponding word vector.Wherein, the distance between two vectors are closer, thenIllustrate that the similarity between two vectors is higher.Wherein, the implementation for calculating the distance between two vectors can refer to phaseImplementation is closed, details are not described herein for the embodiment of the present application.

After vector distance between the word vector and each word vector for determining the character to be detected, which can be fromThe vector distance for being less than distance to a declared goal threshold value is obtained in multiple vector distances.If it is less than the vector distance of distance to a declared goal threshold valueQuantity is only one, then character corresponding to the corresponding word vector of the available vector distance of the terminal, if the character withCharacter to be detected is identical, then illustrates that the character to be detected is not wrong word.If it is less than the vector distance of distance to a declared goal threshold valueQuantity is multiple, the minimum vector distance in the available multiple vector distances less than distance to a declared goal threshold value of the terminal, and is obtainedTake character corresponding to the corresponding word vector of the minimum vector distance.It, can be true if the character is different from character to be detectedThe fixed character to be detected may be wrong word, at this point, the terminal can will be corresponding to the corresponding word vector of minimum vector distanceCharacter is shown as recommendation character.Wherein, the recommendation character be directed to user's recommendation for replacing the character to be detectedCorrect characters.

Optionally, in some possible situations, if there is no the vector distance for being less than distance to a declared goal threshold value, then endThe character to be detected directly can be determined as wrong word by end.

In addition, the character identifying method in the embodiment of the present application can be also used for carrying out the excavation of nearly word form, in this fieldUnder scape, after the vector distance of word vector and each word vector that character to be detected is calculated, which can also basisThe size of vector distance determines the likelihood probability between each word vector and the word vector of character to be detected, and then according to the phaseLike the size of probability, output character similar with the character to be detected.

Next character string detection device provided by the embodiments of the present application is introduced.

Referring to fig. 2, the embodiment of the present application provides a kind of character recognition device 200, which includes:

First determining module 201 determines the stroke of the character to be detected for the stroke order according to character to be detectedCoded sequence；

First obtains module 202, for obtaining the contextual information of the character to be detected；

Second determining module 203 is determined for the stroke coding sequence and contextual information according to the character to be detectedThe word vector of the character to be detected；

Identification module 204, for the word according to the word vector of storage and the mapping relations of character and the character to be detectedVector identifies the character to be detected.

Optionally, referring to Fig. 3, first determining module 201 includes:

Submodule 2011 is split, for the stroke order according to the character to be detected, the character to be detected is carried outSplit, obtain include multiple character elements strokes sequence；

It determines submodule 2012, for the mapping relations according to character element and encoded information, determines the strokes sequenceIn the corresponding encoded information of each character element；

Sorting sub-module 2013, for the sequence according to multiple character elements described in the strokes sequence to determining moreA encoded information is ranked up, and obtains the stroke coding sequence.

Optionally, second determining module 203 is specifically used for:

According to the stroke coding sequence, multiple fragment encoding ses are determined；

Believed by context of the designated deep learning model to the multiple fragment encoding se and the character to be detectedBreath is handled, and the word vector of the character to be detected is obtained.

Optionally, referring to fig. 4, the device 200 further include:

Second obtains module 205, for obtain include designated character multiple sample datas, each sample data includesThe stroke coding sequence of the designated character and a kind of contextual information of the designated character；

Processing module 206 obtains institute for handling by designated deep learning model the multiple sample dataState the corresponding word vector of designated character；

Memory module 207, for being stored in the word for the designated character is corresponding with the word vector of the designated characterIn the mapping relations of vector and character.

Optionally, the identification module 204 is specifically used for:

Determine the word for each word vector and character to be detected for including in the mapping relations of the word vector and characterSimilarity between vector；

According to the similarity between each word vector and the word vector of the character to be detected, to the character to be detected intoRow identification.

It should be understood that character recognition device provided by the above embodiment is when identifying wrong word, only with above-mentioned each functionCan module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different functionsModule is completed, i.e., the internal structure of equipment is divided into different functional modules, described above all or part of to completeFunction.In addition, character recognition device provided by the above embodiment and character identifying method embodiment belong to same design, it is specificRealization process is detailed in embodiment of the method, and which is not described herein again.

Fig. 5 shows the structure for the terminal 500 for carrying out character recognition that one exemplary embodiment of the application providesBlock diagram.The terminal 500 may is that smart phone, tablet computer, laptop or desktop computer.Terminal 500 is also possible to be claimedFor user equipment, the equipment of portable adjustment neural network model, the equipment of adjustment neural network model on knee, desk-top adjustmentOther titles such as equipment of neural network model.

In general, terminal 500 includes: processor 501 and memory 502.

Processor 501 may include one or more processing cores, such as 4 core processors, 8 core processors etc..PlaceReason device 501 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmedLogic array) at least one of example, in hardware realize.Processor 501 also may include primary processor and coprocessor, main placeReason device is the processor for being handled data in the awake state, also referred to as CPU (Central ProcessingUnit, central processing unit)；Coprocessor is the low power processor for being handled data in the standby state.?In some embodiments, processor 501 can be integrated with GPU (Graphics Processing Unit, image processor),GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 501 can also be wrappedAI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learningCalculating operation.

Memory 502 may include one or more computer readable storage mediums, which canTo be non-transient.Memory 502 may also include high-speed random access memory and nonvolatile memory, such as oneOr multiple disk storage equipments, flash memory device.In some embodiments, non-transient computer-readable in memory 502Storage medium for storing at least one instruction, at least one instruction for performed by processor 501 to realize the applicationThe character identifying method that embodiment of the method provides.

In some embodiments, terminal 500 is also optional includes: peripheral device interface 503 and at least one peripheral equipment.It can be connected by bus or signal wire between processor 501, memory 502 and peripheral device interface 503.Each peripheral equipmentIt can be connected by bus, signal wire or circuit board with peripheral device interface 503.Specifically, peripheral equipment includes: radio circuit504, at least one of touch display screen 505, camera 506, voicefrequency circuit 507, positioning component 508 and power supply 509.

Peripheral device interface 503 can be used for I/O (Input/Output, input/output) is relevant outside at least onePeripheral equipment is connected to processor 501 and memory 502.In some embodiments, processor 501, memory 502 and peripheral equipmentInterface 503 is integrated on same chip or circuit board；In some other embodiments, processor 501, memory 502 and outerAny one in peripheral equipment interface 503 or two can realize that this is not added in the present embodiment on individual chip or circuit boardTo limit.

Radio circuit 504 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.It penetratesFrequency circuit 504 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 504 turns electric signalIt is changed to electromagnetic signal to be sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit 504 wrapsIt includes: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, codec chipGroup, user identity module card etc..Radio circuit 504 can be by least one wireless communication protocol come neural with other adjustmentThe equipment of network model is communicated.The wireless communication protocol includes but is not limited to: in WWW, Metropolitan Area Network (MAN), Intranet, each generation, moveDynamic communication network (2G, 3G, 4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio circuit 504 can also include that (Near Field Communication, near radio are logical by NFCLetter) related circuit, the application is not limited this.

Display screen 505 is for showing UI (User Interface, user interface).The UI may include figure, text, figureMark, video and its their any combination.When display screen 505 is touch display screen, display screen 505 also there is acquisition to showThe ability of the touch signal on the surface or surface of screen 505.The touch signal can be used as control signal and be input to processor501 are handled.At this point, display screen 505 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft button and/orSoft keyboard.In some embodiments, display screen 505 can be one, and the front panel of terminal 500 is arranged；In other embodimentsIn, display screen 505 can be at least two, be separately positioned on the different surfaces of terminal 500 or in foldover design；In still other realityIt applies in example, display screen 505 can be flexible display screen, be arranged on the curved surface of terminal 500 or on fold plane.Even, it showsDisplay screen 505 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 505 can use LCD (LiquidCrystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode)Etc. materials preparation.

CCD camera assembly 506 is for acquiring image or video.Optionally, CCD camera assembly 506 include front camera andRear camera.In general, the front panel of terminal device is arranged in front camera, the back of terminal device is arranged in rear cameraFace.In some embodiments, rear camera at least two, be respectively main camera, depth of field camera, wide-angle camera,Any one in focal length camera, to realize that main camera and the fusion of depth of field camera realize background blurring function, main cameraIt is merged with wide-angle camera and realizes pan-shot and VR (Virtual Reality, virtual reality) shooting function or otherMerge shooting function.In some embodiments, CCD camera assembly 506 can also include flash lamp.Flash lamp can be monochromatic temperatureFlash lamp is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be withFor the light compensation under different-colour.

Voicefrequency circuit 507 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and willSound wave, which is converted to electric signal and is input to processor 501, to be handled, or is input to radio circuit 504 to realize voice communication.For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 500 to be multiple.MikeWind can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 501 or radio circuit will to be come from504 electric signal is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramic loudspeaker.WhenWhen loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, it can also be by telecommunicationsNumber the sound wave that the mankind do not hear is converted to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 507 can also includeEarphone jack.

Positioning component 508 is used for the current geographic position of positioning terminal 500, to realize navigation or LBS (LocationBased Service, location based service).Positioning component 508 can be the GPS (Global based on the U.S.Positioning System, global positioning system), China dipper system or Russia Galileo system positioning groupPart.

Power supply 509 is used to be powered for various components in terminal 500.Power supply 509 can be alternating current, direct current, oneSecondary property battery or rechargeable battery.When power supply 509 includes rechargeable battery, which can be wired charging batteryOr wireless charging battery.Wired charging battery is the battery to be charged by Wireline, and wireless charging battery is by wireless lineEnclose the battery of charging.The rechargeable battery can be also used for supporting fast charge technology.

In some embodiments, terminal 500 further includes having one or more sensors 510.The one or more sensors510 include but is not limited to: acceleration transducer 511, gyro sensor 512, pressure sensor 513, fingerprint sensor 514,Optical sensor 515 and proximity sensor 516.

The acceleration that acceleration transducer 511 can detecte in three reference axis of the coordinate system established with terminal 500 is bigIt is small.For example, acceleration transducer 511 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 501 canWith the acceleration of gravity signal acquired according to acceleration transducer 511, touch display screen 505 is controlled with transverse views or longitudinal viewFigure carries out the display of user interface.Acceleration transducer 511 can be also used for the acquisition of game or the exercise data of user.

Gyro sensor 512 can detecte body direction and the rotational angle of terminal 500, and gyro sensor 512 canTo cooperate with acquisition user to act the 3D of terminal 500 with acceleration transducer 511.Processor 501 is according to gyro sensor 512Following function may be implemented in the data of acquisition: when action induction (for example changing UI according to the tilt operation of user), shootingImage stabilization, game control and inertial navigation.

The lower layer of side frame and/or touch display screen 505 in terminal 500 can be set in pressure sensor 513.Work as pressureWhen the side frame of terminal 500 is arranged in sensor 513, user can detecte to the gripping signal of terminal 500, by processor 501Right-hand man's identification or prompt operation are carried out according to the gripping signal that pressure sensor 513 acquires.When the setting of pressure sensor 513 existsWhen the lower layer of touch display screen 505, the pressure operation of touch display screen 505 is realized to UI circle according to user by processor 501Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menuAt least one of control.

Fingerprint sensor 514 is used to acquire the fingerprint of user, collected according to fingerprint sensor 514 by processor 501The identity of fingerprint recognition user, alternatively, by fingerprint sensor 514 according to the identity of collected fingerprint recognition user.It is identifyingWhen the identity of user is trusted identity out, the user is authorized to execute relevant sensitive operation, the sensitive operation packet by processor 501Include solution lock screen, check encryption information, downloading software, payment and change setting etc..Terminal can be set in fingerprint sensor 514500 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 500, fingerprint sensor 514 can be withIt is integrated with physical button or manufacturer Logo.

Optical sensor 515 is for acquiring ambient light intensity.In one embodiment, processor 501 can be according to opticsThe ambient light intensity that sensor 515 acquires controls the display brightness of touch display screen 505.Specifically, when ambient light intensity is higherWhen, the display brightness of touch display screen 505 is turned up；When ambient light intensity is lower, the display for turning down touch display screen 505 is brightDegree.In another embodiment, the ambient light intensity that processor 501 can also be acquired according to optical sensor 515, dynamic adjustThe acquisition parameters of CCD camera assembly 506.

Proximity sensor 516, also referred to as range sensor are generally arranged at the front panel of terminal 500.Proximity sensor 516For acquiring the distance between the front of user Yu terminal 500.In one embodiment, when proximity sensor 516 detects useWhen family and the distance between the front of terminal 500 gradually become smaller, touch display screen 505 is controlled from bright screen state by processor 501It is switched to breath screen state；When proximity sensor 516 detects user and the distance between the front of terminal 500 becomes larger,Touch display screen 505 is controlled by processor 501 and is switched to bright screen state from breath screen state.

It will be understood by those skilled in the art that the restriction of the not structure paired terminal 500 of structure shown in Fig. 5, can wrapIt includes than illustrating more or fewer components, perhaps combine certain components or is arranged using different components.

In the exemplary embodiment of the application, a kind of computer readable storage medium is additionally provided, for example including instructionMemory, above-metioned instruction can by the processor in above-mentioned terminal device execute to complete the character recognition side in above-described embodimentMethod.For example, the computer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy diskWith optical data storage devices etc..

Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardwareIt completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readableIn storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..

The foregoing is merely the alternative embodiments of the application, not to limit the application, it is all in spirit herein andWithin principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.

Claims

1. a kind of character identifying method, which is characterized in that the described method includes:

Obtain the contextual information of the character to be detected；

According to the stroke coding sequence and contextual information of the character to be detected, the word vector of the character to be detected is determined；

According to the word vector of the mapping relations and the character to be detected of the word vector of storage and character, to the character to be detectedIt is identified.

2. the method according to claim 1, wherein the stroke order according to character to be detected, determines instituteState the stroke coding sequence of character to be detected, comprising:

According to the stroke order of the character to be detected, the character to be detected is split, obtains including multiple charactersThe strokes sequence of component；

According to the mapping relations of character element and encoded information, the corresponding coding of each character element in the strokes sequence is determinedInformation；

Determining multiple encoded informations are ranked up according to the sequence of multiple character elements described in the strokes sequence, are obtainedThe stroke coding sequence.

3. the method according to claim 1, wherein the stroke coding sequence according to the character to be detectedAnd contextual information, determine the word vector of the character to be detected, comprising:

By designated deep learning model to the contextual information of the multiple fragment encoding se and the character to be detected intoRow processing, obtains the word vector of the character to be detected.

4. method according to claim 1 to 3, which is characterized in that described to be compiled according to the stroke of the character to be detectedCode sequence and contextual information, before the word vector for determining the character to be detected, further includes:

Acquisition includes multiple sample datas of designated character, and each sample data includes the stroke coding sequence of the designated characterA kind of contextual information of column and the designated character；

The multiple sample data is handled by designated deep learning model, obtain the corresponding word of the designated character toAmount；

By the designated character mapping relations for being stored in the word vector and character corresponding with the word vector of the designated characterIn.

5. method according to claim 1 to 4, which is characterized in that described according to the word vector of storage and reflecting for characterThe word vector for penetrating relationship and the character to be detected identifies the character to be detected, comprising:

Determine the word vector for each word vector and character to be detected for including in the mapping relations of the word vector and characterBetween similarity；

According to the similarity between each word vector and the word vector of the character to be detected, the character to be detected is knownNot.

6. a kind of character recognition device, which is characterized in that described device includes:

First determining module determines the stroke coding sequence of the character to be detected for the stroke order according to character to be detectedColumn；

Second determining module, for the stroke coding sequence and contextual information according to the character to be detected, determine it is described toDetect the word vector of character；

Identification module, it is right for the word vector according to the word vector of storage and the mapping relations of character and the character to be detectedThe character to be detected is identified.

7. device according to claim 6, which is characterized in that first determining module includes:

Submodule is split to split the character to be detected for the stroke order according to the character to be detected, obtainIt include the strokes sequence of multiple character elements；

It determines submodule, for the mapping relations according to character element and encoded information, determines each word in the strokes sequenceAccord with the corresponding encoded information of component；

Sorting sub-module believes determining multiple codings for the sequence according to multiple character elements described in the strokes sequenceBreath is ranked up, and obtains the stroke coding sequence.

8. device according to claim 6, which is characterized in that second determining module is specifically used for:

9. a kind of character recognition device, which is characterized in that described device includes processor and memory, is stored in the memoryHave at least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu,The code set or described instruction collection are loaded by the processor and are executed to realize word as claimed in claim 1 to 5Accord with recognition methods.

10. a kind of computer readable storage medium, which is characterized in that be stored at least one instruction, extremely in the storage mediumA few Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or describedInstruction set is loaded by processor and is executed to realize character identifying method as claimed in claim 1 to 5.