Specific embodiment
To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with attached drawing to the embodiment of the present applicationMode is described in further detail.
Fig. 1 shows showing for the appraisal procedure for the quality of instruction that can be applied to the application or the assessment device of quality of instructionExample property system architecture 100.
As shown in Figure 1, system architecture 100 may include first terminal equipment 100, first network 101, server 102,Two networks 103 and second terminal equipment 104.First network 104 between first terminal equipment 101 and server 102 for mentioningFor the medium of communication link, the second network 103 is for providing communication link between second terminal equipment 104 and server 102Medium.First network 101 and the second network 103 may include various types of wired communications links or wireless communication link,Such as: wired communications links include optical fiber, twisted pair or coaxial cable, and wireless communication link includes bluetooth communications link, nothingLine fidelity (WIreless-FIdelity, Wi-Fi) communication link or microwave communications link etc..
First terminal equipment 100 passes through first network 101, server 102, the second network 103 and second terminal equipment 104Between communicated, first terminal equipment 100 sends message to server 102, and server 102 forwards messages to second terminalEquipment 104, second terminal equipment 104 transmit the message to server 102, and server 102 forwards messages to second terminal and setsStandby 100, the communication being achieved between first terminal equipment 100 and second terminal equipment 104, first terminal equipment 100 and theThe type of message of interaction includes control data and business datum between two terminal devices 104.
Wherein, in this application, first terminal equipment 100 is the terminal that student attends class, and second terminal equipment 104 is teacherThe terminal attended class;Or first terminal equipment 100 is the terminal of class-teaching of teacher, second terminal equipment 104 is the terminal that student attends class.Such as: business datum is video flowing, and first terminal equipment 100 leads to the first video flowing during camera acquisition student attends class,Second terminal equipment acquires the second video flowing during class-teaching of teacher by camera 104, and first terminal equipment 100 is by firstFirst video flowing is transmitted to second terminal equipment 104, second terminal equipment to server 102, server 102 by video stream104 show the first video flowing and the second video flowing on interface;Second terminal equipment 104 is by the second video stream to server102, the second video flowing is transmitted to first terminal equipment 100 by server 102, and first terminal equipment 100 shows the first video flowingWith the second video flowing.
Wherein, the mode of attending class of the application can be one-to-one or one-to-many, i.e. the corresponding student or one of a teacherA teacher corresponds to multiple students.Correspondingly, one is used for the terminal and a use of class-teaching of teacher in one-to-one teaching methodIt is communicated between the terminal that student attends class;In one-to-many teaching method, one for the terminal of class-teaching of teacher and moreIt is communicated between a terminal attended class for student.
Various communication customer end applications can be installed in first terminal equipment 100 and second terminal equipment 104, such as:Video record application, video playing application, interactive voice application, searching class application, timely means of communication, mailbox client, societyHand over platform software etc..
First terminal equipment 100 and second terminal equipment 104 can be hardware, be also possible to software.When terminal device 101~103 be hardware when, can be the various electronic equipments with display screen, including but not limited to smart phone, tablet computer, kneeMo(u)ld top half portable computer and desktop computer etc..When first terminal equipment 100 and second terminal equipment 104 are software,It can be and install in above-mentioned cited electronic equipment.Its may be implemented in multiple softwares or software module (such as: for mentioningFor Distributed Services), single software or software module also may be implemented into, be not specifically limited herein.
When first terminal equipment 100 and second terminal equipment 104 are hardware, be also equipped with thereon display equipment andCamera, display equipment, which is shown, can be the various equipment for being able to achieve display function, and camera is for acquiring video flowing;Such as:Display equipment can be cathode-ray tube display (Cathode ray tubedisplay, abbreviation CR), diode displayingDevice (Light-emitting diode display, abbreviation LED), electronic ink screen, liquid crystal display (Liquid crystalDisplay, abbreviation LCD), Plasmia indicating panel (Plasma displaypanel, abbreviation PDP) etc..User can useDisplay equipment on one terminal device 100 and second terminal equipment 104, come information such as the texts, picture, video of checking display.
It should be noted that the appraisal procedure of quality of instruction provided by the embodiments of the present application is generally executed by server 102,Correspondingly, the assessment device of quality of instruction is generally positioned in server 102.Such as: server 102 detects first terminal equipmentIn the facial pose of first video flowing middle school student of 100 acquisitions, and the second video flowing of the detection acquisition of second terminal equipment 104The facial pose of teacher assesses teaching quality information according to the facial pose of student and the facial pose of teacher.In addition, working as studentFacial pose be continuously abnormal posture duration be more than duration threshold value in the case where, server 102 is to first terminal equipment 100Prompt information is sent, to prompt attention of student not concentrate;When the duration that the facial pose of teacher is continuously abnormal posture is more thanIn the case where duration threshold value, server 102 sends prompt information to second terminal equipment 104, to prompt teachers ' teaching quality notIt is good.
Server 102 can be to provide the server of various services, and server 102 can be hardware, be also possible to software.When server 105 is hardware, the distributed server cluster of multiple server compositions may be implemented into, list also may be implemented intoA server.When server 102 is software, multiple softwares or software module may be implemented into (such as providing distributionService), single software or software module also may be implemented into, be not specifically limited herein.
It should be understood that the number of terminal device, network and server in Fig. 1 is only illustrative.It, can according to needs are realizedTo be any number of terminal device, network and server.
Below in conjunction with attached drawing 2- attached drawing 7, the appraisal procedure of quality of instruction provided by the embodiments of the present application is carried out detailedIt introduces.Wherein, the assessment device of the quality of instruction in the embodiment of the present application can be Fig. 2-server shown in Fig. 7.
Fig. 2 is referred to, a kind of flow diagram of the appraisal procedure of quality of instruction is provided for the embodiment of the present application.Such as figureShown in 2, the embodiment of the present application the method may include following steps:
The second video flowing of S201, the first video flowing for obtaining the acquisition of first terminal equipment and the acquisition of second terminal equipment.
Wherein, video flowing uses the continuous time-base media of streaming technology in internet or Intranet, and streaming video existsEntire file is not downloaded before playing, beginning content is only stored in memory, the data flow of streaming video transmits at any time at any timeIt plays, is only beginning with some delays.First terminal equipment acquires video flowing using internal or external camera, will adoptThe video stream collected is to server;Second terminal will using internal or external camera
Such as: first terminal equipment is the terminal that student attends class, for second terminal equipment is the terminal of class-teaching of teacher, theOne terminal device collected student is attended class during video stream to server, second terminal equipment will be collectedFor the second video stream during class-teaching of teacher to server, server can receive the first view from first terminal equipmentFirst video flowing and the second video flowing can be spliced to obtain by the second video flowing of frequency stream and second terminal equipment, serverOne video flowing, first terminal equipment and second terminal equipment play spliced video flowing in a broadcast window;It can alsoOnly to forward the first video flowing and the second video flowing, first terminal equipment and second terminal equipment are respectively in two broadcast windowsPlay the first video flowing and the second video flowing.There is different user types to identify for first video flowing and the second video flowing, so as toIt is the video flowing for including teacher which server, which distinguishes, which is the video flowing for including student.Such as: the packet header of the first video flowingMiddle carrying user type is identified as " student ", and server indicates to determine the packet of the one the second video flowings according to the user typeUser type is carried in head is identified as " teacher ".
Wherein, server can periodically acquire the first video flowing and the second video flowing, the first video flowing and the second viewFrequency flows corresponding identical initial time and terminates the time, and the length of the first video flowing and the second video flowing is preset length, in advanceIf the period of length and acquisition according to actual needs depending on, the application is with no restriction.Such as: server every 6 minutes to acquireThe first video flowing and the second video flowing that length is 5 minutes.
Video frame in S202, the first video flowing of extraction obtains the first sets of video frames and extracts the view in the second video flowingFrequency frame obtains the second sets of video frames.
Wherein, the first video flowing and the second video flowing include multiple video frames.By taking the first video flowing as an example, server can be withVideo frame all in the first video flowing is extracted as the first sets of video frames, the first view can also be extracted according to default ruleThe video frame of part is as the first sets of video frames in frequency stream.
In one embodiment, server only extracts the key frame in the first video flowing, using the I frame extracted as firstSets of video frames;The key frame in the second video flowing is only extracted simultaneously as the second sets of video frames.Further, server canKey frame will be extracted from the first video flowing according to the preset sampling interval as the first sets of video frames, and according to defaultSampling interval key frame is extracted from the second video flowing as the second sets of video frames.First sets of video frames and the second videoVideo frame in frame set may include face, it is also possible to not include face.
Wherein, server can be after extracting the first sets of video frames and the second sets of video frames, in each video frameIn detect whether there are face, the case where there are faces can be divided into: complete face, part face are blocked by barrierFace.Server can detect video frame with the presence or absence of face, if detecting face, by face according to Face datection algorithmIt is marked.Face datection algorithm include the recognizer based on human face characteristic point, the recognizer based on whole picture facial image,Recognizer, the algorithm identified using neural network based on template and the algorithm identified using support vector machines,The application is with no restriction.
Wherein, server is identified according to the user type of the user type of the first video flowing mark and the second video flowing and is distinguishedThe first video flowing is that acquisition student's process of attending class generates out, and the second video flowing is that acquisition class-teaching of teacher process generates.
It is each in the second sets of video frames of facial pose and identification of each video frame in S203, the first sets of video frames of identificationThe facial pose of a video flowing.
Wherein, facial pose is expression of the head with respect to some reference axis degree of deflection of people, and identification facial pose takesBusiness device estimates the process of facial deflection angle from video frame, and deflection angle indicates that face is around in the angle of y-axis.Estimate in postureIt counts in precision, Attitude estimation can be divided into two major classes: rough estimate mode and thin estimation mode.The embodiment of the present application is only regardingWhen there is complete unobstructed face in frequency frame, can just remove the deflection angle for identifying the face, i.e., only to complete face intoThe estimation of horizontal deflection angle identifies that face is positive face posture or side face posture according to deflection angle.If wrapped in video frameInclude part face, identification division face is also to be above the acquisition range of camera caused by blocking to cause;If in video frameThere is no in the case where face, identify facial pose for no face posture.
In rough estimate mode, facial pose estimation is the process in the coarse face deflection direction for estimating people, such as:Face is deflected or is upward deflected to the left.
In thin estimation mode, facial pose estimation is the precisive on three-dimensional space to deflection angle, that is,Deflection angle of the head relative to some coordinate plane.In the ideal situation, in three reference axis, facial pose is denoted as: being enclosedBeing around in X-axis facial pose range is -90 °~90 °, is -90 °~90 ° around Y-axis facial pose range, surrounds Z axis facial poseRange is -90 °~90 °
In one embodiment, the method based on template estimates facial pose: by image to be identified and known facePose template is compared to obtain facial pose.
In one embodiment, facial pose is estimated based on detector array: is trained by sample form multiple and differentHuman-face detector, to achieve the effect that different angle Face datection.In terms of facial pose, using detection zone array approachPrinciple: by support vector machines or the multiple facial pose classifiers of Adboost cascaded iteration algorithm training, to detect differenceFacial pose.
In one embodiment, facial pose is estimated based on elastic model: in facial pose estimation, two different peoplesImage is also not quite identical, because face characteristic position is different with the difference of people, nowadays, using based on local feature regionThe variable image of (canthus point, prenasale, corners of the mouth point etc.) is to be trained to every in order to train this algorithm as templateFace figure carries out handmarking's human face characteristic point, and on each local feature, description is extracted using Gaborjets.ThisA little characteristic points are extracted from multiple and different visual angles, and extra invariance can be retouched by being stored in a series of of each nodeMiddle acquisition is stated, these descriptions are referred to as elastic bunch graph.Will more a branch of figure and a new facial image, beam figure is placed on face figureAs on, the shortest distance between two characteristic points is being found by thorough or repeated deformation in the position of each node of graph, thisA process is elastic graph matching.
In facial pose estimation, different beam figures is created to every kind of posture, each beam figure is used for face more to be testedPortion's posture figure, because beam figure distributes a discrete head posture using maximum similitude.Since EGM uses face key portionThe characteristic point of position greatly reduces the variation between individual, this makes between model compared with using the face position not adjustedSimilitude is bigger may to be equal to facial pose.
S204, according to each in the facial pose of video frame each in the first sets of video frames and/or the second sets of video framesThe facial pose of a video frame assesses quality of instruction.
In general, quality of instruction indicates that multiple quality-classes can be used in the measurement of the effect in teaching process, quality of instructionDo not grade, the quantity of quality scale can according to actual needs depending on, the application is with no restriction.
Such as: teaching quality information is divided into normal and bad 2 quality scales to indicate;Another example is: teaching quality informationIt is divided into outstanding, general and bad 3 quality scales to indicate.Preparatory trained Teaching Quality Assessment mould can be used in serverType is come to evaluate the first video flowing and/or the corresponding teaching quality information of the second video flowing, Evaluation Model of Teaching Quality be using instructionPractice what sample set training obtained, training sample concentrates training sample and quality scale label including multiple facial poses, qualityThe quality scale of rank tag representation training sample, such as: indicate bad using 0,1 expression is general, 2 indicate outstanding.
In one embodiment, teaching quality information is divided into two quality scales, and the facial pose of the first video frame is also known asFirst facial posture, the facial pose of the second video frame are also known as the second facial pose, and monitoring server first facial posture is heldIn the case that the duration that continuous time and/or the second facial pose are continuously abnormal posture is greater than duration threshold value, quality of instruction is determinedIt is bad;Conversely, quality of instruction is normal.
When being executed, server obtains student's video flowing and/or teacher during attending class to the scheme of the embodiment of the present applicationVideo flowing detects the face in student's video flowing and teacher's video flowing, identifies the facial pose of face, assessed according to facial poseQuality of instruction.On the one hand the application may be implemented to assess quality of instruction in real time, can be in real time according to the quality of instruction of Real-time FeedbackIt is that subsequent teaching provides the problems in reference, timely correction teaching process, improves the quality and efficiency of entire teaching process;On the other hand quality of instruction is assessed by facial pose simultaneously, facial pose is able to achieve as a significant biological characteristicCordless and simple recognizer identify, the accuracy of identification process can be improved.
Fig. 3 is referred to, a kind of flow diagram of the appraisal procedure of quality of instruction is provided for the embodiment of the present application.This realityIt applies example and is applied to illustrate in server with the appraisal procedure of quality of instruction.The appraisal procedure of the quality of instruction may includeFollowing steps:
S301, facial pose identification model is obtained according to facial pose sample set progress model training.
Wherein, facial pose sample set includes multiple facial pose samples, each facial pose sample carrier state markIt infuses, includes the sample of a variety of different facial poses in facial pose sample set, the quantity of the sample of each posture can be equalOr it is roughly equal, to improve the accuracy of model training.
In one embodiment, the sample set type in facial pose sample set includes: no face posture sample, positive facePosture sample, side face posture sample, without face posture sample, block posture sample and appearance posture sample, no face posture sample tableShow that there is no faces in video frame;Positive face posture sample indicates in video frame there are complete face, and the deflection angle of the faceThe absolute value of degree is less than angle threshold;Side face posture sample indicates that there are the deflection angles of complete face and the face in video frameThe absolute value of degree is greater than angle threshold;Blocking posture sample indicates in video frame there are complete face but the face is by barrierIt blocks;Appearance posture sample indicates it is as caused by the acquisition range beyond camera there are part face in video frame.ExampleSuch as: in facial pose sample set without face posture sample, positive face posture sample, side face posture sample, without face posture sample andThe quantity of appearance posture sample is respectively 1000, and the quantity of each sample is more balanced, can train facial appearance fasterState identification model.
In general, the type of facial pose identification model can be gauss hybrid models, neural network model and hidden Ma ErSection's husband's model.
S302, the first video flowing for periodically obtaining the acquisition of first terminal equipment.
Wherein, first terminal equipment can acquire the first video flowing by camera, and collected first video flowing is sent outServer is given, server periodically acquires the first video flowing from first terminal equipment since initial time of attending class.ClothesBusiness device periodically acquires the first video flowing of preset length, the period of acquisition and preset length can according to actual needs andFixed, the application is with no restriction.
S303, the key frame extracted in the first video flowing form the first sets of video frames.
Wherein, it extracts the key frame in the first video flowing and forms the first sets of video frames, had in key frame video frameWhole image information, key frame need not rely on other video frames to unzip it and decode, the view in the first sets of video framesFrequency frame is the figure for unziping it and decoding according to key frame.
First face of each video frame in S304, the first sets of video frames of detection.
Wherein, Face datection is scanned for video frame, if it find that face, then return to the location information of face, peopleFace detection is the important foundation of subsequent human face analysis.Such as: firstly the need of Face datection is carried out, it then just can be carried out face knowledgeNot.If not detecting the first face in video frame, it is determined that the facial pose of the video frame is no face posture, if videoThe first face is detected in frame, then further identifies facial pose.
In some embodiments, describe face characteristic according to the existing priori knowledge of face to construct a series of rulesBetween relationship.Due to there is the features such as symmetrical eyes, eyebrow, nose and mouth in face, and between these featuresRelative position or relative distance are fixed and invariable to a certain extent.Knowledge based engineering method is by the criterion of building come one by oneThe candidate item of face is screened, and finally realizes Face datection.
In some embodiments, the method based on feature is that only possible searching can be in different perspectives, different posture and notWith illumination etc. it is changeable under the conditions of be able to maintain stable invariant, and this finds face.Such as: it can be from edge feature, lineReason one of feature and color characteristic a variety of detects face.
In some embodiments, face standard form or skin detection according to the pre-stored data are stored, and are being examinedWhen surveying face, image to be detected and face standard form are compared to detection face, face table template or face characteristic mouldPlate needs are configured in advance.
S305, the first face is detected.
Wherein, the first face is detected in the video frame, and the first face detected may be complete face, it is also possible toIt is part face, subsequent deflection angle, coverage extent and integrated degree further according to the first face identifies facial pose.
S306, the characteristic information for extracting the first face.
Wherein, the characteristic information of the first face includes color characteristic information, texture feature information and shape feature information, spyReference breath can be indicated with multi-C vector.
S307, it the characteristic information of the first face is input to facial pose identification model obtains facial pose recognition result.
Wherein, facial pose recognition result includes a variety of facial poses, the facial pose of facial pose identification model outputRecognition result is a score value, and score value is within a preset range.Such as: preset range is between 0 to 1, and different facial poses is pre-It is first configured with different value intervals, the value interval where being determined according to the score value of facial pose identification model output, thenObtain the associated facial pose of value interval.
S308, the second video flowing for periodically obtaining the acquisition of second terminal equipment.
Wherein, the second video flowing is that second terminal equipment is collected by camera, and second terminal equipment will collectThe second video stream server, server receive the second video flowing from second terminal equipment.Server periodically obtainsFetch the second video flowing from second terminal equipment.Wherein, server obtains the period of the first video flowing and obtains the second videoThe period of stream is identical.
For example: it is shown in Figure 4, server detection reach it is preset attend class initial time when, create Virtual Class,First terminal equipment and second terminal equipment are added in Virtual Class, first terminal equipment starts to start camera acquisition theOne video flowing, by the first video stream to server;Second terminal equipment starts to start camera the second video flowing of acquisition, willFor second video stream to server, the first video flowing and the second video flowing are real-time continuous Media Streams.Server uses phaseWith cycle T 2 acquire the video flowing of preset duration T1, the duration of T1 and T2 can according to actual needs depending on, t0 is to have attended classBegin the moment.
S309, the second sets of video frames of key frame composition in the second video flowing is extracted.
Wherein, in order to reduce calculation amount, server can be with each video frame in collected first video flowing atReason, server extract the key frame in the second video flowing and form the second sets of video frames, and key frame has complete picture, disobeysRely other frames that decoding can be completed, key frame is usually I frame.In general, each video frame in the second sets of video frames isThe image that key frame is unziped it and decoded.
Second face of each video frame in S310, the second sets of video frames of detection.
Wherein, Face datection is with the presence or absence of face in detection video frame, and face, returns to the position letter of face if it existsBreath.Face if it does not exist determines that the video frame corresponds to facial pose as no face posture.
S311, the second face is detected.
S312, the characteristic information for extracting the second face.
S313, it the characteristic information of the second face is input to facial pose identification model obtains facial pose recognition result.
Wherein, the detailed process of S309~S313 can refer to the description of S303~S307, and details are not described herein again.
In one embodiment, the method for server identification facial pose may also is that
The similarity value between video frame to be identified and preset facial pose template is calculated, is greater than phase in similarity valueIn the case where like degree threshold value, determine that the facial pose of the video frame to be identified is the associated facial appearance of the facial pose templateState;Video frame to be identified is any one video frame in the first sets of video frames or the second sets of video frames.
It is shown in Figure 5, it is the interface schematic diagram of first terminal equipment or second terminal equipment, is with first terminal equipmentExample is illustrated, and first terminal equipment is provided with camera 50, and when initial time arrives at school, camera sets first terminalStandby and second terminal equipment is added in Virtual Class, indicates the camera 50 of first terminal equipment by collected first videoStream is shown in first window 51, and the second video stream that the camera of second terminal equipment is acquired is to first terminalEquipment shows the second video flowing of second terminal equipment acquisition in the second window 52 of first terminal equipment.First terminal equipmentInterface further include chat window 53, text input box 54 and send button 55, chat window is for showing first terminal equipmentUser and second terminal equipment user chat record, text input box user inputs text, picture, video and expression packetEtc. information, send button 55 be used to send information in text input box 54.
It is the schematic diagram of each facial pose referring to shown in Fig. 6 A~6C, Fig. 6 A table is positive the schematic diagram of face posture, faceY-axis deflection angle between -90 °~+90 °, positive face posture indicates that the absolute value of deflection angle of the face in y-axis is less than angleThreshold value is spent, such as: angle threshold is 20 °;Fig. 6 B is the schematic diagram of side face posture, and side face posture indicates face in the deflection of y-axisThe absolute value of angle is greater than angle threshold.Fig. 6 C is the schematic diagram of appearance state, and appearance state indicates in video frame due to partExcept face is beyond the acquisition range of camera, part face is only existed.
S314, assessment quality of instruction.
Wherein, a variety of facial poses are abnormal posture and normal attitude, in one embodiment, a variety of facial poses in advanceInclude: positive face posture, side face posture, block posture, appearance posture or without face posture, abnormal posture includes positive face posture, it is abnormalPosture includes side face posture, blocks posture, appearance posture or without face posture.The first sets of video frames septum reset posture is calculated to continueThe second duration of abnormal posture is continuously for the first duration of abnormal posture and the second sets of video frames septum reset posture,In the case that first duration and/or the second duration are greater than duration threshold value, the bad prompt information of quality of instruction is generated;Or firstIn the case that duration and the second duration are both less than duration threshold value, the normal prompt information of quality of instruction is generated.
Wherein it is possible to count the first sets of video frames and the second sets of video frames septum reset posture as the video of abnormal postureThe quantity of frame, the duration of each video frame are known, therefore according to the quantity of the video frame of abnormal posture and can be heldThe continuous time determines the duration of abnormal posture.
In one embodiment, server quality of instruction can normally be indicated information be sent to first terminal equipment andSecond terminal equipment, so as to the quality of instruction current to teacher or student's Real-time Feedback.
In one embodiment, in the case where the first duration is greater than duration threshold value, server can be by quality of instruction notGood information is sent to first terminal equipment;In the case where the second duration is greater than duration threshold value, server can be by matter of imparting knowledge to studentsIt measures bad information and is sent to second terminal equipment, so as to the user feedback in time to first terminal equipment and second terminal equipmentCurrent quality of instruction.
Implement embodiments herein, server obtains the student's video flowing and/or teacher's video flowing during attending class, inspectionThe face in student's video flowing and teacher's video flowing is surveyed, identifies the facial pose of face, quality of instruction is assessed according to facial pose.On the one hand the application may be implemented to assess quality of instruction in real time, can in real time be subsequent according to the quality of instruction of Real-time FeedbackTeaching provides reference, and the problems in timely correction teaching process improves the quality and efficiency of entire teaching process;On the other hand sameWhen quality of instruction is assessed by facial pose, facial pose is able to achieve cordless as a significant biological characteristicIt is identified with simple recognizer, the accuracy of identification process can be improved.
Following is the application Installation practice, can be used for executing the application embodiment of the method.It is real for the application deviceUndisclosed details in example is applied, the application embodiment of the method is please referred to.
Fig. 7 is referred to, it illustrates the knots for assessing device for the quality of instruction that one exemplary embodiment of the application providesStructure schematic diagram.Hereinafter referred to as device 7, device 7 can pass through the whole of software, hardware or both being implemented in combination with as terminalOr a part.Device 7 includes video acquisition unit 701, video extraction unit 702, gesture recognition unit 703 and teaching evaluation listMember 704.
Video acquisition unit 701, the first video flowing and second terminal equipment for obtaining the acquisition of first terminal equipment are adoptedSecond video flowing of collection.
Video extraction unit 702, for extract the video frame in first video flowing obtain the first sets of video frames andThe video frame extracted in second video flowing obtains the second sets of video frames.
Gesture recognition unit 703, for identification in first sets of video frames each video frame facial pose and knowledgeThe facial pose of each video frame in not described second sets of video frames.
Teaching evaluation unit 704, for according to the facial pose of each video frame in first sets of video frames and/orThe facial pose of each video frame assesses quality of instruction in second sets of video frames.
The facial pose and identification second video frame of each video frame in identification first sets of video framesThe facial pose of each video frame in set, comprising:
The similarity value between video frame to be identified and preset facial pose template is calculated, is greater than phase in similarity valueIn the case where like degree threshold value, determine that the facial pose of the video frame to be identified is the associated facial appearance of the facial pose templateState;Wherein, the video frame to be identified is any one view in first sets of video frames or the second sets of video framesFrequency frame.
In one embodiment, gesture recognition unit 703 is used for:
Feature extraction is carried out to video frame to be identified and obtains image;Wherein the video frame to be identified is described firstAny one video frame in sets of video frames or second sets of video frames;
Described image feature is input to preset facial pose identification model and obtains facial pose recognition result
Feature extraction is carried out to first video frame and obtains the first characteristics of image;
First characteristics of image is input to preset facial pose identification model and obtains facial pose recognition result.
In one embodiment, facial pose includes positive face posture, side face posture, blocks posture, without face posture or appearancePosture;The positive face posture indicates that the absolute value of the deflection angle of face is less than angle threshold;The side face posture indicates faceDeflection angle absolute value is greater than the angle threshold, and the posture of blocking indicates that face is blocked by barrier, the appearance postureIndicate that part face is located at except acquisition range;The no face posture indicates that face is not present in video frame.
In one embodiment, teaching evaluation unit 704 is used for:
The first sets of video frames septum reset posture is calculated to be continuously in the first duration and the second video flowing of abnormal postureSecond duration of facial pose persistent anomaly posture;Abnormal posture includes the side face posture, described blocks posture, the no facePosture or the appearance posture;
In the case where first duration and/or the second duration are greater than duration threshold value, generate that quality of instruction is bad to be mentionedShow information;Or
In the case where first duration and the second duration are both less than the duration threshold value, it is normal to generate quality of instructionPrompt information.
In one embodiment, video extraction unit 702 is used for:
It extracts the key frame in first video flowing and forms the first sets of video frames, and extract second video flowingIn key frame form the second sets of video frames;
In one embodiment, video acquisition unit 701 is used for:
Periodically acquire the second video flowing of the first video flowing and second terminal equipment from first terminal equipment;ItsIn, the length of first video flowing and second video flowing is preset duration.
It should be noted that device 7 provided by the above embodiment execute quality of instruction appraisal procedure when, only with above-mentionedThe division progress of each functional module can according to need and for example, in practical application by above-mentioned function distribution by differentFunctional module is completed, i.e., the internal structure of equipment is divided into different functional modules, with complete it is described above whole orPartial function.In addition, the appraisal procedure embodiment of quality of instruction provided by the above embodiment belongs to same design, embodies and realizeProcess is detailed in embodiment of the method, and which is not described herein again.
Above-mentioned the embodiment of the present application serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.
The device 7 of the application obtains the student's video flowing and/or teacher's video flowing during attending class, and detects student's video flowingWith the face in teacher's video flowing, the facial pose of face is identified, quality of instruction is assessed according to facial pose.The application is on the one handIt may be implemented to assess quality of instruction in real time, ginseng can be provided in real time according to the quality of instruction of Real-time Feedback for subsequent teachingIt examines, the problems in timely correction teaching process, improves the quality and efficiency of entire teaching process;On the other hand pass through face simultaneouslyPosture assesses quality of instruction, and facial pose is able to achieve cordless and simple knows as a significant biological characteristicOther algorithm identifies, the accuracy of identification process can be improved.
The embodiment of the present application also provides a kind of computer storage medium, the computer storage medium can store moreItem instruction, described instruction are suitable for being loaded by processor and being executed the method and step such as above-mentioned Fig. 2-Fig. 6 C illustrated embodiment, specificallyImplementation procedure may refer to illustrating for Fig. 2-Fig. 6 C illustrated embodiment, herein without repeating.
Present invention also provides a kind of computer program product, which is stored at least one instruction,At least one instruction is loaded as the processor and is executed to realize commenting for quality of instruction described in as above each embodimentEstimate method.
Fig. 8 is a kind of assessment apparatus structure schematic diagram of quality of instruction provided by the embodiments of the present application, hereinafter referred to as device8, device 8 can integrate in server above-mentioned, as shown in figure 8, the device includes: memory 802, processor 801, input dressSet 803, output device 804 and communication interface.
Memory 802 can be independent physical unit, can with processor 801, input unit 803 and output device 804To be connected by bus.Memory 802, processor 801, transceiver 803 also can integrate together, pass through hardware realization etc..
Memory 802 is used to store the program for realizing above method embodiment or Installation practice modules, processingDevice 801 calls the program, executes the operation of above method embodiment.
Input unit 802 includes but is not limited to keyboard, mouse, touch panel, camera and microphone;Output device includesBut it is limited to display screen.
For receiving and dispatching various types of message, communication interface includes but is not limited to wireless interface or wired connects communication interfaceMouthful.
Optionally, when passing through software realization some or all of in the distributed task dispatching method of above-described embodiment,Device can also only include processor.Memory for storing program is located at except device, processor by circuit/electric wire withMemory connection, for reading and executing the program stored in memory.
Processor can be central processing unit (central processing unit, CPU), network processing unitThe combination of (networkprocessor, NP) or CPU and NP.
Processor can further include hardware chip.Above-mentioned hardware chip can be specific integrated circuit(application-specific integrated circuit, ASIC), programmable logic device (programmableLogic device, PLD) or combinations thereof.Above-mentioned PLD can be Complex Programmable Logic Devices (complexProgrammable logic device, CPLD), field programmable gate array (field-programmable gateArray, FPGA), Universal Array Logic (generic array logic, GAL) or any combination thereof.
Memory may include volatile memory (volatile memory), such as access memory (random-Access memory, RAM);Memory also may include nonvolatile memory (non-volatile memory), such as fastlyFlash memory (flashmemory), hard disk (hard disk drive, HDD) or solid state hard disk (solid-state drive,SSD);Memory can also include the combination of the memory of mentioned kind.
Wherein, processor 801 calls the program code in memory 802 for executing following steps:
Obtain the first video flowing of first terminal equipment acquisition and the second video flowing of second terminal equipment acquisition;
The video frame in first video flowing is extracted to obtain the first sets of video frames and extract in second video flowingVideo frame obtain the second sets of video frames;
It identifies the facial pose of each video frame in first sets of video frames and identifies second sets of video framesIn each video frame facial pose;
According in the facial pose of each video frame in first sets of video frames and/or second sets of video framesThe facial pose of each video frame assesses quality of instruction.
In one embodiment, processor 801 executes each video frame in identification first sets of video framesFacial pose and the facial pose for identifying each video frame in second sets of video frames, comprising:
The similarity value between video frame to be identified and preset facial pose template is calculated, is greater than phase in similarity valueIn the case where like degree threshold value, determine that the facial pose of the video frame to be identified is the associated facial appearance of the facial pose templateState;Wherein, the video frame to be identified is any one view in first sets of video frames or the second sets of video framesFrequency frame.
In one embodiment, processor 801 executes each video frame in identification first sets of video framesFacial pose and the facial pose for identifying each video frame in second sets of video frames, comprising:
Feature extraction is carried out to video frame to be identified and obtains image;Wherein the video frame to be identified is described firstAny one video frame in sets of video frames or second sets of video frames;
Described image feature is input to preset facial pose identification model and obtains facial pose recognition result
Feature extraction is carried out to first video frame and obtains the first characteristics of image;
First characteristics of image is input to preset facial pose identification model and obtains facial pose recognition result.
In one embodiment, facial pose includes positive face posture, side face posture, blocks posture, without face posture or appearancePosture;The positive face posture indicates that the absolute value of the deflection angle of face is less than angle threshold;The side face posture indicates faceDeflection angle absolute value is greater than the angle threshold, and the posture of blocking indicates that face is blocked by barrier, the appearance postureIndicate that part face is located at except acquisition range;The no face posture indicates that face is not present in video frame.
In one embodiment, processor 801 executes the facial pose according to first video frame and/or describedThe facial pose of second video frame assesses quality of instruction, comprising:
The first sets of video frames septum reset posture is calculated to be continuously in the first duration and the second video flowing of abnormal postureSecond duration of facial pose persistent anomaly posture;Abnormal posture includes the side face posture, described blocks posture, the no facePosture or the appearance posture;
In the case where first duration and/or the second duration are greater than duration threshold value, generate that quality of instruction is bad to be mentionedShow information;Or
In the case where first duration and the second duration are both less than the duration threshold value, it is normal to generate quality of instructionPrompt information.
In one embodiment, processor 801 executes the video frame extracted in first video flowing and obtains firstSets of video frames and the video frame extracted in second video flowing obtain the second sets of video frames, comprising:
It extracts the key frame in first video flowing and forms the first sets of video frames, and extract second video flowingIn key frame form the second sets of video frames;
In one embodiment, processor 801 executes the first video flowing and the of the acquisition first terminal equipment acquisitionSecond video flowing of two terminal devices acquisition, comprising:
Periodically acquire the second video flowing of the first video flowing and second terminal equipment from first terminal equipment;ItsIn, the length of first video flowing and second video flowing is preset duration.
The embodiment of the present application also provides a kind of computer storage mediums, are stored with computer program, the computer programFor executing the appraisal procedure of quality of instruction provided by the above embodiment.
The embodiment of the present application also provides a kind of computer program products comprising instruction, when it runs on computersWhen, so that computer executes the appraisal procedure of quality of instruction provided by the above embodiment.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer programProduct.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the applicationApply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) producesThe form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present applicationFigure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructionsThe combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programsInstruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produceA raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for realThe device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spyDetermine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram orThe function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that countingSeries of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer orThe instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram oneThe step of function of being specified in a box or multiple boxes.