CN107480161A

Movatterモバイル変換

Info

Publication number: CN107480161A
Application number: CN201710391293.4A
Authority: CN
Inventors: R·M·奥尔; M·P·贝纳多; D·J·曼德尔
Original assignee: Apple Computer Inc
Current assignee: Apple Inc
Priority date: 2016-06-08
Filing date: 2017-05-27
Publication date: 2017-12-15

Abstract

Embodiment of the disclosure is related to the intelligent automation assistant probed into for media.It is used to operate intelligent automation assistant to probe into the system of media item and process the invention provides a kind of.In an example process, the phonetic entry for representing the request to one or more media items is received from user.The process determines whether phonetic entry corresponds to the user view for obtaining the personalized recommendation for media item.In response to determining that phonetic entry corresponds to user view of the acquisition for the personalized recommendation of media item, at least one media item is obtained from the corpus specific to user of media item.The corpus specific to user of media item is generated based on the data associated with user.At least one media item is provided.

Description

The intelligent automation assistant probed into for media

The cross reference of related application

Entitled " the INTELLIGENT AUTOMATED that patent application claims were submitted on June 8th, 2016ASSISTANT FOR MEDIA EXPLORATION " U.S.Provisional Serial 62/347,480；In September 15 in 2016Day submit entitled " the INTELLIGENT AUTOMATED ASSISTANT FOR MEDIA EXPLORATION " U.S. is non-Provisional application Ser.No 15/266,956；And the entitled " " INTELLIGENT submitted on May 15th, 2017AUTOMATED ASSISTANT FOR MEDIA EXPLORATION " Danish Patent Application sequence number PA201770338's is excellentFirst weigh, all these patent applications are incorporated by reference in its entirety herein for all purposes accordingly.

Technical field

Present invention relates generally to intelligent automation assistant, and more particularly relate to the intelligent automation that media are probed intoAssistant.

Background technology

Intelligent automation assistant (or digital assistants) can provide favourable interface between human user and electronic equipment.ThisClass assistant can allow user to be interacted using speech form and/or the natural language of textual form with equipment or system.For example,User can be provided the phonetic entry asked comprising user to the digital assistants run on an electronic device.The digital assistants can rootExplain user view according to phonetic entry and user view is transformed into task.Then, can be by performing electronic equipmentOne or more services perform task, and the correlation output that can will be responsive to user's request is back to user.

When managing music or other media, digital assistants can help to search for or play back specific media, particularly existUnder handsfree environment.Specifically, digital assistants effectively can respond to play specific media item, such as by title to requestOr the corpus that clearly identifies of artist or song.However, digital assistants are difficult to based on fuzzy open natural language requestSuch as the media item of correlation is found for recommending the request of song or corpus.

The content of the invention

It is used to operate intelligent automation assistant to probe into the system of media item and process the invention provides a kind of.At oneIn example process, the phonetic entry for representing the request to one or more media items is received from user.The process determinesWhether phonetic entry corresponds to the user view for obtaining the personalized recommendation for media item.In response to determining that phonetic entry is correspondingIn the user view for obtaining the personalized recommendation for media item, at least one is obtained from the corpus specific to user of media itemIndividual media item.The corpus specific to user of media item is generated based on the data associated with user.There is provided this at least oneIndividual media item.

Brief description of the drawings

Fig. 1 is to show to be used to realize the system of digital assistants and the block diagram of environment according to various examples.

Fig. 2A is the portable multifunction device for showing the client-side aspects for realizing digital assistants according to various examplesBlock diagram.

Fig. 2 B are the block diagram for showing the example components for event handling according to various examples.

Fig. 3 shows the portable multifunction device of the client-side aspects for realizing digital assistants according to various examples.

Fig. 4 is the block diagram for showing the exemplary multifunctional equipment with display and touch sensitive surface according to various examples.

Fig. 5 A show the example user of the menu of the application program on the portable multifunction device according to various examplesInterface.

Fig. 5 B are shown according to the exemplary of the multifunctional equipments with the touch sensitive surface separated with display of various examplesUser interface.

Fig. 6 A show the personal electronic equipments according to various examples.

Fig. 6 B are the block diagram for showing the personal electronic equipments according to various examples.

Fig. 7 A are the block diagram for showing digital assistant or its server section according to various examples.

Fig. 7 B show the function of the digital assistants as shown in Figure 7 A according to various examples.

Fig. 7 C show a part for the ontologies according to various examples.

Fig. 8 A-C show the process for being used for the digital assistants that media are probed into according to the operation of various examples.

Fig. 9 A-B show to operate the digital assistants probed into for media according to the user of various examples.

Figure 10 shows to operate the digital assistants probed into for media according to the user of various examples.

Figure 11 shows to operate the digital assistants probed into for media according to the user of various examples.

Figure 12 shows the functional block diagram of the electronic equipment according to various examples.

Embodiment

Accompanying drawing will be quoted in below to the description of example, shows the spy that can be carried out by way of illustration in the accompanying drawingsDetermine example.It should be appreciated that in the case where not departing from the scope of each example, other examples can be used and can make structuralChange.

The routine techniques that media content is probed into using digital assistants is general cumbersome and poorly efficient.Specifically, natural language shapeThe media association requests of formula are for example excessively wide in range or fuzzy, and therefore, it is difficult to be accurately inferred to and the corresponding user view of request.ExampleSuch as, media association requests " it is pleasing to the ear to play some for me " are fuzzy and open, and therefore utilize prior art, numeralAssistant may retrieve the media item incompatible with user preference, and excessive media item may be presented to user, or may be notReturn to any content.This can cause a large amount of follow-up interactions between user and digital assistants, to clarify user view.This can be rightConsumer's Experience has a negative impact.In addition, a large amount of follow-up interactions are poorly efficient relative to the energy consumption of equipment.This consideration forBattery-driven equipment is especially important.

According to some systems, computer-readable media and process as described herein, digital assistants are with more efficient and accurateMode is probed into perform media.In an example process, received from user for representing to one or more media itemsThe phonetic entry of request.The process determines whether phonetic entry corresponds to the user's meaning obtained for the personalized recommendation of media itemFigure.In response to determining that phonetic entry corresponds to user view of the acquisition for the personalized recommendation of media item, from the spy of media itemCorpus due to user obtains at least one media item.At least one media item uses the media sequence mould specific to userType and be acquired.Generated based on the data associated with user media item specific to the corpus of user or specific to userMedia order models.Then at least one media item is provided to user.By using the language specific to user of media itemExpect that storehouse and the media order models specific to user obtain at least one media item, at least one media item meets user preferencePossibility improve.Therefore, it is recommended that the media item more relevant with user, this improves the efficiency and serviceability of digital assistants.

Although description describes various elements using term " first ", " second " etc. below, these elements should not be by artThe limitation of language.These terms are only intended to distinguish an element with another element.For example, the first input can be named as theTwo inputs, and similarly, the second input can be named as the first input, without departing from the scope of various described examples.First input and the second data are input, and are independent and different input in some cases.

The term used in the description to various described examples is intended merely to describe particular example hereinPurpose, and be not intended to be limited.As used in the description in the various examples and appended claimsLike that, singulative "one" (" a ", " an ") and "the" are intended to also include plural form, refer to unless the context clearlyShow.It will be further understood that term "and/or" used herein refers to and covered in associated listed projectAny and all possible combinations of one or more projects.It will be further understood that term " comprising " (" includes "," including ", " comprises " and/or " comprising ") when in this manual use when be specify exist institute it is oldFeature, integer, step, operation, element and/or the part stated, but it is not excluded that in the presence of or addition one or more other are specialSign, integer, step, operation, element, part and/or its packet.

Based on context, term " if " can be interpreted to mean " and when ... when " (" when " or " upon ") or " in response toIt is determined that " or " in response to detecting ".Similarly, based on context, phrase " if it is determined that ... " or " [stated if detectedCondition or event] " can be interpreted to mean " it is determined that ... when " or " in response to determining ... " or " [stated detectingCondition or event] when " or " in response to detecting [condition or event stated] ".

1st, system and environment

Fig. 1 shows the block diagram of the system 100 according to various examples.In some instances, system 100 realizes digital assistants.Term " digital assistants ", " virtual assistant ", " intelligent automation assistant " or " automatic digital assistant " refer to explanation voice and/or textThe natural language of form is inputted to infer user view and be performed based on the user view being inferred at any information of actionReason system.For example, in order to practice the user view being inferred to, system performs one or more of the following：Pass through designThe step of for realizing be inferred to user view and parameter carry out identification mission stream, by the tool from the user view being inferred toBody requirement is input in task flow；Task flow is performed by caller, method, service, API etc.；And generation is to userThe sense of hearing (for example, voice) and/or visual form output response.

As shown in figure 1, in some instances, digital assistants can be implemented according to client-server model.Digital assistantsIt is included in the client-side aspects 102 (hereinafter referred to as " DA clients 102 ") performed on user equipment 104, and in serverThe server portion 106 (hereinafter referred to as " DA servers 106 ") performed in system 108.DA clients 102 by one orMultiple networks 110 are communicated with DA servers 106.It is such as user oriented defeated that DA clients 102 provide client-side functionEnter and output is handled, and communicated with DA servers 106.DA servers 106 are that any number of DA clients 102 provideServer side function, any number of DA clients 102 are each located in respective user equipment 104.

In some instances, DA servers 106 include the I/O interfaces 112 at curstomer-oriented end, one or more processing modules114th, data and model 116 and the I/O interfaces 118 to external service.The I/O interfaces 112 at curstomer-oriented end are advantageous to DA clothesThe input and output processing at the curstomer-oriented end of business device 106.One or more processing modules 114 using data and model 116 comeHandle phonetic entry and inputted based on natural language to determine user view.In addition, one or more processing modules 114 are based onThe user view that is inferred to performs task.In some instances, DA servers 106 by one or more networks 110 come withExternal service 120 is (for example, one or more media services 120-1, one or more navigation Service 120-2, one or more disappearBreath type service 120-3, one or more information service 120-4, calendar service 120-5, telephone service 120-6 etc.) communicated,To complete task or collection information.I/O interfaces 118 to external service facilitate such communication.

Specifically, DA servers 106 are communicated with one or more media services, include searching for and obtaining matchmaker to performThe task of body item.One or more media services 120-1 are implemented in for example one or more remote media servers, and byIt is configured to provide for media item, song, corpus, playlist, video etc..For example, one or more media services includeMedia streaming services, such as Apple Music or iTunes Radio^TM(Apple Inc.(Cupertino,California))。One or more media services 120-1 are configured as receiving media research inquiry (for example, coming from DA servers 106), and makeFor response, there is provided meet one or more media items of media research inquiry.Specifically, inquired about according to media research, search for matchmakerOne or more corpus of body item, to identify one or more media items and provide the one or more media identified.In addition, one or more media services are configured to supply the information associated with media item, such as with certain media items phaseArtistical name, the issuing date of certain media items or the lyrics of certain media items of association.

One or more media services 120-1 include the various corpus of media item.The corpus of the media item includes matchmakerThe corpus specific to user of body item.Generated based on the data associated with relative users media item it is each specific toThe corpus at family.The related data of media include for example indicating media item previously checked, selected by user, having asked, having gathered orUser's input of refusal.In addition, the data of media correlation are included in what is found in the personal library of the media item associated with userMedia item.Therefore, the matchmaker for the media item reflection relative users being comprised in each corpus specific to user of media itemBody preference.In some instances, identify and access based on user profile such as user login information and/or user password informationThe corpus specific to user of each media item.In some instances, the media item in one or more media services 120-1Corpus also include the corpus of one or more second of media item generated based on the issuing date of media item.For example,The corpus of one or more second of media item, which only includes, to be had in the predetermined time range since current dateIssuing date media item.

In some instances, each media item in the corpus of media item includes indicating one or more media parametersMetadata.Media parameter includes such as { title }, { artist }, { school }, { issuing date }, { mood }, { occasion }, { editorList }, { political orientation }, { skills involved in the labour } etc..Therefore based on the media parameter indicated in the metadata of media item comeMedia item in the corpus of search and retrieval media item.Provided below with reference to Fig. 8 A-C on the matchmaker associated with media itemThe additional description of body parameter.

User equipment 104 can be any suitable electronic equipment.In some instances, user equipment is portable multi-functionEquipment (for example, below with reference to equipment 200 described in Fig. 2A), multifunctional equipment are (for example, below with reference to the equipment described in Fig. 4Or personal electronic equipments (for example, equipment 600 described in below with reference to Fig. 6 A-B) 400).Portable multifunction device is for exampleAlso include other functions such as PDA and/or music player functionality mobile phone.The specific example of portable multifunction deviceIncluding from Apple Inc.'s (Cupertino, California)Equipment, iPodEquipment andEquipment.Other examples of portable multifunction device include but is not limited to laptop computer or tablet personal computer.In addition,In some examples, user equipment 104 is non-portable multifunctional equipment.Specifically, user equipment 104 is desktop computer, tripGaming machine, television set or TV set-top box.In some instances, user equipment 104 includes touch sensitive surface (for example, touch-screen is shownDevice and/or touch pad).In addition, user equipment 104 optionally includes other one or more physical user-interface devices, such asPhysical keyboard, mouse and/or control stick.The various examples such as multifunctional equipment of electronic equipment is retouched more fully belowState.

The example of one or more communication networks 110 includes LAN (LAN) and wide area network (WAN), such as internet.OneAny of procotol of individual or multiple uses of communication networks 110, including various wired or wireless agreements, such as Ethernet,USB (USB), live wire (FIREWIRE), global system for mobile communications (GSM), enhanced data gsm environment(EDGE), CDMA (CDMA), time division multiple acess (TDMA), bluetooth, Wi-Fi, internet telephone protocol (VoIP), Wi-MAX,Or any other suitable communication protocol is implemented.

Server system 108 is real on the one or more free-standing data processing equipments or distributed network of computerApply.In some instances, server system 108 also uses third party's service provider (for example, third party cloud service provider)Various virtual units and/or service the potential computing resource and/or infrastructure resources of server system 108 be provided.

In some instances, user equipment 104 is communicated via second user equipment 122 with DA servers 106.Second user equipment 122 is similar or identical with user equipment 104.For example, second user equipment 122 is similar to below with reference to figureEquipment 200,400 or 600 described in 2A, Fig. 4 and Fig. 6 A-B.User equipment 104 is configured as via direct communication connection such asBluetooth, NFC, BTLE etc. are communicatively coupled to second user via such as local Wi-Fi network of wired or wireless networkEquipment 122.In some instances, second user equipment 122 is configured to act as between user equipment 104 and DA servers 106Agency.For example, the DA clients 102 of user equipment 104 be configured as via second user equipment 122 by information (for example,The user's request received at user equipment 104) transmit to DA servers 106.The processing information of DA servers 106 and viaRelated data (for example, data content in response to user's request) is back to user equipment 104 by two user equipmenies 122.

In some instances, user equipment 104 is configured as sending the breviary request of data to second user equipment122, to reduce the information content transmitted from user equipment 104.Second user equipment 122 is configured to determine that added to breviary and askedSide information, with generation transmit to the full request of DA servers 106.The system architecture can advantageously make with finite communicationThe user equipment 104 of ability and/or finite battery charge (for example, wrist-watch or similar compact electronic devices) is by using toolThere are stronger communication capacity and/or the second user of battery electric quantity (for example, mobile phone, laptop computer, tablet personal computer etc.) to setStandby 122 act on behalf of as DA servers 106 to access the service provided by DA servers 106.Although two are merely illustrated in Fig. 1User equipment 104 and user equipment 122, it is to be understood that, in some instances, system 100 may include to carry out with proxy configurationsConfiguration is with the user equipment of the Arbitrary Digit amount and type to be communicated with DA server systems 106.

Although the digital assistants shown in Fig. 1 include client-side aspects (for example, DA clients 102) and server sidePartly both (for example, DA servers 106), but in some instances, the function of digital assistants is implemented as being installed in userFree-standing application program in equipment.In addition, the function between the client part and server section of digital assistants is divided inAlterable in different specific implementations.For example, in some instances, DA clients are only to provide user oriented input and outputProcessing function, and the every other function of digital assistants is delegated to the thin-client of back-end server.

2nd, electronic equipment

The embodiment that attention is drawn to the electronic equipment of the client-side aspects for realizing digital assistants.Figure2A is the block diagram for showing the portable multifunction device 200 with touch-sensitive display system 212 according to some embodiments.Touch-sensitive display 212 is referred to alternatively as or is called " touch-sensitive display system sometimes for being conveniently called " touch-screen " sometimesSystem ".Equipment 200 includes memory 202 (it optionally includes one or more computer-readable recording mediums), memory controlsDevice 222, one or more processing units (CPU) 220, peripheral interface 218, RF circuits 208, voicefrequency circuit 210, raise one's voiceDevice 211, microphone 213, input/output (I/O) subsystem 206, other input control apparatus 216 and outside port 224.IfStandby 200 optionally include one or more optical sensors 264.Equipment 200 optionally include be used for detection device 200 (for example,The touch-sensitive display system 212 of touch sensitive surface, such as equipment 200) on contact intensity one or more contact strengths passSensor 265.One or more tactiles output that equipment 200 optionally includes being used to generate on the device 200 tactile output occursDevice 267 is (for example, raw in the touch-sensitive display system 212 of touch sensitive surface such as equipment 200 or the touch pad 455 of equipment 400Exported into tactile).These parts are communicated optionally by one or more communication bus or signal wire 203.

As used in the specification and claims, term " tactile output " refers to utilize user's by userThe equipment that sense of touch detects is relative relative to the part (for example, touch sensitive surface) of the physical displacement of the previous position of equipment, equipmentIn another part (for example, shell) of equipment physical displacement or part relative to the barycenter of equipment displacement.For example,The part of equipment or equipment connects with user to touching sensitive surface (for example, other parts of finger, palm or user's hand)In the case of touching, the tactile output generated by physical displacement will be construed to sense of touch by user, and the sense of touch corresponds to equipment or setThe change perceived of the physical features of standby part.For example, the movement of touch sensitive surface (for example, touch-sensitive display or Trackpad)" pressing click " or " unclamp and click on " to physical actuation button is optionally construed to by user.In some cases, user willFeel sense of touch, such as " press click " or " unclamp click on ", (example is physically pressed even in the movement by userSuch as, be shifted) the physical actuation button associated with touch sensitive surface when not moving.As another example, even in tactileWhen the smoothness of sensitive surfaces is unchanged, it is touch sensitive surface that the movement of touch sensitive surface, which also optionally can be construed to by user or be sensed," roughness ".Although will be limited by user by the individuation sensory perception of user such explanation of touch, existThe many sensory perceptions touched are that most of users share.Therefore, when tactile output is described as corresponding to the specific of userDuring sensory perception (for example, " pressing click ", " unclamp and click on ", " roughness "), unless otherwise stated, the tactile otherwise generatedCorresponding to equipment or the physical displacement of its part, the sense organ that the physical displacement will generate typical case (or common) user is known for outputFeel.

It should be appreciated that equipment 200 is only an example of portable multifunction device, and equipment 200 optionally hasThan shown more or less parts, two or more parts are optionally combined, or optionally there are these partsDifferent configurations or arrangement.Various parts shown in Fig. 2A are come real with the combination of both hardware, software or hardware and softwareExisting, it includes one or more signal processing circuits and/or application specific integrated circuit.

Memory 202 includes one or more computer-readable recording mediums.Exemplified by computer-readable recording medium if anyIt is shape and non-transient.Memory 202 includes high-speed random access memory and may also include nonvolatile memory, such asOne or more disk storage equipments, flash memory device or other non-volatile solid state memory equipment.Memory controlsThe miscellaneous part of the control device 200 of device 222 accesses memory 202.

In some instances, the non-transient computer readable storage medium storing program for executing of memory 202 be used for store instruction (for example, withIn perform process described below aspect), for instruction execution system, device or equipment such as computer based system,System comprising processor can obtain instruction and the other systems use of execute instruction from instruction execution system, device or equipmentIt is or in connection.In other examples, instruction (for example, aspect for performing process described below) is stored in serverOn the non-transient computer readable storage medium storing program for executing (not shown) of system 108, or can in the non-transient computer of memory 202Read to divide between storage medium and the non-transient computer readable storage medium storing program for executing of server system 108.

Peripheral interface 218 is used to the input peripheral of equipment and output ancillary equipment being couple to the Hes of CPU 220Memory 202.The one or more processors 220 run or perform the various software programs that are stored in memory 202 and/Or instruction set, to perform the various functions of equipment 200 and processing data.In some embodiments, peripheral interface 218,CPU 220 and Memory Controller 222 are implemented in one single chip such as on chip 204.In some other embodiments,They are implemented on independent chip.

RF (radio frequency) circuit 208 receives and sent the RF signals for being also designated as electromagnetic signal.RF circuits 208 turn electric signalBe changed to electromagnetic signal/by electromagnetic signal and be converted to electric signal, and via electromagnetic signal come with communication network and other communicateEquipment is communicated.RF circuits 208 optionally include being used for the well known circuit for performing these functions, including but not limited to antennaSystem, RF transceivers, one or more amplifiers, tuner, one or more oscillators, digital signal processor, encoding and decodingChipset, subscriber identity module (SIM) card, memory etc..RF circuits 208 optionally by radio communication and network and itsHe is communicated at equipment, and the network is such as internet (also referred to as WWW (WWW)), Intranet, and/or wireless network(such as cellular phone network, WLAN (LAN), and/or Metropolitan Area Network (MAN) (MAN)).RF circuits 208 optionally include being used forThe well known circuit of near-field communication (NFC) field is detected, is such as detected by short-haul connections radio unit.Channel radioTrust selection of land and use any one of a variety of communication standards, agreement and technology, including but not limited to global system for mobile communications(GSM), enhanced data gsm environment (EDGE), high-speed downlink packet access (HSDPA), High Speed Uplink Packet connectEnter (HSUPA), evolution, clear data (EV-DO), HSPA, HSPA+, double unit HSPA (DC-HSPDA), Long Term Evolution (LTE),Near-field communication (NFC), WCDMA (W-CDMA), CDMA (CDMA), time division multiple acess (TDMA), bluetooth, bluetoothLow-power consumption, Wireless Fidelity (Wi-Fi) are (for example, IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, IEEE802.11n and/or IEEE 802.11ac), voice over internet protocol (VoIP), Wi-MAX, email protocol (for example, interconnectionNetwork information access protocol (IMAP) and/or post office protocol (POP)), instant message (for example, scalable message processing and exist associationDiscuss (XMPP), for instant message and in the presence of Session initiation Protocol (SIMPLE), instant message and the presence service using extension(IMPS)), and/or Short Message Service (SMS) or when being included in this document submission date also it is untapped go out communication protocolAny other appropriate communication protocol.

Voicefrequency circuit 210, loudspeaker 211 and microphone 213 provide the COBBAIF between user and equipment 200.AudioCircuit 210 receives voice data from peripheral interface 218, voice data is converted into electric signal, and electric signal transmission is arrivedLoudspeaker 211.Loudspeaker 211 converts electrical signals to the audible sound wave of the mankind.Voicefrequency circuit 210 is also received by microphone 213The electric signal changed according to sound wave.Voicefrequency circuit 210 converts electrical signals to voice data, and voice data is transferred into peripheryEquipment interface 218 is for processing.Voice data be retrieved from and/or transmitted by peripheral interface 218 to memory 202 and/Or RF circuits 208.In some embodiments, voicefrequency circuit 210 also includes earphone jack (for example, 312 in Fig. 3).HeadsetJack provides the interface between voicefrequency circuit 210 and removable audio input/output ancillary equipment, and the removable audio is defeatedEnter/export earphone that ancillary equipment such as only exports or with output (for example, single head-receiver or bi-telephone) and input (exampleSuch as, microphone) both headset.

I/O subsystems 206 control such as touch-screen 212 of the input/output ancillary equipment in equipment 200 and other inputsEquipment 216 is coupled to peripheral interface 218.I/O subsystems 206 optionally include display controller 256, optical sensor controlDevice 258 processed, intensity sensor controller 259, tactile feedback controller 261 and one for other inputs or control deviceOr multiple input controllers 260.One or more input controllers 260 receive telecommunications from other input control apparatus 216Number/send electric signal to other input control apparatus 216.Other input control apparatus 216 optionally include physical button (exampleSuch as, push button, rocker buttons etc.), dial, slide switch, control stick, click type rotating disk etc..In some alternative realitiesApply in scheme, one or more input controllers 260 be optionally coupled to any one of the following (or be not coupled toAny one of lower items)：Keyboard, infrared port, USB port and pointing device such as mouse.One or more buttons(for example, 308 in Fig. 3) optionally include pressing for increase/reduction of the volume of loudspeaker 211 and/or microphone 213 controlButton.One or more buttons optionally include push button (for example, 306 in Fig. 3).

Quick push button of pressing releases the locking of touch-screen 212 or begins to use the gesture on touch-screen to come to equipmentThe process being unlocked, such as it is entitled " the Unlocking a Device by submitted on December 23rd, 2005Performing Gestures on an Unlock Image " U.S. Patent application 11/322,549 is United States Patent (USP) 7,Described in 657,849, above-mentioned U.S. Patent application is incorporated by reference in its entirety herein accordingly.Push button is pressed longerly(such as 306) make equipment 200 start shooting or shut down.User can carry out self-defined to the function of one or more buttons.Touch-screen212 are used to realize virtual push button or soft key and one or more soft keyboards.

Touch-sensitive display 212 provides the input interface and output interface between equipment and user.Display controller 256 from touchTouch the reception electric signal of screen 212 and/or send electric signal to touch-screen 212.Touch-screen 212 shows visual output to user.Depending onFeel that output optionally includes figure, text, icon, video and any combination of them (being referred to as " figure ").In some implementationsIn scheme, the visual output of some visual outputs or whole corresponds to user interface object.

Touch-screen 212 have based on tactile and/or tactile contact come receive from user the touch sensitive surface of input, sensor,Or sensor group.Touch-screen 212 and display controller 256 are (with any associated module in memory 202 and/or instructionCollection together) detection touch-screen 212 on contact (and any movement or interruption of the contact), and by detected contact turnThe user interface object (for example, one or more soft keys, icon, webpage or image) for being changed to and being displayed on touch-screen 212Interaction.In an exemplary embodiment, the contact point between touch-screen 212 and user corresponds to the finger of user.

Touch-screen 212 is (luminous using LCD (liquid crystal display) technology, LPD (light emitting polymer displays) technologies or LEDDiode) technology, but other Display Techniques can be used in other embodiments.Touch-screen 212 and display controller 256 useIt is currently known or later by any technology and other proximity sensor arrays or use in a variety of touch-sensing technologies of exploitationIn it is determined that the one or more points contacted with touch-screen 212 other elements come detect contact and its any movement or interruption, shouldA variety of touch-sensing technologies include but is not limited to capacitive character, resistive, infrared and surface acoustic wave technique.In an exemplary implementationIn scheme, using projection-type mutual capacitance detection technology, such as Apple Inc.'s (Cupertino, California)And iPodIt was found that technology.

Touch-sensitive display in some embodiments of touch-screen 212 can be similar to the multiple spot described in following United States Patent (USP)Touch-sensitive touch pad：6,323,846 (Westerman et al.), 6,570,557 (Westerman et al.) and/or 6,677,932(Westerman)；And/or U.S. Patent Publication 2002/0015024A1, each patent document in these patent documents is accordinglyIt is incorporated by reference in its entirety herein.However, touch-screen 212 shows the visual output from equipment 200, and touch-sensitive touch pad is notVisual output is provided.

Touch-sensitive display in some embodiments of touch-screen 212 is described as in following patent application：(1) U.S. Patent application 11/381,313 submitted on May 2nd, 2006, " Multipoint Touch SurfaceController”；(2) U.S. Patent application 10/840,862 submitted on May 6th, 2004, " MultipointTouchscreen”；(3) U.S. Patent application 10/903,964 submitted on July 30th, 2004, " Gestures ForTouch Sensitive Input Devices”；(4) U.S. Patent application 11/048 submitted on January 31st, 2005,264, “Gestures For Touch Sensitive Input Devices”；(5) in U.S. submitted on January 18th, 2005State's patent application 11/038,590, " Mode-Based Graphical User Interfaces For TouchSensitive Input Devices”；(6) U.S. Patent application 11/228,758 submitted for 16th in September in 2005,“Virtual Input Device Placement On A Touch Screen User Interface”；(7) in 2005The U.S. Patent application 11/228,700 that September is submitted on the 16th, " Operation Of A Computer With A TouchScreen Interface”；(8) U.S. Patent application 11/228,737 submitted on September 16th, 2005, " ActivatingVirtual Keys Of A Touch-Screen Virtual Keyboard”；(9) in U.S. submitted on March 3rd, 2006State's patent application 11/367,749, " Multi-Functional Hand-Held Device ".All these patent applications are in fullIt is herein incorporated by reference.

Touch-screen 212 has the video resolution for example more than 100dpi.In some embodiments, touch-screen has about160dpi video resolution.User using any suitable object or additives such as stylus, finger etc. come with touch-screen212 contacts.In some embodiments, by user-interface design mainly to be worked by the contact based on finger and gesture, byIt is larger in the contact area of finger on the touchscreen, therefore this may be accurate not as the input based on stylus.In some embodiment partyIn case, the rough input based on finger is translated as accurate pointer/cursor position or order by equipment, for performing user instituteDesired action.

In some embodiments, in addition to a touch, equipment 200 includes being used to activating or deactivating specific functionTouch pad (not shown).In some embodiments, touch pad is the touch sensitive regions of equipment, and the touch sensitive regions and touch-screen are notTogether, it does not show visual output.Touch pad is the touch sensitive surface separated with touch-screen 212, or is touched by what touch-screen was formedThe extension of sensitive surfaces.

Equipment 200 also includes being used for the power system 262 for various parts power supply.Power system 262 includes electrical managementSystem, one or more power supplys (for example, battery, alternating current (AC)), recharging system, power failure detection circuit, power becomeParallel operation or inverter, power supply status indicator (for example, light emitting diode (LED)) and the life with the electric power in portable setAny other part associated into, management and distribution.

Equipment 200 also includes one or more optical sensors 264.Fig. 2A, which is shown, to be couple in I/O subsystems 206Optical sensor controller 258 optical sensor.Optical sensor 264 may include charge coupling device (CCD) or complementationMetal-oxide semiconductor (MOS) (CMOS) phototransistor.Optical sensor 264 from environment receive by one or more lens andThe light of projection, and convert light to represent the data of image.With reference to image-forming module 243 (also referred to as camera model), optics passesSensor 264 captures still image or video.In some embodiments, optical sensor be located at equipment 200 with before equipmentThe phase of touch-screen display 212 in portion back to rear portion so that touch-screen display is used as being used for still image and/or video figureAs the view finder of collection.In some embodiments, optical sensor is located at the front portion of equipment so that is touching screen display in userThe image of the user is obtained while showing and other video conference participants are checked on device, for video conference.In some implementationsIn scheme, the position of optical sensor 264 can be changed (such as by lens and sensor in slewing shell) by user,So that single optical sensor 264 is used together with touch-screen display, for video conference and still image and/or videoBoth IMAQs.

Equipment 200 optionally also includes one or more contact strength sensors 265.Fig. 2A, which is shown, is couple to I/OThe contact strength sensor of intensity sensor controller 259 in system 206.Contact strength sensor 265 optionally includes oneIndividual or multiple piezoresistive strain instrument, capacitive force transducer, electric force snesor, piezoelectric force transducer, optics force snesor, electricityAppearance formula touch sensitive surface or other intensity sensors are (for example, the sensing of the power (or pressure) for measuring the contact on touch sensitive surfaceDevice).Contact strength sensor 265 receives contact strength information (for example, surrogate of pressure information or pressure information) from environment.In some embodiments, at least one contact strength sensor and touch sensitive surface (for example, touch-sensitive display system 212) juxtapositionArrangement is neighbouring.In some embodiments, at least one contact strength sensor be located at equipment 200 with positioned at equipment 200Front portion on the phase of touch-screen display 212 back to rear portion on.

Equipment 200 also includes one or more proximity transducers 266.Fig. 2A, which is shown, is couple to peripheral interface 218Proximity transducer 266.Alternatively, proximity transducer 266 is couple to the input controller 260 in I/O subsystems 206.It is closeSensor 266 performs as described in following U.S. Patent application：11/241,839, entitled " ProximityDetector In Handheld Device”；11/240,788, entitled " Proximity Detector InHandheld Device”；11/620,702, entitled " Using Ambient Light Sensor To AugmentProximity Sensor Output”；11/586,862, entitled " Automated Response To And SensingOf User activity In Portable Devices "；With 11/638,251, entitled " Methods And Systems ForAutomatic Configuration Of Peripherals ", these U.S. Patent applications are incorporated by reference simultaneously accordinglyEnter herein.In some embodiments, when multifunctional equipment is placed near the ear of user (for example, when user is enteringDuring row call), proximity transducer is closed and disables touch-screen 212.

Equipment 200 optionally also includes one or more tactile output generators 267.Fig. 2A, which is shown, is couple to I/OThe tactile output generator of tactile feedback controller 261 in system 206.Tactile output generator 267 optionally includes oneOr multiple electroacoustic equipments such as loudspeaker or other acoustic components；And/or the electromechanics for converting the energy into linear movement is setStandby such as motor, solenoid, electroactive polymerizer, piezo-activator, electrostatic actuator or other tactiles output generating unit (exampleSuch as, the part exported for converting the electrical signal to the tactile in equipment).Contact strength sensor 265 is from haptic feedback module233 receive touch feedback generation instruction, and the tactile that generation can be felt by the user of equipment 200 on the device 200 is defeatedGo out.In some embodiments, at least one tactile output generator and touch sensitive surface (for example, touch-sensitive display system 212)Alignment is neighbouring, and optionally by vertically (for example, surface inside/outside to equipment 200) or laterally (for example,In the surface identical plane with equipment 200 rearwardly and a forwardly) mobile touch sensitive surface exports to generate tactile.In some implementationsIn scheme, at least one tactile output generator sensor be located at equipment 200 with the touch-screen on the front portion of equipment 200The phase of display 212 back to rear portion on.

Equipment 200 may also include one or more accelerometers 268.Fig. 2A, which is shown, is coupled to peripheral interface 218Accelerometer 268.Alternatively, accelerometer 268 is couple to the input controller 260 in I/O subsystems 206.Accelerometer268 execution as described in following U.S. Patent Publication：20050190059, entitled " Acceleration-basedTheft Detection System for Portable Electronic Devices”；And 20060017692, titleFor " Methods And Apparatuses For Operating A Portable Device Based On AnAccelerometer ", the two U.S. Patent Publications are incorporated by reference in its entirety herein.In some embodiments, believeBreath based on to the analysis from one or more accelerometer received datas and on touch-screen display with longitudinal view orTransverse views are shown.Equipment 200 also include optionally in addition to accelerometer 268 magnetometer (not shown) and GPS (orGLONASS or other Global Navigation Systems) receiver (not shown), for obtaining position and the orientation (example on equipment 200Such as, it is vertical or horizontal) information.

In some embodiments, the software part being stored in memory 202 includes operating system 226, communication module(or instruction set) 228, contact/motion module (or instruction set) 230, figure module (or instruction set) 232, text input module(or instruction set) 234, global positioning system (GPS) module (or instruction set) 235, digital assistants client modules 229 and shouldWith program (or instruction set) 236.In addition, the data storage of memory 202 and model, such as user data and model 231.In addition,In some embodiments, memory 202 (Fig. 2A) or 470 (Fig. 4) storage devices/global internal state 257, such as Fig. 2A andShown in Fig. 4.Equipment/global internal state 257 includes one or more of the following：Applications active state,The applications active state is used to indicate which application program (if any) is currently movable；Dispaly state, this is aobviousShow that state is used to indicate that what application program, view or other information occupy the regional of touch-screen display 212；SensorState, the sensor states include the information that each sensor and input control apparatus 216 of slave unit obtain；And on settingStandby position and/or the positional information of posture.

Operating system 226 is (for example, Darwin, RTXC, LINUX, UNIX, OS X, iOS, WINDOWS or embedded behaviourMake system such as VxWorks) include being used to control and manage general system task (for example, memory management, storage device controlSystem, power management etc.) various software parts and/or driver, and promote between various hardware componenies and software partCommunication.

Communication module 228 promotes to be communicated with other equipment by one or more outside ports 224, and also wrapsInclude for handling by RF circuits 208 and/or the various software parts of the received data of outside port 224.Outside port 224(such as USB (USB), live wire etc.) is suitable to be directly coupled to other equipment, or (such as mutual indirectly by networkNetworking, WLAN etc.) coupling.In some embodiments, outside port be with(Apple Inc. trade mark) equipmentGo up spininess (for example, 30 pins) connector that used 30 needle connectors are same or similar and/or are compatible with.

Contact/motion module 230 optionally detect with touch-screen 212 (with reference to display controller 256) and other touch-sensitive setThe contact of standby (for example, touch pad or physics click type rotating disk).Contact/motion module 230 include various software parts forThe various operations related with contact detection are performed, such as to determine that whether being in contact (for example, detecting finger down event), trueSurely contact intensity (for example, contact power or pressure, or contact power or pressure substitute), determine whether there is contactMovement and track movement (for example, detecting one or more finger drag events) on touch sensitive surface and determine contactWhether stop (for example, detection digit up event or contact disconnect).Contact/motion module 230 receives from touch sensitive surfaceContact data.Determine the movement of contact point optionally include determining the speed (value) of contact point, speed (value and direction) and/Or acceleration (change in value and/or direction), the movement of the contact point are represented by a series of contact data.These operationsOptionally it is applied to single-contact (for example, single abutment) or multiple spot while contacts (for example, " multiple point touching "/multiple handsAbutment).In some embodiments, contact/motion module 230 detects the contact on touch pad with display controller 256.

In some embodiments, contact/motion module 230 determines to operate using one group of one or more intensity thresholdWhether performed (for example, determining that whether user " clicks on " icon) by user.In some embodiments, according to software parameterTo determine at least one subset of intensity threshold (for example, intensity threshold is not Lai really by the activation threshold of specific physical actuation deviceFixed, and can be conditioned in the case where not changing the physical hardware of equipment 200).For example, do not changing Trackpad or touchIn the case of panel type display hardware, mouse " click " threshold value of Trackpad or touch-screen can be configured to the big of predefined threshold valueAny one threshold value in scope.In addition, in some specific implementations, provided to the user of equipment for adjusting one group of intensity thresholdOne or more of intensity threshold (for example, by adjusting each intensity threshold and/or by using to " intensity " parameter beingIrrespective of size click on comes the multiple intensity thresholds of Primary regulation) software design patterns.

Contact/motion module 230 optionally detects the gesture input of user.Different gestures on touch sensitive surface have differenceContact patterns (for example, the different motion of detected contact, timing and/or intensity).Therefore, it is special optionally by detectionDetermine contact patterns and carry out detection gesture.For example, detection finger tapping down gesture includes detection finger down event, then pressed with fingerLower event identical position (or substantially the same position) place (for example, in opening position of icon) detection finger is lifted and (is lifted away from)Event.As another example, finger is detected on touch sensitive surface and gently sweeps gesture including detecting finger down event, is then detectedOne or more finger drag events, and then detection finger lifts and (is lifted away from) event.

Figure module 232 includes being used for being presented and showing the various known of figure on touch-screen 212 or other displaysSoftware part, including for changing the visual impact of shown figure (for example, brightness, transparency, saturation degree, contrastOr other visual signatures) part.As used herein, term " figure " includes any object that can be displayed to user, includingBut it is not limited to text, webpage, icon (user interface object for including soft key), digital picture, video, animation etc..

In some embodiments, figure module 232 stores the data for representing figure to be used.Each figure is appointedSelection of land is assigned corresponding code.Figure module 232 is used for specify figure to be shown one from receptions such as application programsOr multiple codes, coordinate data and other graphic attribute data are also received in the case of necessary, then generate screen picture numberAccording to output to display controller 256.

Haptic feedback module 233 includes being used for the various software parts for generating instruction, and the instruction is by tactile output generator267 use, to produce tactile in response to the one or more positions of user with interacting for equipment 200 and on the device 200Output.

In some instances, the text input module 234 as the part of figure module 232 is provided in various applicationsProgram is (for example, contact person 237, Email 240, IM 241, browser 247 and any other application for needing text inputProgram) in input text soft keyboard.

GPS module 235 determine this information that the position and providing of equipment uses in various application programs (for example,There is provided to the phone 238 for being used for location-based dialing, there is provided be used as picture/video metadata to camera 243, and provide extremelyLocation Based service such as weather desktop small routine, local Yellow Page desktop small routine and map/navigation desktop small routine are providedApplication program).

Digital assistants client modules 229 instruct including various client-side digital assistants, to provide the visitor of digital assistantsFamily side function.For example, digital assistants client modules 229 can be connect by the various users of portable multifunction device 200Mouth is (for example, microphone 213, one or more accelerometers 268, touch-sensitive display system 212, one or more optical sensingsDevice 229, other input control apparatus 216 etc.) come receive sound input (for example, phonetic entry), text input, touch inputAnd/or gesture input.Digital assistants client modules 229 can also pass through the various inputs of portable multifunction device 200Interface (for example, loudspeaker 211, touch-sensitive display system 212, one or more tactile output generators 267 etc.) provides audioThe output of (for example, voice output), video and/or tactile form.For example, output is provided as voice, sound, alarm, textThe combination of message, menu, figure, video, animation, vibration, and/or both of the above or more person.During operation, digital assistantsClient modules 229 are communicated using RF circuits 208 with DA servers 106.

User data and model 231 include the various data associated with user (for example, the vocabulary number specific to userAccording to, user preference data, title pronunciation, the data from user's electronic address list, do list, purchase specific to userThing inventory etc.), to provide the client-side function of digital assistants.In addition, user data and model 231 include being used to handle userInput and determine user view various models (for example, speech recognition modeling, statistical language model, Natural Language Processing Models,Ontologies, task flow model, service model etc.).

In some instances, digital assistants client modules 229 utilize various sensors, subsystem and portable multi-functionThe ancillary equipment of equipment 200 to gather additional information from the surrounding environment of portable multifunction device 200, with establish with user,The context that active user's interaction and/or active user's input are associated.In some instances, digital assistants client modules229 provide contextual information or its subset to DA servers 106 together with user's input, to help to infer the intention of user.In some instances, digital assistants also prepare to export and be sent to user using contextual information to determine how.Up and downLiterary information is referred to as context data.

In some instances, sensor information is included with the contextual information of user's input, such as illumination, environment are made an uproarSound, environment temperature, the image of surrounding environment or video etc..In some instances, contextual information may also include the physics of equipmentState, such as apparatus orientation, device location, device temperature, power level, speed, acceleration, motor pattern, cellular signal are strongDegree etc..In some instances, also by the information related to the application state of DA servers 106 such as portable multifunction device200 running, installation procedure, past and current network activity, background service, error log, resource, which use etc., to be madeThere is provided for the contextual information associated with user's input to DA servers 106.

In some instances, digital assistants client modules 229 select in response to the request from DA servers 106There is provided the information (for example, user data 231) being stored on portable multifunction device 200 to property.In some instances, numberWord assistant client modules 229 are also when being made requests on by DA servers 106 via natural language dialogue or other users interfaceExtract the additional input from user.Additional input is sent to DA servers 106 by digital assistants client modules 229, with sideHelp DA servers 106 to carry out intent inference and/or meet the user view expressed in user asks.

Digital assistants are described in further detail below with reference to Fig. 7 A-C.It should be appreciated that digital assistants client modules 229It may include the submodule of any number of digital assistant module 726 described below.

Application program 236 is included with lower module (or instruction set) or its subset or superset：

Contact module 237 (otherwise referred to as address list or contacts list)；

Phone module 238；

Video conference module 239；

Email client module 240；

Instant message (IM) module 241；

Body-building support module 242；

For still image and/or the camera model 243 of video image；

Image management module 244；

Video player module；

Musical player module；

Browser module 247；

Calendaring module 248；

Desktop small routine module 249, it includes one or more of the following in some instances：Weather desktopSmall routine 249-1, stock market desktop small routine 249-2, calculator desktop small routine 249-3, alarm clock desktop small routine 249-4, wordAllusion quotation desktop small routine 249-5 and other desktop small routines obtained by user, and the desktop small routine 249- 6 that user creates；

For the desktop small routine builder module 250 for the desktop small routine 249-6 for generating user's establishment；

Search module 251；

Video and musical player module 252, it merges video player module and musical player module；

Notepad module 253；

Mapping module 254；And/or

Online Video module 255.

The example for the other applications 236 being stored in memory 202 include other word-processing applications, itsHis picture editting's application program, drawing application program, application program, encryption, the digital rights that application program is presented, supports JAVABenefit management, speech recognition and speech reproduction.

With reference to touch-screen 212, display controller 256, module 230, figure module 232 and text input module 234 are contacted,Contact module 237 is used to manage address list or contacts list (for example, being stored in memory 202 or memory 470In the application program internal state 292 of contact module 237), including：One or more names are added to address list；From logicalOne or more names are deleted in news record；Make one or more telephone numbers, one or more e-mail addresses, one or moreIndividual physical address or other information associate with name；Image is associated with name；Name is classified and sorted；Phone is providedNumber or e-mail address, to initiate and/or promote by phone 238, video conference 239, Email 240 or IM 241Communication of progress etc..

With reference to RF circuits 208, voicefrequency circuit 210, loudspeaker 211, microphone 213, touch-screen 212, display controller256th, contact/motion module 230, figure module 232 and text input module 234, phone module 238 is used to input and phoneThe phone number that one or more of character string, access contact module 237 telephone number, modification corresponding to number have inputtedCode, corresponding telephone number is dialed, is conversated and is disconnected or hang up when session is completed.As described above, channel radio courierWith any one of multiple communication standards, agreement and technology.

With reference to RF circuits 208, voicefrequency circuit 210, loudspeaker 211, microphone 213, touch-screen 212, display controller256th, optical sensor 264, optical sensor controller 258, contact/motion module 230, figure module 232, text input mouldBlock 234, contact module 237 and phone module 238, video conference module 239 include according to user instruction to initiate, carry out andTerminate the executable instruction of the video conference between user and other one or more participants.

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, the and of figure module 232Text input module 234, email client module 240 include creating, send, receive and managing in response to user instructionThe executable instruction of Email.With reference to image management module 244, email client module 240 to be very easy to woundBuild and send with the still image shot by camera model 243 or the Email of video image.

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, the and of figure module 232Text input module 234, instant message module 241 include the executable instruction for following operation：Input and instant message pairCharacter that the character string answered, modification are previously entered, the corresponding instant message of transmission are (for example, using Short Message Service (SMS) or moreMedia information service (MMS) agreement for the instant message based on phone or using XMPP, SIMPLE or IMPS withFor the instant message based on internet), receive instant message and check received instant message.In some embodiment partyIn case, the instant message for transmitting and/or receiving includes figure, photo, audio file, video file and/or MMS and/or increasingOther annexes supported in strong messenger service (EMS).As used herein, " instant message " refers to the message (example based on phoneSuch as, the message transmitted using SMS or MMS) and message based on internet (for example, using XMPP, SIMPLE or IMPS to passBoth defeated message).

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, figure module 232,Text input module 234, GPS module 235, mapping module 254 and musical player module, body-building support module 242 include usingIn the executable instruction of following operation：Create body-building (for example, there is time, distance and/or caloric burn target)；With body-buildingSensor (mobile device) is communicated；Receive workout sensor data；Calibrate the sensor for monitoring body-building；Select body-buildingMusic simultaneously plays out；And display, storage and transmission workout data.

With reference to touch-screen 212, display controller 256, one or more optical sensors 264, optical sensor controller258th, contact/motion module 230, figure module 232 and image management module 244, camera model 243 include being used for following behaviourThe executable instruction of work：Capture still image or video (including video flowing) and store them in memory 202, repairChange the feature of still image or video or delete still image or video from memory 202.

With reference to touch-screen 212, display controller 256, contact/motion module 230, figure module 232, text input mouldBlock 234 and camera model 243, image management module 244 include being used for arranging, change (for example, editor) or otherwiseManipulate, tag, deleting, presenting (for example, in digital slide or photograph album) and storage still image and/or video figureThe executable instruction of picture.

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, the and of figure module 232Text input module 234, browser module 247 include being used for (including searching for, linking to browse internet according to user instructionTo, receive and display webpage or part thereof and the annex and alternative document that link to webpage) executable instruction.

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, figure module 232,Text input module 234, email client module 240 and browser module 247, calendaring module 248 include being used for basisUser instruction creates, shows, changes and stored calendar and the data associated with calendar (for example, calendar, waiting to handle affairsList etc.) executable instruction.

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, figure module 232,Text input module 234 and browser module 247, desktop small routine builder module 250 are used to create desktop little Cheng by userSequence (for example, user's specified portions of webpage are gone in desktop small routine).

With reference to touch-screen 212, display controller 256, contact/motion module 230, figure module 232 and text input mouldBlock 234, search module 251 include being used for according to user instruction come the matching one or more searching bar in searching storage 202The text of part (for example, search term that one or more users specify), music, sound, image, video and/or alternative documentExecutable instruction.

With reference to touch-screen 212, display controller 256, contact/motion module 230, figure module 232, voicefrequency circuit systemSystem 210, loudspeaker 211, RF circuit systems 208 and browser module 247, video and musical player module 252 include allowingUser is downloaded and played back with the music recorded of one or more file formats (such as MP3 or AAC files) storage and otherThe executable instruction of audio files, and for showing, presenting or otherwise playing back video (for example, in touch-screen 212Or on the external display connected via outside port 224) executable instruction.In some embodiments, equipment 200 is appointedSelection of land includes MP3 player, such as iPod (Apple Inc. trade mark) function.

With reference to touch-screen 212, display controller 256, contact/motion module 230, figure module 232 and text input mouldBlock 234, notepad module 253 include creating and managing the executable instruction of notepad, backlog etc. according to user instruction.

With reference to RF circuits 208, touch-screen 212, display controller 256, contact/motion module 230, figure module 232,Text input module 234, GPS module 235 and browser module 247, mapping module 254 can be used for received according to user instruction,It has been shown that, modification and storage map and the data associated with map are (for example, at or near steering direction and ad-hoc locationShop and the relevant data and other location-based data of other points of interest).

With reference to touch-screen 212, display controller 256, contact/motion module 230, figure module 232, voicefrequency circuit210th, loudspeaker 211, RF circuits 208, text input module 234, email client module 240 and browser module 247,Online Video module 255 is included to give an order：Allow user access, browse, receiving (for example, by transmit as a stream and/or underCarry), playback (such as on the touchscreen or on the external display connected via outside port 224), send and have to specialDetermine the Email of the link of Online Video and otherwise manage one or more file formats such as H.264Line video.In some embodiments, using instant message module 241 rather than email client module 240 send toThe link of specific Online Video.It is entitled that the additional description of Online Video application program can be that on June 20th, 2007 submits“Portable Multifunction Device,Method,and Graphical User Interface forPlaying Online Videos " U.S. Provisional Patent Application 60/936,562 and in the mark submitted on December 31st, 2007Entitled " Portable Multifunction Device, Method, and Graphical User Interface forFound in Playing Online Videos " U.S. Patent application 11/968,067, the content of the two patent applications is accordinglyIt is incorporated by reference in its entirety herein.

Above-mentioned each module and application program, which correspond to, to be used to perform above-mentioned one or more functions and in this patent ShenPlease described in method (for example, computer implemented method as described herein and other information processing method) executable instructionCollection.These modules (for example, instruction set) need not be implemented as independent software program, process or module, therefore these modulesEach subset can in various embodiments be combined or otherwise rearrange.For example, video player module canIndividual module (for example, video and musical player module 252 in Fig. 2A) is combined into musical player module.In some realitiesApply in scheme, memory 202 stores the subset of above-mentioned module and data structure.In addition, the storage of memory 202 is not described aboveAdd-on module and data structure.

In some embodiments, equipment 200 is that the operation of predefined one group of function in the equipment uniquely passes throughTouch-screen and/or touch pad are come the equipment that performs.By using touch-screen and/or touch pad as the operation for equipment 200Main input control apparatus, optionally reduce equipment 200 on be physically entered control device (push button, driver plate etc.Deng) quantity.

User circle is uniquely optionally included in come the predefined one group of function of performing by touch-screen and/or touch padNavigation between face.In some embodiments, when user touches touch pad, appoint what is shown in the slave unit 200 of equipment 200What user interface navigation is to main menu, home menus or root menu.In such embodiment, " dish is realized using touch padSingle button ".In some other embodiments, menu button is physics push button or other are physically entered control device,Rather than touch pad.

Fig. 2 B are the block diagrams for showing the example components for event handling according to some embodiments.In some realitiesApply in scheme, memory 202 (Fig. 2A) or memory 470 (Fig. 4) include event classifier 270 (for example, in operating system 226In) and corresponding application program 236-1 (for example, any one application in aforementioned applications program 237-251,255,480-490Program).

The application program 236-1 and answer that event information is delivered to by the reception event information of event classifier 270 and determinationWith program 236-1 application view 291.Event classifier 270 includes event monitor 271 and event dispatcher module274.In some embodiments, application program 236-1 includes application program internal state 292, the application program internal stateIndicate the one or more current applications being displayed on when application program is activity or is carrying out on touch-sensitive display 212Views.In some embodiments, equipment/global internal state 257 is used for which (which determined by event classifier 270Application program is currently movable a bit), and application program internal state 292 is used for determination by thing by event classifier 270The application view 291 that part information is delivered to.

In some embodiments, application program internal state 292 includes additional information, such as one of the followingOr more persons：The recovery information used, instruction are just being employed program 236-1 and shown when application program 236-1 recovers to performInformation or be ready for being employed the user interface state information for the information that program 236-1 is shown, for enabling a user toEnough return to application program 236-1 previous state or the state queue of view and the repetition of prior actions that user takes/Cancel queue.

Event monitor 271 receives event information from peripheral interface 218.Event information is included on subevent (exampleSuch as, as on the touch-sensitive display 212 of a part for multi-touch gesture user touch) information.Peripheral interface 218Transmitting it, (such as proximity transducer 266, accelerometer 268 and/or microphone 213 (pass through from I/O subsystems 206 or sensorVoicefrequency circuit 210)) receive information.The information that peripheral interface 218 receives from I/O subsystems 206 is included from touch-sensitive aobviousShow the information of device 212 or touch sensitive surface.

In some embodiments, event monitor 271 sends the request to ancillary equipment at predetermined intervalsInterface 218.As response, the transmitting event information of peripheral interface 218.In other embodiments, peripheral interface 218Only when exist notable event (for example, receive higher than predetermined noise threshold input and/or receive more than in advance reallyThe input of fixed duration) when ability transmitting event information.

In some embodiments, event classifier 270 also includes hit view determination module 272 and/or life eventIdentifier determining module 273.

When touch-sensitive display 212 shows more than one view, hit view determination module 272 is provided for determining sub- thingThe part software process where occurred in one or more views.The control that view can be checked over the display by userPart and other elements are formed.

The another aspect of the user interface associated with application program is one group of view, otherwise referred to as applies journey hereinSequence view or user interface windows, wherein display information and occur the gesture based on touch.Touch is detected whereinThe sequencing water that (corresponding application programs) application view corresponds in the sequencing or view hierarchies structure of application programIt is flat.For example, detect that the floor level view of touch is called hit view wherein, and be identified as correctly entering thatThe hit view for the initial touch that group event is based at least partially on the gesture for starting based on touch determines.

Hit view determination module 272 and receive the information related to the subevent of the gesture based on touch.Work as application programDuring with the multiple views organized in hierarchy, hit view determination module 272 will hit view, and be identified as should be to sub- thingMinimum view in the hierarchy that part is handled.In most cases, hit view is to initiate subevent (for example, shapeThe first subevent into the subevent sequence of event or potential event) the floor level view that occurs wherein.Once hitView is hit view determination module 272 and identified, hit view just generally receive with its be identified as hit view it is targetedSame touch or all subevents of input source correlation.

It is specific that life event identifier determining module 273 determines which or which view in view hierarchies structure should receiveSubevent sequence.In some embodiments, life event identifier determining module 273 determines that only hit view should just receiveSpecific subevent sequence.In other embodiments, life event identifier determining module 273 determines the physics for including subeventAll views of position are the active views participated in, it is thus determined that all views actively participated in should receive specific subevent sequenceRow.In other embodiments, even if touch subevent is confined to the region associated with a particular figure completely, butHigher view will remain in that view for active participation in hierarchy.

Event information is assigned to event recognizer (for example, event recognizer 280) by event dispatcher module 274.WrappingIn the embodiment for including life event identifier determining module 273, event information is delivered to by living by event dispatcher module 274The dynamic definite event identifier of event recognizer determining module 273.In some embodiments, event dispatcher module 274 existsEvent information is stored in event queue, the event information is retrieved by corresponding event receiver 282.

In some embodiments, operating system 226 includes event classifier 270.Alternatively, application program 236-1 bagsInclude event classifier 270.In still another embodiment, event classifier 270 is independent module, or is stored inA part for another module (such as contact/motion module 230) in memory 202.

In some embodiments, application program 236-1 includes multiple button.onreleases 290 and one or more applicationsViews 291, wherein each application view includes being used to handle occurring to regard in the corresponding of user interface of application programThe instruction of touch event in figure.Application program 236-1 each application view 291 includes one or more event recognitionsDevice 280.Generally, corresponding application programs view 291 includes multiple event recognizers 280.In other embodiments, event recognitionOne or more of device 280 event recognizer is a part for standalone module, the standalone module such as user interface tool bag(not shown) or application program the 236-1 therefrom higher levels of object of inheritance method and other attributes.In some embodimentsIn, corresponding event processing routine 290 includes one or more of the following：Data renovator 276, object renovator 277,GUI renovators 278, and/or the event data 279 received from event classifier 270.Button.onrelease 290 is utilized or adjustedWith data renovator 276, object renovator 277 or GUI renovators 278, with more new application internal state 292.It is alternativeOne or more of ground, application view 291 application view includes one or more corresponding event processing routines290.In addition, in some embodiments, one of data renovator 276, object renovator 277 and GUI renovators 278 orMore persons are included in corresponding application programs view 291.

Corresponding event identifier 280 from event classifier 270 receive event information (for example, event data 279), and fromEvent information identification events.Event recognizer 280 includes Event receiver 282 and event comparator 284.In some embodiment partyIn case, event recognizer 280 also comprises at least the subset of the following：(it is wrapped for metadata 283 and event delivery instruction 288Enclosed tool event delivery instructs).

Event receiver 282 receives event information from event classifier 270.Event information includes for example touching on subeventTouch or touch the information of movement.According to subevent, event information also includes the position of additional information such as subevent.Work as subeventWhen being related to the motion of touch, speed and direction of the event information also including subevent.In some embodiments, event includes settingIt is standby to rotate to another orientation (for example, rotate to horizontal orientation from machine-direction oriented, or vice versa as the same), and event from an orientationInformation includes the corresponding informance of the current orientation (also referred to as equipment posture) on equipment.

Compared with event comparator 284 defines event information with predefined event or subevent, and it is based on being somebody's turn to doCompare to determine event or subevent, or determination or the state of update event or subevent.In some embodiments, eventComparator 284 includes event and defines 286.Event defines 286 definition (for example, predefined subevent sequence) for including event,Such as event 1 (287-1), event 2 (287-2) and other events.In some embodiments, the sub- thing in event (287)Part for example starts including touch, touches and terminate, touch mobile, touch cancellation and multiple point touching.In one example, event 1The definition of (287-1) is the double-click on shown object.For example, double-click the of the predetermined duration included on shown objectOnce touch (touch starts), the first time of predetermined duration lifts (touch terminates), is shown predefining on objectSecond of touch (touch starts) of duration and lifting for the second time (touch terminates) for predetermined duration.Show at anotherIn example, the definition of event 2 (287-2) is the dragging on shown object.For example, dragging is including advance true on shown objectThe movement and touch of the long touch of timing (or contact), touch on touch-sensitive display 212 are lifted (touch terminates).In some embodiments, event also includes the information for being used for one or more associated button.onreleases 290.

In some embodiments, event defines 287 and includes being used for the definition of the event of respective user interfaces object.OneIn a little embodiments, event comparator 284 performs hit test, to determine which user interface object is associated with subevent.For example, shown on touch display 212 in the application view of three user interface objects, when in touch-sensitive display 212On when detecting touch, event comparator 284 performs hit test to determine which of these three user interface objects are usedFamily interface object is associated with the touch (subevent).If each shown object and the corresponding phase of button.onrelease 290Association, the then result that event comparator is tested using the hit determine which button.onrelease 290 should be activated.ExampleSuch as, the selection of event comparator 284 button.onrelease associated with the object of subevent and triggering hit test.

In some embodiments, the definition of corresponding event (287) also includes delay voltage, delay voltage delay eventThe delivering of information, until having determined that whether subevent sequence exactly corresponds to or do not correspond to the event type of event recognizer.

, should when corresponding event identifier 280 determines that any event that subevent sequence is not defined with event in 286 matchesThe entry event of corresponding event identifier 280 is impossible, event fails or event done state, ignores after this based on touchThe follow-up subevent of gesture.In this case, for hit view holding activity other event recognizers (ifWords) continue to track and handle the subevent of the lasting gesture based on touch.

In some embodiments, corresponding event identifier 280 includes having how instruction event delivery system should be heldConfigurable attribute, mark and/or the metadata of list 283 that row is delivered the subevent of the event recognizer of active participation.In some embodiments, metadata 283 includes indicating how event recognizer interacts or how to interact each other and be configurableAttribute, mark and/or list.In some embodiments, metadata 283 includes whether instruction subevent is delivered to view or journeyConfigurable attribute, mark and/or the list of different levels in sequence hierarchy.

In some embodiments, when one or more specific subevents of identification events, corresponding event identifier 280The activation button.onrelease 290 associated with event.In some embodiments, corresponding event identifier 280 will be with eventAssociated event information is delivered to button.onrelease 290.Activation button.onrelease 290 is different from sending subevent(and delaying to send) hits view to corresponding.In some embodiments, event recognizer 280 is dished out and the event phase that is identifiedThe mark of association, and the button.onrelease 290 associated with the mark obtains the mark and performs predefined process.

In some embodiments, event delivery instruction 288 includes event information of the delivering on subevent without activatingThe subevent delivery instructions of button.onrelease.On the contrary, event information is delivered to and subevent string phase by subevent delivery instructionsThe button.onrelease of association or the view for being delivered to active participation.The thing associated with subevent string or the view actively participated inPart processing routine receives event information and performs predetermined process.

In some embodiments, data renovator 276 creates and updated the data used in application program 236-1.For example, data renovator 276 is updated to the telephone number used in contact module 237, or to video playbackVideo file used in device module is stored.In some embodiments, object renovator 277 is created and updated and answeringWith the object used in program 236-1.For example, object renovator 277 creates new user interface object or more new user interface pairThe position of elephant.GUI renovators 278 update GUI.For example, GUI renovators 278 prepare display information and send it to figure mouldBlock 232, for the display on touch-sensitive display.

In some embodiments, one or more button.onreleases 290 include data renovator 276, object updatesDevice 277 and GUI renovators 278 or with access to data renovator 276, object renovator 277 and GUI renovators 278Authority.In some embodiments, data renovator 276, object renovator 277 and GUI renovators 278, which are included in, accordingly shouldWith in program 236-1 or the individual module of application view 291.In other embodiments, they be included in two orIn more software modules.

It should be appreciated that the discussed above of the event handling touched on the user on touch-sensitive display is applied also for using defeatedEnter user's input that equipment carrys out the other forms of operating multifunction equipment 200, not all user's input is all in touch-screenUpper initiation.For example, optionally with single or multiple keyboard pressings or the mouse movement of combination and mouse button is kept to press；TouchContact movement, touch, dragging, rolling etc. in template；Stylus inputs；The movement of equipment；Spoken command；DetectedEyes move；Biological characteristic inputs；And/or as it is corresponding with subevent input be used for define the event to be identified itsThe combination of meaning.

Fig. 3 shows the portable multifunction device 200 with touch-screen 212 according to some embodiments.Touch-screenThe one or more figures of display optionally in user interface (UI) 300.In the present embodiment and it is described belowIn other embodiments, user can be by, for example, one or more fingers 302 (in figure be not drawn to scale) or oneIndividual or multiple stylus 303 (being not drawn to scale in figure) make gesture to select one or more in these figures on figureIndividual figure.In some embodiments, when user interrupts the contact with one or more figures, will occur to one or moreThe selection of figure.In some embodiments, gesture optionally includes one or many touches, one or many gently swept (from a left sideTo the right, from right to left, up and/or down the finger), and/or being in contact with equipment 200 rolling (from right to left, fromFrom left to right, up and/or down).In some specific implementations or in some cases, it will not inadvertently be selected with pattern contactSelect figure.For example, when gesture corresponding with selection is touch, what is swept above application icon gently sweeps gesture optionallyCorresponding application programs will not be selected.

Equipment 200 also includes one or more physical buttons, such as " home " button or menu button 304.Such as preceding instituteState, menu button 304 is used for any application program 236 navigate in the one group of application program performed on the device 200.It is alternativeGround, in some embodiments, menu button are implemented as the soft key in the GUI that is displayed on touch-screen 212.

In some embodiments, equipment 200 includes touch-screen 212, menu button 304, for making equipment power on/offWith for the push button 306 of locking device, one or more volume knobs 308, subscriber identity module (SIM) neck310th, earphone jack 312 and docking/charging external port 224.Push button 306 is optionally used to：By pressing button simultaneouslyButton is set to keep predetermined time interval to make equipment power on/off in pressed status；By pressing button and passing throughRelease button carrys out locking device before predetermined time interval；And/or releasing process is unlocked or initiated to equipment.In alternative embodiment, equipment 200 is also received defeated for the voice that activates or deactivate some functions by microphone 213Enter.One or more contact strengths that equipment 200 also optionally includes being used to detect the intensity of the contact on touch-screen 212 passSensor 265, and/or for generating one or more tactile output generators 267 of tactile output for the user of equipment 200.

Fig. 4 is the block diagram according to the exemplary multifunctional equipment with display and touch sensitive surface of some embodiments.Equipment 400 needs not be portable.In some embodiments, equipment 400 is laptop computer, desktop computer, flat board electricityBrain, multimedia player device, navigation equipment, educational facilities (such as children for learning toy), games system or control device (exampleSuch as, household controller or industrial controller).Equipment 400 generally include one or more processing units (CPU) 410, one orMultiple networks or other communication interfaces 460, memory 470 and one or more communication bus for making these component connections420.Communication bus 420 is optionally including making the circuit of the communication between system unit interconnection and control system part (sometimesIt is called chipset).Equipment 400 includes input/output (I/O) interface 430 with display 440, and the display is typically to touchTouch panel type display.I/O interfaces 430 also optionally include keyboard and/or mouse (or other sensing equipments) 450 and touch pad455th, for generating the tactile output generator 457 of tactile output on device 400 (for example, similar to above with reference to Fig. 2A institutesThe one or more tactile output generators 267 stated), sensor 459 is (for example, optical sensor, acceleration transducer, closeSensor, touch-sensitive sensors, and/or similar to one or more contact strength sensors 265 above with reference to described in Fig. 2AContact strength sensor).Memory 470 include high-speed random access memory such as DRAM, SRAM, DDR RAM or other withMachine access solid-state memory device, and optionally include such as one or more disk storage equipments of nonvolatile memory,Optical disc memory apparatus, flash memory device or other non-volatile solid-state memory devices.Memory 470 optionally includes remoteFrom one or more storage devices of one or more CPU 410 positioning.In some embodiments, memory 470 storage withThe program, the module that are stored in the memory 202 of portable multifunction device 200 (Fig. 2A) journey similar with data structureSequence, module and data structure or their subset.In addition, memory 470 is optionally stored in portable multifunction device 200Appendage, module and the data structure being not present in memory 202.For example, the memory 470 of equipment 400 optionally storesGraphics module 480, module 482, word processing module 484, website creation module 486, disk editor module 488, and/or electricity is presentedSub-table module 490, and the memory 202 of portable multifunction device 200 (Fig. 2A) does not store these modules optionally.

In some instances, each element in the said elements in Fig. 4 can be stored in one or more mentioned aboveMemory devices in.Each module in above-mentioned module corresponds to the instruction set for being used for performing above-mentioned function.Above-mentioned module orProgram (for example, instruction set) need not be implemented as single software program, process or module, and therefore these modules is variousSubset can in various embodiments be combined or otherwise rearrange.In some embodiments, memory 470 is depositedStore up the subset of above-mentioned module and data structure.In addition, memory 470 stores the add-on module and data knot being not described aboveStructure.

Attention is drawn to can be in the embodiment party for the user interface for example realized on portable multifunction device 200Case.

Fig. 5 A show the example of the application menu on the portable multifunction device 200 according to some embodimentsProperty user interface.Similar user interface is realized on device 400.In some embodiments, user interface 500 includes followingElement or its subset or superset：

One or more signal intensities for one or more radio communications such as cellular signal and Wi-Fi signal indicateSymbol 502；

Time 504；

Bluetooth designator 505；

Battery status indicator 506；

Pallet 508 with the icon for commonly using application program, such as：

The icon 516 for being marked as " phone " of o phone modules 238, the icon 516 optionally include missed call or languageThe designator 514 of the quantity of sound message；

The icon 518 of the mark " mail " of o email clients module 240, the icon 518 optionally include not readingThe designator 510 of the quantity of Email；

The icon 520 for being marked as " browser " of o browser modules 247；With

O videos and musical player module 252 (also referred to as iPod (Apple Inc. trade mark) module 252) is markedIt is designated as the icon 522 of " iPod "；And

The icon of other applications, such as：

The icon 524 for being marked as " message " of o IM modules 241；

The icon 526 for being marked as " calendar " of o calendaring modules 248；

The icon 528 for being marked as " photo " of o image management modules 244；

The icon 530 for being marked as " camera " of o camera models 243；

The icon 532 for being marked as " Online Video " of o Online Videos module 255；

The o stock markets desktop small routine 249-2 icon 534 for being marked as " stock market "；

The icon 536 for being marked as " map " of o mapping modules 254；

The o weather desktop small routines 249-1 icon 538 for being marked as " weather "；

The o alarm clock desktop small routines 249-4 icon 540 for being marked as " clock "；

The icon 542 for being marked as " body-building support " of o body-building support module 242；

The icon 544 for being marked as " notepad " of o notepad modules 253；With

O is used for the icon 546 for being marked as " setting " for setting application program or module, and the icon 546 is provided to equipment200 and its various application programs 236 setting access.

It should indicate, the icon label shown in Fig. 5 A is only exemplary.For example, video and music player mouldThe icon 522 of block 252 is optionally marked as " music " or " music player ".Optionally make for various application iconsWith other labels.In some embodiments, the label of corresponding application programs icon includes and the corresponding application programs icon pairThe title for the application program answered.In some embodiments, the label of application-specific icon is different from and the application-specificThe title of application program corresponding to program icon.

Fig. 5 B are shown with (the example of touch sensitive surface 551 separated with display 550 (for example, touch-screen display 212)Such as, Fig. 4 tablet personal computer or touch pad 455) equipment (for example, Fig. 4 equipment 400) on exemplary user interface.Equipment400 also optionally include one or more contact strength sensors of the intensity for detecting the contact on touch sensitive surface 551(for example, one or more of sensor 457 sensor), and/or one for generating tactile output for the user of equipment 400Individual or multiple tactile output generators 459.

Although it will be provided then with reference to the input on touch-screen display 212 (being wherein combined with touch sensitive surface and display)Example in some examples, it is but in some embodiments, defeated on the touch sensitive surface that equipment detection separates with displayEnter, as shown in Figure 5 B.In some embodiments, touch sensitive surface (for example, 551 in Fig. 5 B) have with display (for example,550) main shaft (for example, 552 in Fig. 5 B) corresponding to the main shaft (for example, 553 in Fig. 5 B) on.According to these embodiments,In position corresponding with the relevant position on display, (for example, in Fig. 5 B, 560 correspond to 568 and 562 pairs for equipment detectionShould be in the contact with touch sensitive surface 551 of 570) place (for example, 560 in Fig. 5 B and 562).So, in touch sensitive surface (for example, figureIn 5B when 551) being separated with the display (550 in Fig. 5 B) of multifunctional equipment, detected by equipment on touch sensitive surfaceUser's input (for example, contact 560 and 562 and their movement) is used to manipulate the user interface on display by the equipment.It should be appreciated that similar method is optionally for other users interface as described herein.

In addition, though mostly in reference to finger input (for example, finger contact, singly refer to Flick gesture, finger gently sweeps gesture)To provide following example it should be appreciated that in some embodiments, one or more of these fingers inputFinger input is substituted by the input (for example, input or stylus based on mouse input) from another input equipment.It is for example, lightSweep gesture and (for example, rather than contact) optionally clicked on by mouse, be cursor afterwards along the moving of the path gently swept (for example,Rather than the movement of contact) substitute.And for example, Flick gesture is optionally by when cursor is located above the position of Flick gestureMouse is clicked on (for example, rather than the detection to contact, be termination detection contact afterwards) and substituted.Similarly, detected when simultaneouslyWhen being inputted to multiple users, it should be appreciated that multiple computer mouses are optionally used simultaneously, or mouse and finger contactOptionally it is used simultaneously.

Fig. 6 A show exemplary personal electronic equipments 600.Equipment 600 includes main body 602.In some embodiments,Equipment 600 is included for some or all of the feature described in equipment 200 and 400 (for example, Fig. 2A -4) feature.In some realitiesApply in scheme, equipment 600 has the touch-sensitive display panel 604 of hereinafter referred to as touch-screen 604.Replacement as touch-screen 604Or supplement, equipment 600 have display and touch sensitive surface.As the situation of equipment 200 and 400, in some embodimentsIn, touch-screen 604 (or touch sensitive surface) has the one or more for the intensity for being used to detect applied contact (for example, touch)Intensity sensor.One or more of intensity sensors of touch-screen 604 (or touch sensitive surface) can provide for representing to touchIntensity output data.The user interface of equipment 600 is responded based on touch intensity to touch, it means that different strongThe touch of degree can call the different operating user interfaces in equipment 600.

Technology for detecting and handling touch intensity is seen in for example following related application：May 8 in 2013Entitled " Device, Method, and Graphical User Interface for the Displaying User that day submitsInterface Objects Corresponding to an Application " international patent application serial number PCT/US2013/040061, and entitled " Device, Method, the and Graphical submitted on November 11st, 2013User Interface for Transitioning Between Touch Input to Display OutputRelationships " international patent application serial number PCT/US2013/069483, it is each special in the two patent applicationsProfit application is incorporated by reference in its entirety herein accordingly.

In some embodiments, equipment 600 has one or more input mechanisms 606 and 608.The He of input mechanism 606608 (if including) were physical form.Being physically entered the example of mechanism includes push button and Rotatable mechanism.OneIn a little embodiments, equipment 600 has one or more attachment means.Such attachment means (if including) can allow byEquipment 600 and such as cap, glasses, earrings, necklace, shirt, jacket, bracelet, watchband, bangle, trousers, belt, shoes, moneyBag, knapsack etc. are attached.These attachment means allow user's wearable device 600.

Fig. 6 B show exemplary personal electronic equipments 600.In some embodiments, equipment 600 includes reference chartSome or all of part described in 2A, Fig. 2 B and Fig. 4 part.Equipment 600 has a bus 612, and the bus is by I/O parts614 operatively couple with one or more computer processors 616 and memory 618.I/O parts 614 are connected to display604, the display 604 can be with touch sensing element 622 and optionally also with touch intensity sensing unit 624.In addition, I/OPart 614 is connected with communication unit 630, for using Wi-Fi, bluetooth, near-field communication (NFC), honeycomb and/or other are wirelessThe communication technology receives application program and operating system data.Equipment 600 includes input mechanism 606 and/or 608.For example, inputMechanism 606 is rotatable input equipment or pressable input equipment and rotatable input equipment.In some instances, it is defeatedIt is button to enter mechanism 608.

In some instances, input mechanism 608 is microphone.Personal electronic equipments 600 include for example various sensors, allSuch as GPS sensor 632, accelerometer 634, orientation sensor 640 (for example, compass), gyroscope 636, motion sensor 638And/or its combination, all these equipment are operatively connectable to I/O parts 614.

The memory 618 of personal electronic equipments 600 is can for storing the non-transient computer of computer executable instructionsStorage medium is read, the instruction by one or more computer processors 616 when being performed, such as computer processor is performedTechnique described below and process.The computer executable instructions are also for example in any non-transient computer readable storage medium storing program for executingStored and/or transmitted, for instruction execution system, device or equipment such as computer based system, include processorSystem or can be obtained from instruction execution system, device or equipment the other systems of instruction and execute instruction using or tie with itClose.Personal electronic equipments 600 are not limited to Fig. 6 B part and configuration, but may include the miscellaneous part in various configurations or additionalPart.

As used herein, refer to can be in such as equipment 200,400 and/or 600 (Fig. 2, Fig. 4 and figure for term " showing to represent "6) the user mutual formula graphical user interface object of display screen display.For example, image (for example, icon), button and text(for example, hyperlink), which is each formed, to be shown and can represent.

As used herein, term " focus selector " refers to the user interface just interacted for instruction userThe input element of current portions.In some specific implementations including cursor or other positions mark, cursor serves as " focus selectionDevice " so that when cursor is above particular user interface element (for example, button, window, sliding block or other users interface element)When detect input (for example, pressing on touch sensitive surface (for example, touch sensitive surface 551 in touch pad 455 or Fig. 5 B in Fig. 4)Pressure input) in the case of, the particular user interface element is conditioned according to detected input.Including can realize withThe touch-screen display of the direct interaction of user interface element on touch-screen display is (for example, the touch-sensitive display in Fig. 2ATouch-screen 212 in system 212 or Fig. 5 A) some specific implementations in, the detected contact on touch-screen is served as " burntPoint selection device " so that when on touch-screen display particular user interface element (for example, button, window, sliding block or otherUser interface element) opening position detect input (for example, by contact carry out pressing input) when, the particular user interface memberElement is conditioned according to detected input.In some specific implementations, focus is moved to from a region of user interfaceAnother region of user interface, the movement of the contact in corresponding movement or touch-screen display without cursor is (for example, logicalCross and focus is moved to another button from a button using Tab key or arrow key)；In these specific implementations, focus choosingSelect device mobile according to the focus between the different zones of user interface and move.The specific shape that focus selector is taken is not consideredFormula, focus selector be typically from user's control so as to transmit with the user of user interface expected from interact (for example, pass through toThe element that the user of equipment indicative user interface it is expected to interact) user interface element (or on touch-screen displayContact).For example, when detecting pressing input on touch sensitive surface (for example, touch pad or touch-screen), focus selector (exampleSuch as, cursor, contact or choice box) instruction user it is expected to activate the corresponding button and (rather than set by position above the corresponding buttonThe other users interface element shown on standby display).

As used in the specification and in the claims, " property strengths " of the contact term refers to one based on contactOr the characteristic of the contact of multiple intensity.In some embodiments, property strengths are based on multiple intensity samples.Property strengths are optionalGround is based on (for example, after contact is detected, before detecting that contact is lifted, connecing detecting relative to predefined eventTouch start it is mobile before or after, before detecting that contact terminates, before or after the intensity for detecting contact increases and/Or detect contact intensity reduce before or after) for the predetermined period (for example, 0.05 second, 0.1 second,0.2 second, 0.5 second, 1 second, 2 seconds, 5 seconds, 10 seconds) during collection predefined quantity intensity sample or one group of intensity sample.ConnectTactile property strengths are optionally based on one or more of the following：The maximum of contact strength, the average of contact strength,90% maximum of value, half maximum of contact strength, contact strength at the average value of contact strength, preceding the 10% of contact strengthValue etc..In some embodiments, it is determined that using the duration of contact (for example, being to connect in property strengths during property strengthsDuring tactile intensity average value in time).In some embodiments, by property strengths and one group of one or more intensity thresholdValue is compared, to determine whether executed operates user.For example, the one or more intensity thresholds of the group include the first intensity thresholdValue and the second intensity threshold.In this example, contact of the property strengths not less than first threshold causes the first operation, property strengthsContact more than the first intensity threshold but not less than the second intensity threshold causes the second operation, and property strengths are more than the second thresholdThe contact of value causes the 3rd operation.In some embodiments, using the comparison between property strengths and one or more threshold valuesTo determine whether to perform one or more operations (for example, being to perform corresponding operating to be also to give up performing corresponding operating), withoutIt is to be used to determine to perform the first operation or the second operation.

In some embodiments, identify a part for gesture for determination property strengths.For example, touch sensitive surface receivesContact is gently continuously swept, this is continuously gently swept contact from original position transition and reaches end position, at the end position, connectsTactile intensity increase.In this example, contact the characteristic strength at end position and be based only upon the continuous part for gently sweeping contact,Rather than entirely gently sweep contact (for example, gently sweeping contact portion only at end position).In some embodiments, it is determined thatThe forward direction of the property strengths of contact gently sweeps the intensity application smoothing algorithm of gesture.For example, the smoothing algorithm optionally includesOne or more of the following：It is smooth that moving average smoothing algorithm, triangle smoothing algorithm, median filter are not weightedChange algorithm and/or exponential smoothing algorithm.In some cases, these smoothing algorithms, which eliminate, gently sweeps in the intensity of contactNarrow spike or depression, to realize the purpose for determining property strengths.

Contact performance intensity from the intensity less than light press intensity threshold increase between light press intensity threshold with it is deep byIntensity between Compressive Strength threshold value is sometimes referred to as " light press " input.Contact performance intensity is from less than deep pressing intensity thresholdThe intensity that intensity increases to above deep pressing intensity threshold is sometimes referred to as " deep pressing " input.Contact performance intensity is from less than connecingThe intensity for touching detection intensity threshold value is increased between the intensity contacted between detection intensity threshold value and light press intensity threshold sometimesIt is referred to as detecting the contact on touch-surface.Contact characteristic intensity is reduced to low from the intensity higher than contact detection intensity threshold valueSometimes referred to as detect that contact is lifted from touch-surface in the intensity of contact detection intensity threshold value.In some embodiments,It is zero to contact detection intensity threshold value.In some embodiments, contact detection intensity threshold value and be more than zero.

Herein in some described embodiments, in response to detecting gesture or response including corresponding pressing inputOne or more operations are performed in detecting the corresponding pressing performed using corresponding contact (or multiple contacts) input, wherein extremelyIt is at least partly based on and detects that the intensity of the contact (or multiple contacts) increases to above pressing input intensity threshold value and detectedCorresponding pressing inputs.In some embodiments, in response to detecting that it is strong that the intensity of corresponding contact increases to above pressing inputThreshold value (for example, " downward stroke " of corresponding pressing input) is spent to perform corresponding operating.In some embodiments, pressing inputIntensity including corresponding contact increase to above pressing input intensity threshold value and the contact intensity be decreased subsequently to be less than byPress input intensity threshold value, and in response to detect the intensity of corresponding contact be decreased subsequently to less than pressing input threshold value (for example," up stroke " of corresponding pressing input) perform corresponding operating.

In some embodiments, the accident that equipment uses intensity hysteresis to avoid sometimes referred to as " shaking " inputs, itsMiddle equipment limits or selection has the hysteresis intensity threshold of predefined relation with pressing input intensity threshold value (for example, hysteresis intensityThreshold value than the low X volume unit of pressing input intensity threshold value, or hysteresis intensity threshold be pressing input intensity threshold value 75%,90% or some rational proportion).Therefore, in some embodiments, pressing input includes the intensity of corresponding contact and increases to heightIt is decreased subsequently to be less than the hysteresis for corresponding to pressing input intensity threshold value in the intensity of pressing input intensity threshold value and the contactIntensity threshold, and in response to detecting that the intensity of corresponding contact is decreased subsequently to less than hysteresis intensity threshold (for example, accordingly pressingPress " up stroke " of input) perform corresponding operating.Similarly, in some embodiments, only contact is detected in equipmentIntensity from equal to or less than hysteresis intensity threshold intensity increase to equal to or higher than pressing input intensity threshold value intensity andOptionally contact strength is decreased subsequently to be equal to or less than just detect pressing input during the intensity of hysteresis intensity, and in response toPressing input (for example, according to environment, contact strength increase or contact strength reduce) is detected to perform corresponding operating.

In order to easily explain, optionally, triggered in response to detecting any of following various situations situation to soundThe associated pressing input of Ying Yuyu pressing input intensity threshold values or the operation performed in response to the gesture including pressing inputDescription：The intensity of contact increases to above pressing input intensity threshold value, the intensity of contact from strong less than hysteresis intensity thresholdDegree increase to above pressing input intensity threshold value intensity, contact intensity be decreased below press input intensity threshold value, and/orThe intensity of contact is decreased below hysteresis intensity threshold corresponding with pressing input intensity threshold value.In addition, describe the operations asIn the example that intensity in response to detecting contact is decreased below pressing input intensity threshold value and performed, it is optionally in response to examineThe intensity for measuring contact is decreased below corresponding to and performs behaviour less than the hysteresis intensity threshold of pressing input intensity threshold valueMake.

3rd, digital assistant

Fig. 7 A show the block diagram of the digital assistant 700 according to various examples.In some instances, digital assistant700 realize in freestanding computer system.In some instances, digital assistant 700 is distributed across multiple computers.OneIn a little examples, some modules and function in the module and function of digital assistants are divided into server section and client endPoint, wherein client part be located at one or more user equipmenies (for example, equipment 104,122,200,400 or 600) on andCommunicated by one or more networks with server section (for example, server system 108), as shown in Figure 1.Show at someIn example, digital assistant 700 is the specific implementation of the server system 108 (and/or DA servers 106) shown in Fig. 1.It should be pointed out that digital assistant 700 is only an example of digital assistant, and digital assistant 700 canWith more or less parts than showing, can be combined two or more parts or can have part it is different configure orArrangement.Various parts shown in Fig. 7 A are in hardware, the software instruction for being performed by one or more processors, firmware (bagInclude one or more signal processing integrated circuits and/or application specific integrated circuit) or combinations thereof in realize.

Digital assistant 700 includes memory 702, one or more processors 704, input/output (I/O) interface706 and network communication interface 708.These parts can be communicated with one another by one or more communication bus or signal wire 710.

In some instances, memory 702 includes non-transitory computer-readable medium, such as high-speed random access memoryAnd/or non-volatile computer readable storage medium storing program for executing (for example, one or more disk storage equipments, flash memory device orOther non-volatile solid state memory equipment).

In some instances, I/O interfaces 706 such as show the input-output apparatus 716 of digital assistant 700Device, keyboard, touch-screen and microphone are coupled to subscriber interface module 722.The I/O interfaces 706 combined with subscriber interface module 722User is received to input (for example, phonetic entry, input through keyboard, touch input etc.) and correspondingly handle these inputs.In some examples, such as when digital assistants are being implemented on free-standing user equipment, digital assistant 700 includes relative respectivelyIn the part and I/O communication interfaces described by equipment 200, equipment 400 or equipment 600 in Fig. 2A, Fig. 4, Fig. 6 A-BAny one.In some instances, digital assistant 700 represents the server section of digital assistants specific implementation, and can lead toThe client-side aspects crossed on user equipment (for example, equipment 104, equipment 200, equipment 400 or equipment 600) and userInteract.

In some instances, network communication interface 708 includes one or more wired connection ports 712, and/or wirelessTransmission and receiving circuit 714.One or more wired connection ports are via one or more wireline interfaces such as Ethernet, generalUniversal serial bus (USB), live wire etc. receive and sent signal of communication.Radio-circuit 714 is from communication network and other communication equipmentsReceive RF signals and/or optical signalling and send RF signals and/or optical signalling to communication network and other communications and setIt is standby.Radio communication uses any of a variety of communication standards, agreement and technology, such as GSM, EDGE, CDMA, TDMA, indigo plantTooth, Wi-Fi, VoIP, Wi-MAX or any other suitable communication protocol.Network communication interface 708 makes digital assistant700 by network, such as internet, Intranet and/or wireless network such as cellular phone network, WLAN (LAN) and/Or communication of the Metropolitan Area Network (MAN) (MAN) between other equipment is possibly realized.

In some instances, the computer-readable recording medium storage of memory 702 or memory 702 includes herein belowIn whole or its subset program, module, instruction and data structure：Operating system 718, communication module 720, user interface mouldBlock 722, one or more application programs 724 and digital assistant module 726.Specifically, the meter of memory 702 or memory 702Calculation machine readable storage medium storing program for executing stores the instruction for performing process described below.One or more processors 704 perform theseProgram, module and instruction, and read data from data structure or write data into data structure.

Operating system 718 is (for example, Darwin, RTXC, LINUX, UNIX, iOS, OS X, WINDOWS or embedded operationSystem such as VxWorks) include be used for control and manage general system task (for example, memory management, storage device control,Power management etc.) various software parts and/or driver, and be advantageous between various hardware, firmware and software partCommunication.

Communication module 720 facilitates what is carried out between digital assistant 700 and other equipment by network communication interface 708Communication.For example, communication module 720 and electronic equipment such as equipment 200, the equipment as shown in Fig. 2A, Fig. 4, Fig. 6 A-B respectively400 and the RF circuits 208 of equipment 600 communicated.Communication module 720 also includes being used to handle by radio-circuit 714 and/or hadThe various parts of the received data of line COM1 712.

Subscriber interface module 722 is via I/O interfaces 706 from user (for example, from keyboard, touch-screen, sensing equipment, controlDevice and/or microphone processed) order and/or input are received, and user interface object is generated over the display.Subscriber interface module722 be also prepared for export (for example, voice, sound, animation, text, icon, vibration, touch feedback, illumination etc.) and by its viaI/O interfaces 706 (for example, by display, voice-grade channel, loudspeaker, touch pad etc.) are sent to user.

Application program 724 includes being configured as the program and/or module performed by one or more of processors 704.For example, if digital assistant is implemented on free-standing user equipment, application program 724 may include user application,Such as game, calendar applications, navigation application program or mail applications.If digital assistant 700 is in serverUpper implementation, then application program 724 include such as asset management application, diagnosis of application program or scheduling application.

Memory 702 also stores digital assistant module 726 (or server section of digital assistants).In some instances,Digital assistant module 726 includes following submodule or its subset or superset：Input/output processing module 728, voice turn text(STT) processing module 730, natural language processing module 732, dialogue stream processing module 734, task flow processing module 736, clothesProcessing module 738 of being engaged in and voice synthetic module 740.Each module in these modules, which is respectively provided with, to be helped following system or numeralManage one or more of data and model of module 726 or the access rights of its subset or superset：Ontologies 760, wordConverge and index 744, user data 748, task flow model 754, service model 756 and ASR system.

In some instances, helped using the processing module, data and model implemented on digital assistant module 726, numeralAt least some operations in the executable following operation of reason：Phonetic entry is converted into text；Identify the natural language received from userThe intention of the user expressed in speech input；Energetically draw and obtain the information (example needed for for the fully intention of deduction userSuch as, by eliminating the ambiguity of words, game, purpose etc.)；It is determined that the task flow for realizing the intention being inferred to；And performThe intention that task flow is inferred to realizing.

In some instances, as shown in Figure 7 B, I/O processing modules 728 are entered by the I/O equipment 716 in Fig. 7 A with userRow interaction or by the network communication interface 708 in Fig. 7 A and user equipment (for example, equipment 104, equipment 200, equipment 400 orEquipment 600) interact, with obtain user input (for example, phonetic entry) and provide to user input response (for example,As voice output).I/O processing modules 728 are in company with receiving user's input together or after the user input is received soonOptionally obtain the contextual information associated with user's input from user equipment.Contextual information is included specific to userData, vocabulary, and/or the preference related to user's input.In some instances, contextual information, which is additionally included in, receives useThe software and hardware state of user equipment when family is asked, and/or it is related to the surrounding environment of the user when receiving user's requestInformation.In some instances, I/O processing modules 728 also send the follow-up problem relevant with user's request to user, and fromFamily, which receives, answers.When user's request is received by I/O processing modules 728 and user's request includes phonetic entry, I/O processing mouldsPhonetic entry is forwarded to STT processing modules 730 (or speech recognition device) by block 728, is changed for speech text.

STT processing modules 730 include one or more ASR systems.One or more ASR systems, which can be handled, passes through I/OPhonetic entry received by processing module 728, to produce recognition result.Each ASR system includes front end speech preprocessor.Front end speech preprocessor extracts characteristic features from phonetic entry.For example, front end speech preprocessor is held to phonetic entryTo extract spectral signature, phonetic entry is characterized as representative multi-C vector sequence by the spectral signature for row Fourier transform.In addition,Each ASR system includes one or more speech recognition modelings (for example, sound model and/or language model) and implementation oneOr multiple speech recognition engines.The example of speech recognition modeling includes hidden Markov model, gauss hybrid models, depth nerveNetwork model, n gram language models and other statistical models.The example of speech recognition engine includes being based on dynamic time warpingEngine and engine based on FST (WFST).One or more speech recognition modelings and one or more languagesSound identification engine is used for the characteristic features extracted for handling front end speech preprocessor, to produce instant recognition result (exampleSuch as, phoneme, phone string and sub- word) and finally produce text identification result (for example, words, words string or symbol sebolic addressing).OneIn a little examples, phonetic entry at least partially through third party's service or user equipment (for example, equipment 104,200,400 or600) located, to produce recognition result.Once STT processing modules 730 are generated comprising text string (for example, words, words sequenceRow or symbol sebolic addressing) recognition result, just by the recognition result be sent to natural language processing module 732 for be intended to push awayIt is disconnected.

The more details that relevant voice turns text-processing are that in September, 2011 submission of 20 days is entitled" Consolidating Speech Recognition Results " U.S. Utility Patent patent application serial numbers 13/236,It is described in 942, the entire disclosure is herein incorporated by reference.

In some instances, STT processing modules 730 include and/or accessed to know via phonemic alphabet table modular converter 731The vocabulary of malapropism word.One or more candidates of words of each vocabulary words with being represented in speech recognition phonemic alphabet table pronounceIt is associated.Specifically, it can recognize that the vocabulary of words includes the words associated with multiple candidates pronunciation.For example, vocabulary include withCandidate pronouncesWithAssociated words " tomato ".In addition, vocabulary words is with being based on userThe self-defined candidate pronunciation of previous phonetic entry is associated.Such self-defined candidate's pronunciation is stored in STT processing modules 730In and it is associated with specific user via the user profile in equipment.In some instances, the spelling based on words andOne or more language and/or phoneme rules come determine the candidate of words pronounce.In some instances, candidate's hair is manually generatedSound, such as generated based on known RP.

When receiving phonetic entry, STT processing modules 730 are used to determine phoneme corresponding with phonetic entry (for example, makingWith sound model), and then attempt to determine the words (for example, using language model) for matching the phoneme.If for example, STTProcessing module 730 identifies aligned phoneme sequence/t firstCorresponding to a part for phonetic entry, it is then based on wordThe index 744 that converges determines whether the sequence corresponds to words " tomato ".

In some instances, STT processing modules 730 determine the words in language using approximate matching techniques.Therefore, exampleSuch as, STT processing modules 730 can determine that aligned phoneme sequenceCorresponding to words " tomato ", instant specific phoneme sequenceRow are not a candidate phoneme in the candidate phoneme sequence of the words.

The natural language processing module 732 (" natural language processor ") of digital assistants can be obtained by STT processing modulesThe words of 730 generations or the sequence (" symbol sebolic addressing ") of symbol, and attempt the formula symbol sebolic addressing and identified by digital assistantsOne or more " executable to be intended to " are associated." executable to be intended to " is represented to be performed by digital assistants and can had in taskThe task for the associated task flow implemented in flow model 754.Associated task flow is that digital assistants are taken to perform taskA series of actions by programming and step.The limit of power of digital assistants depends in task flow model 754 implementing simultaneouslyThe value volume and range of product of the task flow of storage, or in other words, the quantity of " executable to be intended to " that is identified depending on digital assistants andSpecies.It is inferred to correctly from user's request with natural language expressing however, the validity of digital assistants additionally depends on assistant" one or more executable be intended to " ability.

In some instances, in addition to the words or the sequence of symbol that are obtained from STT processing modules 730, at natural languageReason module 732 also receives the contextual information associated with user's request, such as from I/O processing modules 728.At natural languageReason module 732 optionally using contextual information clearly, supplement and/or further definition be comprised in from STT processing modulesInformation in 730 symbol sebolic addressings received.Contextual information includes the hardware and/or software of such as user preference, user equipmentState, the previous friendship between sensor information, digital assistants and the user collected soon before, during or after user asksMutually (for example, dialogue), etc..As described herein, in some instances, contextual information is dynamic and during with dialogueBetween, position, content and other factors and change.

In some instances, natural language processing is based on such as ontologies 760.Ontologies 760 are to include many sectionsThe hierarchy of point, each node represent " executable to be intended to " or with one of " executable to be intended to " or other " attributes " or morePerson's related " attribute ".As described above, the task that " executable to be intended to " expression digital assistants are able to carry out, the i.e. task are " canPerform " or can be carried out." attribute " represents the parameter associated with the son aspect of executable intention or another attribute.KnowledgeBetween executable intention node and attribute node in body 760 linking limit by attribute node represent parameter how subordinateIn by executable being intended to node and representing for task.

In some instances, ontologies 760 are made up of executable intention node and attribute node.In ontologies 760Interior, each executable node that is intended to is connected directly to or is connected to one or more category by attribute node among one or moreProperty node.Similarly, each attribute node be connected directly to or by attribute node among one or more be connected to one orMultiple executable intention nodes.For example, as seen in figure 7 c, ontologies 760 include " dining room reservation " node (that is, executable meaningNode of graph).Attribute node " dining room ", " date/time " (for subscribing) and " colleague's number " are connected directly to executable meaningNode of graph (that is, " dining room reservation " node).

In addition, attribute node " style of cooking ", " price range ", " telephone number " and " position " is the son in attribute node " dining room "Node, and " dining room reservation " node (that is, executable to be intended to node) is connected to by middle attribute node " dining room ".AgainSuch as, as seen in figure 7 c, ontologies 760 also include " setting is reminded " node (that is, another executable intention node).Attribute sectionPoint " date/time " (being reminded for setting) and " theme " are each attached to " setting is reminded " node (for reminding).Due to categoryProperty " date/time " to carry out dining room reservation task and setting prompting both tasks it is related, therefore attribute node " date/Time " is connected to both " dining room reservation " node and " setting is reminded " nodes in ontologies 760.

The executable node that is intended to is collectively depicted as being in " domain " together with its concept node connected.In this discussion, each domainWith it is corresponding it is executable be intended to associated, and be related to a group node associated with specific executable intention (and between themRelation).For example, the ontologies 760 shown in Fig. 7 C are included in showing for the dining room subscribing domain 762 in ontologies 760Example and the example for reminding domain 764.Dining room subscribing domain include it is executable be intended to node " dining room reservation ", attribute node " dining room "," date/time " and " colleague's number " and sub- attribute node " style of cooking ", " price range ", " telephone number " and " position ".CarryDomain 764 wake up including can perform intention node " set and remind " and attribute node " theme " and " date/time ".In some examplesIn, ontologies 760 are made up of multiple domains.One or more attribute nodes are shared with other one or more domains in each domain.ExampleSuch as, in addition to dining room subscribing domain 762 and prompting domain 764, " date/time " attribute node is also with many not same areas (for example, rowJourney arranges domain, travel reservations domain, film ticket domain etc.) it is associated.

Although Fig. 7 C show two example domains in ontologies 760, other domains include such as " lookup film "," initiation call ", " search direction ", " arrangement meeting ", " transmission message " and " answer that problem is provided ", " reading rowTable ", " offer navigation instruction ", " instruction for task is provided " etc.." transmission message " domain and " transmission message " executable intentionNode is associated, and also includes attribute node such as " one or more recipients ", " type of message " and " message text ".CategoryProperty node " recipient " further can be defined for example by sub- attribute node such as " recipient's name " and " message addresses ".

In some instances, " lookup media item " domain includes super domain, and the super domain is included with searching or obtaining media itemAssociated many executable intention nodes.For example, " lookup media item " domain includes executable intention node, such as " tool is obtainedHave the media item of nearest release data ", " obtain personalized digital media item recommend " or " obtaining the information associated with media item ".

In some instances, ontologies 760 include digital assistants it will be appreciated that and it is worked all domains (andTherefore executable intention).In some instances, ontologies 760 are such as by adding or remove whole domain or node, orChanged by changing the relation between the node in ontologies 760.

In some instances, by the node clusters associated to multiple related executable intentions in ontologies 760Under " super domain ".For example, " travelling " super domain may include the attribute node relevant with travelling and the executable cluster for being intended to node.The executable intention node relevant with travelling include " plane ticket booking ", " hotel reservation ", " automobile leasing ", " route planning ", " seekLook for point of interest " etc..Executable intention node under same super domain (for example, " travelling " super domain) has multiple shared categoryProperty node.For example, for " plane ticket booking ", " hotel reservation ", " automobile leasing ", " route planning ", " searching point of interest " canPerform be intended to nodes sharing attribute node " original position ", " destination ", " departure date/time ", " date of arrival/time " andOne or more of " colleague's number ".

In some instances, each node in ontologies 760 and and the attribute represented by node or executable intentionOne group of relevant words and/or phrase are associated.The words and/or phrase of the respective sets associated with each node be and nodeAssociated so-called " vocabulary ".The words and/or phrase of the respective sets associated with each node are stored in and by nodesIn the attribute of expression or the executable glossarial index 744 for being intended to be associated.For example, return to Fig. 7 B, the node with " dining room " attributeAssociated vocabulary includes words such as " cuisines ", " drinks ", " style of cooking ", " starvation ", " eating ", " Pizza ", snack food, " meals "Deng.And for example, the vocabulary associated with the node of the executable intention of " initiation call " includes words and phrase such as " is exhaledCry ", " making a phone call ", " dialing ", " with ... take on the telephone ", " calling the number ", " phoning " etc..Glossarial index 744 is optionalGround includes the words and phrase of different language.

Natural language processing module 732 receives symbol sebolic addressing (for example, text string) from STT processing modules 730, and determines to accord withWhich node word in number sequence involves.In some instances, if it find that words or phrase in symbol sebolic addressing and knowledge sheetOne or more of body 760 node is associated (via glossarial index 744), then words or phrase by " triggering " or " activation " thatA little nodes.Based on the quantity and/or relative importance for having activated node, natural language processing module 732 will select executable meaningExecutable being intended to perform digital assistants as user view of the task in figure.In some instances, selection has mostThe domain of more " triggering " nodes.In some instances, selection has highest confidence level (for example, each having triggered node based on itsRelative importance) domain.In some instances, domain is selected based on the combination of the quantity and importance that have triggered node.In some examples, additive factor is further contemplated during node is selected, whether such as digital assistants previously are correctly explainedFrom the similar request of user.

User data 748 includes the information specific to user, such as vocabulary, user preference, user specific to userOther short-term or long-term informations of location, the default language of user and second language, the contacts list of user and every user.In some instances, natural language processing module 732 is supplemented included in user's input using the information specific to userInformation, further to limit user view.For example, " inviting my friends to participate in my birthday party " is asked for user, fromRight language processing module 732 is able to access that user data 748 to determine " friend " is that who and " birthday party " will be in whatWhen where hold, clearly provide this type of information in its request without user.

Other details based on symbol string search ontologies are being filed in the entitled " Method on December 22nd, 2008And Apparatus for Searching Using An Active Ontology " U.S. Utility Patent application sequenceIt is described in row number 12/341,743, the entire disclosure is herein incorporated by reference.

In some instances, once natural language processing module 732 be based on user's request identify executable intention (orDomain), just generating structureization inquires about executable intention to represent to be identified to natural language processing module 732.In some instances,Structuralized query includes the parameter for one or more nodes in the executable domain being intended to, and in the parameter at leastSome parameters are filled with the customizing messages specified in user asks and requirement." me is helped to subscribe evening in sushi shop for example, user saysUpper 7 points of seat ".In this case, natural language processing module 732 can be based on user's input executable intention is correctGround is identified as " dining room reservation ".According to ontologies, the structuralized query in " dining room reservation " domain include parameter such as { style of cooking },{ time }, { date }, { colleague's number } etc..In some instances, based on phonetic entry and using STT processing modules 730 fromThe text drawn in phonetic entry, natural language processing module 732 are directed to dining room subscribing domain generating portion structuralized query, whereinPartial structured inquiry includes parameter { style of cooking=" sushi class " } and { time=" at night 7 points " }.However, in this example,User spoken utterances include the information for being not enough to complete the structuralized query associated with domain.Therefore, based on currently available information, tyingOther not specified call parameters such as { colleague's number } and { date } in structureization inquiry.In some instances, natural language processingSome parameters that module 732 is inquired about using the contextual information interstitital textureization received.For example, in some instances, if withFamily request " nearby " sushi shop, natural language processing module 732 are used for carrying out interstitital texture from the GPS coordinates of user equipment{ position } parameter in inquiry.

In some instances, natural language processing module 732 is by the structuralized query of generation (including any completed ginsengNumber) it is sent to task flow processing module 736 (" task stream handle ").Task flow processing module 736 is configured as from natural languageSay that processing module 732 receives structuralized query, complete structuralized query (if necessary), and perform " completion " user and finally askRequired action.In some instances, various processes provide in task flow model 754 necessary to completing these tasks.In some examples, task flow model 754 include be used for obtain the additional information from user process, and for perform with canPerform the task flow for being intended to associated action.

As described above, in order to complete structuralized query, task flow processing module 736 needs to initiate additional pair with userWords, to obtain additional information, and/or understand fully the language being potentially ambiguous.In view of the necessity of such interaction, at task flowReason module 736 calls dialogue stream processor module 734 to participate in the dialogue with user.In some instances, stream handle is talked withModule 734 determines how (and/or when) and asks additional information to user, and receives and processing user response.At I/OProblem is provided to user and received from user and answered by reason module 728.In some instances, dialogue stream processing module 734 passes throughDialogue output is presented to user from audio and/or video frequency output, and receives via oral or physics (for example, click) responseInput from user.Continue example above, when task flow processing module 736 call dialogue stream processing module 734 with forWhen the structuralized query associated with domain " dining room reservation " is to determine " colleague's number " and " date " information, dialogue stream processing module734 generation problems such as " shared how many people's dining" and " specifically when have dinner" to be sent to user.Come from once receivingThe answer of user, dialogue stream processing module 734 is then inquired about with missing information interstitital textureization, or is transferred information at task flowModule 736 is managed to complete missing information according to structuralized query.

Once task stream handle 736 is intended to complete structuralized query for executable, task flow processing module 736 is justSet about performing the final task associated with can perform intention.Therefore, task flow processing module 736 is wrapped according in structuralized queryThe special parameter contained performs step and the instruction in task flow model.For example, it is intended to appointing for " dining room reservation " for executableBusiness flow model include be used for contact dining room and actually ask special time be directed to specific colleague's number reservation the step of andInstruction.For example, by using structuralized query such as：Dining room is subscribed, dining room=ABC coffee-houses, date=2012/3/12, whenBetween=at 7 points in afternoon, colleague's number=5 people }, task flow processing module 736 performs following steps：(1) ABC coffee-houses are signed inServer or dining room reservation system are such as(2) inputting date in the form on website, time and sameRow number information, (3) submit form, and (4) to make calendar for the reservation in user's calendar.

In some instances, auxiliary of the task flow processing module 736 in service processing module 738 (" service processing module ")Lower the completing to be asked in user's input of the task provides the informedness answer asked in user's input.For example, service centerReason module 738 can represent task flow processing module 736 and initiate call, set calendar, invocation map search, callThe other users application program installed on user equipment interacts with the other applications, and calls third party's service(for example, portal website, social network sites, banking portal site etc. are subscribed in dining room) or interacted with third party's service.In some examplesIn, the agreement and application programming interface needed for each service are specified by the respective service model in service model 756(API).Service processing module 738 is directed to the appropriate service model of service access, and according to service model according to needed for the serviceAgreement and API generation for the service request.

For example, if dining room has enabled online booking service, service model is submitted in dining room, and the service model, which is specified, to be carried outThe call parameter of reservation and the API that the value of call parameter is sent to online booking service.By task stream handle 736During request, service processing module 738 establishes the net with online booking service using the network address being stored in service modelNetwork connects, and by the call parameter (for example, time, date, colleague's number) of reservation with the API's according to online booking serviceForm is sent to online booking interface.

In some instances, natural language processing module 732, dialogue stream processing module 734 and task flow processing module 736Use jointly and repeatedly to infer and limit the intention of user, acquisition information further to define and refine user view, simultaneouslyResponse (that is, export to user, or complete task) is ultimately generated, to meet the intention of user.The response generated is at least portionFulfil the dialogue response to phonetic entry of user view with dividing.In addition, in some instances, the response generated is as languageThe output of sound output.In these examples, the response generated is sent to voice synthetic module 740 (for example, phonetic synthesisDevice), wherein the response can be handled to synthesize the dialogue response of speech form.In another example, the response that is generated be withMeet that the user in phonetic entry asks related data content.

Voice synthetic module 740 is configured as synthesis and is used to present to the voice output of user.The base of voice synthetic module 740Voice output is synthesized in the text that digital assistants provide.For example, the dialogue response generated is the form of text string.Voice closesText string is converted to audible voice output into module 740.Voice synthetic module 740 uses any appropriate phonetic synthesis skillArt is so that from text generation voice output, the speech synthesis technique includes but is not limited to continuous synthesis, Unit selection synthesis, binarySound synthesis, the specific synthesis in domain, formant synthesis, voice parameter synthesis, the synthesis based on hidden Markov model (HMM), withAnd sine wave synthesis.In some instances, voice synthetic module 740 is configured as synthesizing based on phone string corresponding with wordsEach words.For example, phone string is associated with the words in the dialogue response generated.Phone string is stored in related to wordsIn the metadata of connection.Phonetic synthesis model 740 is configured as the phone string in directly processing metadata, to synthesize speech formWords.

It is in some instances, opposite (or in addition to using voice synthetic module 740) with using voice synthetic module 740,Perform phonetic synthesis on remote equipment (for example, server system 108), and by the voice of synthesis send to user equipment withFor exporting to user.For example, this output that digital assistants wherein can occur generate on the server systems some are specificIn implementation.And because server system typically has stronger disposal ability or resource than user equipment, therefore being capable of BillyThe actual result synthesized with client-side obtains higher-quality voice output.

More details about digital assistants are found in the entitled " Intelligent for being filed on January 10th, 2011Automated Assistant " U.S. Utility Patent application number 12/987,982 and it is filed in September in 2011 30Entitled " Generating and Processing Task Items That Represent Tasks toIn Perform " U.S. Utility Patent application number 13/251,088, the entire disclosure is incorporated by reference thisText.

4th, the process for the digital assistants probed into for media is operated

Fig. 8 A-C show the process 800 for being used for the digital assistants that media are probed into according to the operation of various examples.Fig. 9 A-B,Figure 10 and Figure 11 shows to operate the digital assistants on the user equipment 903 probed into for media according to the user 901 of various examplesInteraction.For example, carry out implementation procedure 800 using the one or more electronic equipments for realizing digital assistants.In some examplesIn, perform the process on the client-server system (for example, system 100) for realizing digital assistants.In some instances,User equipment (for example, equipment 104,200,400 or 600) on perform the process.In process 800, some frames optionally byMerge, the order of some frames is optionally changed, and some frames are optionally omitted.Additionally, it should be realized that show at someIn example, the subset below with reference to the feature described in Fig. 8 A-C is only carried out in process 800.

In frame 802, phonetic entry is received (for example, at I/O processing modules 728 and via microphone from user213).The phonetic entry represents the request to one or more media items.For example, with reference to figure 9A, phonetic entry for " he, Siri,Play the hip-hop music that some I like ".In another example shown in Figure 10, phonetic entry for " he, Siri, play oneA little music for being adapted to barbecue ".In another example shown in Figure 11, phonetic entry for " he, Siri, play what some newly went outMusic ".Representing other examples of the phonetic entry of the request to one or more media items includes：" what I should listen ", " push awayRecommend some music ", " what content today provides ", " he, Siri, DJ ", " putting some beautiful beats to me " as me," search recommend playlist ", " playing any pleasing to the ear corpus ", " playing the content that some I like ", " any recommendationBody-building musical ", " music for searching newest distribution ", " the new rock and roll song that hot topic please be play " etc..

At frame 804, it is determined that whether (for example, using natural language processing module 732) be right on the phonetic entry of frame 802Should be in the user view for obtaining the personalized recommendation for media item.Specifically, it is corresponding with phonetic entry to include determination for the determinationUser view (for example, executable be intended to).User view is determined in the way of above with reference to described in Fig. 7 A-C.Specifically,Words in phonetic entry or phrase are parsed and entered with the words of glossarial index (for example, glossarial index 744) or phraseRow compares.The various nodes of the words or phrase of glossarial index and ontologies (for example, ontologies 760) are (for example, can holdRow is intended to or domain) it is associated, therefore based on comparing, corresponding with phonetic entry words or phrase are by " triggering " or " activation ".The node with highest confidence level in the node that selection is activated.Therefore, determined by corresponding with the phonetic entry of frame 802User view is executable intention corresponding with selected node.

Determine whether phonetic entry corresponds to obtained for media item based on selected executable intention nodeThe user view that propertyization is recommended.If there is selected node acquisition to be held for the corresponding of personalized recommendation of media itemRow is intended to, it is determined that phonetic entry corresponds to the user view for obtaining the personalized recommendation for media item.If on the contrary, instituteThe node of selection has the corresponding executable intention outside the personalized recommendation obtained for media item, it is determined that phonetic entryThe user view for obtaining the personalized recommendation for media item is not corresponded to.

In some instances, determine whether phonetic entry corresponds to the user's meaning obtained for the personalized recommendation of media itemFigure includes determining whether phonetic entry includes one or more of multiple predetermined phrases predetermined phrase.SpecificallyGround, glossarial index include corresponding with the executable intention node for obtaining the personalized recommendation for being directed to media item multiple predeterminedPhrase.The multiple predetermined phrase is included for example：" recommending for me ... [music] ", " DJ ", " broadcasting as meSome tune/beats ", " what content I should play ", " playing [music] that some I like ", " search and be adapted to ...Some good [music] " etc..One or more of these phrases phrase is included based on phonetic entry, phonetic entry is mapped to and obtainedThe executable intention of the personalized recommendation for media item is taken, and determines that the phonetic entry corresponds to and obtains for media itemThe user view of personalized recommendation.For example, in figure 9 a, phonetic entry 902 includes phrase and " plays [sound that some I likeIt is happy] ", the phrase is corresponding multiple predetermined for the executable intention node of the personalized recommendation of media item with acquisitionA predetermined phrase in phrase.Therefore, in this example, determine that phonetic entry 902 corresponds to obtain and be directed to mediaThe user view of the personalized recommendation of item.

In some instances, determine whether phonetic entry corresponds to the user's meaning obtained for the personalized recommendation of media itemWhether the quantity for the parameter that figure includes determining to limit in phonetic entry is less than predetermined threshold value.Specifically, if in languageThe quantity of the parameter (for example, media parameter) limited in sound input is less than predetermined threshold value, it is determined that the phonetic entry pairShould be in the user view for obtaining the personalized recommendation for media item.For example, " I should play any content for phonetic entry" beThe request related to playing music.However, the request is wide in range and fuzzy because it does not limit any media parameter, it is all as scheduledThe artist of prestige, corpus, school or, issuing date.In this example, determine that the phonetic entry corresponds to obtain and be directed to mediaThe user view of the personalized recommendation of item, because the quantity of the parameter (for example, media parameter) limited in the phonetic entry is smallIn predetermined threshold value (for example, one).

In some instances, determine whether phonetic entry corresponds to the user view for obtaining personalized recommendation media including trueWhether the fixed phonetic entry is related to user.Specifically, phonetic entry is parsed and is related to user's (example to determine if to includeSuch as, " I ", " being adapted to me ", " give me ", " I " etc.) words or phrase.It is related to for example, determining whether following phrase includesThe words of user：" having anything to recommend mine ", " giving me what a surprise ", " today has anything to recommend my content"In some examples, whether determination process is based on determining in phonetic entry comprising being related to the words or phrase and and media of userRelated words or phrase (for example, " listening ", " music ", " broadcasting ", " tune ", " DJ " etc.).For example, determine that following phrase includesIt is related to the words of user and the words or phrase related to media：" recommending some hip-hop music to me ", " DJ " as me," what I should listen ", " have what can give recommend mine " or " playing some tunes for me ".Therefore, based on phonetic entryComprising the words or phrase for being related to user, determine that the phonetic entry corresponds to the user view for obtaining personalized recommendation media.

In response to determining that phonetic entry corresponds to the user view of acquisition personalized recommendation media, frame 806 is performed.In frameAt 806, at least one media item is obtained (for example, using natural language processing mould from the corpus specific to user of media itemBlock 732, task flow processing module 736, and/or service processing module 738).In some instances, at least one media item bagInclude song, corpus, video, film or playlist.The corpus specific to user of media item is the matchmaker specific to userThe personalized corpus of body item.Specifically, the corpus specific to user of media item based on the data associated with user andGeneration.The more detailed description of the corpus specific to user of media item is provided below with reference to frame 810.At frame 806, fromThe corpus specific to user of media item, which obtains at least one media item, to be included performing one in frame 808-816 described belowIndividual or multiple frames.For example, use natural language processing module 732, task flow processing module 736 and/or service processing module 738One or more of perform frame 808-816.

At frame 808, it is determined that the media parameter limited in phonetic entry is (for example, using natural language processing module732).Then it is corresponding come the executable intention for generating with obtaining the personalized recommendation for being directed to media item using the media parameter of restrictionStructuralized query.Specifically, glossarial index (for example, glossarial index 744) includes and each media in multiple media parametersWords corresponding to parameter or phrase.Therefore, by comparing the words or phrase and the words or phrase of glossarial index of phonetic entryTo determine the media parameter limited in phonetic entry.For example, glossarial index includes the words or short associated with media parameterLanguage { school }.Words or phrase are included for example：" hip-hop ", " R＆B ", " jazz ", " punk ", " rock and roll ", " prevalence ", " allusion "," bluegrass " etc..In the example shown in Fig. 9 A, phonetic entry is determined based on phrase " hip-hop " is detected in phonetic entry 902902 are defined to media parameter { school } " hip-hop ".Therefore, in this example, generate u and obtain the personalization for being directed to media itemStructuralized query is with including media parameter { school }=" hip-hop " corresponding to the executable intention recommended.

Another media parameter that can be determined from phonetic entry is { issuing date }.Media parameter { issuing date } refers toThe issuing date of the media item of user's concern.Issuing date is such as exact date or date range.With the media parameter { issue datePhase } associated words or phrase include for example：" the seventies ", " the eighties ", " the nineties ", " last decade "," 2008 ", " after in March, 2016 " etc..In one example, some the eighties " are played for me based on phonetic entryWords " the eighties " in tune " determines that media parameter { issuing date } is defined to " 1980-1989 " by the phonetic entry.Therefore, in this example, generation structuralized query corresponding with the executable intention for obtaining the personalized recommendation for being directed to media itemWith including media parameter { issuing date }=" 1980- 1989 ".

In some instances, it is limited based on the context of phonetic entry the date or time section in phonetic entry to be explainedStator school and non-limiting issuing date.For example, based on " 70 in phonetic entry " playing some seventies of punk's music for me "Age " determines that the phonetic entry will be defined to " 1970-1979 " period.In response to determining that phonetic entry limits the timeSection, determines whether the phonetic entry limits the school associated with a period.In this example, phonetic entry includes and matchmakerPhrase " punk " corresponding to body parameter { school }.Because the period " seventies " in phonetic entry modifies school " punk ", becauseThis determination phonetic entry is defined in the school " punk " associated with the period " seventies ".In response to determining phonetic entry limitThe fixed school associated with the period that is limiting, the school of period or restriction based on restriction determine subflow group.For example, baseDetermine that subflow is sent " punk's music seventies " in the period " seventies " of restriction and the school " punk " of restriction.Therefore, existIn the example, the structuralized query generated for the executable intention obtained for the personalized recommendation of media item is joined including mediaNumber { school }=" punk's music seventies ".It is worth noting that, with the period of restriction is construed into the media parameter { issue datePhase } on the contrary, the period of the restriction to be more accurately construed to a part for media parameter { school }.In this way, willPhonetic entry is construed to the actual intention for more accurately reflecting user, so as to which more relevant media item is provided to user.For example,At frame 806, the subflow based on determination sends " punk's music seventies " to obtain at least one media item, including issuing dateMedia item outside period 1970-1979.Specifically, each media item at least one media item includes instructionThe metadata of school " punk's music seventies ".

Be determined to be in other media parameters for being limited in phonetic entry include such as { activity }, { mood }, { occasion },{ edit list }, { political orientation } or { skills involved in the labour }.Each media in these media parameters are hereafter described successivelyParameter.For example, media parameter { activity } refer to the activity performed by user and with words or phrase such as " body-building ", " fromHabit ", " barbecue ", " sleep ", " driving ", " study ", " drawing " etc. are associated.In one example, " searched based on phonetic entrySome be adapted to study music " in words " study " come determine the phonetic entry by media parameter activity } be defined to " learningPractise ".In another example shown in Figure 10, phrase corresponding with media parameter { activity } is detected in phonetic entry 1002" barbecue ".Therefore, in this example, determine that media parameter { activity } is defined to " roast " by phonetic entry 1002.

Media parameter { mood } refer to user sensation or psychological condition and with words or phrase such as " cheerful and light-hearted ", " sadWound ", " anger ", " loosening ", " powerful ", " excitement ", " romance " etc. are associated.In one example, " me is given based on phonetic entryRecommend some cheerful and light-hearted music " in words " cheerful and light-hearted " be defined to determining the phonetic entry by media parameter { mood } it is " joyousIt hurry up ".

Media parameter { occasion } refer to the occasion associated with special time period and with words or phrase such as " ChristmasSection ", " birthday ", " summer ", " winter ", " All Saints' Day ", " New Year ", " Easter " etc. are associated.In one example, based on languageSound input " playing some Christmas Day music " in words " Christmas Day " come determine the phonetic entry by media parameter { occasion } limitIt is set to " Christmas Day ".

Media parameter { edit list } refers to the predetermined list of media items compiled by media representatives, such asRolling Stones magazines, Billboard magazines, Shazam etc..Exemplary edit list is included for example：Billboard hundredBig Billboard UK, Billboard hit parades, the Big-corpus lists of Billboard 200, American top 40,500 big songs of Rolling Stones, the Big-corpus of Rolling Stones 500, Rolling Stones hundred are most greatArtist etc..Media parameter { edit list } with and the corresponding words of these lists or phrase be associated.For example, it is based on languagePhrase " hundred big Billboard UKs of Billboard " in sound input " playing song from hundred big Billboard UKs of Billboard for me ", reallyMedia parameter { edit list } is defined to " hundred big Billboard UKs of Billboard " by the fixed phonetic entry.

Media parameter { political orientation } refer to the political orientation of user and with words or phrase such as " conservatives ", " fromBy sending ", " right flank ", " Right deviation ", " left wing ", " "Left"-deviationist " etc. it is associated.In one example, " searched based on phonetic entry for meWords " conservatives " in conservatives' news " is defined to determining the phonetic entry by media parameter { political orientation } " conservativeGroup ".In this example, the alternative media item determined in frame 812 more likely with conservatives media sources (for example, Fox News,Drudge Report etc.) rather than liberal's media source (for example, Huffington Post, New York Times etc.) phaseAssociation.

Media parameter { skills involved in the labour } refers to that user is proficient in degree to technical theme.In request for discussion technical themeDocumentary film when, the media parameter is related.Specifically, media parameter { skills involved in the labour } and words or phrase be such as" technical very strong ", " layman ", " science ", " should be readily appreciated that ", " simple ", " advanced " etc. are associated.In an exampleIn, based on the words " technical very strong " in phonetic entry " searching some technical very strong documentary films for me " come reallyMedia parameter { skills involved in the labour } is defined to " height " by the fixed phonetic entry.In some instances, based on user to being askedTheme be familiar with journey degree of coming infer media parameter { skills involved in the labour }.Specifically, if user often asks relevant universeThe documentary film (for example, being based on user's Request Log) of airship largely has if user has in his/her personal media librariesThe documentary film of spaceship is closed, then can determine that the user is very familiar for the theme of spaceship, and therefore in this example,It is " height " to be inferred to media parameter { skills involved in the labour }.

At frame 810, the corpus specific to user of media item is determined.Determine the language material specific to user of media itemStorehouse includes obtaining the customer identification information associated with user.Customer identification information includes the correspondence for example for accessing media itemThe corpus specific to user user account log-on message or user password information.Then known using customer identification informationNot and access media item multiple corpus specific to user in appropriate media item the corpus specific to user withObtain at least one media item.

In some instances, wherein the user equipment of the phonetic entry of sink block 802 with comprising customer identification information onlyOne user profile (for example, being stored in user data 748) is associated.Therefore, it is based on setting with user at frame 810Standby associated user profile retrieves customer identification information.So as to identify media based on the customer identification information of retrievalThe corpus specific to user of item.

In some instances, customer identification information is retrieved when verifying user identity.Specifically, by using frame 802Phonetic entry performs speaker identification, to verify the identity of user.For example, generated by the phonetic entry compared by frame 802Vocal print and the reference vocal print associated with specific user perform speaker identification.It is if it is determined that defeated by the voice of frame 802Enter the vocal print of generation under conditions of higher than confidence threshold value with reference to voice print matching, then the identity of user is verified.ShouldRecognize, other auth methods, finger print identifying, password authentification etc. can be implemented.In the identity of good authentication user,Retrieval customer identification information (for example, being retrieved from user profile) corresponding with the identity of the empirical tests of user.SoIdentified and accessed corresponding to media item specific to the corpus of user using customer identification information afterwards.Institute based on user is reallyFixed identity, the corresponding language material specific to user of media item is determined from multiple corpus specific to user of media itemStorehouse.

In some instances, the corpus specific to user of media item is stored in the long-range clothes separated with user equipmentIt is engaged on device.For example, the corpus specific to user of media item is as providing the media services of media item (for example, a kind of or moreKind of media services 120-1) a part stored.Customer identification information is needed to access the language material specific to user of media itemStorehouse.In some instances, the crypto token comprising customer identification information generates and is sent to media clothes in a user deviceBusiness.Then media services are decrypted to token and access media item using the customer identification information in decrypted tokenThe corresponding corpus specific to user, to obtain at least one media item.

In some instances, according to the media preferences of specific user come the corpus specific to user of customized media item.For example, the corpus specific to user of media item is generated using media-related data previously associated with user.SpecificallyGround, based on selected by previous user, request or refusal media item and generate the corpus specific to user of media item.ExampleSuch as, if it is determined that user often ask, browse, select or play with some media parameters (for example, { school }=" prevalence " or{ artist }=" Katy Perry ") media item, then generate the corpus specific to user of media item to be advantageous to haveThe media item of those parameters.Similarly, if it is determined that user refuses have other some ginsengs (for example, { mood }=sadness) all the timeRecommendation media item, then generate the corpus specific to user of media item to be unfavorable for the media with those other specifications.

In some instances, the language material specific to user of media item is generated based on the information from user profileStorehouse.User profile includes the information for characterizing user, country /region such as associated with user, the spoken language of user, userAge or the activity often participated in of user.Based on the information, the corpus specific to user of media item is generated, with favourableIn the media item with the media parameter for supplementing the information.For example, if user profile instruction user is mainly spoken English simultaneouslyAnd for 12 years old, then the corpus specific to user of media item is generated, to be advantageous to say or sing English and with nearest (exampleSuch as, over nearly 5 years) media item at issuing date.

In addition, in some instances, the specific of media item is generated based on the personal library of the media item associated with userIn the corpus of user.The personal library of media item includes the media item (for example, song, film etc.) gathered by user.Media itemPersonal library be stored on user equipment and/or be stored on the remote server associated with user account.Generate mediaThe corpus specific to user of item, there are those media being similar in the individual media Xiang Ku of user to be advantageous to media itemThe media item of the media parameter of item.For example, if the individual media Xiang Ku of user includes artist Katy Perry many languagesExpect storehouse, then generation specific to user corpus, be advantageous to artist Katy Perry or similar to Katy PerryThe associated media items of artist such as Avril Lavigne.

In some instances, generate the corpus specific to user of media item so that the media item specific to userCorpus in media item include the metadata for indicating corresponding with respective media item multiple media parameters.Specifically, eachThe metadata of media item limits any one media parameter in above-mentioned media parameter, such as { artist }, { school }, { subflowGroup }, { issuing date }, activity }, { mood }, { occasion }, { edit list }, { political orientation } or { skills involved in the labour }.ProfitRecommend suitable media item to user with metadata based on the media parameter limited in being inputted in user speech.For example, mediaThe corpus specific to user of item includes instrumental music song " the Chariots of with the metadata for indicating following media parameterFire”：{ title }=" Chariots of Fire ", { school }=" original music；Instrumental music ", { composer }=Vangelis,{ issuing date }=" in March, 1981 ", { activity }=" running " and { mood }=" pursuing a goal with determination ".Therefore, if connect at frame 802Phonetic entry " playing some instrumental musics of pursuing a goal with determination for being adapted to run for me " is received, then based on the media parameter limited in phonetic entry(that is, { school }=" instrumental music ", { activity }=" running " and { mood }=" pursuing a goal with determination "), " Chariots of Fire " are matchmaker to songAlternative media item that is being identified in the corpus specific to user of body item and recommending user.

In some instances, the metadata of the media item in the corpus specific to user of media item is based on analysis and matchmakerThe associated concrete property of body item and by Intelligent generation.Specifically, determined by analyzing the voice data of media item eachThe music rhythm (for example, beat number per minute) of media item.Based on identified music rhythm come determine media item specific to{ activity } media parameter of media item in the corpus of user.For example, media item with faster music rhythm with it is more livingBold and vigorous movable body-building, hikings etc. are associated.On the contrary, media item and more passive work with slower music rhythmDynamic sleep, meditation etc. are associated.Therefore, correlation { activity } media parameter determined based on music rhythm is included in phaseIn the metadata for the media item answered.

In addition, in some instances, the note based on each media item transfer to determine media item specific to user's{ mood } media parameter of media item in corpus.For example, the music tone of each media item is analyzed, to determine and audio numberAccording to associated music tone (for example, the big tune of c major, G, A ditties etc.).Media item with master music tone with more actively andCheerful and light-hearted mood " cheerful and light-hearted ", " optimism ", " optimistic ", " excitement " etc. are associated, and the media item with time music tone withMore grieved mood " sadness ", " grief " etc. are associated.

At frame 812, the corpus specific to user of media item is determined based on the media parameter of identified frame 808In multiple alternative media items.For example, search is performed to identify the spy of media item using the media parameter of identified frame 808Include the alternative media item of the metadata of the media parameter of identified frame 808 due to having in the corpus of user.For example,Fig. 9 A are returned, media parameter { school }=" hip-hop " are determined at frame 808, to be limited in phonetic entry 902.In response to trueDetermine phonetic entry and media parameter { school } is defined to " hip-hop ", can search for the corpus specific to user of media item, to knowThere is not include the media item of the metadata of media parameter { school }=" hip-hop ".For example, media item such as J-Kwon" Tipsy ", Jay-Z " 99Problems " and Drake " Over " each have include media parameter { school }=" hip-hop "Metadata.Therefore, in this example, the multiple alternative media items determined from the corpus specific to user of media item includeThese media items.

In another example shown in Figure 10, the offer of user 901 phonetic entry 1002 " he, Siri, play some and be adapted toThe music of barbecue ".In this example, phonetic entry is determined at frame 808, media parameter { activity } is defined to " roast ".In response to determining that media parameter { activity } is defined to " roast " by phonetic entry, the language material specific to user of media item is searched forStorehouse, to identify the media item with the metadata for including media parameter { activity }=" barbecue ".For example, media item such as TheKooks " She Moves in Her Own Way ", Katy Perry " Hot n Cold " and The Beach Boys'" Fun Fun Fun " each have the metadata for including media parameter { activity }=" barbecue ".Therefore, in this example, from matchmakerMultiple alternative media items that the corpus specific to user of body item determines include these media items.

Although Fig. 9 A-B and Figure 10 example are described relative to specific media parameter, but it would be recognized that baseAny media parameter limited in the phonetic entry in frame 902 determines multiple times from the corpus specific to user of media itemSelect media item.For example, in addition to the media parameter { school } described in Fig. 9 A-B and Figure 10 example and { barbecue }, media parameterIncluding { artist }, { medium type }, { school }, { issuing date }, { mood }, { occasion }, { edit list }, { politics is inclinedTo }, { skills involved in the labour } etc..

At frame 814, multiple alternative media items of frame 812 are arranged using the media order models specific to userSequence.Media order models specific to user are stored in such as user data 748 or data and model 116.Using specificThe sequence specific to user of each alternative media item in multiple alternative media items is generated in the media order models of userScore.So as to be ranked up based on the sequence score specific to user to multiple alternative media items.Specific to the sequence of userScore represents that user receives the possibility of alternative media item in the case where media parameter is associated with alternative media item.Specific toThe media order models of user are the statistical machine learning model trained using the data specific to user (for example, neutral netModel, Bayesian model etc.), such as information from user profile, the previous input related to media from user,Or the media item associated with user.In addition, based on the data specific to user then received come continuous updating specific toThe media order models of user.For example, phonetic entry based on frame 802 or being comprised in any in the audio input of frame 824Voice updates the media order models specific to user, as described below.

Information from user profile includes age of user, race, position, occupation etc..Utilize the information to generateSpecific to the media order models of user.For example, if the information instruction user from user profile is conservative to live inIdahoan scientist, then training specific to user media order models, with for more high-tech qualification orMore conservative political orientation associated media item generates more favourable score.

Utilize media order models of the previous input related to media to generate specific to user from user.SpecificallyGround, the previous input related to media from user are included in the media phase received before the phonetic entry of sink block 802Close request, selection and refusal.For example, if the previous media association requests instruction user from user generally asks popular soundFind pleasure in and refuse music of talking and singing, then the media sequence mould specific to user is trained based on the input previously related to mediaType, the sequence score of pop music is more beneficial for generation and is less favorable for the sequence score of Chinese musical telling music.In another exampleIn, input previously related to media is indicated when user browses Online Music shop, and user often checks that issuing date is 20The music item of century 70.Based on the determination, the media order models specific to user are trained, are more beneficial for issuing with generationDate is the sequence score of the media item of the 1970s.

The media item associated with user is included in the media item found in the personal media libraries of user.In some examplesIn, generate the media order models specific to user using the media item in the personal media libraries of user.Specifically, training is specialDue to the media order models of user, to be advantageous to the media parameter of the media item with the personal media libraries similar to userMedia item.For example, train the matchmaker specific to user based on the personal media libraries of the user of many corpus with Jay-ZBody order models, be more beneficial for generation to artist Jay-Z or similar to the related media item of Jay-Z artist mustPoint.

In some instances, (addition, or alternatively) to perform the sequence of frame 814 using general media order models.Specifically, the general of each alternative media item in multiple alternative media items is generated using general media order models to sortPoint.So as to be ranked up based on general sequence score to multiple alternative media items.General media order models are similar to specificIn the media order models of user, the difference is that general media order models are used from a large number of users rather than from one specificallyThe related data of the media of user are trained.General sequence score represents the general supporting rate of media item.Specifically, general matchmakerThe media item that body order models are most frequently asked, check or selected for a large number of users generates more favourable sequence score.

It should be appreciated that in some instances, based on from the media order models specific to user specific to userThe combination of sequence score and the general sequence score from general media order models perform the sequence of frame 814.For example, insertEnter score to generate the combination of each alternative media item sequence score.The sequence score for being then based on combination is come to multiple candidate matchmakersBody item is ranked up.Additionally, it should be realized that in some instances, by general media order models and the row specific to userSequence model integration.For example, the order models specific to user are generated using the related data of the media from a large number of users, butIt is to be adjusted to be advantageous to the user preference specific to indicating in the data of user.

At frame 816, at least one media item is selected from multiple alternative media items based on the sequence of frame 814.For example, shouldN number of alternative media of top ranked in alternative media item or multiple alternative media items of at least one media item including top ranked, wherein N is greater than zero integer.At least one media item obtained at frame 806 is at least one for selected frame 816Media item.Retrieved from the corpus specific to user of media item selected at least one media item (for example, a kind of orAt media service 120-1), and it is provided to user at frame 818.

In some instances, pushing away based on the user couple one or more media parameters associated with least one media itemThe familiarity broken selects at least one media item.For example, the phonetic entry received at frame 802 is " to play one for meA little Michael Jackson song ".In this example, the phonetic entry is determined at frame 808, by media parameter { artFamily } it is defined to " Michael Jackson ".Based on the determination, in the corpus specific to user at frame 812 from media itemIdentify the Michael Jackson songs of multiple candidates.Based on general popularity (for example, being sorted according to general media at frame 814Model) and/or the Michael Jackson songs of multiple candidates are arranged based on the media order models specific to userSequence.Determine user to artist's " Michael Jackson " familiarity.Based on artist " Michael Jackson "Associated data specific to user are determined.For example, referred to based on the related input of the previous media from userShow that user often browses, buys, listens to and/or asked Michael Jackson song or the personal media libraries based on userTo artist, " Michael Jackson " familiarity is higher to determine the user for song comprising a large amount of Jackson.PhaseInstead, infrequently browse, buy, listen to and/or ask Michael based on the related input instruction user of the media from userSong that Jackson song or personal media libraries based on user include considerably less Jackson determines the user to skill" Michael Jackson " familiarity is relatively low by astrologist.Based on identified familiarity, from how first candidate at frame 816Michael Jackson song selection song.Such as, if it is determined that user to artist " Michael Jackson's "Familiarity is relatively low, then the Michael Jackson of most popular or top ranked candidate song is selected at frame 814.ToolBody, from the Michael Jackson of the N candidates of the Michael Jackson of how first candidate song selection top rankedSong played out as playlist.By contrast, if it is determined that user to artist " Michael Jackson's "Familiarity is higher, then selects the candidate's of popular (for example, ranking is higher) and less popular (for example, ranking is relatively low)The combination of Michael Jackson song plays out as playlist.Specifically, based on user to artist" Michael Jackson " familiarity is higher, selects the Michael Jackson of the less popular candidate of greater proportionSong.This is favourable because the user very familiar to artist Michael Jackson may be familiar with it is most popularMichael Jackson song.Such user will wish to hear the combination of Michael Jackson song, includingThe song of the degree of commercialization of the song of popular high commercial and less prevalence relatively low (for example, " deep cuts ").CauseThis, in this example, at frame 816 based on determined by user to artist " Michael Jackson " familiarity comeSelect the average supporting rate of Michael Jackson song.

It should be appreciated that in some instances, the user couple one or more media associated with least one media itemThe familiarity of parameter is directly included in specific in the media order models of user.For example, based on determination user to artFamily " Michael Jackson " are very familiar, specific to user media order models be configured as it is less popular to someMichael Jackson song generates higher sequence score.In this way, the N head candidates of top rankedMichael Jackson song include height on commercialized popular Michael Jackson song and popularity compared withThe combination of low Michael Jackson song.In these examples, selected at least one media item includes rankingThe Michael Jackson of highest N candidates song.

Although above-mentioned frame 806 is performed using the corpus specific to user of media item, but it would be recognized that andIn other examples, other corpus of media item can be used to replace the corpus specific to user of media item.For example,In some examples, at least one media item derives from general (independently of the user's) corpus of media item or based on one or moreThe corpus of the media item of individual specific media parameter generation.

At frame 818, there is provided at least one media item.Specifically, at least one media are provided at user equipment.In some instances, at least one media item plays (for example, using loudspeaker 211) on a user device.At otherIn example, at least one media item is shown on a user device (for example, on touch-screen 212), for user check and/Or selection.In other examples, at least one media item is provided to user (for example, using raising in the form of voice responseSound device 211).

Referring again to example as shown in Figure 9 A, at frame 814 using the media order models specific to user come toThe candidate's hip-hop media item determined at frame 812 is ranked up.Specifically, candidate's hip-hop media item is ranked up so that candidateHave in candidate's hip-hop media item that media item Jay-Z " 99Problems " is determined at frame 812 top ranked.Therefore, existIn the example, at least one media item selected at frame 816 includes media item Jay-Z " 99Problems " and in userThe media item is played into user 901 in equipment 903.

With reference now to example as shown in Figure 10, using the media order models specific to user come to frame at frame 814" barbecue " alternative media item determined at 812 is ranked up.In this example, alternative media item The Kooks " SheHave in the alternative media item that Moves in Her Own Way " are determined at frame 812 top ranked.Therefore, selected at frame 816At least one media item selected include media item The Kooks " She Moves in Her Own Way ", and obtain shouldMedia item simultaneously plays it to user 901 on user equipment 903.It should be appreciated that selected at least one media item canIncluding other media items.For example, alternative media item Katy Perry " Hot n Cold " and The Beach Boys " FunThere is the second high and the 3rd high ranking in the alternative media item that Fun Fun " are determined at frame 812.Selected extremely at frame 816A few media item includes these media items.Therefore, in these examples, media item The is being played on a user deviceKooks " She Moves in Her Own Way " play media item Katy Perry " Hot n Cold " and The afterwardsBeach Boys " Fun Fun Fun ".

In some instances, process 800 allows users to provide to follow up when providing at least one media item at frame 818 to askAsk.For example, at least one media item or the request related at least one media item that user's refusal provides at frame 818 are addedInformation.Frame 820-826 is described in terms of the response of user's reception follow-up voice request and offer to the voice request that follows up.

At frame 820, it is determined that domain corresponding with phonetic entry whether be one in multiple predetermined domains in advance reallyFixed domain.Specifically, only some predetermined domains may extract the follow-up request from user.Therefore, in order to improve efficiency,The ability that follow-up voice request is received from user is realized only for some predetermined domains.For example, multiple predetermined domainsIncluding the domain with the project with a large amount of metadata, such as " lookup media item " domain or " lookup dining room " domain.With a large amount of first numbersAccording to project such as media item and dining room item often extract follow-up from user and ask.It is corresponding with phonetic entry in response to determiningDomain be multiple predetermined domains in a predetermined domain, at frame 820 receive audio input (for example, microphone213 are activated).On the contrary, in response to determining that corresponding with phonetic entry domain is not one in multiple predetermined domains advanceThe domain of determination, process 800 are abandoned receiving audio input (for example, the un-activation of microphone 213) at frame 822.

At frame 824, audio input is received.Specifically, it is defeated to receive audio when at least one media item is provided at frame 818Enter.For example, with reference to figure 9A, once media item Jay-Z " 99 Problems " starts to play on user equipment 903, then userEquipment 903 starts to receive audio input via the microphone of user equipment 903.

At frame 826, determine whether audio input includes voice.The determination is carried out when receiving audio input.Specifically,Once receiving audio input, then audio input is analyzed to determine whether to include sound corresponding with those features of voiceFeature.Specifically, temporal signatures are extracted (for example, zero-crossing rate, short-time energy, spectrum energy or frequency from the audio input receivedCompose flatness) and/or frequency domain character (for example, Mel-frequency Cepstral Coefficients, linear prediction residue error or mel-frequency are discreteWavelet coefficient) and by it compared with human speech model, to determine that audio input includes the possibility of voice.If reallyIts fixed possibility is higher than predetermined value, it is determined that the audio input includes voice.On the contrary, if its possibility is less than advanceThe value of determination, it is determined that the audio input does not include voice.In response to determining that audio input does not include voice, process 800 is pre-Reception audio input at frame 828 is stopped at after the time quantum first determined.For example, with reference to figure 9A, user equipment 903 is receiving quiltIt is defined as stopping receiving audio input after the audio input of the predetermined duration without any voice.

In some instances, degree of the predetermined time quantum based on the ambient noise detected in audio input.Specifically, frame 826 includes determining the amount of the ambient noise (for example, background noise) in audio input.Based on detecting that audio is defeatedAmbient noise level in entering is higher, the audio input not comprising any voice received at frame 824 it is predeterminedTime quantum is reduced.Such as, if it is determined that the amplitude of the ambient noise in audio input is without departing from predetermined threshold value, then process800 stop reception audio input at frame 828 after predetermined time quantum (for example, 7 seconds).However, if it is determined that soundThe amplitude of ambient noise during frequency inputs exceeds predetermined threshold value, then process 800 at frame 828 less than predeterminedStop receiving audio input after second predetermined time quantum (for example, 4 seconds) of time quantum.

In response to determining that audio input includes voice, frame 830 is performed.At frame 830, determine audio input voice whetherCorresponding to phonetic entry identical domain.The determination includes determining user view corresponding with the voice of audio input.According to classThe mode of frame 804 as described above is similar to determine user view.It is determined that user view corresponding with the voice of audio input includesIt is determined that domain corresponding with the voice of audio input.It is then determined that with the voice of audio input corresponding to domain with and frame 802 voiceWhether domain corresponding to input is identical.In response to determining that the voice of audio input does not correspond to and phonetic entry identical domain, process800 abandon providing the response to audio input at frame 832.This is desirable for filtering out babble noise.For example, with reference toFig. 9 A, phonetic entry 902 correspond to " lookup media item " domain.If received when playing Jay-Z song " 99Problems "The babble noise unrelated with searching media item is included to audio input and the audio input, it is determined that the multichannel is overlapped and made an uproarSound is unrelated with phonetic entry 902, and will not provide a user follow-up response (frame 832).

Voice in response to determining audio input corresponds to and phonetic entry identical domain, execution frame 834.In frame 834Place, response is provided according to user view corresponding with the voice of audio input.According to above with reference to similar described in Fig. 7 A-CMode provides response.Specifically, looked into based on identified user view corresponding with the voice of audio input to generate structuringAsk.Then one or more tasks corresponding with user view are performed according to the structuralized query generated.Based on one orMultiple performed tasks provide response.

Referring now to Figure 10 example, user 901 provides the second phonetic entry 1004, and " when this issues" toolBody, in the broadcasting media item The of user equipment 903 Kooks " during She Moves in Her Own Way ", from user 901Receive the second phonetic entry 1004 (frame 824).It is determined that with 1004 corresponding user view (frame 830) of the second phonetic entry.It is based onWords " this " and user equipment 903 in second phonetic entry 1004 play media item " She Moves in Her Own Way "Context come determine the second phonetic entry 1004 correspond to and the identical domain of phonetic entry 902.It is specifically, it is determined that defeated with voiceIt is " lookup media item " domain to enter domain corresponding to 904.In addition, in this example, based in broadcasting media item " She Moves inWords " this " and " distribution " is explained in Her Own Way " context to determine that (frame 830) second phonetic entry 904 corresponds toObtain and the media item " user view at She Moves in Her Own Way " associated issuing dates.It is true in response to thisIt is fixed, perform one or more tasks (frame 834) corresponding with user view.Specifically, song " She Moves in Her are retrievedOwn Way " issuing date (for example, from one or more media services 120-1) and it is provided to user's (frame834).For example, as shown in Figure 10, according to identified user view, voice response 1006 is provided at user equipment 903To user 901.Specifically, voice response 1006 indicates that " She Moves in Her Own Way " issuing date is song" in June, 2006 ".In some instances, in response to the second phonetic entry 1004, release data is addition or alternatively displayed onOn user equipment 903.

Returns frame 804, the phonetic entry in response to decision block 802 do not correspond to the personalized recommendation obtained for media itemUser view, perform Fig. 8 C frame 836.At frame 836, whether the phonetic entry of decision block 802, which corresponds to obtain, has mostThe user view of the media item at nearly issuing date.As described above, user's meaning corresponding with phonetic entry is determined at frame 804Figure.Executable intention node selected in knowledge based body (for example, ontologies 760) carrys out decision block 836.It is if selectedThe node selected is with the corresponding executable intention for obtaining the media item with nearest issuing date, it is determined that phonetic entry is correspondingIn the user view for obtaining the media item with nearest issuing date.If on the contrary, the node is with except acquisition is with nearest hairCorresponding executable intention outside the media item on row date, it is determined that phonetic entry, which does not correspond to acquisition, has the nearest issue dateThe user view of the media item of phase.

In some instances, determine whether phonetic entry corresponds to the user for obtaining the media item with nearest issuing dateIntention includes determining whether phonetic entry is predetermined short including one or more of more than second predetermined phrasesLanguage.Specifically, the executable intention node and second of user view corresponding with media item of the acquisition with nearest issuing dateMultiple predetermined phrases are associated.Predetermined phrase more than second be stored in and obtain user's meaning of media itemIt can perform and be intended in the associated glossarial index (glossarial index 744) of node corresponding to figure.More than second predetermined phraseIncluding the phrase such as " new music ", " issuing recently ", " newest issue ", " newly going out ".Phonetic entry based on frame 802 includesThe predetermined phrase of one or more of individual predetermined phrase more than second, phonetic entry is mapped to be had with acquisitionIt can perform corresponding to the user view of the media item at nearest issuing date and be intended to node.Accordingly, it is determined that the phonetic entry pair of frame 802Should be in the user view for obtaining the media item with nearest issuing date.For example, with reference to figure 11, receive from user's 901Phonetic entry 1102 " he, Siri, some pop musics newly gone out are played to me ".Based on the phonetic entry for including phrase " newly going out "1102, select executable intention node corresponding with the user view for obtaining the media item with nearest issuing date.Therefore, reallyDetermine phonetic entry 1102 and correspond to the user view for obtaining the media item with nearest issuing date.

In response to determining that phonetic entry corresponds to the user view of media item of the acquisition with nearest issuing date, frame is performed838.On the contrary, in response to determining that phonetic entry does not correspond to the user view of media item of the acquisition with nearest issuing date, mistakeJourney 800 is abandoned performing frame 838.For example, as shown in Figure 8 C, there is distribution recently in response to determining that phonetic entry does not correspond to acquisitionThe user view of the media item on date, process 800 terminate.

At frame 838, at least one second media item is obtained from the second corpus of media item.Frame 838 is similar to frame806, unlike using the second corpus of media item and the corpus specific to user of non-media item performs frame 838.In addition, frame 838 includes the frame similar to frame 808-816, different is still, relative to the second corpus of media item rather than matchmakerThe corpus specific to user of body item performs the frame.Second corpus of media item is such as issue date based on media itemPhase and the general corpus of media item generated.Specifically, each media item in the second corpus of media item, which has, is working asIssuing date in the predetermined time range on preceding date.Working as example, the second corpus of media item only includes havingThe media item at the issuing date in preceding three months dates.In some instances, the people based on such as each media item of other factorsGas generates the second corpus of media item.

At frame 840, there is provided at least one second media item.Frame 840 is similar to frame 818.Specifically, at user equipmentAt least one second media item is provided.In some instances, at least one media item is played at user equipment.In other examplesIn, at least one media item is shown on a user device (for example, on touch-screen 212), so that user checks and/or selectsSelect.In other examples, at least one media item is provided to user in the form of voice response.

Frame 838-840 is further described with reference to figure 11.For example, obtained in response to determining that phonetic entry 1102 corresponds toThe user view of media item with nearest issuing date, the digital assistants realized on user equipment 903 from media itemTwo corpus obtain at least one second media item.Second corpus of media item, which only includes, to be had in current date three monthsIssuing date media item.In this example, if current date is on June 1st, 2016, the second corpus of media itemIn each media item have and be no earlier than issuing date on March 1st, 2016.Therefore, obtained from the second corpus of media itemAt least one second media item have be no earlier than on March 1st, 2016 release data.In this example, this at least one" Dangerous Woman ", its issuing date are March 11 in 2016 to song of individual second media item including Ariana GrandeDay.As illustrated, in response to phonetic entry 1102, obtaining song, " Dangerous Woman " are (for example, from one or more matchmakersIn body service 120-1) and played out on user equipment 903.

5th, other electronic equipments

Figure 12 shows the functional block diagram of the electronic equipment 1200 configured according to the principle of the various examples.This setsStandby functional block optionally by the combination of the hardware of the principle that carries out various described examples, software or hardware and software LaiRealize.It will be understood by those of skill in the art that the functional block described in Figure 12 is optionally combined or is separated into sub- frame, withRealize the principle of the various examples.Therefore, description herein optionally supports any possible of functional block as described hereinCombination or separation further limit.

As shown in figure 12, electronic equipment 1200 includes being configured as showing graphic user interface and receives touch from userThe touch screen display unit 1202 of input, the audio input unit for being configured as receiving audio input (for example, phonetic entry)1204th, it is configured as exporting the loudspeaker unit 1205 of audio (for example, voice and/or media content) and is configured as passingThe defeated and communication unit of receive information 1206.Electronic equipment 1200 also includes the processing for being coupled to touch screen display unit 1202Unit 1208 and audio input unit 1204 and communication unit 1206.In some instances, processing unit 1208 includes connecingReceive unit 1210, determining unit 1212, acquiring unit 1214, provide unit 1216, sequencing unit 1218, updating block 1220,Stop element 1222, abandon unit 1224 and selecting unit 1226.

According to some embodiments, the processing unit 1208, which is configured as receiving from user, represents one or more media itemsPhonetic entry (for example, phonetic entry of frame 802) (for example, using receiving unit 1210 and via audio input unit1204).The processing unit 1208 is additionally configured to determine whether phonetic entry corresponds to acquisition and push away for the personalization of media itemThe user view (for example, utilizing determining unit 1212) (for example, frame 804) recommended.The processing unit 1208 is additionally configured to respondIn it is determined that phonetic entry correspond to obtain for media item personalized recommendation user view and from media item specific toThe corpus at family obtains at least one media item (for example, at least one media item of frame 806) (for example, utilizing acquiring unit1214).The corpus specific to user of media item is generated based on the data associated with user (for example, the media of frame 806The corpus specific to user of item).Processing unit 1208 is additionally configured to provide at least one media item (for example, using carryingFor unit and use touch screen display unit 1202 and/or loudspeaker unit 1205) (for example, frame 818).

In some instances, determine whether phonetic entry corresponds to the user's meaning obtained for the personalized recommendation of media itemWhether the quantity for the parameter that figure includes determining to limit in phonetic entry is less than predetermined threshold value (for example, frame 804).

In some instances, determine whether phonetic entry corresponds to the user's meaning obtained for the personalized recommendation of media itemFigure includes determining whether phonetic entry is corresponding multiple short including the user view with obtaining the personalized recommendation for being directed to media itemA phrase (for example, frame 804) in language.

In some instances, determine whether phonetic entry corresponds to the user view for obtaining personalized recommendation media including trueWhether the fixed phonetic entry is related to user's (for example, frame 804).

In some instances, generated based on the media item for previously being selected or being asked by user media item specific to userCorpus (for example, corpus specific to user of the media item of frame 806).

In some instances, based on previously by user refusal media item and generated the language material specific to user of media itemStorehouse (for example, frame 806).

In some instances, personal library based on the media item associated with user and generate media item specific to userCorpus (for example, frame 806).

In some instances, the processing unit 1208 is also configured to use the media order models (example specific to userSuch as, frame 814) come multiple alternative media items of the corpus specific to user from media item are ranked up (for example, utilizeSequencing unit 1218 is ranked up).The media specific to user are generated based on the previous and media association requests from userOrder models.Obtaining a few media item includes selecting at least one media item (example from multiple alternative media items based on sequenceSuch as, frame 816).

In some instances, the processing unit 1208 is additionally configured to receive the second phonetic entry (for example, profit from userWith receiving unit 1210 and via audio input unit 1204).The processing unit 1208 is additionally configured to determine that the second voice is defeatedEnter the refusal (for example, utilizing determining unit 1212) whether corresponded to at least one media item.The processing unit 1208 goes back quiltIt is configured to, in response to determining that the second phonetic entry corresponds to the refusal at least one media item, be updated according to the refusal specificIn the media order models (for example, utilizing updating block 1220) of user.

In some instances, the processing unit 1208 is additionally configured to based on the refusal at least one media item come to comingResequenced from multiple alternative media items of the corpus specific to user of media item (for example, utilizing sequencing unit1218).The processing unit 1208 is additionally configured to select at least one second from multiple alternative media items based on rearrangementMedia item (for example, utilizing selecting unit 1226).

In some instances, the supporting rate based on each media item in multiple alternative media items is come to multiple alternative mediasItem is ranked up (for example, frame 814).

In some instances, each media item in the corpus specific to user of media item includes instruction and media itemAssociated movable metadata.Activity is associated with media item based on the music rhythm of media item.

In some instances, each media item in the corpus specific to user of media item includes instruction and media itemThe metadata of associated mood.Mood is associated with media item based on the music tone of media item.

In some instances, changed handling unit 1208 be additionally configured to determine phonetic entry whether limit it is related to the periodThe occasion (for example, utilizing determining unit 1212) (for example, frame 804) of connection.The processing unit 1208 is additionally configured in response to trueDetermine phonetic entry to limit the occasion associated with the period and at least one media item is obtained based on the occasion (for example, utilizingAcquiring unit 1214), wherein at least one media item includes the metadata (for example, frame 806) of instruction occasion.

In some instances, the processing unit 1208 is additionally configured to determine whether phonetic entry limits and media representatives' phaseThe edit list (for example, utilizing determining unit 1212) (for example, frame 804) of association.The processing unit 1208 is additionally configured to ringShould be in it is determined that phonetic entry be limited the edit list associated with media representatives and arranged based on the editor associated with media representativesTable obtains at least one media item (for example, utilizing acquiring unit) (for example, frame 806).At least one media item includes referring toShow the metadata of the edit list associated with media representatives.

In some instances, the processing unit 1208 is additionally configured to determine whether phonetic entry limits mood (for example, profitWith determining unit 1212) (for example, frame 804).The processing unit 1208 is additionally configured in response to determining that phonetic entry limits feelingsThread and at least one media item (for example, utilizing acquiring unit 1214) is obtained based on the mood, wherein at least one media itemMetadata (for example, frame 806) including indicating mood.

In some instances, the processing unit 1208 be additionally configured to determine phonetic entry whether restriction activity (for example, profitWith determining unit 1212) (for example, frame 804).The processing unit 1208 is additionally configured to live in response to determining that phonetic entry limitsMove and at least one media item (for example, utilizing acquiring unit 1214) is obtained based on the activity, wherein at least one media itemMetadata (for example, frame 806) including instruction activity.

In some instances, the processing unit 1208 is additionally configured to determine) phonetic entry whether limiting time section (exampleSuch as, determining unit 1212 (for example, frame 804) is utilized.The processing unit 1208 is additionally configured in response to determining phonetic entry limitSection of fixing time and determine whether phonetic entry limits the school associated with the period (for example, utilizing determining unit 1212).ShouldProcessing unit 1208 be additionally configured in response to determine phonetic entry to limit the school that is associated with the period and be based on the period andSchool determines that subflow is sent (for example, utilizing determining unit 1212).Sent based on subflow and obtain at least one media item and extremelyA few media item includes the metadata (for example, frame 806) of instruction subflow group.

In some instances, phonetic entry limits the classification of media item, and obtains at least one media item including obtainingThe multiple media items associated with the classification of media item.The processing unit 1208 is additionally configured to determine class of the user to media itemOther familiarity (for example, familiarity of frame 816) (for example, utilizing determining unit 1212).The plurality of media item is averagedFamiliarity of the supporting rate based on user to the classification of media item.

In some instances, the processing unit 1208 is additionally configured to come by using phonetic entry execution speaker identificationDetermine the identity (for example, utilizing determining unit 1212) of user.The processing unit 1208 is additionally configured to use based on determined byThe identified identity at family and the language specific to user of media item is determined from multiple corpus specific to user of media itemExpect storehouse (for example, utilizing determining unit 1212).

In some instances, obtaining at least one media item includes sending crypto token to remote server.Encryption orderBoard includes customer identification information.Crypto token is needed to access the language material specific to user of media item via remote serverStorehouse.

In some instances, the processing unit 1208 is additionally configured to determine domain corresponding with phonetic entry (for example, frame820 domain) whether it is a predetermined domain (for example, utilizing determining unit 1212) in multiple predetermined domains.ShouldProcessing unit 1208 is additionally configured in response to determining that domain corresponding with phonetic entry is one in multiple predetermined domainsPredetermined domain, audio input (for example, audio input of frame 824) is received when providing at least one media item (for example, profitWith receiving unit 1210 and by audio input unit 1204).The processing unit 1208 is additionally configured to determine that audio input isIt is no to include voice (for example, frame 826) (for example, utilizing determining unit 1212).The processing unit 1208 be additionally configured in response toDetermine audio input not include voice and stop receiving audio input after predetermined time quantum (for example, utilizing stoppingUnit 1222) (for example, frame 828).

In some instances, the processing unit 1208 is additionally configured to determine in response to determining audio input to include voiceWhether the voice of audio input corresponds to phonetic entry identical domain (for example, utilizing determining unit 1212) (for example, frame830).The processing unit 1208 is additionally configured to correspond to and phonetic entry identical domain in response to the voice for determining audio inputAnd determine user view (for example, user view of frame 820) corresponding with the voice of audio input (for example, utilizing determining unit1212).The processing unit 1208 is additionally configured to be provided corresponding to the voice of audio input according to user view) it is directed to audioThe response (for example, response of frame 834) of input is (using providing unit 1216.

In some instances, the processing unit 1208 is additionally configured in response to determining that the voice of audio input does not correspond toWith phonetic entry identical domain and abandon providing response (for example, using abandon unit 1224) to audio input (for example, frame832)。

In some instances, degree of the predetermined time quantum based on the ambient noise detected in audio input.

In some instances, there is provided at least one media item includes playing media item.The processing unit 1208 is also configuredTo receive the 3rd phonetic entry (for example, phonetic entry in the audio input of frame 824) when playing media item (for example, utilizingReceiving unit 1210 and via audio input unit 1204).The processing unit 1208 is additionally configured to be based on playing matchmakerBody item and the 3rd phonetic entry determine that user view (for example, user view of frame 820) corresponds to the 3rd phonetic entry (exampleSuch as, determining unit 1212 is utilized).The processing unit 1208 is additionally configured to according to user view corresponding with the 3rd phonetic entryTo provide response (for example, response of frame 834) (for example, using provide unit 1216).

In some instances, the processing unit 1208 is additionally configured in response to determining that phonetic entry does not correspond to acquisition pinDetermine whether phonetic entry corresponds to the user view of the personalized recommendation of media item to obtain with nearest issuing dateThe user view (for example, utilizing determining unit 1212) (for example, frame 836) of media item.The processing unit 1208 is additionally configured toIn response to determine phonetic entry correspond to obtain the user view of the media item with nearest issuing date and from the of media itemTwo corpus obtain at least one second media item (for example, at least one second media item of frame 838) (for example, utilizing acquisitionUnit 1214).Each media item in second corpus of media item has the predetermined time range in current dateInterior issuing date.The processing unit 1208 is additionally configured to provide at least one second media item (for example, using providing unit1216) (for example, frame 840).

In some instances, determine whether phonetic entry corresponds to the user for obtaining the media item with nearest issuing dateIt is intended to include the user view corresponding for determining whether phonetic entry is included with obtaining the media item with nearest issuing dateA phrase (for example, frame 836) more than two in individual phrase.

In some instances, the processing unit 1208 be additionally configured to determine the political orientation associated with user (for example,Utilize determining unit 1212) (for example, frame 814).Media item of the determination based on user's previous Request or consumption.Based on being determinedPolitical orientation and obtain at least one media item.

In some instances, the processing unit 1208 is additionally configured to determine the skills involved in the labour associated with user(for example, utilizing determining unit 1212) (for example, frame 814).Media item of the determination based on user's previous Request or consumption.It is based onIdentified skills involved in the labour and obtain at least one media item.

Above with reference to the operation described in Fig. 8 A-C optionally by the part shown in Fig. 1-4, Fig. 6 A-B and Fig. 7 A-C Lai realIt is existing.For example, the operation of process 800 can handle mould by operating system 718, application program module 724, I/O processing modules 728, STTBlock 730, natural language processing module 732, glossarial index 744, task flow processing module 736, service processing module 738, one kindOr media services one or more of 120-1 or one or more processors 220,410,704 to realize.This areaThose of ordinary skill can know clearly how to realize based on the part described in Fig. 1-4, Fig. 6 A-B and Fig. 7 A-COther processes.

According to some specific implementations, there is provided a kind of computer-readable recording medium is (for example, non-transient computer readable storageMedium), one or more journeys that the computer-readable recording medium storage is performed by the one or more processors of electronic equipmentSequence, one or more programs include being used for the instruction for performing any one of method or process described herein.

According to some specific implementations, there is provided a kind of electronic equipment (for example, portable electric appts), the electronic equipment includeFor performing the device of any one of method or process described herein.

According to some specific implementations, there is provided a kind of electronic equipment (for example, portable electric appts), the electronic equipment includeIt is configured as performing the processing unit of any one of method or process described herein.

According to some specific implementations, there is provided a kind of electronic equipment (for example, portable electric appts), the electronic equipment includeOne or more processors and storage by one or more of computing devices one or more programs memory, this oneIndividual or multiple programs include being used for the instruction for performing any one of method or process described herein.

For illustrative purposes, description above is described by reference to specific embodiment.However, example aboveThe property shown is discussed being not intended to limit or limits the invention to disclosed precise forms.According to teachings above content, veryMore modifications and variations are all possible.It is to best explain these technologies to select and describe these embodimentsPrinciple and its practical application.Others skilled in the art are thus, it is possible to best utilize these technologies and with suitableIn the various embodiments of the various modifications of desired special-purpose.

Although having carried out comprehensive description to the disclosure and example referring to the drawings, it should be noted that, various change and repairChange and will become obvious for those skilled in the art.It should be appreciated that such change and modifications is considered as being wrappedInclude in the range of the disclosure and example being defined by the claims.

As described above, the one side of the technology of the present invention is to gather and using the data derived from various sources, to improveDelivering it to user may perhaps any other content in inspiration interested.The disclosure is expected, in some instances, these institutesThe data of collection may include to uniquely identify or available for the personal information data for contacting or positioning specific people.Such personal letterBreath data may include demographic data, location-based data, telephone number, e-mail address, home address or any otherIdentification information.

Be benefited the present disclosure recognize that may be used in family using such personal information data in the technology of the present invention.For example,The personal information data can be used for delivering user object content interested.Therefore, such personal information data are used to causePlanned control can be carried out to the content delivered.In addition, the disclosure is it is also contemplated that personal information data are beneficial to user'sOther purposes.

The disclosure be contemplated that be responsible for the collections of such personal information data, analysis, openly, transmission, storage or other useThe entity on way will comply with the privacy policy established and/or privacy practice.Specifically, such entity should be carried out and adhere to usingIt is acknowledged as being met or exceeded by the privacy political affairs to safeguarding the privacy of personal information data and the industry of security or administration requestPlan and practice.For example, the personal information from user should be collected for the legal and rational purposes of entity, and do not existShare or sell outside these legal uses.In addition, such collection should be carried out only after user's informed consent.In addition, thisClass entity should take any required step, to ensure and protect the access to such personal information data, and guaranteeAccess personal information data other people observe their privacy policy and program.In addition, this entity can make itself to be subjected toThird party is assessed to prove that it observes the privacy policy accepted extensively and practice.

Regardless of afore-mentioned, the disclosure is it is also contemplated that user optionally prevents to use or access personal information dataEmbodiment.I.e. the disclosure is expected that hardware element and/or software element can be provided, to prevent or prevent to such personal information numberAccording to access.For example, for advertisement delivery service, technology of the invention can be configured as allowing user during registration service" addition " or " exiting " is selected to participate in the collection to personal information data.And for example, user may be selected not as object content delivering clothesBusiness provides positional information.For another example, user may be selected not providing accurate positional information, but granted transmission position area information.

Therefore, although the disclosure is widely covered using personal information data to realize that one or more is various disclosedEmbodiment, but the disclosure it is also contemplated that various embodiments also can in the case where such personal information data need not be accessed quiltRealize.That is, the various embodiments of the technology of the present invention will not due to lack such personal information data all or part of andIt can not be normally carried out.For example, can by the personal information based on non-personal information data or absolute bottom line such as with userContent, other non-personal information available to content delivery services or publicly available information that associated equipment is asked pushes awayDisconnected preference, so as to select content and be delivered to user.

Claims

1. a kind of be used to operate digital assistants to probe into the method for media item, methods described includes：

At the electronic equipment with memory and one or more processors：

The phonetic entry for representing the request to one or more media items is received from user；

Determine whether the phonetic entry corresponds to the user view for obtaining the personalized recommendation for media item；And

In response to determining that the phonetic entry corresponds to user view of the acquisition for the personalized recommendation of media item：

At least one media item, the language specific to user of the media item are obtained from the corpus specific to user of media itemMaterial storehouse is generated based on the data associated with the user；And

At least one media item is provided.

2. according to the method for claim 1, obtained wherein determining whether the phonetic entry corresponds to for media itemWhether the quantity for the parameter that the user view of personalized recommendation includes determining to limit in the phonetic entry is less than threshold value.

3. according to the method for claim 1, obtained wherein determining whether the phonetic entry corresponds to for media itemThe user view of personalized recommendation includes determining whether the phonetic entry includes with obtaining the personalized recommendation for media itemThe user view corresponding to a phrase in multiple phrases.

4. according to the method for claim 1, obtained wherein determining whether the phonetic entry corresponds to for media itemThe user view of personalized recommendation includes determining whether the phonetic entry is related to the user.

5. according to the method for claim 1, wherein the corpus specific to user of the media item is based on previously by instituteState the media item of user's selection or request and generate.

6. according to the method for claim 1, wherein the corpus specific to user of the media item is based on previously by instituteState the media item of user's refusal and generate.

7. according to the method for claim 1, wherein the corpus specific to user of the media item is based on and the useThe personal library of the associated media item in family and generate.

8. the method according to claim 11, in addition to：

Using the media order models specific to user come to the multiple of the corpus specific to user from the media itemAlternative media item is ranked up, and the media order models specific to user are based on multiple previous medias from the userAssociation requests and generate, wherein obtaining at least one media item includes sorting come from the multiple alternative media based on describedItem selects at least one media item.

9. according to the method for claim 8, wherein the multiple alternative media item is based in the multiple alternative media itemThe supporting rate of each media item be ranked up.

10. the method according to claim 11, in addition to：

The second phonetic entry is received from the user；

Determine whether second phonetic entry corresponds to the refusal at least one media item；And

In response to determining that second phonetic entry corresponds to the refusal at least one media item：

Specific to the media order models of user described in being updated according to the refusal.

11. the method according to claim 11, in addition to：

Based on the refusal at least one media item come the institute to the corpus specific to user from media itemMultiple alternative media items are stated to be resequenced；And

At least one second media item is selected from the multiple alternative media item based on the rearrangement.

12. according to the method for claim 1, wherein each media in the corpus specific to user of the media itemItem includes indicating the movable metadata associated with the media item, and the wherein described movable sound based on the media itemHappy rhythm and it is associated with the media item.

13. according to the method for claim 1, wherein each media in the corpus specific to user of the media itemItem includes indicating the metadata of the mood associated with the media item, and wherein described sound of the mood based on the media itemMusical sound adjust and it is associated with the media item.

14. the method according to claim 11, in addition to：

Determine whether the phonetic entry limits the occasion associated with the period；And

In response to determining that the phonetic entry limits the occasion associated with the period, obtained based on the occasion described at leastOne media item, wherein at least one media item includes the metadata for indicating the occasion.

15. the method according to claim 11, in addition to：

Determine whether the phonetic entry limits the edit list associated with media representatives；And

In response to determining that the phonetic entry limits the edit list associated with media representatives, based on media representatives' phaseThe edit list of association obtains at least one media item, wherein at least one media item includes instruction and instituteState the metadata of the associated edit list of media representatives.

16. the method according to claim 11, in addition to：

Determine whether the phonetic entry limits mood；And

In response to determining that the phonetic entry limits mood, at least one media item is obtained based on the mood, whereinAt least one media item includes the metadata for indicating the mood.

17. the method according to claim 11, in addition to：

Determine the phonetic entry whether restriction activity；And

In response to determining the phonetic entry restriction activity, at least one media item is obtained based on the activity, whereinAt least one media item includes indicating the movable metadata.

18. the method according to claim 11, in addition to：

Determine the phonetic entry whether limiting time section；

In response to determining the phonetic entry limiting time section, it is related to the period to determine whether the phonetic entry limitsThe school of connection；And

In response to determining that the phonetic entry limits the school associated with the period, based on the period and the streamSend determine subflow group, wherein at least one media item be based on the subflow send and be acquired, and wherein it is described at leastOne media item includes the metadata for indicating the subflow group.

19. according to the method for claim 1, wherein the phonetic entry limits the classification of media item, wherein described in obtainingAt least one media item includes obtaining the multiple media items associated with the classification of the media item, and also includes：

Familiarity of the user to the classification of the media item is determined, wherein the average support of the multiple media itemThe familiarity of the rate based on the user to the classification of the media item.

20. the method according to claim 11, in addition to：

Speaker identification is performed by using the phonetic entry to determine the identity of the user；And

Based on the identified identity of the user, the media item is determined from multiple corpus specific to user of media itemThe corpus specific to user.

21. according to the method for claim 1, wherein obtain at least one media item include by crypto token send toRemote server, the crypto token include customer identification information, and wherein need the crypto token with via described remoteJourney server accesses the corpus specific to user of the media item.

22. the method according to claim 11, in addition to：

It is determined that whether domain corresponding with the phonetic entry is a predetermined domain in multiple predetermined domains；

In response to determining that domain corresponding with the phonetic entry is a predetermined domain in multiple predetermined domains：

When providing at least one media item, audio input is received；

Determine whether the audio input includes voice；And

In response to determining that the audio input does not include voice, stop receiving audio input after predetermined time quantum.

23. the method according to claim 11, in addition to：

In response to determining that the audio input includes voice：

Determine whether the voice of the audio input corresponds to and the phonetic entry identical domain；

In response to determining that the voice of the audio input corresponds to and the phonetic entry identical domain：

It is determined that user view corresponding with the voice of the audio input；And

The response to the audio input is provided according to the user view corresponding with the voice of the audio input.

24. the method according to claim 11, in addition to：

In response to determining that the voice of the audio input does not correspond to and the phonetic entry identical domain：

Abandon providing the response to the audio input.

25. according to the method for claim 22, wherein the predetermined time quantum is based in the audio inputThe degree of the ambient noise detected.

26. the method according to claim 11, wherein it is described at least one including playing to provide at least one media itemMedia item in media item, and also include：

When playing the media item, the 3rd phonetic entry is received；

User corresponding with the 3rd phonetic entry is determined based on the media item and the 3rd phonetic entry that are playingIt is intended to；And

Response is provided according to the user view corresponding with the 3rd phonetic entry.

27. the method according to claim 11, in addition to：

In response to determining that the phonetic entry does not correspond to user view of the acquisition for the personalized recommendation of media item：

Determine whether the phonetic entry corresponds to the user view for obtaining the media item with nearest issuing date；And

In response to determining that the phonetic entry corresponds to the user view of media item of the acquisition with nearest issuing date：

At least one second media item is obtained from the second corpus of media item, wherein in the second corpus of the media itemEach media item has the issuing date in the predetermined time range of current date；And

At least one second media item is provided.

28. according to the method for claim 1, obtained wherein determining whether the phonetic entry corresponds to distribution recentlyThe user view of the media item on date includes determining whether the phonetic entry includes with obtaining the matchmaker with nearest issuing dateA phrase in more than second individual phrases corresponding to the user view of body item.

29. the method according to claim 11, in addition to：

It is determined that the political orientation associated with user, described to determine based on the previous media item asked or consumed by the user,Wherein described at least one media item is acquired based on identified political orientation.

30. the method according to claim 11, in addition to：

It is determined that the skills involved in the labour associated with user, the determination is based on the previous media asked or consumed by the user, wherein at least one media item is acquired based on identified skills involved in the labour.

31. a kind of computer-readable recording medium, the computer-readable recording medium storage is configured as by electronic equipmentOne or more programs that one or more processors perform, one or more of programs include being used for what is operated belowInstruction：

At least one media item is provided.

32. computer-readable medium according to claim 31, obtained wherein determining whether the phonetic entry corresponds toWhether the quantity of the parameter limited for the user view of the personalized recommendation of media item including determination in the phonetic entryLess than threshold value.

33. computer-readable medium according to claim 31, obtained wherein determining whether the phonetic entry corresponds toInclude determining whether the phonetic entry includes being directed to media item with obtaining for the user view of the personalized recommendation of media itemPersonalized recommendation the user view corresponding to a phrase in multiple phrases.

34. computer-readable medium according to claim 31, obtained wherein determining whether the phonetic entry corresponds toInclude determining whether the phonetic entry is related to the user for the user view of the personalized recommendation of media item.

35. computer-readable medium according to claim 31, wherein the corpus specific to user of the media itemGenerated based on the media item for previously having been selected or having been asked by the user.

36. computer-readable medium according to claim 31, wherein the corpus specific to user of the media itemBased on previously being generated by the media item that the user refuses.

37. computer-readable medium according to claim 31, wherein the corpus specific to user of the media itemPersonal library based on the media item associated with the user and generate.

38. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

39. the computer-readable medium according to claim 38, wherein the multiple alternative media item is based on the multipleThe supporting rate of each media item in alternative media item is ranked up.

40. the computer-readable medium according to claim 38, in addition to：

The second phonetic entry is received from the user；

41. computer-readable medium according to claim 40, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

42. computer-readable medium according to claim 31, wherein the corpus specific to user of the media itemIn each media item include indicating the movable metadata associated with the media item, and wherein described activity is based on instituteState the music rhythm of media item and associated with the media item.

43. computer-readable medium according to claim 31, wherein the corpus specific to user of the media itemIn each media item include the metadata for indicating the mood associated with the media item, and wherein described mood is based on instituteState the music tone of media item and associated with the media item.

44. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

45. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction of following operation：

46. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

Determine whether the phonetic entry limits mood；And

47. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

Determine the phonetic entry whether restriction activity；And

48. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

Determine the phonetic entry whether limiting time section；

49. computer-readable medium according to claim 31, wherein the phonetic entry limits the classification of media item, itsIt is middle to obtain at least one media item including obtaining the multiple media items associated with the classification of the media item, andWherein one or more of programs further comprise the instruction for being operated below：

50. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

51. computer-readable medium according to claim 31, wherein obtaining at least one media item includes addingSecret order board is sent to remote server, and the crypto token includes customer identification information, and wherein needs the crypto tokenTo access the corpus specific to user of the media item via the remote server.

52. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

When providing at least one media item, audio input is received；

Determine whether the audio input includes voice；And

53. computer-readable medium according to claim 52, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

In response to determining that the audio input includes voice：

54. computer-readable medium according to claim 53, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

Abandon providing the response to the audio input.

55. computer-readable medium according to claim 52, wherein the predetermined time quantum is based on describedThe degree of the ambient noise detected in audio input.

56. computer-readable medium according to claim 31, wherein providing at least one media item includes playingMedia item at least one media item, and wherein one or more of programs further comprise be used for carry out it is followingThe instruction of operation：

When playing the media item, the 3rd phonetic entry is received；

57. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

At least one second media item is provided.

58. computer-readable medium according to claim 31, obtained wherein determining whether the phonetic entry corresponds toThe user view of media item with nearest issuing date includes determining whether the phonetic entry includes with obtaining with recentlyA phrase in more than second individual phrases corresponding to the user view of the media item at issuing date.

59. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

60. computer-readable medium according to claim 31, wherein one or more of programs further comprise usingIn the instruction for carrying out following operation：

61. a kind of be used to operate digital assistants to probe into the electronic equipment of media item, the electronic equipment includes：

One or more processors；With

Memory, the memory storage are configured as one or more programs by one or more of computing devices,One or more of programs also include being used for the instruction operated below：

At least one media item is provided.

62. equipment according to claim 61, wherein determining whether the phonetic entry corresponds to obtain is directed to media itemThe user view quantity of parameter that includes determining to limit in the phonetic entry of personalized recommendation whether be less than threshold value.

63. equipment according to claim 61, wherein determining whether the phonetic entry corresponds to obtain is directed to media itemThe user view of personalized recommendation include determining whether the phonetic entry includes pushing away with the personalization obtained for media itemA phrase in multiple phrases corresponding to the user view recommended.

64. equipment according to claim 61, wherein determining whether the phonetic entry corresponds to obtain is directed to media itemThe user view of personalized recommendation include determining whether the phonetic entry is related to the user.

65. equipment according to claim 61, wherein the corpus specific to user of the media item be based on previously byThe media item of user selection or request and generate.

66. equipment according to claim 61, wherein the corpus specific to user of the media item be based on previously byThe media item of user refusal and generate.

67. equipment according to claim 61, wherein the corpus specific to user of the media item be based on it is describedThe personal library for the media item that user is associated and generate.

68. equipment according to claim 61, in addition to：

69. equipment according to claim 68, wherein the multiple alternative media item is based on the multiple alternative media itemIn the supporting rate of each media item be ranked up.

70. equipment according to claim 68, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

The second phonetic entry is received from the user；

71. equipment according to claim 70, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

72. equipment according to claim 61, wherein each matchmaker in the corpus specific to user of the media itemBody item includes indicating the movable metadata associated with the media item, and wherein described activity is based on the media itemMusic rhythm and it is associated with the media item.

73. equipment according to claim 61, wherein each matchmaker in the corpus specific to user of the media itemBody item includes indicating the metadata of the mood associated with the media item, and wherein described mood is based on the media itemMusic tone and it is associated with the media item.

74. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

75. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

76. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

Determine whether the phonetic entry limits mood；And

77. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

Determine the phonetic entry whether restriction activity；And

78. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

Determine the phonetic entry whether limiting time section；

79. equipment according to claim 61, wherein the phonetic entry limits the classification of media item, wherein described in obtainingAt least one media item includes obtaining the multiple media items associated with the classification of the media item, and wherein described oneIndividual or multiple programs further comprise the instruction for being operated below：

80. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

Based on the identity of the identified user, the media are determined from the corpus specific to user of multiple media itemsThe corpus specific to user of item.

81. equipment according to claim 61, wherein obtaining at least one media item includes sending crypto tokenTo remote server, the crypto token includes customer identification information, and wherein needs the crypto token with via describedRemote server accesses the corpus specific to user of the media item.

82. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

When providing at least one media item, audio input is received；

Determine whether the audio input includes voice；And

83. the equipment according to claim 82, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

In response to determining that the audio input includes voice：

84. the equipment according to claim 83, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

Abandon providing the response to the audio input.

85. the equipment according to claim 82, wherein the predetermined time quantum is based in the audio inputThe degree of the ambient noise detected.

86. equipment according to claim 61, wherein providing at least one media item is included at least one described in broadcastingMedia item in individual media item, and wherein one or more of programs further comprise the finger for being operated belowOrder：

When playing the media item, the 3rd phonetic entry is received；

87. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

At least one second media item is provided.

88. equipment according to claim 61, sent out recently wherein determining whether the phonetic entry corresponds to obtain to haveThe user view of the media item on row date includes determining whether the phonetic entry includes with obtaining with nearest issuing dateA phrase in more than second individual phrases corresponding to the user view of media item.

89. equipment according to claim 61, wherein one or more of programs further comprise be used for carry out it is followingThe instruction of operation：

90. equipment according to claim 61, wherein one or more of programs further comprise being used for carrying out it is followingThe instruction of operation：

The determination is based on the previous media item asked or consumed by the user, wherein at least one media item is based on instituteThe skills involved in the labour of determination and be acquired.

91. a kind of equipment, the equipment includes being used for the device for performing the method according to any one of claim 1-30.