CROSS-REFERENCE TO RELATED DOCUMENTSThe present application is a conversion from provisional application No. 60/302,736 to a non-provisional patent application, claims priority benefit under 35 U.S.C. 19(e) of provisional patent application serial No. 60/302,736 filed on Jul. 3, 2001, and incorporates all disclosure of the prior application by reference.[0001]
FIELD OF THE INVENTIONThe present invention is in the area of software application development, deployment and maintenance and pertains particularly to methods and apparatus for platform-independent development and deployment and maintenance of voice applications in a communications environment.[0002]
BACKGROUND OF THE INVENTIONA speech application is one of the most challenging applications to develop, deploy and maintain in a communications (typically telephony) environment. Expertise required for developing and deploying a viable application includes expertise in computer telephony integration (CTI) hardware and software, voice recognition software, text-to-speech software, and speech application logic. hardware and software, voice recognition software, text-to-speech software, and speech application logic.[0003]
With the relatively recent advent of voice extensive markup language (VXML) the expertise require to develop a speech solution has been reduced somewhat. VXML is a language that enables a software developer to focus on the application logic of the voice application without being required to configuring underlying telephony components. Typically, the developed voice application is run on a VXML interpreter that resides on and executes on the associated telephony system to deliver the solution.[0004]
As is shown in FIG. 1A (prior art) a typical architecture of a VXML-compliant telephony system comprises a voice application server ([0005]110) and a VXML-compliant telephony server (130). Typical steps for development and deployment of a VXML enabled IVR solution are briefly described below using the elements of FIG. 1A.
Firstly, a new application database ([0006]113) is created or an existing one is modified to support VXML.Application logic112 is designed in terms of workflow and adapted to handle the routing operations of the IVR system. VXML pages, which are results of functioning application logic, are rendered by a VXML rendering engine (111) based on a specified generation sequence.
Secondly, an object facade to[0007]server130 is created comprising the corresponding VXML pages and is sent toserver130 over a network (120), which can be the Internet, an Intranet, or an Ethernet network. The VXML pages are integrated intorendering engine111 such that they can be displayed according to set workflow atserver110.
Thirdly, the VXML-[0008]telephony server130 is configured to enable proper retrieval of specific VXML pages from renderingengine111 withinserver110. A triggering mechanism is provided toserver110 so that when a triggering event occurs, an appropriate outbound call is placed fromserver110.
A VXML interpreter ([0009]131), a voice recognition text-to-speech engine (132), and the telephony hardware/software (133) are provided withinserver130 and comprise server function. In prior art, the telephony hardware/software130 along with the VXMLinterpreter131 are packaged as an off-the-shelf IVR-enabling technology. Arguably the most important feature, however, of the entire system is theapplication server110. The application logic (112) is typically written in a programming language such as Java and packaged as an enterprise Java Bean archive. The presentation logic required is handled by renderingengine111 and is written in JSP or PERL.
A drawback to this type of prior-art system is that the developer must be highly skilled in VXML language compilation, database programming, and advanced application server design. Provision of such a prior-art solution typically takes considerable time to develop and deploy. Further, if any changes are required to debug or modifications are required for users, the developer will be the only person that understands the system and therefore, the only one who can successfully debug or modify it. The ad hoc nature of prior-art design and deployment of the VXML solution greatly increases the production cycle and cost of the IVR solution.[0010]
In light of the above-described drawbacks associated with prior art, improvements are needed in terms of ease-of-use issues, including enabling any persons regardless of programming experience to develop and deploy an IVR solution. Speed of development and deployment as well as platform independence are also issues that need improvement.[0011]
What is clearly needed is a user-friendly application for developing, deploying and maintaining a platform-independent voice application in a telephony environment without the requirement of traditional developers' skills.[0012]
SUMMARY OF THE INVENTIONIn a preferred embodiment of the present invention a system for developing and deploying a voice application over a communications network to one or more recipients is provided, comprising a voice application server connected to a data network for storing and serving voice applications, a network communications server connected to the data network and to the communications network for routing the voice applications to their intended recipients, a computer station connected to the data network having control access to at least the voice application server, and a software application running on the computer station for creating applications and managing their states. The system is characterized in that a developer operating the software application from the computer station creates voice applications through object modeling and linking, stores them for deployment in the application server, and manages deployment and state of deployed applications including scheduled deployment and repeat deployments in terms of intended recipients.[0013]
In preferred embodiments the communications network is a telephony network accessible to the data network, and may also be a data network. In some embodiments the communications network is the Internet network. Also in preferred embodiments the voice application server serves voice applications through the data network to the network communications server. The network communications server may be a telephony server. Further, the telephony server may be VXML compliant and connected to a telephony network. Also the telephony network may be a PSTN network.[0014]
In some cases the computer station is a desktop computer station having a display monitor, and the computer station may be a lap top computer having a wireless connection to the data network. The computer station and the voice application server may share the same physical domain, which could be a building, or a machine.[0015]
In some embodiments the communications server is an Internet server enabled for voice using VXML. In some embodiments recipients may use a telephone to access the voice application, or in some cases a computer. In some embodiments of the invention the system further comprises a voice portal, and the voice portal may be an IVR.[0016]
In another aspect of the invention a software application for designing and deploying voice applications over a communications network is provided, comprising a client interface for creating and managing contacts for receiving voice applications, a client interface for designing dialogs including responses of a voice application, a client interface for creating and managing actions related to dialog deployment based on call exception handling, and a client interface for integrating dialogs and data fetch and send operations to form a complete executable voice application. The software application is characterized in that the operations configured and executed are parameters of dialog objects integrated to form the voice application and wherein execution of a first dialog object results in at least one of an expected state including interaction resulting in subsequent execution of further dialog objects and or configured actions including data fetch and send operations associated with the objects and with interaction with the dialogs.[0017]
In preferred embodiments of the software the communications network is a telephony network accessible to a data network and in some embodiments a data network. The network may be Internet network. Also the application may be configured to execute on a computer station, which may be a desktop or a portable computer. The client interface is preferably platform independent.[0018]
In some embodiments object modeling is used to create voice applications. There may further be a dialog controller within the application server, the dialog controller adapted to access business rules and to access stored data according to interpretation of dialog responses received. In these embodiments the stored data may be accessed from a third-party source.[0019]
In still another aspect of the invention a method for developing an interactive voice application using an object oriented software application is provided, comprising the steps of (a) naming and describing a voice application under development; (b) identifying contacts for receiving the voice application; (c) creating at least one executable dialog defining the voice application; (d) applying business rules and exception handling rules for each dialog created; (e) configuring the state of execution of the voice application; and (d) deploying the voice application for launch according to type.[0020]
In some embodiments, in step (c), the dialog is configured for one of inbound deployment or outbound deployment. In step (d) the exceptions may be telephony events. Further, step (e) may include scheduling parameters.[0021]
In yet another aspect of the invention a voice application server for developing and serving voice applications to a distribution point in a network is provided, comprising an instance of voice application development software for developing application components, an input port for enabling network access to the application server, a resource adapter and application program interface for accessing data from internal or external data sources, and application logic for directing application development and deployment. The voice application server is characterized in that a developer designs and configures applications for deployment from application server from a remote address and wherein the application server serves the completed applications to the distribution point based on predefined rules.[0022]
BRIEF DESCRIPTION OF THE DRAWING FIGURESFIG. 1A is a block diagram illustrating a basic architecture of a VXML-enabled IVR development and deployment environment according to prior-art.[0023]
FIG. 1B is a block diagram illustrating the basic architecture of FIG. 1A enhanced to practice the present invention.[0024]
FIG. 2 is a process flow diagram illustrating steps for creating a voice application shell or container for a VXML voice application according to an embodiment of the present invention.[0025]
FIG. 3 is a block diagram illustrating a simple voice application container according to an embodiment of the present invention.[0026]
FIG. 4 is a block diagram illustrating a dialog object model according to an embodiment of the present invention.[0027]
FIG. 5 is a process flow diagram illustrating steps for voice dialog creation for a VXML-enabled voice application according to an embodiment of the present invention.[0028]
FIG. 6 is a block diagram illustrating a dialog transition flow after initial connection with a consumer according to an embodiment of the present invention.[0029]
FIG. 7 is a plan view of a developer's frame containing a developer's login screen of according to an embodiment of the present invention.[0030]
FIG. 8 is a plan view of a developer's frame containing a screen shot of a home page of the developer's platform interface of FIG. 7.[0031]
FIG. 9 is a plan view of a developer's frame containing a screen shot of an[0032]address book911 accessible through interaction with the option Address insection803 of the previous frame of FIG. 8.
FIG. 10 is a plan view of a developer's frame displaying a[0033]screen1001 for creating a new voice application.
FIG. 11 is a plan view of a developer's frame illustrating screen of FIG. 10 showing further options as a result of scrolling down.[0034]
FIG. 12 is a screen shot of a dialog configuration window illustrating a dialog configuration page according to an embodiment of the invention.[0035]
FIG. 13 is a[0036]screen shot1300 of dialog design panel of FIG. 12 illustrating progression of dialog state to a subsequent contact.
FIG. 14 is a screen shot of a thesaurus configuration window activated from the example of FIG. 13 according to a preferred embodiment.[0037]
FIG. 15 is a plan view of a developer's frame illustrating a screen for managing created modules according to an embodiment of the present invention.[0038]
DESCRIPTION OF THE PREFERRED EMBODIMENTSAccording to preferred embodiments of the present invention, the inventor teaches herein, in an enabling fashion, a novel system for developing and deploying real-time dynamic or static voice applications in an object-oriented way that enables inbound or outbound delivery of IVR and other interactive voice solutions in supported communications environments.[0039]
FIG. 1A is a block diagram illustrating a basic architecture of a VXML-enabled IVR development and deployment environment according to prior art. As described with reference to the background section, the prior-art architecture of this example is known to and available to the inventor. Developing and deploying voice applications for the illustrated environment, which in this case is a telephony environment, requires a very high level of skill in the art. Elements of this prior-art example that have already been introduced with respect to the background section of this specification shall not be re-introduced.[0040]
In this simplified scenario,[0041]voice application server110 utilizes database/resource adapter113 for accessing a database or other resources for content.Application logic112 comprising VXML script, business rules, and underlying telephony logic must be carefully developed and tested before single applications can be rendered byrendering engine111. Once voice applications are complete and servable fromserver110, they can be deployed throughdata network120 totelephony server130 whereinterpreter131 and text-tospeech engine132 are utilized to formulate and deliver the voice application in useable or playable format for telephony software andhardware133. The applications are accessible to a receiving device, illustrated herein asdevice135, a telephone, through the prevailingnetwork134, which is in this case a public-switched-telephone-network (PSTN) linking the telephony server to the consumer (device135) generally through a telephony switch (not shown).
Improvements to this prior-art example in embodiments of the present invention concern and are focused in the capabilities of[0042]application server110 with respect to development and deployment issues and with respect to overall enhancement to response capabilities and options in interaction dialog that is bi-directional. Using the description of existing architecture deemed state-of-art architecture, the inventor herein describes additional components that are not shown in the prior-art example of FIG. 1A, but are illustrated in a novel version of the example represented herein by FIG. 1B.
FIG. 1B is a block diagram illustrating the basic architecture of FIG. 1A enhanced to illustrate an embodiment of the present invention. Elements of the prior-art example of FIG. 1A that are also illustrated in FIG. 1B retain their original element numbers and are not re-introduced. For reference purposes an entity (a person) that develops a voice application shall be referred to hereinafter in this specification as either a producer or developer.[0043]
A developer or producer of a voice application according to an embodiment of the present invention operates preferably from a remote computerized workstation illustrated herein as[0044]station140.Station140 is essentially a network-connected computer station.Station140 may be housed within the physical domain alsohousing application server110. In another embodiment,station140 andapplication server110 may reside in the same machine. In yet another embodiment, a developer may operatestation140 from his or her home office or from any network-accessible location including any wireless location.
[0045]Station140 is equipped with a client software tool (CL)141, which is adapted to enable the developer to create and deploy voice applications across the prevailing system represented byservers110,130, and by receivingdevice135.CL141 is a Web interface application similar to or incorporated with a Web browser application in this example, however other network situations may apply instead.CL141 contains the software tools required for the developer to enable enhancements according to embodiments of the invention.Station140 is connected to avoice portal143 that is maintained either on the data network (Internet, Ethernet, Intranet, etc.) and/or withintelephony network134. In thisexample portal143 is illustrated logically in both networks.Voice portal143 is adapted to enable a developer or a voice application consumer to call in and perform functional operations (such as access, monitor, modify) on selected voice applications.
Within application server[0046]10 there is an instance of voiceapplication development server142 adapted in conjunction with the existing components111-113 to provide dynamic voice application development and deployment according to embodiments of the invention.
[0047]Portal143 is accessible via network connection tostation140 and via a network bridge to a voice application consumer throughtelephony network134. In one example, portal143 is maintained as part ofapplication server110.Portal143 is, in addition to an access point for consumers is chiefly adapted as a developer's interface server.Portal143 is enabled by aSW instance144 adapted as a server instance toCL141. In a telephony embodiment, portal143 may be an interactive voice response (IVR) unit.
In a preferred embodiment, the producer or developer of a voice application accesses application server[0048]10 throughportal143 anddata network120 usingremote station140 as a “Web interface” and first creates a list of contacts. In an alternative embodiment,station140 has direct access toapplication server110 through a network interface. Contacts are analogous to consumers of created voice applications.CL141 displays, upon request and in order of need, all of the required interactive interfaces for designing, modifying, instantiating, and executing completed voice applications to launch fromapplication server110 and to be delivered byserver130.
The software of the present invention enables voice applications to be modeled as a set of dialog objects having business and telephony (or other communication delivery/access system) rules as parameters without requiring the developer to perform complicated coding operations. A dialog template is provided for modeling dialog states. The dialog template creates the actual speech dialog, specifies the voice application consumer (recipient) of the dialog, captures the response from the voice application consumer and performs any follow-up actions based upon system interpretation of the consumer response. A dialog is a reusable component and can be linked to a new dialog or to an existing (stored) dialog. A voice application is a set of dialogs inter-linked by a set of business rules defined by the voice application producer. Once the voice application is completed, it is deployed by[0049]server110 and is eventually accessible to the authorized party (device135) throughtelephony server130.
The voice applications are in a preferred embodiment in the form of VXML to run on VXML-[0050]compliant telephony server130. This process is enabled throughVXML rendering engine111.Engine111 interacts directly withserver130, locates the voice application at issue, retrieves its voice application logic, and dynamically creates the presentation in VXML and forwards it toserver130 for processing and delivery. Onceinterpreter131 interprets the VXML presentation it is sent to or accessible todevice135 in the form of an interactive dialog (in this case an IVR dialog). Any response fromdevice135 follows the same path back toapplication server110 for interpretation byengine111.Server110 then retrieves the voice application profile from the database accessible throughadapter113 and determines the next business rule to execute locally. Based upon the determination a corresponding operation associated with the rule is taken. A next (if required) VXML presentation is then forwarded torendering engine111, which in turn dynamically generates the next VXML page for interpretation, processing and deployment atserver130. This two-way interaction between the VXML-compliant telephony server (130) and the voice application server (110) continues in the form of an automated logical sequence of VXML dialogs until the voice application finally reaches its termination state.
A voice application (set of one or more dialogs) can be delivered to the consumer (target audience) in outbound or inbound fashion. For an inbound voice application, a voice application consumer calls in to voice portal[0051]143 to access the inbound voice application served fromserver130. The voice portal can be mapped to a phone number directly or as an extension to a central phone number. In a preferred embodiment the voice portal also serves as a community forum where voice application producers can put their voice applications into groups for easy access and perform operational activities such as voice application linking, reporting, and text-to-speech recording and so on.
For an outbound voice application there are two sub-types. These are on-demand outbound applications and scheduled outbound applications. For on-demand[0052]outbound applications server110 generates an outbound call as soon as the voice application producer issues an outbound command associated with the application. The outbound call is made to the target audience and upon the receipt of the call the voice application is launched fromserver130. For scheduled outbound applications, the schedule server (not shown within server110) launches the voice application as soon as the producer-specified date and time has arrived. In a preferred embodiment both on-demand and scheduled outbound application deployment functions support unicast, multicast, and broadcast delivery schemes.
As described above, a voice application created by[0053]application server110 consists of one or more dialogs. The contents of each dialog can be static or dynamic. Static content is content sourcing from the voice application producer. The producer creates the contents when the voice application is created. Dynamic content sources from a third-party data source.
In a preferred embodiment a developers tool contains an interactive dialog design panel (described in detail later) wherein a producer inputs a reference link in the form of eXtensible Markup Language (XML) to the dialog description or response field. When a dialog response is executed and interpreted by[0054]application server110, the reference link invokes a resource Application-Program-Interface (API) that is registered inresource adapter113. The API goes out in real time and retrieves the requested data and integrates the returned data into the existing dialog. The resulting and subsequent VXML page being generated has the dynamic data embedded onto it.
One object of the present invention is a highly dynamic, real time IVR system that tailors itself automatically to the application developer's specified data source requirement. Another object of the present invention is to enable rapid development and deployment of a voice application without requirement of any prior knowledge of VXML or any other programming technologies. A further object of the present invention is to reduce the typical voice application production cycle and drastically reduce the cost of production.[0055]
FIG. 2 is a process flow diagram illustrating steps for creating a voice application shell or container for a VXML voice application according to an embodiment of the present invention. A developer utilizing a client application known as a thin client analogous to[0056]CL141 onstation140 described with reference to FIG. 1b, creates a voice application shell or voice application container. Atstep201 the developer logs in to the system at a login page. Atstep202 the developer creates a contact list of application consumers. Typically a greeting or welcome page would be displayed beforestep202. An application consumer is an audience of one or more entities that would have access to and interact with a voice application. A contact list is first created so that all of the intended contacts are available during voice application creation if call routing logic is required later on. The contact list can either be entered individually in the event of more than one contact by the producer or may be imported as a set list from some organizer/planner software, such as Microsoft Outlook™ or perhaps a PDA™ organizer.
In one embodiment of the present invention the contact list may reside on an external device accessed by a provided connector (not shown) that is configured properly and adapted for the purpose of accessing and retrieving the list. This approach may be used, for example, if a large, existing customer database is used. Rather than create a copy, the needed data is extracted from the original and provided to the application.[0057]
At[0058]step203, a voice application header is populated. A voice application header is simply a title field for the application. The field contains a name for the application and a description of the application. Atstep204, the developer assigns either and inbound or outbound state for the voice application. An outbound application is delivered through an outbound call while the consumer accesses an inbound voice application.
In the case of the inbound application, in[0059]step205 the system sets a default addressee for inbound communications. The developer selects a dialog from a configured list instep206. It is assumed in this example that the dialogs have already been created. Atstep207, the developer executes the dialog and it is deployed automatically.
In the case of an outbound designation in[0060]step204, the developer chooses a launch type instep208. A launch type can be either an on-demand type or a scheduled type. If the choice made by the developer instep208 is scheduled, then instep209, the developer enters all of the appropriate time and date parameters for the launch including parameters for recurring launches of the same application. In the case of an on demand selection for application launch instep208, then instep210 the developer selects one or more contacts from the contact list established instep202. It is noted herein thatstep210 is also undertaken by the developer afterstep209 in the case of a scheduled launch. Atstep207, the dialog is created. In this step a list of probable dialog responses for a voice application wherein interaction is intended may also be created and stored for use.
In general sequence, a developer creates a voice application and integrates the application with a backend data source or, optionally, any third party resources and deploys the voice application. The application consumer then consumes the voice application and optionally, the system analyzes any consumer feedback collected by the voice application for further interaction if appropriate. The steps of this example pertain to generating and launching a voice application from “building blocks” that are already in place.[0061]
FIG. 3 is a block diagram illustrating a simple[0062]voice application container300 according to an embodiment of the present invention.Application container300 is a logical container or “voice application object”300. Also termed a shell,container300 is logically illustrated as a possible result of the process of FIG. 2 above.Container300 contains one or more dialog states illustrated herein as dialogs301a-nlabeled in this example as dialogs 1-4. Dialogs301a-nare objects and thereforecontainer300 is a logical grouping of the set of dialog objects301a-n.
The represented set of dialog objects[0063]301a-nis interlinked by business rules labeled rules 1-4 in this example. Rules 1-4 are defined by the developer and are rule objects. It is noted herein that that there may be many more or fewer dialog objects301a-nas well as interlinking business rule objects 1-4 comprisingcontainer object300 without departing from the spirit and scope of the present invention. The inventor illustrates4 of each entity and deems the representation sufficient for the purpose of explaining the present invention.
In addition to the represented objects,[0064]voice application shell300 includes a plurality of settings options. In this example, basic settings options are tabled for reference and given the element number305a-cillustrating3 listed settings options. Reading in the table from top to bottom, a first setting launch type (305a) defines an initial entry point forvoice application300 into the communications system. As described above with reference to FIG. 2step204, the choices forlaunch type305aare inbound or outbound. In an alternative embodiment, a launch type may be defined by a third party and be defined in some other pattern than inbound or outbound.
Outbound launch designation binds a voice application to one or more addressees (consumers). The addressee may be a single contact or a group of contacts represented by the contact list or distribution list also described with reference to FIG. 2 above (step[0065]202). When the outbound voice application is launched in this case, it is delivered to the addressee designated on a voice application outbound contact field (not shown). All addressees designated receive a copy of the outbound voice application and have equal opportunity to interact (if allowed) with the voice application dialog and the corresponding backend data resources if they are used in the particular application.
In the case of an inbound voice application designation for[0066]launch type305a, the system instructs the application to assume a ready stand-by mode. The application is launched when the designated voice application consumer actively makes a request to access the voice application. A typical call center IVR system assumes this type of inbound application.
Launch time setting ([0067]305b) is only enabled as an option if the voice application launch type setting305ais set to outbound. The launch time setting is set to instruct a novel scheduling engine, which may be assumed to be part of the application server function described with reference to FIG. 1B. The scheduling engine controls the parameter of when to deliver of when to deliver the voice application to the designated addressees. The time setting may reflect on-demand, scheduled launch, or any third-party-defined patterns.
On-demand gives the developer full control over the launch time of the voice application. The on-demand feature also allows any third-party system to issue a trigger event to launch the voice application. It is noted herein that in the case of third-party control the voice application interaction may transcend more than one communications system and or network.[0068]
Property setting[0069]305cdefines essentially how the voice application should behave in general. Possible state options for setting305care public, persistent, or sharable. A public state setting indicates that the voice application should be accessible to anyone within the voice portal domain so that all consumers with minimum privilege can access the application. A persistent state setting for property305censures that only one copy of the voice application is ever active regardless of how many consumers are attempting to access the application. An example of such a scenario would be that of a task-allocation voice application. For example, in a task-allocation scenario there are only a number of time slots available for a user to access the application. If the task is a request from a pool of contacts such as perhaps customer-support technicians to lead a scheduled chat session, then whenever a time slot has been selected, the other technicians can only select the slots that are remaining. Therefore if there is only one copy of the voice application circulating within the pool of technicians, the application captures the technician's response on a first-come first-serve basis.
A sharable application state setting for[0070]property305aenables the consumer to “see” the responses of other technicians in the dialog at issue, regardless of whether the voice application is persistent or not. Once the voice application shell is created, the producer can then create the first dialog of the voice application as described with reference to FIG. 2step207. It is reminded herein thatshell300 is modeled using a remote and preferably a desktop client that will be described in more detail later in this specification.
FIG. 4 is a block diagram illustrating a[0071]dialog object model400 according to an embodiment of the present invention.Dialog object model400 is analogous to any of dialog objects301a-ndescribed with reference to FIG. 3 above. Object400 models a dialog and all of its properties. A properties object illustrated withindialog object400 and labeled Object Properties (410) contains the dialog type and properties including behavior states and business rules that apply to the dialog.
For example, every dialog has a route-to property illustrated in the example as Route To property ([0072]411).Property411 maps to and identifies the source of the dialog. Similarly, every dialog has a route-from property illustrated herein as Route From property (412). Route fromproperty412 maps to and identifies the recipient contact of the dialog or the dialog consumer.
Every dialog falls under a dialog type illustrated in this example by a property labeled Dialog Type and given the[0073]element number413.Dialog type413 may include but is not limited to the following types of dialogs:
1. Radio Dialog: A radio dialog allows a voice application consumer to interactively select one of available options from an option list after hearing the dialog description.[0074]
2. Bulletin Dialog: A bulletin dialog allows a voice application consumer to interact with a bulletin board-like forum where multiple consumers can share voice messages in an asynchronous manner.[0075]
3. Statement Dialog: A statement dialog plays out a statement to a voice application consumer without expecting any responses from the consumer.[0076]
4. Open Entry Dialog: An open entry dialog allows a voice application consumer to record a message of a pre-defined length after hearing the dialog description.[0077]
5. Third Party Dialog: A third party dialog is a modular container structure that allows the developer to create a custom-made dialog type with its own properties and behaviors. An example would be Nuance's SpeechObject™.[0078]
Each dialog type has one or more associated business rules tagged to it enabling determination of a next step in response to a perceived state. A rule compares the application consumer response with an operand defined by the application developer using an operational code such as less than, greater than, equal to, or not equal to. In a preferred embodiment of the invention the parameters surrounding a rule are as follows:[0079]
If user response is equal to the predefined value, then perform one of the following:[0080]
A. Do nothing and terminate the dialog state.[0081]
B. Do a live bridge transfer to the contact specified. Or,[0082]
C. Send another dialog to another contact.[0083]
In the case of an outbound voice application, there are likely to be exception-handling business rules associated with perceived states. In a preferred embodiment of the present invention, exception handling rules are encapsulated into three different events:[0084]
1. An application consumer designated to receive the voice application rejects a request for interacting with the voice application.[0085]
2. An application consumer has a busy connection at the time of launch of the voice application, for example, a telephone busy signal. And,[0086]
3. An application consumer's connection is answered by or is redirected to a non-human device, for example, a telephone answering machine.[0087]
For each of the events above, any one of the three follow-up actions are possible according to perceived state:[0088]
1. Do nothing and terminate the dialog state.[0089]
2. Redial the number.[0090]
3. Send another dialog to another contact.[0091]
FIG. 5 is a process flow diagram illustrating steps for voice dialog creation for a VXML-enabled voice application according to an embodiment of the present invention. All dialogs can be reused for subsequent dialog routing. There is, as previously described, a set of business rules for every dialog and contact pair. A dialog be active and be able to transit from one dialog state to another only when it is rule enabled.[0092]
At step[0093]501 a developer populates a dialog description field with a dialog description. A dialog description may also contain reference to XML tags as will be described further below. Atstep502, parameters of the dialog type are entered based on the assigned type of dialog. Examples of the available parameters were described with reference to FIG. 4 above.
At[0094]step503 the developer configures the applicable business rules for the dialog type covering, as well, follow up routines. In one embodiment rules configuration atstep503 resolves to step505 for determining follow-up routines based on the applied rules. For example, the developer may select atstep505, one of three types of transfers. For example, the developer may configure for a live transfer as illustrated bystep506; transfer to a next dialog for creation as illustrated bystep507; or the developer may configure for dialog completion as illustrated bystep508.
If the developer does not branch off into configuring[0095]sub-routines506,507, or508 fromstep505, but rather continues fromstep503 to step504 wherein inbound or outbound designation for the dialog is system assigned, then the process must branch fromstep504 to either step508 or509, depending on whether the dialog is inbound or outbound. If atstep504, the dialog is inbound, then atstep508 the dialog is completed. If the assignment atstep504 is outbound, then atstep509 to configure call exception business rules.
At[0096]step510, the developer configures at least one follow-up action for system handling of exceptions. If no follow-up actions are required to be specified atstep510, then the process resolves to step508 for dialog completion. If an action or actions are configured atstep510, then atstep511 the action or actions are executed such as a system re-dial, which the illustrated action forstep511.
In a preferred embodiment, once the voice application has been created, it can be deployed and accessed through the telephone. The method of access, of course, depends on the assignment configured at[0097]step504. For example, if the application is inbound, the application consumer accesses a voice portal to access the application. As described further above, a voice portal is a voice interface for accessing a selected number of functions of the voice application server described with reference to FIG. 1B above. A voice portal may be a connection-oriented-switched-telephony (COST) enabled portal or a data-network-telephony (DNT) enabled portal. In the case of an outbound designation atstep504, the application consumer receives the voice application through an incoming call to the consumer originated from the voice application server. In a preferred embodiment, the outbound call can be either COST based or DNT based depending on the communications environment supported.
FIG. 6 is a block diagram illustrating a dialog transition flow after initial connection with a consumer according to an embodiment of the present invention. Some of the elements illustrated in this example were previously introduced with respect to the example of FIG. 1B above and therefore shall retain their original element numbers. In this example, an application consumer is logically illustrated as[0098]Application Consumer600 that is actively engaged in interaction with adialog601 hosted bytelephony server130.Server130 is, as previously described a VMXL compliant telephony server as is so labeled.
[0099]Application server110 is also actively engaged in the interaction sequence and has the capability to provide dynamic content toconsumer600. Asapplication consumer600 begins to interact with the voice application represented herein bydialog600 withintelephony server130,voice application server110 monitors the situation. In actual practice, each dialog processed and sent toserver130 for delivery to or access byconsumer600 is an atomic unit of the particular voice application being deployed and executed. Thereforedialog601 may logically represent more than one single dialog.
In this example, assuming more than one dialog,[0100]dialog601 is responsible during interaction for acquiring a response fromconsumer600. Arrows labeled Send and Respond represent the described interaction. Whenconsumer600 responds to dialog content, the response is sent back along the same original path toVXML rendering engine111, which interprets the response and forwards the interpreted version to a provideddialog controller604.Controller604 is part ofapplication logic112 inserver110 described with reference to FIG. 1B.Dialog controller604 is a module that has the ability to perform table lookups, data retrieve and data write functions based on established rules and configured response parameters.
When[0101]dialog controller604 receives a dialog response, it stores the response corresponding to the dialog at issue (601) to a provideddata source602 for data mining operations and workflow monitoring.Controller604 then issues a request to a providedrules engine603 to look-up the business rule or rules that correspond to the stored response. Once the correct business rule has been located for the response, the dialog controller starts interpretation. If the business rule accessed requires reference to a third-party data source (not shown),controller604 makes the necessary data fetch from the source. Any data returned bycontroller604 is integrated into the dialog context and passed onwardVXML rendering engine111 for dialog page generation of anext dialog601. The process repeats untildialog601 is terminates.
In one embodiment, the business rule accessed by[0102]controller604 as a result of a received response fromconsumer600 carries a dialog transition state other than back to the current application consumer. In thiscase controller604 spawns an outbound call fromapplication server110 to deliver the next or “generated dialog” to the designated target application consumer. At the same time, the current consumer has his/her dialog state completed as described with reference to FIG. 5step508 according to predefined logic specified in the business rule.
It will be apparent to one with skill in the art that a dialog can contain dynamic content by enabling[0103]controller604 to have access todata source602 according to rules served byrule engine603. In most embodiments there are generally two types of dynamic content. Both types are, in preferred embodiments, structured in the form of XML and are embedded directly into the next generated dialog page. The first of the2 types of dynamic content is classified as non-recurring. Non-recurring content makes a relative reference to a non-recurring resource label in a resource adapter registry within a resource adapter analogous toadapter113 ofvoice application server110 described with reference to FIG. 1B.
In the above case, when[0104]dialog controller604 interprets the dialog, it first scans for any resource label. If a match is found, it looks up the resource adapter registry and invokes the corresponding resource API to fetch the required data into the new dialog context. Once the raw data is returned from the third-party data source, it passes the raw data to a corresponding resource filter for further processing. When completed in terms of processing by the filter, the dialog resource label or tag is replaced with the filtered data and is integrated transparently into the new dialog.
The second type of dynamic content is recurring. Recurring content usually returns more than one set of a name and value pair. An example would be a list of stocks in an application consumer's stock portfolio. For example, a dialog that enables[0105]consumer600 to parrot a specific stock and have the subsequent quote returned through another dialog state is made to use recurring dynamic content to achieve the desired result. Recurring content makes a relative reference to a recurring resource label in the resource adapter registry ofvoice application server110. Whencontroller604 interprets the dialog, it handles the resource in an identical manner to handling of non-recurring content. However, instead of simply returning the filtered data back to the dialog context, it loops through the data list and configures each listed item as a grammar-enabled keyword. In so doing,consumer600 can parrot one of the items (separate stocks) in the list played in the first dialog and have the response captured and processed for return in the next dialog state. The stock-quote example presented below illustrates possible dialog/response interactions from the viewpoint ofconsumer600.
Voice Application: “Good morning Leo, what stock quote do you want?”[0106]
Application Consumer: “Oracle”[0107]
Voice Application: “Oracle is at seventeen dollars.”[0108]
Voice Application: “Good morning Leo, what stock quote do you want?”[0109]
This particular example consists of two dialogs.[0110]
The first dialog plays out the statement “Good morning Leo, what stock quote do you want?” The dialog is followed by a waiting state that listens for keywords such as Oracle, Sun, Microsoft, etc. The statement consists of two dynamic non-recurring resource labels. The first one is the time in day: Good morning, good afternoon, or good evening. The second dynamic content is the name of the application consumer. In this case, the name of the consumer is internal to the voice application server, thus the type of the resource label is SYSTEM. In the actual dialog description field, it may look something like this:[0111]
<resource type=‘ADAPTER’ name=‘time greeting’><resource type=‘SYSTEM’ name=‘target contact’/>, what stock quote do you want?[0112]
Because the dialog is expecting the consumer to say a stock out of his/her existing portfolio, the dialog type is radio dialog, and the expected response property of the radio dialog is[0113]
<resource type=‘ADAPTER’ name=‘stock_list’>[0114]
<param>[0115]
<resource type=‘SYSTEM’ name=‘target_contact_id’/>[0116]
</param>[0117]
</resource>[0118]
This XML resource label tells[0119]dialog controller604 to look for a resource label named stock_list and to invoke the corresponding API with target_contact_id as the parameter. Upon completion of the data fetching, the list of stocks is integrated into the dialog as part of the grammars. And whatever the user responds to in terms of stock identification is matched against the grammars at issue (stocks in portfolio) and assigned the grammar return value to the dialog response, which can then forward it to the next dialog as resource of DIALOG type.
The producer can make reference to any dialog return values in any subsequent dialog by using <resource type=‘DIALOG’ name=‘dialog_name’>. This rule enables the producer to play out the options the application consumer selected previously in any follow-up dialogs.[0120]
The second dialog illustrated above plays out the quote of the stock selected from the first dialog, then returns the flow back to the first dialog. Because no extra branching logic is involved in this dialog, the dialog type in this case is a statement dialog. The dialog's follow-up action is simply to forward the flow back to the first dialog. In such a case, the dialog statement is: <resource type=‘DIALOG’ name=‘select stock dialog’/>[0121]
<resource type=‘ADAPTER’ name=‘get_stock_quote’>[0122]
<param>[0123]
<resource type=‘DIALOG’ name=‘select stock dialog’/>[0124]
</param>[0125]
</resource>[0126]
Besides making reference to ADAPTER, DIALOG and SYSTEM type, the dialog can also take in other resource types such as SOUND and SCRIPT. SOUND can be used to impersonate the dialog description by inserting a sound clip into the dialog description. For example, to play a sound after the stock quote, the producer inserts <resource type=‘SOUND’ name=‘beep’/> right after the ADAPTER resource tag. The producer can add a custom-made VXML script into the dialog description by using <resource type=‘RESOURCE’ name=‘confirm’/> so that in the preferred embodiment, any VXML can be integrated into the dialog context transparently with maximum flexibility and expandability.[0127]
It will be apparent to one with skill in the art that while the example cited herein use VXML and XML as the mark-up languages and tags, it is noted herein that other suitable markup languages can be utilized in place of or integrated with the mentioned conventions without departing from the spirit and scope of the invention. It will also be apparent to the skilled artisan that while the initial description of the invention is made in terms of a voice application server having interface to a telephony server using generally HTTP requests and responses, it should be noted that the present invention can be practiced in any system that is capable of handling well-defined requests and responses across any distributed network.[0128]
FIGS.[0129]7-15 illustrate various displayed Browser frames of a developer platform interface analogous toCL141 ofstation140 of FIG. 1B. Description of the following interface frames and frame contents assumes existence of a desktop computer host analogous to station140 of FIG. 1B wherein interaction is enabled in HTTP request/response format as would be the case of developing over the Internet network for example. However, the following description should not limit the method and apparatus of the invention in any way as differing protocols, networks, interface designs and scope of operation can vary.
FIG. 7 is a plan view of a developer's frame containing a developer's login screen of[0130]700 according to an embodiment of the present invention.Frame700 is presented to a developer in the form of a Web browser container according to one embodiment of the invention. Commercial Web browsers are well known and any suitable Web browser will support the platform.Frame700 has all of the traditional Web options associated with most Web browser frames including back, forward, Go, File, Edit, View, and so on. A navigation tool bar is visible in this example.Screen710 is a login page. The developer may, in one embodiment, have a developer's account. In another case, more than one developer may share a single account. There are many possibilities.
[0131]Screen710 has a field for inserting a login ID and a field for inserting a login personal identification number (PIN). Once login parameters are entered the developer submits the data by clicking on a button labeled Login.Screen710 may be adapted for display on a desktop computer or any one of a number of other network capable devices following specified formats for display used on those particular devices.
FIG. 8 is a plan view of a developer's[0132]frame800 containing a screen shot of a home page of the developer's platform interface of FIG. 7.Frame800 contains a sectioned screen comprising awelcome section801, aproduct identification section802 and anavigation section803 combined to fill the total screen or display area. A commercial name for a voice application developer's platform that is coined by the inventor is the name Fonelet.Navigation section803 is provided to display on the “home page” and on subsequent frames of the software tool.
[0133]Navigation section803 contains, reading from top to bottom, a plurality of useful links. Starting with a link to home followed by a link to an address book. A link for creating a new Fonelet (voice application) is labeled Create New. A link to “My” Fonelets is provided as well as a link to “Options”. A standard Help link is illustrated along with a link to Logout. An additional “Options Menu” is the last illustrated link insection803.Section803 may have additional links that are visible by scrolling down with the provided scroll bar traditional to the type of display of this example.
FIG. 9 is a plan view of a developer's[0134]frame900 containing a screen shot of anaddress book911 accessible through interaction with the option Address insection803 of the previous frame of FIG. 8.Screen911 as an interactive option for listing individual contacts and for listing contact lists. A contact list is a list of voice application consumers and a single contact represents one consumer in this example. However, in other embodiments a single contact may mean more than one entity.Navigation screen803 is displayed on the left ofscreen911. In this example, contacts are listed by First Name followed by Last Name, followed by a telephone number and an e-mail address. Other contact parameters may also be included or excluded without departing from the spirit and scope of the invention. For example the Web site of a contact may be listed and may also be the interface for receiving a voice application. To the left of the listed contacts are interactive selection boxes used for selection and configuration purposes. Interactive options are displayed in the form of Web buttons and adapted to enable a developer to add or delete contacts.
FIG. 10 is a plan view of a developer's[0135]frame1000 displaying ascreen1001 for creating a new voice application.Screen1001 initiates creation of a new voice application termed a Fonelet by the inventor. Aname field1002 is provided inscreen1001 for inputting a name for the application. Adescription field1003 is provided for the purpose of entering the applications description. Aproperty section1004 is illustrated and adapted to enable a developer to select from available options listed as Public, Persistent, and Shareable by clicking on the appropriate check boxes.
A Dialog Flow Setup section is provided and contains a dialog[0136]type section field1005 and a subsequent field for selecting a contact orcontact group1006. After the required information is correctly populated into the appropriate fields, a developer may “create” the dialog by clicking on aninteractive option1007 labeled Create.
FIG. 11 is a plan view of a developer's[0137]frame1100illustrating screen1001 of FIG. 10 showing further options as a result of scrolling down. A callingschedule configuration section1101 is illustrated and provides the interactive options of On Demand or Scheduled. As was previously described, selecting On Demand enables application deployment at the will of the developer while selecting scheduled initiates configuration for a scheduled deployment according to time/date parameters. A grouping ofentry fields1102 is provided for configuring Time Zone and Month of launch. A subsequent grouping ofentry fields1103 is provided for configuring the Day of Week and the Day of Month for the scheduled launch. A subsequent grouping ofentry fields1104 is provided for configuring the hour and minute of the scheduled launch. It is noted herein that the options enable a repetitive launch of the same application. Once the developer finishes specifying the voice application shell, he or she can click a Create Dialog button labeled Create to spawn an overlying browser window for dialog creation.
FIG. 12 is a screen shot of a[0138]dialog configuration window1200 illustrating a dialog configuration page according to an embodiment of the invention. In this window a developer configures the first dialog that the voice application or Fonelet will link to. Adialog identification section1201 is provided for the purpose of identifying and describing the dialog to be created. A text entry field for entering a dialog name and a text entry field for entering dialog description are provided. Within the dialog description field, an XML resource tag (not shown) is inserted which for example, may refer to a resource label machine code registered with a resource adapter within the application server analogous toadapter113 andapplication server110 described with reference to FIG. 1B.
[0139]A section1202 is provided withinscreen1200 and adapted to enable a developer to configure for expected responses. In this case the type of dialog is a Radio Dialog.Section1202 serves as the business rule logic control for multiple choice-like dialogs.Section1202 contains a selection option for Response of Yes or No. It is noted herein that there may be more and different expected responses in addition to a simple yes or no response.
An adjacent section is provided within[0140]section1202 for configuring any Follow-Up Action to occur as the result of an actual response to the dialog. For example, an option of selecting No Action is provided for each expected response of Yes and No. In the case of a follow-up action, an option for Connect is provided for each expected response. Adjacent to each illustrated Connect option, a Select field is provided for selecting a follow-up action, which may include fetching data.
A Send option is provided for enabling Send of the selected follow-up action including any embedded data. A follow-up action may be any type of configured response such as send a new radio dialog, send a machine repair request, and so on. A send to option and an associated select option is provided for identifying a recipient of a follow-up action and enabling automated send of the action to the recipient. For example, if a first dialog is a request for machine repair service sent to a plurality of internal repair technicians, then a follow-up might be to send the same dialog to the next available contact in the event the first contact refused to accept the job or was not available at the time of deployment.[0141]
In the above case, the dialog may propagate from contact to contact down a list until one of the contacts is available and chooses to interact with the dialog by accepting the job. A follow-up in this case may be to send a new dialog to the accepting contact detailing the parameters of which machine to repair including the diagnostic data of the problem and when the repair should take place. In this example, an option for showing details is provide for developer review purposes. Also interactive options for creating new or additional responses and for deleting existing responses from the system are provided. It is noted herein that once a dialog and dialog responses are created then they are reusable over the whole of the voice application and in any specified sequence in a voice application.[0142]
[0143]A section1203 is provided withinscreen1201 and adapted for handling Route-To Connection Exceptions. This section enables a developer to configure what to de in case of possible connection states experience in application deployment. For example, for a Caller Reject, Line Busy, or connection to Voice Mail there are options for No Action and for Redial illustrated. It is noted herein that there may be more Exceptions as well as Follow-up action types than are illustrated in this example without departing from the spirit and scope of the present invention.
A Send option is provided for each type of exception for re-sending the same or any other dialog that may be selected from an adjacent drop down menu. For example if the first dialog is a request for repair services and all of the initial contacts are busy for example, the dialog may be sent back around to all of the contacts until one becomes available by first moving to a next contact for send after each busy signal and then beginning at the top of the list again on re-dial. In this case John Doe represents a next recipient after a precious contact rejects the dialog, is bust, or re-directs to voice mail because of unavailability.[0144]Section1203 is only enabled when the voice application is set to outbound. Once the first dialog is created and enabled by the developer then a second dialog may be created if desired by clicking on one of the available buttons labeled detail. Also provided are interactive buttons for Save Dialog, Save and Close, and Undo Changes.
FIG. 13 is a[0145]screen shot1300 ofdialog design panel1200 of FIG. 12 illustrating progression of dialog state to a subsequent contact. The dialog state configured in the example of FIG. 12 is now transmitted from a contact listed in Route From to a contact listed in Route To insection1301, which is analogous tosection1201 of FIG. 12. In this case, the contacts involved are John Doe and Jane Doe. In this case, the dialog name and description are the same because the dialog is being re-used. The developer does not have to re-enter any of the dialog context. However, because each dialog has a unique relationship with a recipient the developer must configure the corresponding business rules.
[0146]Sections1302 and1303 of this example are analogous tosections1202 and1203 of the previous example of FIG. 12. In this case if John Doe says no to the request for machine repair then the system carries out a bridge transfer to Jane Doe. In the case of exceptions, shown in Route-ToConnection Exceptions region1303, all the events are directed to a redialing routine. In addition to inserting keywords such as “Yes” or “No” in theresponse field1302, the developer can create a custom thesaurus by clicking on a provided thesaurus icon not shown in this example. All the created vocabulary in a thesaurus can later be re-used throughout any voice applications the developer creates.
FIG. 14 is a screen shot of a[0147]thesaurus configuration window1400 activated from the example of FIG. 13 according to a preferred embodiment.Thesaurus window1400 has asection1401 containing a field for labeling a vocabulary word and an associated field for listing synonyms for the labeled word. In this example, the word no is associated with probable responses no, nope, and the phrase “I can not make it”. In this way voice recognition regimens can be trained in a personalized fashion to accommodate for varieties in a response that might carry a same meaning.
A[0148]vocabulary section1402 is provided and adapted to list all of the created vocabulary words for a voice application and a selection mechanism (a selection bar in this case) for selecting one of the listed words. An option for creating a new word and synonym pair is also provided withinsection1402. Acontrol panel section1403 is provided withinwindow1400 and adapted with the controls Select From Thesaurus; Update Thesaurus; Delete From Thesaurus; and Exit Thesaurus.
FIG. 15 is a plan view of a developer's[0149]frame1500 illustrating ascreen1502 for managing created modules according to an embodiment of the present invention.
After closing all[0150]dialog windows frame1500 displays screen orpage1502 for module management options.Menu section803 is again visible.Screen1502 displays as a result of clicking on the option “My” or My Fonelet inframe803.Screen1502 lists all voice applications that are already created and usable. In the list, each voice application has a check box adjacent thereto, which can be selected to change state of the particular application. A column labeled Status is provided withinscreen1502 and located adjacent to the application list applications already created.
The Status column lists the changeable state of each voice application. Available status options include but are not limited to listed states of Inactive, Activated and Inbound. A column labeled Direct Access ID is provided adjacent to the Status column and is adapted to enable the developer to access a voice application directly through a voice interface in a PSTN network or in one embodiment from a DNT voice interface. In a PSTN embodiment, direct access ID capability serves as an extension of a central phone number. A next column labeled Action is provided adjacent to the direct access ID column and is adapted to enable a developer to select and apply a specific action regarding state of a voice application.[0151]
For example, assume that a developer has just finished the voice application identified as Field Support Center (FSC) listed at the top of the application identification list. Currently, the listed state of FSC is Inactive. The developer now activates the associated Action drop down menu and selects Activate to launch the application FSC on demand. In the case of a scheduled launch, the voice application is activated automatically according to the settings defined in the voice application shell.[0152]
As soon as the Activate command has been issued, the on-demand request is queued for dispatching through the system's outbound application server. For example, John Doe then receives a call originating from the voice application server (110) that asks if John wants to take the call. If John responds “Yes,” the voice application is executed. The actual call flow follows:[0153]
System: “Hello John, you received a fonelet from Jim Doe, would you like to take this call?”[0154]
John: “Yes.”[0155]
System: “Machine number 008 is broken, are you available to fix it?”[0156]
John: “No.”[0157]
System: “Thanks for using fonelet. Goodbye!”[0158]
System: Terminate the connection with John, record the call flow to the data source, and spawn a new call to Jane Doe.[0159]
System: “Hello Jane, you received a fonelet from Jim Doe, would you like to take this call?”[0160]
Jane: “Yes.”[0161]
System: “Machine number 008 is broken, are you available to fix it?”[0162]
Jane: “I cannot make it.”[0163]
System: “Please wait while fonelet transfers you to Jeff Doe.”[0164]
System: Carry out the bridge transfer between Jane Doe and Jeff Doe. When the conversation is completed, terminate the connection with Jeff and record the call flow to the data source.[0165]
The default textual content of the voice application is being generated by the text-to-speech engine hosted on the telephony or DNT server. However, the voice application producer can access the voice portal through the PSTN or DNT server and record his/her voice over any existing prompts in the voice application.[0166]
It will be apparent to one with skill in the art the method and apparatus of the present invention may be practiced in conjunction with a CTI-enabled telephony environment wherein developer access to for application development is enabled through a client application running on a computerized station connected to a data network also having connectivity to the server spawning the application and telephony components. The method and apparatus of the invention may also be practiced in a system that is DNT-based wherein the telephony server and application server are both connected to a data network such as the well-known Internet network. There are applications for all mixes of communications environments including any suitable multi-tier system enabled for VXML and or other applicable mark-up languages that may serve similar purpose.[0167]
It will also be apparent to one with skill in the art that modeling voice applications including individual dialogs and responses enables any developer to create a limitless variety of voice application quickly by reusing existing objects in modular fashion thereby enabling a wide range of useful applications from an existing store of objects.[0168]
The method and apparatus of the invention should be afforded to broadest interpretation under examination in view of the many possible embodiments and uses. The spirit and scope of the invention is limited only be the claims that follow.[0169]