CN107276659B

Movatterモバイル変換

Info

Publication number: CN107276659B
Application number: CN201710439142.1A
Authority: CN
Inventors: 张国滔; 郑勇; 魏科文; 卫特超; 郑培艺
Original assignee: Shenzhen Water World Co Ltd
Current assignee: Shenzhen Waterward Information Co Ltd
Priority date: 2017-06-12
Filing date: 2017-06-12
Publication date: 2020-10-09
Anticipated expiration: 2037-06-12
Also published as: CN107276659A; WO2018227854A1

Abstract

The invention discloses a voice talkback method, a device and a mobile terminal, wherein the method comprises the following steps: keeping long connection with a server through a satellite mobile communication network; compressing the collected voice information by using a low-rate voice coding algorithm and generating a first voice file; and sending the first voice file to the server so that the server sends the first voice file to a receiving end. The collected voice information is compressed by using the low-rate voice coding algorithm, so that the code rate of the voice information is greatly reduced, the capacity of a sent voice file is reduced, the bandwidth resource of a satellite mobile communication network is saved, and then the low-delay real-time talkback is realized.

Description

Voice talkback method and device and mobile terminal

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a voice intercom method, apparatus, and mobile terminal.

Background

Mobile communication in a regional or global area is called satellite mobile communication by using a geostationary orbit satellite or a medium or low orbit satellite as a relay station. It generally comprises three parts: a communication satellite consisting of one or more satellites; a ground station including a system control center and a plurality of gateway stations (i.e., transit stations for connecting a public switched telephone network with mobile subscribers); the mobile user communication terminal comprises a vehicle-mounted terminal, a ship-based terminal, an airborne terminal and a handset. The users are free to move within the coverage area of the satellite beam, and the satellite transmits signals to maintain communication with the terrestrial communication system and the users of the private system or other mobile users.

Compared with other communication modes, the satellite mobile communication has the advantages of large coverage area, long communication distance, flexible communication, stable and reliable line and the like. Therefore, satellite mobile communication has become an important development direction of communication services.

With the rapid development of satellite mobile communication technology, more and more mobile terminals support satellite mobile communication. The satellite mobile communication network also supports the function of TCP/IP link access to the internet, as well as the public land mobile communication network, so that the mobile terminal can perform networking communication through the satellite mobile communication network. However, since the bandwidth of the satellite mobile communication network is narrow, real-time voice intercom cannot be realized by using the instant messaging application, thereby affecting user experience.

Disclosure of Invention

The invention mainly aims to provide a voice talkback method, a voice talkback device and a mobile terminal, and aims to solve the technical problem that the mobile terminal based on satellite mobile communication cannot utilize instant messaging application to realize real-time voice talkback.

To achieve the above object, an embodiment of the present invention provides a voice intercom method, including the following steps:

keeping long connection with a server through a satellite mobile communication network;

compressing the collected voice information by using a low-rate voice coding algorithm and generating a first voice file;

and sending the first voice file to the server so that the server sends the first voice file to an opposite terminal.

Optionally, the step of establishing a connection with a server through a satellite mobile communication network and maintaining a long connection further includes:

acquiring a second voice file sent by an opposite terminal from the server;

and outputting the second voice file.

Optionally, the step of obtaining, from the server, the second voice file sent by the peer end includes:

receiving a download address of the second voice file sent by the server;

and downloading the second voice file according to the download address.

Optionally, the step of outputting the second voice file includes:

judging whether the second voice file is a low-rate voice coding file or not;

and when the file is a low-speed voice coding file, decoding the second voice file by using a low-speed voice decoding algorithm and then playing the second voice file.

Optionally, the low-rate speech coding algorithm is an adaptive multi-rate AMR algorithm, a mixed excitation linear prediction coding (MELP) algorithm, a code excitation linear prediction Coding (CELP) algorithm, a Sinusoidal Transform Coding (STC) algorithm, a time-frequency domain interpolation coding (TFI) algorithm, a pitch synchronous excitation linear prediction coding (PSELP) algorithm, a multi-band excitation coding (MBE) algorithm, or a waveform interpolation coding (WI) algorithm.

Optionally, when the low-rate speech compression coding algorithm is an AMR algorithm, the step of compressing the collected speech information by using the low-rate speech compression coding algorithm and generating the speech file includes:

and carrying out compression coding on the acquired voice information by utilizing the AMR algorithm so as to reduce the code rate of the voice information to a preset value and generate a voice file in an AMR format.

Optionally, the preset value is 6.6 kb/s.

Optionally, when the low-rate speech compression coding algorithm is a MELP algorithm, the step of compressing the collected speech information by using the low-rate speech compression coding algorithm and generating the speech file includes:

and carrying out compression coding on the collected voice information by using the MELP algorithm so as to reduce the code rate of the voice information to 2.4kb/s and generate a voice file in the MELP format.

Optionally, the step of sending the voice file to the server includes:

and transmitting the voice file to the server by adopting a TCP/IP (Transmission control protocol/Internet protocol) sub-packet.

The embodiment of the invention also provides a voice intercom device, which comprises:

the connection module is used for keeping long connection with the server through a satellite mobile communication network;

the processing module is used for compressing the collected voice information by using a low-rate voice coding algorithm and generating a first voice file;

and the sending module is used for sending the first voice file to the server so that the server sends the first voice file to an opposite terminal.

Optionally, the apparatus further comprises:

the acquisition module is used for acquiring a second voice file sent by the opposite terminal from the server;

and the output module is used for outputting the second voice file.

Optionally, the obtaining module includes:

the receiving unit is used for receiving the download address of the second voice file sent by the server;

and the downloading unit is used for downloading the second voice file according to the downloading address.

Optionally, the output module includes:

the judging unit is used for judging whether the second voice file is a low-rate voice coding file or not;

and the playing unit is used for decoding the second voice file by using a low-rate voice decoding algorithm and then playing the second voice file when the second voice file is a low-rate voice coding file.

Optionally, when the low-rate speech compression coding algorithm is an AMR algorithm, the processing module is configured to: and carrying out compression coding on the acquired voice information by utilizing the AMR algorithm so as to reduce the code rate of the voice information to a preset value and generate a voice file in an AMR format.

Optionally, when the low-rate speech compression coding algorithm is a MELP algorithm, the processing module is configured to: and carrying out compression coding on the collected voice information by using the MELP algorithm so as to reduce the code rate of the voice information to 2.4kb/s and generate a voice file in the MELP format.

Optionally, the sending module is configured to: and transmitting the voice file to the server by adopting a TCP/IP (Transmission control protocol/Internet protocol) sub-packet.

The invention also proposes a mobile terminal comprising a memory, a processor and at least one application stored in said memory and configured to be executed by said processor, characterized in that said application is configured for executing the aforementioned voice intercom method.

According to the voice talkback method provided by the embodiment of the invention, the satellite mobile communication network is kept in long connection with the server, and the low-rate voice coding algorithm is utilized to compress the collected voice information, so that the code rate of the voice information is greatly reduced, the capacity of the sent voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, the low-delay real-time talkback is realized, the technical problem that the mobile terminal based on the satellite mobile communication cannot utilize the instant communication application to realize the real-time voice talkback in the prior art is solved, and the user experience is improved.

Drawings

FIG. 1 is a flow chart of a first embodiment of the voice intercom method of the present invention;

FIG. 2 is a flow chart of a second embodiment of the voice intercom method of the present invention;

FIG. 3 is a block diagram of a first embodiment of the speech communicator of the present invention;

FIG. 4 is a block diagram of a second embodiment of the speech communicator of the present invention;

FIG. 5 is a block diagram of an acquisition module of the voice intercom system of FIG. 4;

fig. 6 is a block diagram of an output module of the voice intercom device of fig. 4.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by those skilled in the art, "terminal" as used herein includes both devices that are wireless signal receivers, devices that have only wireless signal receivers without transmit capability, and devices that include receive and transmit hardware, devices that have receive and transmit hardware capable of performing two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "terminal" or "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. As used herein, a "terminal Device" may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, or a smart tv, a set-top box, etc.

As used herein, a server, as will be understood by those skilled in the art, includes, but is not limited to, a computer, a network host, a single network server, a collection of network servers, or a cloud of servers. Here, the Cloud is composed of a large number of computers or network servers based on Cloud Computing (Cloud Computing), which is a kind of distributed Computing, a super virtual computer composed of a group of loosely coupled computer sets. In the embodiment of the present invention, the server, the terminal device and the WNS server may communicate with each other through any communication method, including but not limited to mobile communication based on 3GPP, LTE and WIMAX, computer network communication based on TCP/IP and UDP protocols, and short-range wireless transmission based on bluetooth and infrared transmission standards.

Referring to fig. 1, a first embodiment of the speech intercom method of the present invention is presented, said method comprising the steps of:

and S11, keeping long connection with the server through the satellite mobile communication network.

In step S11, after the mobile terminal establishes a connection with the server through the satellite mobile communication network, the mobile terminal maintains a long connection with the server at a certain heartbeat cycle, that is, the mobile terminal sends a heartbeat packet to the server every other heartbeat cycle, so as to maintain a connection path between the two, thereby implementing low-delay real-time transmission of subsequent voice packets.

Alternatively, the mobile terminal may maintain a long connection with the server at a preset heartbeat cycle.

Optionally, the mobile terminal may perform adaptive adjustment on the heartbeat cycle according to the reference heartbeat cycle and the signal quality of the satellite mobile communication network, obtain an adaptive heartbeat cycle which is relatively long and can maintain stable connection, and maintain long connection with the server according to the adaptive heartbeat cycle. The reference heartbeat period may be a preset heartbeat period, a heartbeat period used in the last connection, a heartbeat period used by other mobile communication networks (such as a public land mobile communication network), and the like.

In the embodiment of the invention, the mobile terminal can be a satellite mobile communication terminal only supporting satellite mobile communication, or a convergence terminal of satellite mobile communication and public land mobile communication supporting both satellite mobile communication and public land mobile communication.

And S12, processing the collected voice information by using a low-rate voice coding algorithm and generating a first voice file.

In the embodiment of the invention, the mobile terminal can carry out voice talkback with other terminals through instant messaging applications such as WeChat, easy-to-believe, QQ and the like, and the other terminals are opposite terminals of the mobile terminal. When the mobile terminal sends a voice file to the opposite terminal, the mobile terminal is a sending terminal, and the opposite terminal is a receiving terminal; when the mobile terminal receives the voice file sent by the opposite terminal, the mobile terminal is a receiving terminal, and the opposite terminal is a sending terminal.

When the mobile terminal serves as a sending end, voice information is collected through a microphone, the collected voice information is processed through a low-rate voice coding algorithm, and a first voice file is generated.

Optionally, when the mobile terminal collects the voice information, the application processor collects the voice information by using the 8-bit ADC and the 8k sampling frequency, and performs digital recording on the collected voice information. The code rate of the collected voice information is 64 kb/s.

The low-rate speech coding algorithm may be any one of speech coding algorithms such as an adaptive multi-rate (AMR) algorithm, a mixed excitation linear prediction coding (MELP) algorithm, a code excited linear prediction Coding (CELP) algorithm, a Sinusoidal Transform Coding (STC) algorithm, a time-frequency domain interpolation coding (TFI) algorithm, a pitch synchronous excitation linear prediction coding (PSELP) algorithm, a multi-band excitation coding (MBE) algorithm, a waveform interpolation coding (WI) algorithm, and the like.

For example, taking the AMR algorithm as an example, the mobile terminal performs compression coding on the collected voice information by using the AMR algorithm to reduce the code rate of the voice information to a preset value, and generates a voice file in the AMR format. AMR can adopt nine codes from 6.6kb/s to 23.85kb/s, and the lowest code rate of 6.6kb/s is preferred to be a preset value. Therefore, the code rate of the voice information is greatly reduced, the capacity of the voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, and the low-delay real-time talkback is realized.

For another example, taking the MELP algorithm as an example, the mobile terminal performs compression coding on the collected voice information by using the MELP algorithm, so as to reduce the code rate of the voice information to 2.4kb/s, and generate a voice file in the MELP format. Therefore, the code rate of the voice information is greatly reduced, the capacity of the voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, and the low-delay real-time talkback is realized.

And S13, sending the first voice file to the server so that the server sends the first voice file to the opposite terminal.

In the embodiment of the invention, a satellite communication modem (modem) of the mobile terminal establishes communication with the server through a socket (socket), and the mobile terminal preferably adopts a transmission control protocol/internet protocol (TCP/IP) protocol sub-packet to transmit the voice file to the server. That is, the mobile terminal divides the first voice file into a plurality of voice packets, and sequentially transmits the plurality of voice packets to the server in order. After receiving a plurality of voice packets, the server stores the voice packets into the cache in sequence according to the starting identifier and the ending identifier of the voice packets to form a voice file, namely, the first voice file is restored.

Each voice packet is a TCP/IP protocol packet, and the TCP/IP protocol packet comprises the following components:

-a pack head-a pack length-a pack body

The mobile terminal and the server may agree on the definition of the packet header (for example, set different identifiers), and the server analyzes the packet header of the TCP/IP protocol packet to distinguish whether the network transmitting the voice file is a satellite mobile communication network or a public land mobile communication network, that is, whether the voice file sent by the sending end is a low-rate voice coding file or a common voice coding file.

The server can adopt a software architecture supporting concurrent access of a plurality of clients, such as MINA, Erlang and the like, and supports multi-user high-concurrency access to the server. For example, a multi-threaded mechanism is employed, with one thread for listening to client requests and multiple threads for handling multiple user concurrent requests.

The specific process of the server side is as follows: creating server end object to generate monitoring thread, starting port monitoring, starting receiving client end connection request, creating client end object to generate new thread when client end connection comes. Sending data to a client, creating a data stream transmission object, starting data interception, and judging the data length when receiving the data. When the data length is 0, judging that the connection is disconnected, and deleting the client object and the useless thread; when the data length is not 0, the data is processed.

After receiving the first voice file, the server may send the voice file to the receiving end in the following two ways: one is that a download address of the first voice file is sent to the receiving end, so that the receiving end can directly download the first voice file according to the download address; and the other method is to adopt TCP/IP protocol to transmit the first voice file to the receiving end by sub-package.

If the receiving end is accessed to the satellite mobile communication network, the downloading mode is preferentially adopted, so that the time delay can be reduced. If the receiving end is accessed to the public land mobile communication network, the two modes can be both.

According to the voice talkback method, the satellite mobile communication network is in long connection with the server, and the low-rate voice coding algorithm is used for compressing the collected voice information, so that the code rate of the voice information is greatly reduced, the capacity of the sent voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, the low-delay real-time talkback is realized, the technical problem that a mobile terminal based on satellite mobile communication cannot realize real-time voice talkback by using instant communication application in the prior art is solved, and the user experience is improved.

Further, as shown in fig. 2, in the second embodiment of the voice intercom method of the present invention, when the mobile terminal is used as a receiving end, step S11 is followed by:

and S14, acquiring the second voice file sent by the opposite terminal from the server.

In the embodiment of the invention, the server preferably sends the download address of the second voice file to the mobile terminal, and the mobile terminal receives the download address sent by the server and downloads the second voice file according to the download address. The second voice file is acquired in a downloading mode, so that the time delay of voice talkback can be reduced, and the user experience is improved.

In other embodiments, the server may also use TCP/IP protocol to packet-transmit the second voice file to the mobile terminal, that is, the server divides the second voice file into a plurality of voice packets, and sequentially transmits the plurality of voice packets to the mobile terminal. After receiving a plurality of voice packets, the mobile terminal stores the voice packets into the cache in sequence according to the starting identifier and the ending identifier of the voice packets to form a voice file, namely, a second voice file is restored.

And S15, outputting the second voice file.

In the embodiment of the invention, after receiving the second voice file, the mobile terminal firstly judges whether the second voice file is a low-rate voice coding file; when the file is a low-speed voice coding file, decoding the second voice file by a low-speed voice decoder by using a low-speed voice decoding algorithm and then playing the second voice file; when the file is a common voice coding file, the second voice file is decoded by a broadband voice decoder and then played.

The mobile terminal may determine whether the second voice file is a low-rate voice encoded file according to the identification information of the second voice file, where the identification information may be set in the header of the voice packet of the second voice file.

For example, when the identification information of the second voice file is the first identification, the second voice file is judged to be a low-rate voice coding file; and when the identification information of the second voice file is the second identification, judging that the second voice file is a common voice coding file.

For another example, when the identification information of the second voice file is the first identification, the second voice file is determined to be a low-rate voice coding file; when the identification information of the second voice file is empty (i.e. no identification), the second voice file is determined to be a normal voice coding file. Or vice versa.

According to the voice talkback method, the second voice file is obtained in a downloading mode, so that the time delay of voice talkback is reduced, and the user experience is improved. And decoding the second voice file by using a low-rate voice decoding algorithm, so that voice talkback with a mobile terminal accessed to the satellite mobile communication network is realized.

In the embodiment of the invention, the mobile terminal accessed to the satellite mobile communication network can carry out voice talkback with other mobile terminals accessed to the satellite mobile communication network and can also carry out voice talkback with other mobile terminals accessed to the public land mobile communication network.

Referring to fig. 3, a first embodiment of the voice intercom apparatus of the present invention is proposed, the apparatus is applied to a mobile terminal, and may be applied to other terminal devices, of course, the apparatus includes aconnection module 10, aprocessing module 20 and a sendingmodule 30, wherein:

the connection module 10: for maintaining a long connection with a server through a satellite mobile communication network.

In the embodiment of the present invention, after theconnection module 10 establishes a connection with the server through the satellite mobile communication network, the connection module maintains a long connection with the server at a certain heartbeat cycle, that is, theconnection module 10 sends a heartbeat packet to the server every other heartbeat cycle, so as to maintain a connection path between the two, thereby implementing low-delay real-time transmission of subsequent voice packets.

Alternatively, theconnection module 10 may maintain a long connection with the server at a preset heartbeat cycle.

Optionally, theconnection module 10 may perform adaptive adjustment on the heartbeat cycle according to the reference heartbeat cycle and the signal quality of the satellite mobile communication network, obtain an adaptive heartbeat cycle with a relatively large cycle and capable of maintaining stable connection, and maintain long connection with the server according to the adaptive heartbeat cycle. The reference heartbeat period may be a preset heartbeat period, a heartbeat period used in the last connection, a heartbeat period used by other mobile communication networks (such as a public land mobile communication network), and the like.

For example, theconnection module 10 first performs a long connection test with reference to the heartbeat cycle. When the long connection can be maintained in the reference heartbeat period, the duration is increased on the basis of the reference heartbeat period to perform the long connection test, and the self-adaptive heartbeat period capable of maintaining the long connection is obtained, for example: gradually increasing the duration on the basis of the reference heartbeat period to perform a long connection test until the long connection can not be maintained; when the long connection cannot be maintained, the heartbeat cycle of the previous test is selected as the self-adaptive heartbeat cycle. When the long connection cannot be maintained with reference to the heartbeat cycle, the duration is reduced on the basis of the reference heartbeat cycle to perform the long connection test, and the adaptive heartbeat cycle capable of maintaining the long connection is obtained, for example: gradually reducing the duration on the basis of the reference heartbeat period to perform a long connection test until the long connection can be maintained; and when the long connection can be maintained, selecting the heartbeat cycle of the test as the self-adaptive heartbeat cycle. Finally, theconnection module 10 maintains a long connection with the server with an adaptive heartbeat cycle.

The processing module 20: the voice processing device is used for compressing the collected voice information by using a low-rate voice coding algorithm and generating a first voice file.

In the embodiment of the invention, the voice intercom device can perform voice intercom with other terminals through instant messaging applications such as WeChat, easy-to-communicate, QQ and the like, and the other terminals are opposite terminals of the mobile terminal. When the mobile terminal sends a voice file to the opposite terminal, the mobile terminal is a sending terminal, and the opposite terminal is a receiving terminal; when the mobile terminal receives the voice file sent by the opposite terminal, the mobile terminal is a receiving terminal, and the opposite terminal is a sending terminal.

When the mobile terminal is used as a transmitting end, theprocessing module 20 collects voice information through a microphone, and processes the collected voice information by using a low-rate voice coding algorithm to generate a first voice file.

Optionally, when theprocessing module 20 collects the voice information, the application processor collects the voice information by using 8-bit ADCs and 8k sampling frequencies, and performs digital recording on the collected voice information. The code rate of the collected voice information is 64 kb/s.

For example, taking the AMR algorithm as an example, theprocessing module 20 performs compression coding on the collected voice information by using the AMR algorithm to reduce the code rate of the voice information to a preset value, and generates a voice file in the AMR format. AMR can adopt nine codes from 6.6kb/s to 23.85kb/s, and the lowest code rate of 6.6kb/s is preferred to be a preset value. Therefore, the code rate of the voice information is greatly reduced, the capacity of the voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, and the low-delay real-time talkback is realized.

For another example, taking the MELP algorithm as an example, theprocessing module 20 performs compression coding on the collected voice information by using the MELP algorithm, so as to reduce the code rate of the voice information to 2.4kb/s, and generate a voice file in the MELP format. Therefore, the code rate of the voice information is greatly reduced, the capacity of the voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, and the low-delay real-time talkback is realized.

The sending module 30: the server is used for sending the first voice file to the server so as to enable the server to send the first voice file to the opposite terminal.

In the embodiment of the present invention, the sendingmodule 30 preferably uses TCP/IP protocol to transmit the voice file to the server by packetization. That is, the transmittingmodule 30 divides the first voice file into a plurality of voice packets, and sequentially transmits the plurality of voice packets to the server in order. After receiving a plurality of voice packets, the server stores the voice packets into the cache in sequence according to the starting identifier and the ending identifier of the voice packets to form a voice file, namely, the first voice file is restored.

-a pack head-a pack length-a pack body

The sendingmodule 30 and the server may agree on the definition of the packet header (for example, set different identifiers), and the server analyzes the packet header of the TCP/IP protocol packet to distinguish whether the network transmitting the voice file is a satellite mobile communication network or a public land mobile communication network, that is, to distinguish whether the voice file sent by the sending end is a low-rate voice coding file or a normal voice coding file.

The voice intercom device provided by the embodiment of the invention keeps long connection with the server through the satellite mobile communication network, and compresses the acquired voice information by using the low-rate voice coding algorithm, so that the code rate of the voice information is greatly reduced, the capacity of a sent voice file is reduced, the bandwidth resource of the satellite mobile communication network is saved, the low-delay real-time intercom is realized, the technical problem that a mobile terminal based on satellite mobile communication cannot realize the real-time voice intercom by using the instant communication application in the prior art is solved, and the user experience is improved.

Further, as shown in fig. 4, in the second embodiment of the voice intercom device of the present invention, the device further includes an obtainingmodule 40 and anoutputting module 50, where the obtainingmodule 40 is configured to obtain the second voice file sent by the opposite end from the server, and theoutputting module 50 is configured to output the second voice file.

In the embodiment of the present invention, the server preferably sends the download address of the second voice file to the mobile terminal. At this time, the obtainingmodule 40 includes, as shown in fig. 5, a receivingunit 41 and a downloadingunit 42, where the receivingunit 41 is configured to receive a downloading address of the second voice file sent by the server, and the downloadingunit 42 is configured to download the second voice file according to the downloading address.

In other embodiments, the server may also use TCP/IP protocol to packet-transmit the second voice file to the mobile terminal, that is, the server divides the second voice file into a plurality of voice packets, and sequentially transmits the plurality of voice packets to the mobile terminal. The obtainingmodule 40 receives a plurality of voice packets, and sequentially stores the voice packets in the buffer memory according to the start identifier and the end identifier of the voice packets to form a voice file, i.e. a second voice file is restored.

As shown in fig. 6, theoutput module 50 includes a judgingunit 51 and aplaying unit 52, wherein: the judgingunit 51 is configured to judge whether the second voice file is a low-rate voice encoding file; the playingunit 52 is configured to decode the second voice file through the low-rate voice decoder by using a low-rate voice decoding algorithm and then play the second voice file when the second voice file is a low-rate voice encoded file; and when the second voice file is the common voice coding file, the second voice file is decoded by a broadband voice decoder and then played.

The determiningunit 51 may determine whether the second voice file is a low-rate voice encoded file by using identification information of the second voice file, where the identification information may be set in a header of a voice packet of the second voice file.

For example, when the identification information of the second voice file is the first identification, the judgingunit 51 judges that the second voice file is a low-rate voice encoding file; when the identification information of the second voice file is the second identification, thejudgment unit 51 judges that the second voice file is the normal voice encoding file.

For another example, when the identification information of the second voice file is the first identification, the determiningunit 51 determines that the second voice file is a low-rate voice encoded file; when the identification information of the second voice file is empty (i.e., no identification), thejudgment unit 51 judges that the second voice file is a normal voice encoded file. Or vice versa.

The voice talkback device of the embodiment acquires the second voice file in a downloading mode, reduces the time delay of voice talkback and improves the user experience. And decoding the second voice file by using a low-rate voice decoding algorithm, so that voice talkback with a mobile terminal accessed to the satellite mobile communication network is realized.

The invention also proposes a mobile terminal comprising a memory, a processor and at least one application stored in the memory and configured to be executed by the processor, the application being configured to perform a voice intercom method. The voice intercom method comprises the following steps: keeping long connection with a server through a satellite mobile communication network; compressing the collected voice information by using a low-rate voice coding algorithm and generating a first voice file; and sending the first voice file to the server so that the server sends the first voice file to a receiving end. The voice intercom method described in this embodiment is the voice intercom method related to the above embodiment of the present invention, and is not described herein again.

Those skilled in the art will appreciate that the present invention includes apparatus directed to performing one or more of the operations described in the present application. These devices may be specially designed and manufactured for the required purposes, or they may comprise known devices in general-purpose computers. These devices have stored therein computer programs that are selectively activated or reconfigured. Such a computer program may be stored in a device (e.g., computer) readable medium, including, but not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magnetic-optical disks, ROMs (Read-Only memories), RAMs (random access memories), EPROMs (Erasable Programmable Read-Only memories), EEPROMs (Electrically Erasable Programmable Read-Only memories), flash memories, magnetic cards, or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a bus. That is, a readable medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).

It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. Those skilled in the art will appreciate that the computer program instructions may be implemented by a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the features specified in the block or blocks of the block diagrams and/or flowchart illustrations of the present disclosure.

Those of skill in the art will appreciate that various operations, methods, steps in the processes, acts, or solutions discussed in the present application may be alternated, modified, combined, or deleted. Further, various operations, methods, steps in the flows, which have been discussed in the present application, may be interchanged, modified, rearranged, decomposed, combined, or eliminated. Further, steps, measures, schemes in the various operations, methods, procedures disclosed in the prior art and the present invention can also be alternated, changed, rearranged, decomposed, combined, or deleted.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A voice intercom method is characterized by comprising the following steps:

sending the first voice file to the server so that the server sends the first voice file to an opposite terminal; the mode that the server sends the first voice file to the opposite terminal comprises the following steps: transmitting the first voice file to an opposite terminal by adopting TCP/IP (Transmission control protocol/Internet protocol) sub-packets or sending a download address of the first voice file to the opposite terminal;

wherein the step of maintaining a long connection with a server through a satellite mobile communication network comprises:

carrying out self-adaptive adjustment on the heartbeat period according to the reference heartbeat period and the signal quality of the satellite mobile communication network; the reference heartbeat cycle is a preset heartbeat cycle, a heartbeat cycle used in last connection or a heartbeat cycle used by a mobile communication network;

acquiring a self-adaptive heartbeat cycle which has a larger cycle and can keep stable connection, and keeping long connection with the server by the self-adaptive heartbeat cycle;

the low-rate speech coding comprises an AMR algorithm, and the step of compressing the collected speech information by using the low-rate speech coding algorithm and generating the first speech file comprises the following steps:

carrying out compression coding on the collected voice information by utilizing the AMR algorithm so as to reduce the code rate of the voice information to a preset value;

and generating a voice file in an AMR format.

2. The voice intercom method according to claim 1, wherein the step of establishing a connection with a server through a satellite mobile communication network and maintaining a long connection is followed by further comprising:

acquiring a second voice file sent by an opposite terminal from the server;

and outputting the second voice file.

3. The voice intercom method according to claim 2, wherein said step of obtaining, from the server, the second voice file sent by the opposite terminal comprises:

receiving a download address of the second voice file sent by the server;

and downloading the second voice file according to the download address.

4. The voice intercom method according to claim 2, wherein said step of outputting said second voice file comprises:

judging whether the second voice file is a low-rate voice coding file or not;

5. The speech intercom method according to any one of claims 1-4, characterized in that said low-rate speech coding algorithm is an adaptive multi-rate AMR algorithm, a Mixed excitation Linear predictive coding (MELP) algorithm, a code excited Linear predictive Coding (CELP) algorithm, a Sinusoidal Transform Coding (STC) algorithm, a time-frequency-domain interpolation coding (TFI) algorithm, a pitch synchronous excitation Linear predictive coding (PSELP) algorithm, a multi-band excitation coding (MBE) algorithm or a waveform interpolation coding (WI) algorithm.

6. A voice intercom apparatus, comprising:

a sending module, configured to send the first voice file to the server, so that the server sends the first voice file to an opposite end; the mode that the server sends the first voice file to the opposite terminal comprises the following steps: transmitting the first voice file to an opposite terminal by adopting TCP/IP (Transmission control protocol/Internet protocol) sub-packets or sending a download address of the first voice file to the opposite terminal;

the connection module is also used for adaptively adjusting the heartbeat period according to the reference heartbeat period and the signal quality of the satellite mobile communication network; the reference heartbeat cycle is a preset heartbeat cycle, a heartbeat cycle used in last connection or a heartbeat cycle used by a mobile communication network; acquiring a self-adaptive heartbeat cycle which has a larger cycle and can keep stable connection, and keeping long connection with the server by the self-adaptive heartbeat cycle;

the low-rate speech coding comprises an AMR algorithm, and the processing module is used for carrying out compression coding on the collected speech information by utilizing the AMR algorithm so as to reduce the code rate of the speech information to a preset value and generate a speech file in an AMR format.

7. The voice intercom device according to claim 6, further comprising:

and the output module is used for outputting the second voice file.

8. The voice intercom device according to claim 7, wherein said obtaining module comprises:

9. The voice intercom device according to claim 7, wherein said output module includes:

10. A mobile terminal comprising a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, wherein the application is configured to perform the voice intercom method of any of claims 1-5.