Movatterモバイル変換


[0]ホーム

URL:


CN110769297A - Audio and video data processing method and system - Google Patents

Audio and video data processing method and system
Download PDF

Info

Publication number
CN110769297A
CN110769297ACN201810827620.0ACN201810827620ACN110769297ACN 110769297 ACN110769297 ACN 110769297ACN 201810827620 ACN201810827620 ACN 201810827620ACN 110769297 ACN110769297 ACN 110769297A
Authority
CN
China
Prior art keywords
audio
video
video data
ethernet
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810827620.0A
Other languages
Chinese (zh)
Inventor
沈世国
杨乌拉
庞晓强
王艳辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Visionvera Information Technology Co Ltd
Original Assignee
Visionvera Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Visionvera Information Technology Co LtdfiledCriticalVisionvera Information Technology Co Ltd
Priority to CN201810827620.0ApriorityCriticalpatent/CN110769297A/en
Publication of CN110769297ApublicationCriticalpatent/CN110769297A/en
Withdrawnlegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The embodiment of the invention provides a method and a system for processing audio and video data, wherein the method comprises the following steps: the Ethernet terminal transmits the audio and video data acquired in real time to a first serial queue; the Ethernet terminal encodes audio data in the audio and video data into audio frames in a first audio format and encodes video data in the audio and video data, which has a synchronous relation with the audio data, into video frames in the first video format in the first serial queue; the Ethernet terminal transmits the audio frame and the video frame to a second serial queue; and the Ethernet terminal synchronously sends the audio frames and the video frames in the second serial queue to an Ethernet server, the Ethernet server is used for pushing the received audio frames and the received video frames to a video network terminal in real time, and the video network terminal is used for synchronously analyzing and playing the received audio frames and the received video frames. The embodiment of the invention realizes that the Ethernet terminal can carry out audio and video live broadcast to the video network terminal through the video network.

Description

Audio and video data processing method and system
Technical Field
The invention relates to the technical field of video networking, in particular to a method and a system for processing audio and video data.
Background
The video network is a special network for transmitting high-definition video and a special protocol at high speed based on Ethernet hardware, is a higher-level form of the Internet and is a real-time network.
At present, an ethernet terminal in an ethernet network can acquire audio and video data in real time, transmit the acquired audio and video data to other ethernet terminals in the ethernet network, and display the audio and video data on the other ethernet terminals, that is, the live broadcast function of the ethernet terminal. However, the ethernet terminal cannot transmit the audio and video data acquired in real time to the video network, and the audio and video data is displayed on the video network terminal in the video network.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a processing method of audio-video data and a corresponding processing system of audio-video data, which overcome or at least partially solve the above problems.
In order to solve the above problem, an embodiment of the present invention discloses a method for processing audio/video data, where the method is applied to a video network and an ethernet network, and the video network includes: a video networking terminal, the Ethernet comprising: the method comprises the following steps that an Ethernet server and an Ethernet terminal are connected, wherein the Ethernet server is respectively connected with the video networking terminal and the Ethernet terminal, and the method comprises the following steps: the Ethernet terminal transmits the audio and video data acquired in real time to a first serial queue; the Ethernet terminal encodes audio data in the audio and video data into audio frames in a first audio format and encodes video data in the audio and video data, which has a synchronous relationship with the audio data, into video frames in a first video format in the first serial queue; the Ethernet terminal transmits the audio frame and the video frame to a second serial queue; and the Ethernet terminal synchronously sends the audio frames and the video frames in the second serial queue to the Ethernet server, the Ethernet server is used for pushing the received audio frames and the received video frames to the video network terminal in real time, and the video network terminal is used for synchronously analyzing and playing the received audio frames and the received video frames.
Optionally, the encoding, by the ethernet terminal, the audio data in the audio and video data into the audio frame in the first audio format, and the video data in the audio and video data, which has a synchronization relationship with the audio data, into the video frame in the first video format in the first serial queue includes: and the Ethernet terminal codes the audio data in the Pulse Code Modulation (PCM) format in the audio and video data into an audio frame in an Advanced Audio Coding (AAC) format in the first serial queue, and codes the video data in the brightness and chrominance concentration YUV format, which has a synchronous relation with the audio data in the Pulse Code Modulation (PCM) format, in the audio and video data into a video frame in an H.264 format.
Optionally, the transmitting, by the ethernet terminal, the audio frame and the video frame to a second serial queue includes: and the Ethernet terminal encapsulates the audio frame and the video frame according to a preset format requirement, and transmits an audio and video data packet obtained by encapsulation to the second serial queue.
Optionally, before the ethernet terminal transmits the audio and video data acquired in real time to the first serial queue, the method further includes: the Ethernet terminal acquires audio and video data in real time through a preset audio and video data capturing class and an audio and video data acquisition device; the audio and video data capturing class is used for capturing audio and video data through the audio and video data acquisition equipment of the Ethernet terminal, and the audio and video data acquisition equipment acquiring class is used for acquiring attribute information of the audio and video data acquisition equipment of the Ethernet terminal.
Optionally, before the ethernet terminal acquires the audio and video data in real time through a preset audio and video data capture class and an audio and video data acquisition device, the method further includes: and the Ethernet terminal establishes network connection with the Ethernet server and keeps long connection with the Ethernet server through a socket interface.
The embodiment of the invention also discloses a processing system of audio and video data, which is applied to the video network and the Ethernet, wherein the video network comprises: a video networking terminal, the Ethernet comprising: ethernet server and ethernet terminal, ethernet server respectively with the video networking terminal with the ethernet terminal connection, the ethernet terminal includes: the transmission module is used for transmitting the audio and video data acquired by the Ethernet terminal in real time to a first serial queue; the encoding module is used for encoding audio data in the audio and video data into audio frames in a first audio format and encoding video data which has a synchronous relation with the audio data in the audio and video data into video frames in a first video format in the first serial queue; the transmission module is further used for transmitting the audio frame and the video frame to a second serial queue; and the sending module is used for synchronously sending the audio frames and the video frames in the second serial queue to the Ethernet server, the Ethernet server is used for pushing the received audio frames and the received video frames to the video networking terminal in real time, and the video networking terminal is used for synchronously analyzing and playing the received audio frames and the received video frames.
Optionally, the encoding module is configured to encode, in the first serial queue, audio data in a pulse code modulation PCM format in the audio and video data into an audio frame in an advanced audio coding AAC format, and encode, in the audio and video data, video data in a luminance chrominance concentration YUV format, which has a synchronization relationship with the audio data in the pulse code modulation PCM format, into a video frame in an h.264 format.
Optionally, the transmission module is configured to encapsulate the audio frame and the video frame according to a preset format requirement, and transmit an audio/video data packet obtained by encapsulation to the second serial queue.
Optionally, the ethernet terminal further includes: the acquisition module is used for acquiring the audio and video data in real time through a preset audio and video data capture class and an audio and video data acquisition device before the transmission module transmits the audio and video data acquired in real time to the first serial queue; the audio and video data capturing class is used for capturing audio and video data through the audio and video data acquisition equipment of the Ethernet terminal, and the audio and video data acquisition equipment acquiring class is used for acquiring attribute information of the audio and video data acquisition equipment of the Ethernet terminal.
Optionally, the ethernet terminal further includes: and the connection module is used for establishing network connection with the Ethernet server before the acquisition module acquires the audio and video data in real time through a preset audio and video data capture class and an audio and video data acquisition device, and keeping long connection with the Ethernet server through a socket interface.
The embodiment of the invention has the following advantages:
the embodiment of the invention is applied to the video network and the Ethernet, wherein the video network comprises a video network terminal, the Ethernet comprises an Ethernet server and an Ethernet terminal, and the Ethernet server is respectively connected with the video network terminal and the Ethernet terminal.
In the embodiment of the invention, the Ethernet terminal transmits the audio and video data acquired in real time to the first serial queue, encodes the audio data in the audio and video data into the audio frame in the first audio format in the first serial queue, and encodes the video data which has a synchronous relation with the audio data in the audio and video data into the video frame in the first video format. And the Ethernet terminal transmits the audio frame and the video frame to the second serial queue and synchronously transmits the audio frame and the video frame in the second serial queue to the Ethernet server. The Ethernet server is used for pushing the received audio frames and video frames to the video network terminal in real time, and the video network terminal is used for synchronously analyzing and playing the received audio frames and video frames.
The embodiment of the invention applies the characteristics of the Ethernet and the characteristics of the video network, the Ethernet terminal in the Ethernet respectively encodes and transmits audio and video data in a local first serial queue and a local second serial queue, and then the audio and video data acquired by the Ethernet terminal in real time are transmitted to the Ethernet server in the Ethernet, and the Ethernet server pushes the received audio and video data to the video network terminal in the video network, so that the audio and video data acquired by the Ethernet terminal in real time are displayed on the video network terminal, and the Ethernet terminal can perform audio and video live broadcast on the video network terminal through the video network.
Drawings
FIG. 1 is a schematic networking diagram of a video network of the present invention;
FIG. 2 is a schematic diagram of a hardware architecture of a node server according to the present invention;
fig. 3 is a schematic diagram of a hardware structure of an access switch of the present invention;
fig. 4 is a schematic diagram of a hardware structure of an ethernet protocol conversion gateway according to the present invention;
fig. 5 is a flowchart illustrating steps of an embodiment of a method for processing audio/video data according to the present invention;
FIG. 6 is a flow chart illustrating a method for implementing a live broadcast function of IOS equipment based on video networking according to the present invention;
fig. 7 is a block diagram of an embodiment of the processing system for audio/video data according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The video networking is an important milestone for network development, is a real-time network, can realize high-definition video real-time transmission, and pushes a plurality of internet applications to high-definition video, and high-definition faces each other.
The video networking adopts a real-time high-definition video exchange technology, can integrate required services such as dozens of services of video, voice, pictures, characters, communication, data and the like on a system platform on a network platform, such as high-definition video conference, video monitoring, intelligent monitoring analysis, emergency command, digital broadcast television, delayed television, network teaching, live broadcast, VOD on demand, television mail, Personal Video Recorder (PVR), intranet (self-office) channels, intelligent video broadcast control, information distribution and the like, and realizes high-definition quality video broadcast through a television or a computer.
To better understand the embodiments of the present invention, the following description refers to the internet of view:
some of the technologies applied in the video networking are as follows:
network Technology (Network Technology)
Network technology innovation in video networking has improved over traditional Ethernet (Ethernet) to face the potentially enormous video traffic on the network. Unlike pure network Packet Switching (Packet Switching) or network circuit Switching (circuit Switching), the internet of vision technology employs network Packet Switching to satisfy the demand of Streaming (which is interpreted as Streaming, continuous broadcasting, and is a data transmission technology that changes received data into a stable continuous stream and continuously transmits the stream, so that the sound heard or image seen by the user is very smooth, and the user can start browsing on the screen before the whole data is transmitted). The video networking technology has the advantages of flexibility, simplicity and low price of packet switching, and simultaneously has the quality and safety guarantee of circuit switching, thereby realizing the seamless connection of the whole network switching type virtual circuit and the data format.
Switching Technology (Switching Technology)
The video network adopts two advantages of asynchronism and packet switching of the Ethernet, eliminates the defects of the Ethernet on the premise of full compatibility, has end-to-end seamless connection of the whole network, is directly communicated with a user terminal, and directly bears an IP data packet. The user data does not require any format conversion across the entire network. The video networking is a higher-level form of the Ethernet, is a real-time exchange platform, can realize the real-time transmission of the whole-network large-scale high-definition video which cannot be realized by the existing Internet, and pushes a plurality of network video applications to high-definition and unification.
Server Technology (Server Technology)
The server technology on the video networking and unified video platform is different from the traditional server, the streaming media transmission of the video networking and unified video platform is established on the basis of connection orientation, the data processing capacity of the video networking and unified video platform is independent of flow and communication time, and a single network layer can contain signaling and data transmission. For voice and video services, the complexity of video networking and unified video platform streaming media processing is much simpler than that of data processing, and the efficiency is greatly improved by more than one hundred times compared with that of a traditional server.
Storage Technology (Storage Technology)
The super-high speed storage technology of the unified video platform adopts the most advanced real-time operating system in order to adapt to the media content with super-large capacity and super-large flow, the program information in the server instruction is mapped to the specific hard disk space, the media content is not passed through the server any more, and is directly sent to the user terminal instantly, and the general waiting time of the user is less than 0.2 second. The optimized sector distribution greatly reduces the mechanical motion of the magnetic head track seeking of the hard disk, the resource consumption only accounts for 20% of that of the IP internet of the same grade, but concurrent flow which is 3 times larger than that of the traditional hard disk array is generated, and the comprehensive efficiency is improved by more than 10 times.
Network Security Technology (Network Security Technology)
The structural design of the video network completely eliminates the network security problem troubling the internet structurally by the modes of independent service permission control each time, complete isolation of equipment and user data and the like, generally does not need antivirus programs and firewalls, avoids the attack of hackers and viruses, and provides a structural carefree security network for users.
Service Innovation Technology (Service Innovation Technology)
The unified video platform integrates services and transmission, and is not only automatically connected once whether a single user, a private network user or a network aggregate. The user terminal, the set-top box or the PC are directly connected to the unified video platform to obtain various multimedia video services in various forms. The unified video platform adopts a menu type configuration table mode to replace the traditional complex application programming, can realize complex application by using very few codes, and realizes infinite new service innovation.
Networking of the video network is as follows:
the video network is a centralized control network structure, and the network can be a tree network, a star network, a ring network and the like, but on the basis of the centralized control node, the whole network is controlled by the centralized control node in the network.
As shown in fig. 1, the video network is divided into an access network and a metropolitan network.
The devices of the access network part can be mainly classified into 3 types: node server, access switch, terminal (including various set-top boxes, coding boards, memories, etc.). The node server is connected to an access switch, which may be connected to a plurality of terminals and may be connected to an ethernet network.
The node server is a node which plays a centralized control function in the access network and can control the access switch and the terminal. The node server can be directly connected with the access switch or directly connected with the terminal.
Similarly, devices of the metropolitan network portion may also be classified into 3 types: a metropolitan area server, a node switch and a node server. The metro server is connected to a node switch, which may be connected to a plurality of node servers.
The node server is a node server of the access network part, namely the node server belongs to both the access network part and the metropolitan area network part.
The metropolitan area server is a node which plays a centralized control function in the metropolitan area network and can control a node switch and a node server. The metropolitan area server can be directly connected with the node switch or directly connected with the node server.
Therefore, the whole video network is a network structure with layered centralized control, and the network controlled by the node server and the metropolitan area server can be in various structures such as tree, star and ring.
The access network part can form a unified video platform (circled part), and a plurality of unified video platforms can form a video network; each unified video platform may be interconnected via metropolitan area and wide area video networking.
Video networking device classification
1.1 devices in the video network of the embodiment of the present invention can be mainly classified into 3 types: servers, switches (including ethernet gateways), terminals (including various set-top boxes, code boards, memories, etc.). The video network as a whole can be divided into a metropolitan area network (or national network, global network, etc.) and an access network.
1.2 wherein the devices of the access network part can be mainly classified into 3 types: node servers, access switches (including ethernet gateways), terminals (including various set-top boxes, code boards, memories, etc.).
The specific hardware structure of each access network device is as follows:
a node server:
as shown in fig. 2, the system mainly includes anetwork interface module 201, aswitching engine module 202, aCPU module 203, and adisk array module 204.
Thenetwork interface module 201, theCPU module 203, and thedisk array module 204 all enter theswitching engine module 202; theswitching engine module 202 performs an operation of looking up the address table 205 on the incoming packet, thereby obtaining the direction information of the packet; and stores the packet in a queue of thecorresponding packet buffer 206 based on the packet's steering information; if the queue of thepacket buffer 206 is nearly full, it is discarded; theswitching engine module 202 polls all packet buffer queues for forwarding if the following conditions are met: 1) the port send buffer is not full; 2) the queue packet counter is greater than zero. Thedisk array module 204 mainly implements control over the hard disk, including initialization, read-write, and other operations on the hard disk; theCPU module 203 is mainly responsible for protocol processing with an access switch and a terminal (not shown in the figure), configuring an address table 205 (including a downlink protocol packet address table, an uplink protocol packet address table, and a data packet address table), and configuring thedisk array module 204.
The access switch:
as shown in fig. 3, the network interface module (downstreamnetwork interface module 301, upstream network interface module 302), the switchingengine module 303, and theCPU module 304 are mainly included.
Wherein, the packet (uplink data) coming from the downlink network interface module 301 enters the packet detection module 305; the packet detection module 305 detects whether the Destination Address (DA), the Source Address (SA), the packet type, and the packet length of the packet meet the requirements, if so, allocates a corresponding stream identifier (stream-id) and enters the switching engine module 303, otherwise, discards the stream identifier; the packet (downstream data) coming from the upstream network interface module 302 enters the switching engine module 303; the data packet coming from the CPU module 204 enters the switching engine module 303; the switching engine module 303 performs an operation of looking up the address table 306 on the incoming packet, thereby obtaining the direction information of the packet; if the packet entering the switching engine module 303 is from the downstream network interface to the upstream network interface, the packet is stored in the queue of the corresponding packet buffer 307 in association with the stream-id; if the queue of the packet buffer 307 is nearly full, it is discarded; if the packet entering the switching engine module 303 is not from the downlink network interface to the uplink network interface, the data packet is stored in the queue of the corresponding packet buffer 307 according to the guiding information of the packet; if the queue of the packet buffer 307 is nearly full, it is discarded.
The switchingengine module 303 polls all packet buffer queues, which in this embodiment of the present invention is divided into two cases:
if the queue is from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) the port send buffer is not full; 2) the queued packet counter is greater than zero; 3) and obtaining the token generated by the code rate control module.
If the queue is not from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) the port send buffer is not full; 2) the queue packet counter is greater than zero.
The rate control module 208 is configured by theCPU module 204, and generates tokens for packet buffer queues from all downstream network interfaces to upstream network interfaces at programmable intervals to control the rate of upstream forwarding.
TheCPU module 304 is mainly responsible for protocol processing with the node server, configuration of the address table 306, and configuration of the coderate control module 308.
Ethernet protocol conversion gateway
As shown in fig. 4, the apparatus mainly includes a network interface module (a downlinknetwork interface module 401 and an uplink network interface module 402), aswitching engine module 403, aCPU module 404, apacket detection module 405, arate control module 408, an address table 406, apacket buffer 407, aMAC adding module 409, and aMAC deleting module 410.
Wherein, the data packet coming from the downlinknetwork interface module 401 enters thepacket detection module 405; thepacket detection module 405 detects whether the ethernet MAC DA, the ethernet MAC SA, the ethernet length or frame type, the video network destination address DA, the video network source address SA, the video network packet type, and the packet length of the packet meet the requirements, and if so, allocates a corresponding stream identifier (stream-id); then, theMAC deletion module 410 subtracts MAC DA, MAC SA, length or frame type (2byte) and enters the corresponding receiving buffer, otherwise, discards it;
the downlinknetwork interface module 401 detects the sending buffer of the port, and if there is a packet, acquires the ethernet MAC DA of the corresponding terminal according to the video networking destination address DA of the packet, adds the ethernet MAC DA of the terminal, the MACSA of the ethernet coordination gateway, and the ethernet length or frame type, and sends the packet.
The other modules in the ethernet protocol gateway function similarly to the access switch.
A terminal:
the system mainly comprises a network interface module, a service processing module and a CPU module; for example, the set-top box mainly comprises a network interface module, a video and audio coding and decoding engine module and a CPU module; the coding board mainly comprises a network interface module, a video and audio coding engine module and a CPU module; the memory mainly comprises a network interface module, a CPU module and a disk array module.
1.3 devices of the metropolitan area network part can be mainly classified into 3 types: node server, node exchanger, metropolitan area server. The node switch mainly comprises a network interface module, a switching engine module and a CPU module; the metropolitan area server mainly comprises a network interface module, a switching engine module and a CPU module.
2. Video networking packet definition
2.1 Access network packet definition
The data packet of the access network mainly comprises the following parts: destination Address (DA), Source Address (SA), reserved bytes, payload (pdu), CRC.
As shown in the following table, the data packet of the access network mainly includes the following parts:
DASAReservedPayloadCRC
the Destination Address (DA) is composed of 8 bytes (byte), the first byte represents the type of the data packet (e.g. various protocol packets, multicast data packets, unicast data packets, etc.), there are at most 256 possibilities, the second byte to the sixth byte are metropolitan area network addresses, and the seventh byte and the eighth byte are access network addresses.
The Source Address (SA) is also composed of 8 bytes (byte), defined as the same as the Destination Address (DA).
The reserved byte consists of 2 bytes.
The payload part has different lengths according to types of different datagrams, and is 64 bytes if the type of the datagram is a variety of protocol packets, or is 1056 bytes if the type of the datagram is a unicast packet, but is not limited to the above 2 types.
The CRC consists of 4 bytes and is calculated in accordance with the standard ethernet CRC algorithm.
2.2 metropolitan area network packet definition
The topology of a metropolitan area network is a graph and there may be 2, or even more than 2, connections between two devices, i.e., there may be more than 2 connections between a node switch and a node server, a node switch and a node switch, and a node switch and a node server. However, the metro network address of the metro network device is unique, and in order to accurately describe the connection relationship between the metro network devices, parameters are introduced in the embodiment of the present invention: a label to uniquely describe a metropolitan area network device.
In this specification, the definition of the Label is similar to that of a Label of Multi-Protocol Label switching (MPLS), and assuming that there are two connections between a device a and a device B, there are 2 labels for a packet from the device a to the device B, and 2 labels for a packet from the device B to the device a. The label is classified into an incoming label and an outgoing label, and assuming that the label (incoming label) of the packet entering the device a is 0x0000, the label (outgoing label) of the packet leaving the device a may become 0x 0001. The network access process of the metro network is a network access process under centralized control, that is, address allocation and label allocation of the metro network are both dominated by the metro server, and the node switch and the node server are both passively executed, which is different from label allocation of MPLS, and label allocation of MPLS is a result of mutual negotiation between the switch and the server.
As shown in the following table, the data packet of the metro network mainly includes the following parts:
DASAReservedlabel (R)PayloadCRC
Namely Destination Address (DA), Source Address (SA), Reserved byte (Reserved), tag, payload (pdu), CRC. The format of the tag may be defined by reference to the following: the tag is 32 bits with the upper 16 bits reserved and only the lower 16 bits used, and its position is between the reserved bytes and payload of the packet.
Based on the characteristics of the video network, one of the core concepts of the embodiment of the invention is provided, and the Ethernet terminal can transmit the audio and video data acquired in real time to the video network terminal in the video network through the Ethernet server according to the Ethernet protocol and the video network protocol, so that the audio and video data acquired in real time by the Ethernet terminal is displayed on the video network terminal.
Referring to fig. 5, a flowchart illustrating steps of an embodiment of a method for processing audio and video data according to the present invention is shown, where the method may be applied to a video network and an ethernet network, where the video network includes a video network terminal, the ethernet network includes an ethernet server and an ethernet terminal, and the ethernet server is respectively connected to the video network terminal and the ethernet terminal, and the method may specifically include the following steps:
step 501, the ethernet terminal transmits the audio and video data acquired in real time to a first serial queue.
In the embodiment of the present invention, the ethernet terminal may be an intelligent terminal in an ethernet network, such as a smart phone, a tablet computer, and the like. When the Ethernet terminal acquires the audio and video data in real time, the audio data and the video data can be acquired in real time respectively. And respectively combining the audio data and the video data which are obtained in real time to form audio and video data.
The first serial queue in the embodiment of the present invention may be located in an ethernet terminal, and the first serial queue is used for temporarily storing audio and video data. When the ethernet terminal transmits the audio and video data acquired in real time to the first serial queue, the audio data acquired in real time and the video data acquired in real time can be synchronously transmitted to the first serial queue. For example, when the ethernet terminal z1 acquires the audio data s1 at time point t1 and the video data v1 at time point t1, the ethernet terminal z1 transmits the audio data s1 and the video data v1 acquired at time point t1 to the first serial queue. Moreover, in the first serial queue, the audio data and the video data transmitted synchronously still have a synchronous relationship.
In a preferred embodiment of the present invention, before the ethernet terminal transmits the audio and video data acquired in real time to the first serial queue, the ethernet terminal acquires the audio and video data in real time through a preset audio and video data capture class and an audio and video data acquisition device. The audio and video data acquisition class is used for acquiring audio and video data through audio and video data acquisition equipment of the Ethernet terminal, and the audio and video data acquisition equipment acquisition class is used for acquiring attribute information of the audio and video data acquisition equipment of the Ethernet terminal. For example, a smart phone with an ethernet terminal as an IOS system is taken as an example for explanation, in the ethernet terminal, a preset audio/video data capture class is AVCaptureSession, a preset audio/video data acquisition device obtains the class as AVCaptureDevice, and the audio/video data acquisition device may include a camera, a microphone, and the like. The Ethernet terminal can acquire original audio and video data through AVCaptureSession and AVCaptureDevice.
In a preferred embodiment of the present invention, before the ethernet terminal acquires the audio and video data in real time through the preset audio and video data capture class and the audio and video data acquisition device, the ethernet terminal establishes a network connection with the ethernet server and maintains a long connection with the ethernet server through the socket interface. The Ethernet server can be a streaming media server which is a bridge and a link of audio and video data transmission service between the video network and the Ethernet, realizes seamless fusion of the video network service and the Ethernet service, can safely access various audio and video resources in the Ethernet to the video network, and can convert different audio and video streams of a video conference, a monitoring image, a digital television and the like in the video network into audio and video data supporting a standard Ethernet protocol and output the audio and video data to the Ethernet. The Ethernet terminal can log in the streaming media server in a verification mode of a user name and a password, and can keep long connection with the streaming media server through a socket interface after logging in. The long connection refers to a link mode that a plurality of data packets can be continuously transmitted on one connection, and during the connection holding period, if no data packet is transmitted, a link detection packet needs to be transmitted by two sides.
Step 502, the ethernet terminal encodes audio data in the audio and video data into audio frames in a first audio format in the first serial queue, and encodes video data having a synchronization relationship with the audio data in the audio and video data into video frames in the first video format.
In the embodiment of the present invention, the format of the audio data acquired by the ethernet terminal in real time may be a Pulse Code Modulation (PCM) format, and the video data acquired by the ethernet terminal in real time may be a luminance and chrominance density YUV format, where "Y" represents "luminance", "U" represents "chrominance", and "V" represents "density".
When the ethernet terminal encodes the Audio and video data in the first serial queue, the Audio data in the PCM format in the Audio and video data may be encoded into an Audio frame in an Advanced Audio Coding (AAC) format, and the video data in the YUV format having a synchronization relationship with the Audio data in the PCM format in the Audio and video data may be encoded into a video frame in an h.264 format. For example, the ethernet terminal encodes audio data s1 in PCM format into audio frames sz1 in AAC format, and encodes video data v1 in YUV format having a synchronous relationship with the audio data s1 in PCM format into video frames vz1 in h.264 format in the first serial queue.
In step 503, the ethernet terminal transmits the audio frame and the video frame to the second serial queue.
In this embodiment of the present invention, a second serial queue may be located in the ethernet terminal, where the second serial queue is used to temporarily store audio frames and video frames, and the audio frames and the video frames in the second serial queue are sent to the ethernet server.
In a preferred embodiment of the present invention, when the ethernet terminal transmits the audio frame and the video frame to the second serial queue, the ethernet terminal may encapsulate the audio frame and the video frame according to the requirement of the preset format, and transmit the audio/video data packet obtained by encapsulation to the second serial queue. For example, the ethernet terminal may encapsulate the audio frame and the video frame into a UDP packet according to a User Datagram Protocol (UDP) format requirement, and send the UDP packet to the second serial queue.
And step 504, the Ethernet terminal synchronously sends the audio frames and the video frames in the second serial queue to the Ethernet server.
In the embodiment of the invention, the Ethernet server is used for pushing the received audio frames and video frames to the video networking terminal in real time. The ethernet server may be connected to one or more video networking terminals, and the video networking terminals may be smart devices in the video networking, such as a smart phone, a set-top box, and the like. The video network terminal is used for synchronously analyzing and playing the received audio frame and video frame, and specifically, the video network terminal can analyze and play according to the format requirement after the audio frame and the video frame are packaged.
Based on the above description about the embodiment of the method for processing audio and video data, a method for implementing live broadcast function of an IOS device based on video networking is introduced below, as shown in fig. 6, the IOS device logs in to a streaming media server and keeps long connection, the IOS device collects audio data in PCM format and video data in YUV format in real time, the IOS device encodes the audio data in PCM format into audio frames in ACC format by ACC hard coding, and encodes the video data in YUV format into video frames in h.264 format by h.264 hard coding. The IOS equipment sends the audio frames and the video frames to the streaming media server through the socket interface, and the streaming media server pushes the received audio frames and the received video frames to the streaming media client so as to analyze and play the audio frames and the video frames on the streaming media client.
The embodiment of the invention is applied to the video network and the Ethernet, wherein the video network comprises a video network terminal, the Ethernet comprises an Ethernet server and an Ethernet terminal, and the Ethernet server is respectively connected with the video network terminal and the Ethernet terminal.
In the embodiment of the invention, the Ethernet terminal transmits the audio and video data acquired in real time to the first serial queue, encodes the audio data in the audio and video data into the audio frame in the first audio format in the first serial queue, and encodes the video data which has a synchronous relation with the audio data in the audio and video data into the video frame in the first video format. And the Ethernet terminal transmits the audio frame and the video frame to the second serial queue and synchronously transmits the audio frame and the video frame in the second serial queue to the Ethernet server. The Ethernet server is used for pushing the received audio frames and video frames to the video network terminal in real time, and the video network terminal is used for synchronously analyzing and playing the received audio frames and video frames.
The embodiment of the invention applies the characteristics of the Ethernet and the characteristics of the video network, the Ethernet terminal in the Ethernet respectively encodes and transmits audio and video data in a local first serial queue and a local second serial queue, and then the audio and video data acquired by the Ethernet terminal in real time are transmitted to the Ethernet server in the Ethernet, and the Ethernet server pushes the received audio and video data to the video network terminal in the video network, so that the audio and video data acquired by the Ethernet terminal in real time are displayed on the video network terminal, and the Ethernet terminal can perform audio and video live broadcast on the video network terminal through the video network.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 7, a block diagram of a structure of an embodiment of the processing system for audio and video data according to the present invention is shown, where the system may be applied in a video network and an ethernet network, the video network includes a video network terminal, the ethernet network includes an ethernet server and an ethernet terminal, the ethernet server is respectively connected to the video network terminal and the ethernet terminal, and the ethernet terminal in the system may specifically include the following modules:
thetransmission module 701 is configured to transmit the audio and video data acquired by the ethernet terminal in real time to the first serial queue.
Theencoding module 702 is configured to encode, in the first serial queue, audio data in the audio and video data into audio frames in a first audio format, and encode, in the audio and video data, video data that has a synchronization relationship with the audio data, into video frames in the first video format.
Thetransmission module 701 is further configured to transmit the audio frame and the video frame to the second serial queue.
The sendingmodule 703 is configured to send the audio frames and the video frames in the second serial queue to the ethernet server synchronously, where the ethernet server is configured to push the received audio frames and video frames to the video networking terminal in real time, and the video networking terminal is configured to perform synchronous parsing and playing on the received audio frames and video frames.
In a preferred embodiment of the present invention, theencoding module 702 is configured to encode, in the first serial queue, the audio data in the pulse code modulation PCM format in the audio-video data into the audio frames in the advanced audio coding AAC format, and encode, in the audio-video data, the video data in the luminance chrominance density YUV format having a synchronous relationship with the audio data in the pulse code modulation PCM format into the video frames in the h.264 format.
In a preferred embodiment of the present invention, thetransmission module 701 is configured to encapsulate the audio frame and the video frame according to a preset format requirement, and transmit the audio/video data packet obtained by encapsulation to the second serial queue.
In a preferred embodiment of the present invention, the ethernet terminal further comprises: an obtainingmodule 704, configured to obtain real-time audio and video data through a preset audio and video data capture class and an audio and video data acquisition device before thetransmission module 701 transmits the real-time audio and video data to the first serial queue; the audio and video data acquisition class is used for acquiring audio and video data through audio and video data acquisition equipment of the Ethernet terminal, and the audio and video data acquisition equipment acquisition class is used for acquiring attribute information of the audio and video data acquisition equipment of the Ethernet terminal.
In a preferred embodiment of the present invention, the ethernet terminal further comprises: theconnection module 705 is configured to establish a network connection with the ethernet server before theacquisition module 704 acquires the audio and video data in real time through the preset audio and video data capture class and the audio and video data acquisition device, and maintain a long connection with the ethernet server through a socket interface.
For the system embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The method for processing audio and video data and the system for processing audio and video data provided by the invention are introduced in detail, and a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

CN201810827620.0A2018-07-252018-07-25Audio and video data processing method and systemWithdrawnCN110769297A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810827620.0ACN110769297A (en)2018-07-252018-07-25Audio and video data processing method and system

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810827620.0ACN110769297A (en)2018-07-252018-07-25Audio and video data processing method and system

Publications (1)

Publication NumberPublication Date
CN110769297Atrue CN110769297A (en)2020-02-07

Family

ID=69327256

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810827620.0AWithdrawnCN110769297A (en)2018-07-252018-07-25Audio and video data processing method and system

Country Status (1)

CountryLink
CN (1)CN110769297A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110545447A (en)*2019-07-312019-12-06视联动力信息技术股份有限公司 Method and device for synchronizing audio and video
CN111669364A (en)*2020-04-262020-09-15视联动力信息技术股份有限公司 A method, device, electronic device and medium for data transmission
CN113347468A (en)*2021-04-212021-09-03深圳市乐美客视云科技有限公司Audio and video transmission method and device based on Ethernet frame and storage medium
CN117792555A (en)*2023-12-292024-03-29深圳市汇顶科技股份有限公司 A vehicle audio data transmission method, equipment, chip and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110545447A (en)*2019-07-312019-12-06视联动力信息技术股份有限公司 Method and device for synchronizing audio and video
CN111669364A (en)*2020-04-262020-09-15视联动力信息技术股份有限公司 A method, device, electronic device and medium for data transmission
CN111669364B (en)*2020-04-262023-09-12视联动力信息技术股份有限公司Data transmission method, device, electronic equipment and medium
CN113347468A (en)*2021-04-212021-09-03深圳市乐美客视云科技有限公司Audio and video transmission method and device based on Ethernet frame and storage medium
CN113347468B (en)*2021-04-212023-01-13深圳市乐美客视云科技有限公司Audio and video transmission method and device based on Ethernet frame and storage medium
CN117792555A (en)*2023-12-292024-03-29深圳市汇顶科技股份有限公司 A vehicle audio data transmission method, equipment, chip and system

Similar Documents

PublicationPublication DateTitle
CN108632525B (en)Method and system for processing service
CN108737768B (en)Monitoring method and monitoring device based on monitoring system
CN108881815B (en)Video data transmission method and device
CN109194982B (en)Method and device for transmitting large file stream
CN110022295B (en)Data transmission method and video networking system
CN108881948B (en)Method and system for video inspection network polling monitoring video
CN109547163B (en)Method and device for controlling data transmission rate
CN110049273B (en)Video networking-based conference recording method and transfer server
CN109246135B (en)Method and system for acquiring streaming media data
CN110769297A (en)Audio and video data processing method and system
CN108965930B (en)Video data processing method and device
CN108574816B (en)Video networking terminal and communication method and device based on video networking terminal
CN110769179B (en)Audio and video data stream processing method and system
CN110149305B (en) A method and transfer server for multi-party playing audio and video based on video networking
CN109714568B (en)Video monitoring data synchronization method and device
CN109743284B (en)Video processing method and system based on video network
CN110611639A (en)Audio data processing method and device for streaming media conference
CN110086773B (en)Audio and video data processing method and system
CN109889516B (en)Method and device for establishing session channel
CN110049069B (en)Data acquisition method and device
CN110661749A (en)Video signal processing method and video networking terminal
CN110808896B (en) Data transmission method, device, electronic device and storage medium
CN110149306B (en)Media data processing method and device
CN110446069B (en) A video communication method, device and storage medium based on video networking terminal
CN110113565B (en)Data processing method and intelligent analysis equipment based on video network

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
WW01Invention patent application withdrawn after publication
WW01Invention patent application withdrawn after publication

Application publication date:20200207


[8]ページ先頭

©2009-2025 Movatter.jp