CN104778957B

Movatterモバイル変換

Info

Publication number: CN104778957B
Application number: CN201510125087.XA
Authority: CN
Inventors: 程胜
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2015-03-20
Filing date: 2015-03-20
Publication date: 2018-03-02
Anticipated expiration: 2035-03-20
Also published as: CN104778957A

Abstract

本发明实施例公开了一种歌曲音频处理的方法，该方法包括：获取N个歌曲音频文件，其中，N为大于或等于1的整数；分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段；从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数；按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件。本发明实施例还提供一种歌曲音频处理装置。采用本发明实施例可将歌曲的语句切割成独立的语句，并将独立的语句拼接成歌曲，具有较高的处理效率，同时具有趣味性。

The embodiment of the present invention discloses a method for song audio processing, the method comprising: acquiring N song audio files, wherein N is an integer greater than or equal to 1; analyzing the N song audio files to determine the N M sentence fragments in the song audio files; extract the M sentence fragments from the N song audio files, wherein the M is an integer greater than 1; put the M sentence fragments in a preset order Perform splicing to obtain spliced song audio files. The embodiment of the present invention also provides a song audio processing device. By adopting the embodiment of the present invention, the sentences of the song can be cut into independent sentences, and the independent sentences can be spliced into a song, which has high processing efficiency and is interesting at the same time.

Description

Translated fromChinese

一种歌曲音频处理的方法及装置Method and device for song audio processing

技术领域technical field

本发明实施例涉及音频处理技术领域，尤其涉及一种歌曲音频处理的方法及装置。The embodiments of the present invention relate to the technical field of audio processing, in particular to a method and device for audio processing of songs.

背景技术Background technique

随着移动互联网技术的快速发展，装置(如手机、平板电脑、touch)及专用播放机等对音乐的追求也越来越来。目前来看，现有技术中对音乐的功能播放仅限于提高自身的音质处理，如通过装置或者专用播放机中自带的处理软件对劣质音频文件进行适当处理，以提高播放质量；或者，对音频文件本身的质量要求较高，往往正版音频文件能播出较好的效果，而对音频文件的局部处理，如切割技术、提取技术研究较少。With the rapid development of mobile Internet technology, devices (such as mobile phones, tablet computers, touches) and dedicated players are increasingly pursuing music. At present, the functional playback of music in the prior art is limited to improving its own sound quality processing, such as properly processing low-quality audio files through the processing software that comes with the device or a dedicated player to improve the playback quality; The quality requirements of the audio file itself are relatively high, and often genuine audio files can play better results, but there is little research on the partial processing of audio files, such as cutting technology and extraction technology.

现有技术中，对歌曲的剪切主要依赖于网络软件，而这些软件往往需要人工手工操作，不能精确定位每一句歌词的确切位置。In the prior art, the cutting of songs mainly depends on network software, and these software often require manual operation, and cannot accurately locate the exact position of each lyric.

发明内容Contents of the invention

本发明实施例提供了一种歌曲音频处理的方法及装置，能够实现对歌曲的进行剪切与拼接。Embodiments of the present invention provide a method and device for song audio processing, which can realize cutting and splicing of songs.

本发明实施例第一方面提供了一种歌曲音频处理的方法，包括：The first aspect of the embodiment of the present invention provides a method for song audio processing, including:

获取N个歌曲音频文件，其中，N为大于或等于1的整数；Obtain N song audio files, where N is an integer greater than or equal to 1;

分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段；Analyzing the N song audio files to determine M sentence segments in the N song audio files;

从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数；Extract the M sentence fragments from the N song audio files, wherein the M is an integer greater than 1;

按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件。The M sentence segments are spliced according to a preset sequence to obtain a spliced song audio file.

在第一种可能的实现方式中，提取所述N个歌曲音频文件的原唱部分；确定所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间，所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；根据所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。In a first possible implementation, extract the original singing part of the N song audio files; determine the start time and end time of each voice segment in the original singing part of the N song audio files, so The start time and the end time of each speech segment in the original singing part of the N song audio files are the start time and the end time of each sentence segment in the N song audio files; according to the N The start time and end time of each speech segment in the original singing part of the song audio file are cut to the N song audio files to obtain M sentence segments in the N song audio files.

在第二种可能的实现方式中，提取所述N个歌曲音频文件的伴奏部分；确定所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间，所述N个歌曲音频文件的伴奏部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；根据所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。In a second possible implementation, the accompaniment parts of the N song audio files are extracted; the start time and the end time of each tune segment in the accompaniment parts of the N song audio files are determined, and the N The start time and the end time of each speech segment in the accompaniment part of the song audio files are the start time and the end time of each sentence segment in the N song audio files; according to the M song audio files Cut the N song audio files according to the start time and end time of each speech segment in the original singing part, to obtain M sentence segments in the N song audio files.

结合第一方面的第一种或第二中可能的实现方式，获取N个歌曲音频文件之前，在第三种可能的实现方式中，将所述N个歌曲音频文件转换为预设格式的音频文件。In combination with the first or second possible implementation of the first aspect, before acquiring the N song audio files, in a third possible implementation, convert the N song audio files into audio in a preset format document.

结合第一方面的第一种或第二中可能的实现方式，在按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件之后，第四种可能的实现方式中，锁定所述拼接歌曲音频文件的拼接位置；将所述拼接歌曲音频文件的拼接位置进行处理，以获取无缝拼接歌曲音频文件。In combination with the first or second possible implementation of the first aspect, after splicing the M sentence segments in a preset order to obtain the spliced song audio file, in a fourth possible implementation, lock The splicing position of the spliced song audio file; the spliced position of the spliced song audio file is processed to obtain the seamless spliced song audio file.

本发明实施例第二方面提供了一种歌曲音频处理的装置，其特征在于，包括：The second aspect of the embodiment of the present invention provides a song audio processing device, which is characterized in that it includes:

获取单元，用于获取N个歌曲音频文件，其中，N为大于或等于1的整数；An acquisition unit, configured to acquire N song audio files, where N is an integer greater than or equal to 1;

分析单元，分析所述获取单元获取得到的所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段；An analysis unit, analyzing the N song audio files acquired by the acquisition unit, to determine the M sentence segments in the N song audio files;

第一提取单元，从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数；A first extraction unit, extracting the M sentence fragments from the N song audio files, wherein the M is an integer greater than 1;

拼接单元，用于按照预设顺序将所述第一提取单元提取的所述M个语句片段进行拼接，以得到拼接歌曲音频文件。The splicing unit is configured to splice the M sentence segments extracted by the first extraction unit according to a preset order to obtain spliced song audio files.

在第一种可能的实现方式中，所述分析单元包括：第二提取单元，用于提取所述N个歌曲音频文件的原唱部分；第一确定单元，用于确定所述第二提取单元提取的所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间，所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；第一剪切单元，用于根据所述第一确定单元确定的所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到歌曲音频文件中的M个语句片段。In a first possible implementation manner, the analyzing unit includes: a second extracting unit, configured to extract the original singing parts of the N song audio files; a first determining unit, configured to determine the The start time and the end time of each voice segment in the original singing part of the N song audio files extracted, the start time and the end of each voice segment in the original singing part of the N song audio files The time is the start time and the end time of each sentence segment in the N song audio files; the first cutting unit is used for the original singing of the M song audio files determined according to the first determination unit Cut the N song audio files according to the start time and end time of each speech segment in the part to obtain M sentence segments in the song audio file.

在第二种可能的实现方式中，所述分析单元包括：第三提取单元，用于提取所述N个歌曲音频文件的伴奏部分；第二确定单元，用于确定所述第三提取单元提取的所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间，所述N个歌曲音频文件的伴奏部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；第二剪切单元，用于根据所述第二确定单元确定的所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。In a second possible implementation manner, the analyzing unit includes: a third extracting unit, configured to extract the accompaniment parts of the N song audio files; a second determining unit, configured to determine that the third extracting unit extracts The start time and the end time of each tune segment in the accompaniment part of the N song audio files, the start time and the end time of each voice segment in the accompaniment part of the N song audio files are the The start time and the end time of each sentence segment in the N song audio files; the second cutting unit is used for the original singing part of the M song audio files determined according to the second determination unit The start time and end time of each voice segment are used to cut the N song audio files to obtain M sentence segments in the N song audio files.

结合第二方面第一种或第二种可能的实现方式，在获取单元获取N个歌曲音频文件之前，在第三种可能的实现方式中，转换单元，用于将所述N个歌曲音频文件转换为预设格式的音频文件。In combination with the first or second possible implementation of the second aspect, before the acquisition unit acquires the N song audio files, in a third possible implementation, the converting unit is configured to convert the N song audio files Convert audio files to preset formats.

结合第二方面第一种或第二种可能的实现方式，在拼接单元按照预设顺序将所述第一提取单元提取的所述M个语句片段进行拼接，以得到拼接歌曲音频文件之后，在第四种可能的实现方式中，锁定单元，用于锁定所述拼接歌曲音频文件的拼接位置；处理单元，用于将所述锁定单元锁定的所述拼接歌曲音频文件的拼接位置进行处理，以获取无缝拼接歌曲音频文件。In combination with the first or second possible implementation of the second aspect, after the splicing unit splices the M sentence segments extracted by the first extraction unit in a preset order to obtain the spliced song audio file, In a fourth possible implementation manner, the locking unit is configured to lock the splicing position of the spliced song audio file; the processing unit is configured to process the spliced position of the spliced song audio file locked by the locking unit to Get audio files of seamlessly spliced songs.

实施本发明实施例，具有如下有益效果：Implementing the embodiment of the present invention has the following beneficial effects:

本发明实施例中，获取N个歌曲音频文件，其中，N为大于或等于1的整数；分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段；从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数；按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件。采用本发明实施例可将歌曲的语句切割成语句片段，并将语句片段拼接成歌曲音频文件，具有较高的处理效率，同时具有趣味性。In the embodiment of the present invention, N song audio files are obtained, wherein N is an integer greater than or equal to 1; the N song audio files are analyzed to determine the M sentence segments in the N song audio files; Extracting the M sentence segments from the N song audio files, wherein M is an integer greater than 1; splicing the M sentence segments according to a preset order to obtain a spliced song audio file. By adopting the embodiment of the present invention, the sentences of the song can be cut into sentence fragments, and the sentence fragments can be spliced into song audio files, which has high processing efficiency and is interesting at the same time.

附图说明Description of drawings

为了更清楚地说明本发明实施例中的技术方案，下面将对实施例、描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明实施例的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following will briefly introduce the accompanying drawings that are required in the embodiments and descriptions. Obviously, the accompanying drawings in the following description are only some of the embodiments of the present invention. Embodiments, for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.

图1为本发明实施例提供的一种歌曲音频处理的方法的第一实施例流程示意图；Fig. 1 is a schematic flow chart of the first embodiment of a method for processing song audio provided by an embodiment of the present invention;

图2为本发明实施例提供的一种歌曲音频处理的方法的第二实施例流程示意图；FIG. 2 is a schematic flow chart of a second embodiment of a method for processing song audio provided by an embodiment of the present invention;

图3为本发明实施例提供的一种歌曲音频处理的方法的第三实施例流程示意图；FIG. 3 is a schematic flowchart of a third embodiment of a method for processing song audio provided by an embodiment of the present invention;

图4为本发明实施例提供的一种歌曲音频处理装置的第一实施例结构示意图；4 is a schematic structural diagram of a first embodiment of a song audio processing device provided by an embodiment of the present invention;

图5为本发明实施例提供的一种歌曲音频处理装置的第二实施例结构示意图。Fig. 5 is a schematic structural diagram of a second embodiment of a song audio processing device provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明实施例一部分实施例，而不是全部的实施例。基于本发明实施例中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明实施例保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of them. . Based on the embodiments in the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the embodiments of the present invention.

本发明的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象，而不是用于描述特定顺序。此外，术语“包括”和“具有”以及它们任何变形，意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元，而是可选地还包括没有列出的步骤或单元，或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third" and "fourth" in the description and claims of the present invention and the drawings are used to distinguish different objects, rather than to describe a specific order . Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.

实现中，本发明实施例中，装置可以包括但不限于：笔记本电脑、手机、平板电脑、智能可穿戴设备、播放机、MP3、MP4、智能电视、机顶盒、服务器等等。装置的系统指装置的操作系统，可以包括但不限于：Android系统、塞班系统、Windows系统、IOS(苹果公司开发的移动操作系统)系统等等。需要说明的是，Android装置指Android系统的装置，塞班装置指塞班系统的装置，等等。上述装置仅是举例，而非穷举，包含但不限于上述装置。In implementation, in the embodiment of the present invention, devices may include, but are not limited to: notebook computers, mobile phones, tablet computers, smart wearable devices, players, MP3, MP4, smart TVs, set-top boxes, servers, and so on. The system of the device refers to the operating system of the device, which may include but not limited to: Android system, Symbian system, Windows system, IOS (mobile operating system developed by Apple Inc.) system, etc. It should be noted that an Android device refers to a device with an Android system, a Symbian device refers to a device with a Symbian system, and so on. The above-mentioned devices are only examples, not exhaustive, including but not limited to the above-mentioned devices.

实现中，本发明实施例中，歌曲音频文件可以包括但不限于：中文歌曲音频文件、英文歌曲音频文件、俄文歌曲音频文件、西班牙歌曲音频文件、古典歌曲音频文件、流行音乐歌曲音频文件、摇滚音乐歌曲音频文件、轻音乐歌曲音频文件、说唱歌曲音频文件、清唱歌曲音频文件、视频中的歌曲音频文件等等。上述歌曲仅是举例，而非穷举，包含但不限于上述歌曲。In implementation, in the embodiment of the present invention, song audio files may include but not limited to: Chinese song audio files, English song audio files, Russian song audio files, Spanish song audio files, classical song audio files, popular music song audio files, Rock music song audio files, light music song audio files, rap song audio files, a cappella song audio files, song audio files in video, and more. The above-mentioned songs are only examples, not exhaustive, including but not limited to the above-mentioned songs.

实现过程中，歌曲的格式可包括但不仅限于：MP3、MP4、WMV、WAV、FLV等等。上述歌曲的格式仅是举例，而非穷举，包含但不限于上述歌曲的格式。During the realization process, the format of the song may include but not limited to: MP3, MP4, WMV, WAV, FLV and so on. The format of the above-mentioned songs is only an example, not exhaustive, including but not limited to the format of the above-mentioned songs.

结合图1至图5对本发明实施例提供的一种歌曲音频处理的方法及装置进行描述。A method and device for song audio processing provided by an embodiment of the present invention will be described with reference to FIG. 1 to FIG. 5 .

请参阅图1，图1是本发明实施例提供的一种歌曲音频处理的方法的第一实施例流程示意图。本实施例中所描述的歌曲音频处理的方法，包括步骤：Please refer to FIG. 1 . FIG. 1 is a schematic flowchart of a first embodiment of a method for processing song audio provided by an embodiment of the present invention. The method for song audio processing described in the present embodiment, comprises steps:

S101、获取N个歌曲音频文件。S101. Acquire N song audio files.

具体的，获取N个歌曲音频文件，可从歌曲音频处理装置中获取；也可以从另一装置中获取；例如蓝光播放机挂载的存储设备或移动终端，所述移动终端例如可以是手机、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(MID，mobile internet device)、可穿戴设备(例如智能手表(如iwatch等)、智能手环、计步器等)或其他可安装部署即时通讯应用客户端的终端设备，或是从云端中获取、或从网络中获取等等。其中，N为大于或等于1的整数。Specifically, the acquisition of N song audio files can be obtained from the song audio processing device; it can also be obtained from another device; for example, a storage device mounted on a Blu-ray player or a mobile terminal, and the mobile terminal can be, for example, a mobile phone, Tablet computer, laptop computer, handheld computer, mobile Internet device (MID, mobile internet device), wearable device (such as smart watch (such as iwatch, etc.), smart bracelet, pedometer, etc.) or other instant messaging applications that can be installed and deployed The terminal device of the client is either obtained from the cloud, or obtained from the network, and so on. Wherein, N is an integer greater than or equal to 1.

S102、分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段。S102. Analyze the N song audio files to determine M sentence segments in the N song audio files.

具体的，分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段，可以是分析所述N个歌曲音频文件携带的歌词信息，以确定所述N个歌曲音频文件中的M个语句片段，也可以是通过接收所述N个歌曲音频文件的音频内容，通过音频内容歌词空挡位置，确定所述N个歌曲音频文件中M个语句片段，或是其他方式分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段。Specifically, analyzing the N song audio files to determine the M sentence segments in the N song audio files may be analyzing the lyrics information carried by the N song audio files to determine the N song audio files The M sentence fragments in the audio file can also be determined by receiving the audio content of the N song audio files, and determining the M sentence fragments in the N song audio files through the blank position of the audio content lyrics, or in other ways Analyzing the N song audio files to determine M sentence segments in the N song audio files.

S103、从所述N个歌曲音频文件中提取所述M个语句片段。S103. Extract the M sentence segments from the N song audio files.

具体的，根据确定得到的所述N个歌曲音频文件中的M个语句片段，提取所述M个语句片段，其中，所述M为大于1的整数，每个语句片段为为每个独立的语句。Specifically, extract the M sentence fragments according to the determined M sentence fragments in the N song audio files, wherein, the M is an integer greater than 1, and each sentence fragment is for each independent statement.

S104、按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件。S104. Splice the M sentence segments according to a preset sequence to obtain a spliced song audio file.

具体的，将提取到的所述M个语句片段按照预设顺序进行拼接，以获取拼接歌曲。其中，所述预设顺序可包括但不仅限于：时间先后顺序、随机顺序、用户自己预先设置的顺序等等。Specifically, the extracted M sentence segments are spliced according to a preset order to obtain spliced songs. Wherein, the preset order may include but not limited to: chronological order, random order, order preset by the user, and the like.

作为一种可能的实施方式，在多首歌曲的拼接中，根据所述变化规律将所述预设构成部分中重复度相同的且连续的语句合成段落，其中，语句为每首歌中独立的句子，进一步地，歌曲中语句的重复度计算方式可是统计一首歌中语句的总数，分别计算每个语句的概率，将相同语句的语句的概率进行相加，将相同语句的每个语句的概率定义为相加后的概率。判断预设构成部分中是否包含重复度超过预设阈值的段落，其中，预设阈值的定义方式可为：计算每个语句的概率，而预设阈值大于每个语句的概率。若预设构成部分中包含重复度超过预设阈值的段落，获取预设构成部分中超过所述预设阈值的段落对应的起始时间和结束时间。此种方式，根据起始时间和结束时间对所述歌曲音频文件进行切割操作，以得到歌曲音频文件中起始时间结束时间对应的切割段落，即高潮部分。As a possible implementation, in the splicing of multiple songs, according to the change rule, the repeated and continuous sentences in the preset components are synthesized into paragraphs, wherein the sentences are independent in each song. Sentences, further, the way to calculate the repetition degree of sentences in a song can be to count the total number of sentences in a song, calculate the probability of each sentence separately, add the probabilities of the sentences of the same sentence, and calculate the probability of each sentence of the same sentence Probability is defined as the summed probability. It is judged whether the preset constituent part contains paragraphs whose repetition degree exceeds a preset threshold, wherein the preset threshold is defined by: calculating the probability of each sentence, and the preset threshold is greater than the probability of each sentence. If the preset constituent part contains a paragraph whose repetition degree exceeds the preset threshold, the start time and end time corresponding to the paragraph exceeding the preset threshold in the preset constituent part are acquired. In this way, the song audio file is cut according to the start time and end time, so as to obtain the cut paragraph corresponding to the start time and end time in the song audio file, that is, the climax part.

可选的，在获取N个歌曲音频文件之前，将所述N个歌曲音频文件转换为预设格式的音频文件，通常情况下，不同格式的歌曲音频文件采用的编码方式不一样，因而会出现不同的格式类型。如何快速实现歌曲拼接，可在获取N个歌曲音频文件之前，对将所述N个歌曲音频文件转换为预设格式的音频文件。如将所有的歌曲音频文件转化为MP3格式或者WAV格式。Optionally, before obtaining the N song audio files, the N song audio files are converted into audio files of a preset format. Usually, song audio files of different formats use different encoding methods, so there will be Different format types. How to quickly implement song splicing, before acquiring the N song audio files, convert the N song audio files into audio files in a preset format. Such as converting all song audio files into MP3 format or WAV format.

作为一种可能的实施方式，可将，在获取N个歌曲音频文件之后，将所述N个歌曲音频文件转换为预设格式的音频文件。As a possible implementation manner, after acquiring the N song audio files, the N song audio files may be converted into audio files of a preset format.

可选的，在按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件之后，为了实现拼接歌曲的无缝拼接效果，因此需要锁定所述拼接歌曲音频文件的拼接位置；将所述拼接歌曲音频文件的拼接位置进行处理，以获取无缝拼接歌曲音频文件。Optionally, after splicing the M sentence segments according to a preset order to obtain the spliced song audio file, in order to realize the seamless splicing effect of the spliced song, it is necessary to lock the spliced position of the spliced song audio file; The splicing position of the spliced song audio file is processed to obtain the seamlessly spliced song audio file.

具体的，将拼接歌曲音频文件的拼接位置进行处理，以获取无缝拼接歌曲音频文件。其中，处理可包括但不仅限于：对拼接歌曲音频文件的拼接位置进行曲调调整、插入变化程度相似的曲调、对变化幅度的拼接位置进行平滑处理。Specifically, the splicing positions of the spliced song audio files are processed to obtain seamlessly spliced song audio files. Wherein, the processing may include but not limited to: adjusting the tune of the splicing positions of the spliced song audio files, inserting tunes with similar variation degrees, and smoothing the splicing positions of the varying ranges.

请参阅图2，图2是本发明实施例提供的一种歌曲音频处理的方法的第二实施例流程示意图，包括步骤：Please refer to FIG. 2. FIG. 2 is a schematic flow chart of a second embodiment of a method for processing song audio provided by an embodiment of the present invention, including steps:

S201、提取所述N个歌曲音频文件的原唱部分。S201. Extract the original singing parts of the N song audio files.

具体的，将所述N个歌曲音频文件进行去伴奏音乐的处理，以得到所述N个歌曲音频文件的原唱部分，所述原唱部分为除去伴奏音乐的音频文件。Specifically, the N song audio files are processed without accompaniment music, so as to obtain the original singing part of the N song audio files, and the original singing part is an audio file without the accompaniment music.

S202、确定所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间。S202. Determine the start time and end time of each voice segment in the original singing part of the N song audio files.

具体的，解析所述提取到的所述N个歌曲音频文件的原唱部分，以确定所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间，其中，所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间。Specifically, parsing the original singing parts of the N song audio files extracted to determine the start time and end time of each voice segment in the original singing parts of the N song audio files, wherein, The start time and end time of each speech segment in the original singing part of the N song audio files are the start time and end time of each sentence segment in the N song audio files.

S203、根据所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。S203. Cut the N song audio files according to the start time and end time of each voice segment in the original singing part of the N song audio files, to obtain the N song audio files M sentence fragments.

本发明实施例中，提取所述N个歌曲音频文件的原唱部分；确定所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间，所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；根据所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。实现将歌曲音频文件的语句片段提取出来。In the embodiment of the present invention, the original singing part of the N song audio files is extracted; the start time and the end time of each voice segment in the original singing part of the N song audio files are determined, and the N songs The start time and the end time of each voice segment in the original singing part of the audio file are the start time and the end time of each sentence segment in the N song audio files; according to the N song audio files The start time and end time of each speech segment in the original singing part are cut to the N song audio files to obtain M sentence segments in the N song audio files. Realize the extraction of sentence fragments of song audio files.

请参阅图3，图3是本发明实施例提供的一种歌曲音频处理的方法的第三实施例流程示意图，包括步骤：Please refer to FIG. 3. FIG. 3 is a schematic flow chart of a third embodiment of a method for processing song audio provided by an embodiment of the present invention, including steps:

S301、提取所述N个歌曲音频文件的伴奏部分。S301. Extract the accompaniment parts of the N song audio files.

具体的，将所述N个歌曲音频文件除去原唱音乐的处理，以得到所述N个歌曲音频文件的伴奏部分，所述原唱部分为除去去原唱音乐的音频文件。Specifically, the original singing music is removed from the N song audio files to obtain the accompaniment part of the N song audio files, and the original singing part is an audio file without the original singing music.

S302、确定所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间。S302. Determine the start time and end time of each melody segment in the accompaniment part of the N song audio files.

具体的，解析所述提取到的所述N个歌曲音频文件的伴奏部分，以确定所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间，其中，所述N个歌曲音频文件的伴奏部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间。Specifically, analyze the extracted accompaniment parts of the N song audio files to determine the start time and end time of each melody segment in the accompaniment parts of the N song audio files, wherein the The start time and end time of each speech segment in the accompaniment part of the N song audio files are the start time and end time of each sentence segment in the N song audio files.

S302、根据所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。S302. Cut the N song audio files according to the start time and end time of each voice segment in the original singing part of the M song audio files, so as to obtain the N song audio files M sentence fragments.

本发明实施例中，提取所述N个歌曲音频文件的伴奏部分；确定所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间，所述N个歌曲音频文件的伴奏部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；根据所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。实现将歌曲音频文件的语句片段提取出来。In the embodiment of the present invention, the accompaniment parts of the N song audio files are extracted; the start time and the end time of each tune segment in the accompaniment parts of the N song audio files are determined, and the N song audio files The start time and the end time of each speech segment in the accompaniment part are the start time and the end time of each sentence segment in the N song audio files; according to the original singing part of the M song audio files The start time and end time of each voice segment in the N song audio files are cut to obtain the M sentence segments in the N song audio files. Realize extracting sentence fragments of song audio files.

请参阅图4，图4是本发明实施例提供的一种歌曲音频处理装置的第一实施例结构示意图，其中，图4所示的歌曲音频处理装置400可以包括获取单元401、分析单元402、第一提取单元403和拼接单元404，具体如下：Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of a first embodiment of a song audio processing device provided in an embodiment of the present invention, wherein the song audio processing device 400 shown in FIG. 4 may include an acquisition unit 401, an analysis unit 402, The first extraction unit 403 and the splicing unit 404 are as follows:

获取单元401，用于获取N个歌曲音频文件，其中，N为大于或等于1的整数。The acquiring unit 401 is configured to acquire N song audio files, where N is an integer greater than or equal to 1.

分析单元402，分析所述获取单元获取得到的所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段。The analyzing unit 402 is configured to analyze the N song audio files obtained by the acquiring unit, so as to determine the M sentence segments in the N song audio files.

具体的，所述分析单元402包括：第二提取单元(未图示)，用于提取所述N个歌曲音频文件的原唱部分；第一确定单元(未图示)，用于确定所述第二提取单元提取的所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间，所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；第一剪切单元(未图示)，用于根据所述第一确定单元确定的所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到歌曲音频文件中的M个语句片段。Specifically, the analysis unit 402 includes: a second extraction unit (not shown), used to extract the original singing parts of the N song audio files; a first determination unit (not shown), used to determine the The start time and end time of each voice segment in the original part of the N song audio files extracted by the second extraction unit, the start time and end time of each voice segment in the original part of the N song audio files The start time and the end time are the start time and the end time of each sentence segment in the N song audio files; the first cutting unit (not shown) is used for determining according to the first determination unit The N song audio files are cut according to the start time and end time of each voice segment in the original singing part of the N song audio files, so as to obtain M sentence segments in the song audio file.

所述分析单元402包括：第三提取单元(未图示)，用于提取所述N个歌曲音频文件的伴奏部分；第二确定单元(未图示)，用于确定所述第三提取单元提取的所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间，所述N个歌曲音频文件的伴奏部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；第二剪切单元(未图示)，用于根据所述第二确定单元确定的所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。The analysis unit 402 includes: a third extraction unit (not shown), used to extract the accompaniment parts of the N song audio files; a second determination unit (not shown), used to determine the third extraction unit The start time and the end time of each tune fragment in the accompaniment part of the described N song audio files of extraction, the start time and the end time of each voice fragment in the accompaniment part of the N song audio files are The start time and end time of each sentence segment in the N song audio files; the second cutting unit (not shown), used for the M song audio files determined according to the second determination unit Cut the N song audio files according to the start time and end time of each speech segment in the original singing part, to obtain M sentence segments in the N song audio files.

第一提取单元403，从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数。The first extracting unit 403 extracts the M sentence segments from the N song audio files, where M is an integer greater than 1.

拼接单元404，用于按照预设顺序将所述第一提取单元提取的所述M个语句片段进行拼接，以得到拼接歌曲音频文件。The splicing unit 404 is configured to splice the M sentence segments extracted by the first extracting unit in a preset order to obtain spliced song audio files.

可选的，在获取单元401获取N个歌曲音频文件之前，所述歌曲音频处理装置还包括：转换单元，用于将所述N个歌曲音频文件转换为预设格式的音频文件。Optionally, before the acquiring unit 401 acquires the N song audio files, the song audio processing apparatus further includes: a converting unit, configured to convert the N song audio files into audio files of a preset format.

可选的，拼接单元404按照预设顺序将所述第一提取单元提取的所述M个语句片段进行拼接，以得到拼接歌曲音频文件之后，所述歌曲音频处理装置还包括：锁定单元(未图示)，用于锁定所述拼接歌曲音频文件的拼接位置；处理单元(未图示)，用于将所述锁定单元锁定的所述拼接歌曲音频文件的拼接位置进行处理，以获取无缝拼接歌曲音频文件。Optionally, the splicing unit 404 splices the M sentence segments extracted by the first extraction unit according to a preset order to obtain the spliced song audio file, and the song audio processing device further includes: a locking unit (not Illustrated), for locking the splicing position of the spliced song audio file; a processing unit (not shown), for processing the spliced position of the spliced song audio file locked by the locking unit, to obtain seamless Splice song audio files.

可以理解的是，本实施例的歌曲音频处理装置400的各功能模块的功能可根据所述方法实施例中的方法具体实现，其具体实现过程可以参照所述方法实施例的相关描述，此处不再赘述。It can be understood that the functions of each functional module of the song audio processing device 400 in this embodiment can be specifically implemented according to the method in the method embodiment, and the specific implementation process can refer to the relevant description of the method embodiment, here No longer.

可以看出，本发明实施例中，获取单元401获取N个歌曲音频文件，其中，N为大于或等于1的整数；分析单元402分析所述获取单元获取得到的所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段；第一提取单元403从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数；拼接单元404按照预设顺序将所述第一提取单元提取的所述M个语句片段进行拼接，以得到拼接歌曲音频文件。采用本发明实施例可将歌曲的语句切割成独立的语句，并将独立的语句拼接成歌曲，具有较高的处理效率，同时具有趣味性。It can be seen that in the embodiment of the present invention, the acquisition unit 401 acquires N song audio files, wherein N is an integer greater than or equal to 1; the analysis unit 402 analyzes the N song audio files acquired by the acquisition unit, To determine the M sentence segments in the N song audio files; the first extraction unit 403 extracts the M sentence segments from the N song audio files, wherein the M is an integer greater than 1; splicing The unit 404 splices the M sentence segments extracted by the first extraction unit according to a preset sequence to obtain a spliced song audio file. By adopting the embodiment of the present invention, the sentence of the song can be cut into independent sentences, and the independent sentences can be spliced into a song, which has high processing efficiency and is interesting at the same time.

参见图5，图5为本发明实施例提供的一种歌曲音频处理装置的第二实施例结构示意图。本实施例中所描述的歌曲音频处理装置包括：至少一个输入设备501；至少一个输出设备502；至少一个处理器503，例如CPU；和存储器504，上述输入设备501、输出设备502、处理器503和存储器504通过总线505连接。Referring to FIG. 5 , FIG. 5 is a schematic structural diagram of a second embodiment of a song audio processing device provided by an embodiment of the present invention. The song audio processing device described in this embodiment includes: at least one input device 501; at least one output device 502; at least one processor 503, such as CPU; and memory 504, the above-mentioned input device 501, output device 502, processor 503 It is connected to the memory 504 through the bus 505 .

其中，上述输入设备501可为触控面板、普通PC、液晶屏、触控屏等。Wherein, the above-mentioned input device 501 may be a touch panel, a common PC, a liquid crystal screen, a touch screen, and the like.

上述存储器504可以是高速RAM存储器，也可为非不稳定的存储器(non-volatilememory)，例如磁盘存储器。上述存储器504用于存储一组程序代码，上述输入设备501、输出设备502和处理器503用于调用存储器504中存储的程序代码，执行如下操作：The above-mentioned memory 504 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a disk memory. The above-mentioned memory 504 is used to store a set of program codes, and the above-mentioned input device 501, output device 502 and processor 503 are used to call the program codes stored in the memory 504, and perform the following operations:

上述处理器503，用于获取N个歌曲音频文件，其中，N为大于或等于1的整数；The above processor 503 is configured to obtain N song audio files, wherein N is an integer greater than or equal to 1;

上述处理器503，还用于分析所述N个歌曲音频文件，以确定所述N个歌曲音频文件中的M个语句片段；The processor 503 is further configured to analyze the N song audio files to determine the M sentence segments in the N song audio files;

上述处理器503，还用于从所述N个歌曲音频文件中提取所述M个语句片段，其中，所述M为大于1的整数；The above-mentioned processor 503 is further configured to extract the M sentence segments from the N song audio files, wherein the M is an integer greater than 1;

上述处理器503，还用于按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件。The above-mentioned processor 503 is further configured to splice the M sentence segments according to a preset order to obtain a spliced song audio file.

在一些可行的实施例中，上述处理器503还具体用于：In some feasible embodiments, the processor 503 is further specifically configured to:

提取所述N个歌曲音频文件的原唱部分；Extracting the original singing part of the N song audio files;

确定所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间，所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；Determine the start time and the end time of each voice segment in the original singing part of the N song audio files, the start time and the end time of each voice segment in the original singing part of the N song audio files Be the start time and the end time of each statement segment in the N song audio files;

根据所述N个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。Cut the N song audio files according to the start time and end time of each voice segment in the original singing part of the N song audio files, to obtain M in the N song audio files statement fragment.

提取所述N个歌曲音频文件的伴奏部分；extracting the accompaniment part of the N song audio files;

确定所述N个歌曲音频文件的伴奏部分中的每个曲调片段的起始时间和结束时间，所述N个歌曲音频文件的伴奏部分中的每个语音片段的起始时间和结束时间为所述N个歌曲音频文件中的每个语句片段的起始时间和结束时间；Determine the start time and the end time of each tune segment in the accompaniment part of the N song audio files, the start time and the end time of each voice segment in the accompaniment part of the N song audio files are the Describe the start time and the end time of each sentence segment in the N song audio files;

根据所述M个歌曲音频文件的原唱部分中的每个语音片段的起始时间和结束时间对所述N个歌曲音频文件进行剪切，以得到所述N个歌曲音频文件中的M个语句片段。Cut the N song audio files according to the start time and end time of each voice segment in the original singing part of the M song audio files, to obtain M in the N song audio files statement fragment.

在一些可行的实施例中，在上述处理器503获取N个歌曲音频文件之前，上述处理器503还具体用于：In some feasible embodiments, before the above-mentioned processor 503 obtains the N song audio files, the above-mentioned processor 503 is further specifically used for:

将所述N个歌曲音频文件转换为预设格式的音频文件。Converting the N song audio files into audio files in a preset format.

在一些可行的实施例中，在上述处理器503按照预设顺序将所述M个语句片段进行拼接，以得到拼接歌曲音频文件之后，上述处理器503还具体用于：In some feasible embodiments, after the above-mentioned processor 503 splices the M sentence segments according to a preset order to obtain the spliced song audio file, the above-mentioned processor 503 is further specifically configured to:

锁定所述拼接歌曲音频文件的拼接位置；Lock the splicing position of the spliced song audio file;

将所述拼接歌曲音频文件的拼接位置进行处理，以获取无缝拼接歌曲音频文件。The splicing position of the spliced song audio file is processed to obtain the seamlessly spliced song audio file.

具体实现中，本发明实施例中所描述的输入设备501、输出设备502和处理器503可执行本发明实施例提供图1～图3中所描述的歌曲音频处理的方法的各实施例中所描述的实现方式，也可执行本发明实施例提供的图4所描述的歌曲音频处理装置所描述的实现方式，在此不再赘述。In specific implementation, the input device 501, the output device 502 and the processor 503 described in the embodiment of the present invention can execute the method described in the embodiments of the present invention to provide the song audio processing described in Fig. 1 to Fig. 3 The described implementation manner may also execute the implementation manner described in the song audio processing apparatus described in FIG. 4 provided in the embodiment of the present invention, and details are not repeated here.

本发明实施例所有实施例中的模块或子模块，可以通过通用集成电路，例如CPU(Central Processing Unit，中央处理器)，或通过ASIC(Application SpecificIntegrated Circuit，专用集成电路)来实现。The modules or sub-modules in all the embodiments of the embodiments of the present invention may be realized by a general-purpose integrated circuit, such as a CPU (Central Processing Unit, central processing unit), or an ASIC (Application Specific Integrated Circuit, application specific integrated circuit).

本发明实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the methods of the embodiments of the present invention can be adjusted, combined and deleted according to actual needs.

本发明实施例装置中的单元可以根据实际需要进行合并、划分和删减。The units in the device of the embodiment of the present invention can be combined, divided and deleted according to actual needs.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory，ROM)或随机存取存储器(Random AccessMemory，简称RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct related hardware, and the programs can be stored in a computer-readable storage medium. During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM) or a random access memory (Random Access Memory, RAM for short).

以上所揭露的仅为本发明实施例较佳实施例而已，当然不能以此来限定本发明实施例之权利范围，因此依本发明实施例权利要求所作的等同变化，仍属本发明实施例所涵盖的范围。What is disclosed above is only a preferred embodiment of the embodiment of the present invention, and of course it cannot limit the scope of rights of the embodiment of the present invention. Therefore, the equivalent changes made according to the claims of the embodiment of the present invention still belong to the embodiment of the present invention. range covered.