Movatterモバイル変換


[0]ホーム

URL:


CN114550715A - A speaker cloud service platform system - Google Patents

A speaker cloud service platform system
Download PDF

Info

Publication number
CN114550715A
CN114550715ACN202210095507.4ACN202210095507ACN114550715ACN 114550715 ACN114550715 ACN 114550715ACN 202210095507 ACN202210095507 ACN 202210095507ACN 114550715 ACN114550715 ACN 114550715A
Authority
CN
China
Prior art keywords
speaker
search
value
speakers
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210095507.4A
Other languages
Chinese (zh)
Inventor
张学军
李斌
曾泓杰
许先富
张素素
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi UniversityfiledCriticalGuangxi University
Priority to CN202210095507.4ApriorityCriticalpatent/CN114550715A/en
Publication of CN114550715ApublicationCriticalpatent/CN114550715A/en
Priority to GB2300697.6Aprioritypatent/GB2616512B/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明涉及一种音箱云服务平台系统,解决的是应用单一,不能有效调节音箱与音箱之间的关系的技术问题,通过采用音箱处的定位检测单元,云服务器执行步骤1,云服务器接收工作的音箱的定位数据;步骤2,云服务器依据定位数据判断在运音箱的状态,如音箱定位距离小于预定义阈值,将对应的音箱标记为“疑似同组”,并向相应的音箱发送“疑似同组确认信息”;步骤3,接收“疑似同组确认信息”的反馈结果,如结果为“yes”,则将同组音箱进行统一数据传输给同组任一音箱,并控制任一音箱在同组范围内进行数据传输;如结果为“NO”,则按照权重优先级依次传输播放数据并控制音量大小的技术方案,较好的解决了该问题,可用于音箱云服务中。

Figure 202210095507

The invention relates to a cloud service platform system for speakers, which solves the technical problem of single application and inability to effectively adjust the relationship between speakers and speakers. By using a positioning detection unit at the speakers, the cloud server executes step 1, and the cloud server receives the work. In step 2, the cloud server judges the status of the speakers in operation according to the positioning data. If the positioning distance of the speakers is less than the predefined threshold, the corresponding speakers are marked as "suspected to be in the same group", and the corresponding speakers are sent "suspected to be in the same group". Confirmation information of the same group"; Step 3, receive the feedback result of "suspected confirmation information of the same group", if the result is "yes", then transmit the unified data of the same group of speakers to any speaker in the same group, and control any speaker in the same group. Data transmission is performed within the same group; if the result is "NO", the technical solution of transmitting playback data in turn according to the weight priority and controlling the volume can better solve this problem and can be used in the speaker cloud service.

Figure 202210095507

Description

Translated fromChinese
一种音箱云服务平台系统A speaker cloud service platform system

技术领域technical field

本发明涉及音箱云服务领域,具体涉及一种音箱云服务平台系统。The invention relates to the field of speaker cloud service, in particular to a speaker cloud service platform system.

背景技术Background technique

智能音箱是一个音箱升级的产物,是家庭消费者用语音进行上网的一个工具,比如点播歌曲、上网购物,或是了解天气预报,它也可以对智能家居设备进行控制,比如打开窗帘、设置冰箱温度、提前让热水器升温等。2018年6月11日,百度在京发布首款自有品牌智能音箱“小度智能音箱”。2019年6月1日,百度旗下人工智能助手“小度智能音箱大金刚”登陆小度商城;11月25日,由华为和帝瓦雷联合打造的华为SoundX智能音箱正式发布。Smart speaker is a product of speaker upgrade. It is a tool for home consumers to use voice to access the Internet, such as on-demand songs, online shopping, or knowing the weather forecast. It can also control smart home devices, such as opening curtains and setting refrigerators. temperature, let the water heater warm up in advance, etc. On June 11, 2018, Baidu released its first self-owned brand smart speaker "Xiaodu Smart Speaker" in Beijing. On June 1, 2019, Baidu's artificial intelligence assistant "Xiaodu Smart Speaker Donkey Kong" landed on Xiaodu Mall; on November 25, Huawei's SoundX smart speaker jointly created by Huawei and Devialet was officially released.

现有的云服务平台系统存在应用单一,不能有效调节音箱与音箱之间的关系的技术问题。本发明通过提供一种音箱云服务平台系统,能够解决问题。The existing cloud service platform system has the technical problem that the application is single and the relationship between the speakers cannot be effectively adjusted. The present invention can solve the problem by providing a cloud service platform system for speakers.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是现有技术中存在的应用单一,不能有效调节音箱与音箱之间的关系的技术问题。提供一种新的音箱云服务平台系统,该音箱云服务平台系统具有应用多元,有效处理音箱之间的冲突和关系的特点。The technical problem to be solved by the present invention is that the existing technology has a single application and cannot effectively adjust the relationship between the sound box and the sound box. A new speaker cloud service platform system is provided. The speaker cloud service platform system has the characteristics of multiple applications and effective handling of conflicts and relationships between speakers.

为解决上述技术问题,采用的技术方案如下:In order to solve the above technical problems, the technical solutions adopted are as follows:

一种音箱云服务平台系统,所述音箱包含语音输入模块、网络连接单元以及播放装置,所述音箱云服务平台系统包括与云服务器,用于连接云服务器和音箱的网络连接单元;音箱处设置有定位检测单元,云服务器实时接收定位检测单元的数据;云服务器执行音箱冲突检测步骤:A speaker cloud service platform system, the speaker includes a voice input module, a network connection unit and a playback device, the speaker cloud service platform system includes a cloud server, a network connection unit for connecting the cloud server and the speaker; the speaker is provided with There is a positioning detection unit, and the cloud server receives the data of the positioning detection unit in real time; the cloud server performs the steps of speaker conflict detection:

步骤1,云服务器接收处于工作状态的音箱的定位数据;Step 1, the cloud server receives the positioning data of the speaker in working state;

步骤2,云服务器依据定位数据判断在运音箱的状态,如音箱定位距离小于预定义阈值,将对应的音箱标记为“疑似同组”,并向相应的音箱发送“疑似同组确认信息”;Step 2, the cloud server judges the status of the speakers in motion according to the positioning data, if the positioning distance of the speakers is less than the predefined threshold, the corresponding speakers are marked as "suspected to be in the same group", and "suspected to be in the same group confirmation information" is sent to the corresponding speakers;

步骤3,接收“疑似同组确认信息”的反馈结果,如结果为“yes”,则将同组音箱进行统一数据传输给同组任一音箱,并控制任一音箱在同组范围内进行数据传输;如结果为“NO”,则按照权重优先级依次传输播放数据并控制音量大小。Step 3: Receive the feedback result of "suspected confirmation information of the same group", if the result is "yes", then transmit the unified data of the same group of speakers to any speaker in the same group, and control any speaker to perform data within the same group. Transmission; if the result is "NO", the playback data will be transmitted in order according to the weight priority and the volume will be controlled.

本发明的工作原理:本发明将音箱定位信息作为判断是否“疑似同组”的依据,并根据反馈结果有效调节可能关联或冲突的音箱与音箱之间的关系。同时云服务器可调控多种互联网应用。The working principle of the present invention: The present invention uses the speaker positioning information as the basis for judging whether it is "suspected to be in the same group", and effectively adjusts the relationship between the possibly related or conflicting speakers and the speakers according to the feedback results. At the same time, the cloud server can control a variety of Internet applications.

上述方案中,为优化,进一步地,所述权重优先级由云服务器通过如下方式确定:In the above solution, for optimization, further, the weight priority is determined by the cloud server in the following manner:

步骤1.1,判断音箱的联网启动时间,时间在前的优先级高;Step 1.1, judge the networking startup time of the speaker, the priority of the time is higher;

步骤1.2,判断音箱的自检状态,自检状态好的优先级高。Step 1.2, judge the self-inspection status of the speaker, and a good self-inspection status has a high priority.

进一步地,所述云服务器还可根据音箱的指令调用互联网上的其他应用。Further, the cloud server can also call other applications on the Internet according to the instructions of the speaker.

进一步地,所述云服务器还接收音箱的语音控制信号进行语音识别,包括:Further, the cloud server also receives the voice control signal of the speaker for voice recognition, including:

步骤一,建立历史语音特征图库,历史语音特征图是将预先输入或历史记录的语句语音进行特征提取,绘制语句语音特征图,语句语音特征图包含字、词、句特征图;The first step is to establish a historical voice feature library. The historical voice feature map is to perform feature extraction on the pre-input or historically recorded sentence voice, and draw a sentence voice feature map, and the sentence voice feature map includes character, word, and sentence feature maps;

步骤二,将音箱实时采集的语句语音进行特征提取,绘制目标语句语音特征图;任选历史语音特征图库中的语句语音特征图定义为参考图像,将目标语句语音特征图为目标图像,In step 2, feature extraction is performed on the speech speech collected by the speaker in real time, and a speech characteristic map of the target speech is drawn; optionally, the speech speech characteristic map of the speech in the historical speech characteristic library is defined as a reference image, and the speech characteristic map of the target speech is defined as the target image,

步骤三,将目标图像IC进行二值化处理,值为1则定义为有语音特征,0则定义为无语音特征;将二值化处理后的特征图采用单元网格划分为网格图,定义网格图首点(x1,y1)为原点,定义检索匹配步长为L,自原点开始,沿着x的方向进行检索,如果检索出值为1的出的,则记录该点的位置和值,并依序标号,否则继续检索匹配;In step 3, the target imageIC is subjected to binarization processing, and the value of 1 is defined as having a voice feature, and 0 is defined as no voice feature; the feature map after binarization processing is divided into a grid map by using a unit grid. , define the first point (x1, y1) of the grid map as the origin, define the search matching step as L, start from the origin, and search along the direction of x. If the value of 1 is retrieved, record the value of the point position and value, and label them in sequence, otherwise continue to search for matches;

步骤四,将点(x1,y1+N*L)更新为原点,返还执行步骤步骤三,直至x方向和y方向都检索匹配完毕,完成初步定位检索匹配,其中N为整数,L为常数;Step 4: Update the point (x1, y1+N*L) to the origin, and return to step 3 until both the x-direction and the y-direction are searched and matched, and the preliminary positioning search match is completed, where N is an integer and L is a constant;

步骤五,依次将值为1的点取出,将当次取出的1值点更新为原点,更新检索匹配步长为L/2,沿着x方向依次进行检索匹配,之前已经检索匹配的点不再检索匹配,检索匹配到超出范围则自动将检索匹配步长减半,继续检索匹配,直至步长减为最小,检索匹配过程中出现新的1值点则定义为需要进行y向检索匹配的新点,执行步骤六,否则执行步骤七;Step 5: Take out the points with a value of 1 in turn, update the 1-value point taken out at the current time as the origin, update the search matching step size to L/2, and search and match in sequence along the x direction. The points that have been searched and matched before are not Re-search for matching, if the search matching exceeds the range, the search matching step size is automatically halved, and the search matching continues until the step size is reduced to the minimum. A new 1-value point during the search matching process is defined as the one that needs to be searched and matched in the y direction. New point, go to step six, otherwise go to step seven;

步骤六,检索匹配步长为L/2不变,沿着y方向依次进行检索匹配,之前已经检索匹配的点不再检索匹配,检索匹配到超出范围则自动将检索匹配步长减半,继续检索匹配,直至步长减为最小,检索匹配过程中出现新的1值点则定义为需要进行x向检索匹配的新点,执行步骤五,否则执行步骤七;Step 6: The search matching step size is unchanged at L/2, and the search matching is performed in turn along the y direction. The points that have been searched and matched before will not be searched for matching. If the search matching exceeds the range, the search matching step size will be automatically halved and continue. Search and match until the step size is reduced to the minimum. A new 1-value point in the search and matching process is defined as a new point that needs to be searched and matched in the x-direction. Step 5 is performed, otherwise, step 7 is performed;

步骤七,直至没有新点需要检索匹配,结束检索匹配,将检索匹配出的1值点的区域集合为有效目标图像;Step 7, until no new points need to be searched and matched, end the search and match, and set the area set of the 1-value points obtained by the search as a valid target image;

步骤八,将有效目标图像在历史语音特征图库进行搜索匹配分析;Step 8: Search and match the effective target image in the historical voice feature gallery;

步骤九,根据识别结果调用相应策略。Step 9: Invoke the corresponding strategy according to the identification result.

进一步地,所述步骤八还包括图像校对处理,包括:Further, the step 8 also includes image proofreading processing, including:

步骤a,将有效目标图像定义为

Figure BDA0003490869210000041
任选历史语音特征图库中一参考图像定义为IC;Step a, define a valid target image as
Figure BDA0003490869210000041
A reference image in the optional historical speech feature library is defined as IC;

步骤b,定义参考图像IC和通过极坐标变换后的目标图像

Figure BDA0003490869210000042
有关联关系如下:Step b, define the reference image IC and the target image transformed by polar coordinates
Figure BDA0003490869210000042
The relationship is as follows:

Figure BDA0003490869210000043
其中,αz为尺度偏移参数,
Figure BDA0003490869210000044
为旋转偏移参数;
Figure BDA0003490869210000043
where αz is the scale offset parameter,
Figure BDA0003490869210000044
is the rotation offset parameter;

步骤c,计算出参考图像IC在极坐标系中径向上的投影

Figure BDA0003490869210000045
Figure BDA0003490869210000046
目标图像
Figure BDA0003490869210000047
在径向上的投影
Figure BDA0003490869210000048
将KC(i)和
Figure BDA0003490869210000049
取对数得到LKC(i)和
Figure BDA00034908692100000410
将LKC(i)和
Figure BDA00034908692100000411
的平移差值作为尺度偏移参数αz;Stepc , calculate the projection of the reference image IC on the radial direction in the polar coordinate system
Figure BDA0003490869210000045
Figure BDA0003490869210000046
target image
Figure BDA0003490869210000047
Projection in the radial direction
Figure BDA0003490869210000048
Put KC (i) and
Figure BDA0003490869210000049
Take the logarithm to get LKC (i) and
Figure BDA00034908692100000410
Put LKC (i) and
Figure BDA00034908692100000411
The translation difference of is used as the scale offset parameter αz ;

Figure BDA00034908692100000412
Figure BDA00034908692100000412

i=1,2,...nr

Figure BDA0003490869210000051
i=1,2,...nr ,
Figure BDA0003490869210000051

Figure BDA0003490869210000052
Figure BDA0003490869210000052

Figure BDA0003490869210000053
为Ki=Kmax处角度方向的采样数,ce()表示大于或等于括号内值的最小整数,fl()表示小于或大于括号内值的最大整数;目标图像的大小为2Kmax×2Kmax,nr=Kmax为径向方向采样数,nφ=8Ki为角度方向采样数;
Figure BDA0003490869210000053
is the number of samples in the angular direction atKi = Kmax , ce() represents the smallest integer greater than or equal to the value in the brackets, fl() represents the largest integer less than or greater than the value in the brackets; the size of the target image is 2Kmax × 2Kmax , nr =Kmax is the number of samples in the radial direction, nφ =8Ki is the number of samples in the angular direction;

步骤d,根据步骤c中的尺度偏移参数,计算出参考图像IC和目标图像

Figure BDA0003490869210000054
在径向和角度上的投影:Step d, according to the scale offset parameter in step c, calculate the reference imageIC and the target image
Figure BDA0003490869210000054
Projection on radial and angular:

Figure BDA0003490869210000055
Figure BDA0003490869210000055

Figure BDA0003490869210000056
Figure BDA0003490869210000056

Figure BDA0003490869210000057
Figure BDA0003490869210000058
进行归一化计算,计算出最高点的平移量
Figure BDA0003490869210000059
根据
Figure BDA00034908692100000510
Figure BDA00034908692100000511
计算出旋转偏移参数
Figure BDA00034908692100000512
right
Figure BDA0003490869210000057
and
Figure BDA0003490869210000058
Perform a normalization calculation to calculate the translation of the highest point
Figure BDA0003490869210000059
according to
Figure BDA00034908692100000510
Figure BDA00034908692100000511
Calculate the rotation offset parameter
Figure BDA00034908692100000512

步骤e,将旋转偏移参数φz和尺度偏移参数αz带入步骤A对目标图像进行矫正,同时根据

Figure BDA00034908692100000513
计算出∈z最小值对应的位置点
Figure BDA00034908692100000514
为目标图像的中心点,完成图像校对处理In step e, the rotation offset parameter φz and the scale offset parameter αz are brought into step A to correct the target image, and at the same time according to the
Figure BDA00034908692100000513
Calculate the position point corresponding to the minimum value of ∈z
Figure BDA00034908692100000514
As the center point of the target image, complete the image proofreading process

进一步地,所述步骤八中的搜索匹配分析还包括:Further, the search matching analysis in the step 8 also includes:

步骤A,将目标图像

Figure BDA00034908692100000515
以中心点为中心作同心圆,分割语音特征图像成B个环形区域,最后将每个环形区域分割成K个扇形区域,K和B均为预定义的常数;Step A, the target image
Figure BDA00034908692100000515
Make concentric circles with the center point as the center, divide the speech feature image into B annular areas, and finally divide each annular area into K sector-shaped areas, where K and B are both predefined constants;

步骤B,计算出每一扇区Ssq的扇区语音特征值Vsqθ作为Code1;Step B, calculate the sector voice feature value Vsqθ of each sector Ssq as Code1;

Figure BDA0003490869210000061
Figure BDA0003490869210000061

其中,Fsqθ(x,y)为扇形区域Ssq的各像素的灰度值,Psqθ表示扇形区域Ssq内像素灰度值的平均值,nsq为环形区域Ssq内的数目,0<sq≤B×K-1,θ={0°,(360°/K),2*(360°/K),3*(360°/K),...≤180°};Among them, Fsqθ (x, y) is the gray value of each pixel in the fan-shaped area Ssq , Psqθ is the average value of the pixel gray value in the fan-shaped area Ssq , nsq is the number in the annular area Ssq , 0 <sq≤B×K-1, θ={0°, (360°/K), 2*(360°/K), 3*(360°/K), ...≤180°};

步骤C,将语音特征图像旋转(180°/K)后,重复步骤B,提取每个扇区Ssq的扇区语音特征值Vsqθ作为Code2;Step C, after the voice feature image is rotated (180°/K), repeat step B, extract the sector voice feature value Vsqθ of each sector Ssq as Code2;

步骤E,将Code1和Code2分别旋转R×(360°/K)(R=0,1,2...K-1)得到Code1’和Code2’;Step E, rotate Code1 and Code2 by R×(360°/K) (R=0, 1, 2...K-1) to obtain Code1' and Code2';

步骤F,将步骤E的Code1和Code2,Code1’和Code2’输入历史语音特征图库进行匹配。Step F: Input the Code1 and Code2, Code1' and Code2' of Step E into the historical speech feature library for matching.

本发明的有益效果:本发明将音箱定位信息作为判断是否“疑似同组”的依据,并根据反馈结果有效调节可能关联或冲突的音箱与音箱之间的关系。同时云服务器可调控多种互联网应用。将语音的特征识别转换为特征图谱的整体识别,能够有更高的识别效率。通过对于特征图像的预交校对和定位处理,提高了控制的精度和效率。Beneficial effects of the present invention: The present invention uses the speaker positioning information as the basis for judging whether it is "suspected to be in the same group", and effectively adjusts the relationship between the possibly related or conflicting speakers and the speakers according to the feedback results. At the same time, the cloud server can control a variety of Internet applications. Converting the feature recognition of speech into the overall recognition of the feature map can have higher recognition efficiency. The accuracy and efficiency of control are improved by pre-posting, proofreading and positioning processing of feature images.

附图说明Description of drawings

下面结合附图和实施例对本发明进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

图1,音箱冲突检测步骤示意图。Figure 1 is a schematic diagram of the steps of speaker conflict detection.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

实施例1Example 1

本实施例提供一种音箱云服务平台系统,所述音箱包含语音输入模块、网络连接单元以及播放装置,所述音箱云服务平台系统包括与云服务器,用于连接云服务器和音箱的网络连接单元;音箱处设置有定位检测单元,云服务器实时接收定位检测单元的数据;云服务器执行音箱冲突检测步骤:This embodiment provides a speaker cloud service platform system, the speaker includes a voice input module, a network connection unit and a playback device, the speaker cloud service platform system includes a cloud server, a network connection unit for connecting the cloud server and the speaker The speaker is provided with a positioning detection unit, and the cloud server receives the data of the positioning detection unit in real time; the cloud server performs the speaker conflict detection steps:

步骤1,云服务器接收处于工作状态的音箱的定位数据;Step 1, the cloud server receives the positioning data of the speaker in working state;

步骤2,云服务器依据定位数据判断在运音箱的状态,如音箱定位距离小于预定义阈值,将对应的音箱标记为“疑似同组”,并向相应的音箱发送“疑似同组确认信息”;Step 2, the cloud server judges the status of the speakers in motion according to the positioning data, if the positioning distance of the speakers is less than the predefined threshold, the corresponding speakers are marked as "suspected to be in the same group", and "suspected to be in the same group confirmation information" is sent to the corresponding speakers;

步骤3,接收“疑似同组确认信息”的反馈结果,如结果为“yes”,则将同组音箱进行统一数据传输给同组任一音箱,并控制任一音箱在同组范围内进行数据传输;如结果为“NO”,则按照权重优先级依次传输播放数据并控制音量大小。Step 3: Receive the feedback result of "suspected confirmation information of the same group", if the result is "yes", then transmit the unified data of the same group of speakers to any speaker in the same group, and control any speaker to perform data within the same group. Transmission; if the result is "NO", the playback data will be transmitted in order according to the weight priority and the volume will be controlled.

本实施例将音箱定位信息作为判断是否“疑似同组”的依据,并根据反馈结果有效调节可能关联或冲突的音箱与音箱之间的关系。同时云服务器可调控多种互联网应用。In this embodiment, the speaker positioning information is used as the basis for judging whether it is "suspected to be in the same group", and the relationship between the possible related or conflicting speakers and the speakers is effectively adjusted according to the feedback result. At the same time, the cloud server can control a variety of Internet applications.

具体地,所述权重优先级由云服务器通过如下方式确定:Specifically, the weight priority is determined by the cloud server in the following manner:

步骤1.1,判断音箱的联网启动时间,时间在前的优先级高;Step 1.1, judge the networking startup time of the speaker, the priority of the time is higher;

步骤1.2,判断音箱的自检状态,自检状态好的优先级高。Step 1.2, judge the self-inspection status of the speaker, and a good self-inspection status has a high priority.

具体地,所述云服务器还可根据音箱的指令调用互联网上的其他应用。Specifically, the cloud server can also call other applications on the Internet according to the instructions of the speaker.

优选地,所述云服务器还接收音箱的语音控制信号进行语音识别,包括:Preferably, the cloud server also receives the voice control signal of the speaker for voice recognition, including:

步骤一,建立历史语音特征图库,历史语音特征图是将预先输入或历史记录的语句语音进行特征提取,绘制语句语音特征图,语句语音特征图包含字、词、句特征图;The first step is to establish a historical voice feature library. The historical voice feature map is to perform feature extraction on the pre-input or historically recorded sentence voice, and draw a sentence voice feature map, and the sentence voice feature map includes character, word, and sentence feature maps;

步骤二,将音箱实时采集的语句语音进行特征提取,绘制目标语句语音特征图;任选历史语音特征图库中的语句语音特征图定义为参考图像,将目标语句语音特征图为目标图像,In step 2, feature extraction is performed on the speech speech collected by the speaker in real time, and a speech characteristic map of the target speech is drawn; optionally, the speech speech characteristic map of the speech in the historical speech characteristic library is defined as a reference image, and the speech characteristic map of the target speech is defined as the target image,

步骤三,将目标图像IC进行二值化处理,值为1则定义为有语音特征,0则定义为无语音特征;将二值化处理后的特征图采用单元网格划分为网格图,定义网格图首点(x1,y1)为原点,定义检索匹配步长为L,自原点开始,沿着x的方向进行检索,如果检索出值为1的出的,则记录该点的位置和值,并依序标号,否则继续检索匹配;In step 3, the target imageIC is subjected to binarization processing, and the value of 1 is defined as having a voice feature, and 0 is defined as no voice feature; the feature map after binarization processing is divided into a grid map by using a unit grid. , define the first point (x1, y1) of the grid map as the origin, define the search matching step as L, start from the origin, and search along the direction of x. If the value of 1 is retrieved, record the value of the point position and value, and label them in sequence, otherwise continue to search for matches;

步骤四,将点(x1,y1+N*L)更新为原点,返还执行步骤步骤三,直至x方向和y方向都检索匹配完毕,完成初步定位检索匹配,其中N为整数,L为常数;Step 4: Update the point (x1, y1+N*L) to the origin, and return to step 3 until both the x-direction and the y-direction are searched and matched, and the preliminary positioning search match is completed, where N is an integer and L is a constant;

步骤五,依次将值为1的点取出,将当次取出的1值点更新为原点,更新检索匹配步长为L/2,沿着x方向依次进行检索匹配,之前已经检索匹配的点不再检索匹配,检索匹配到超出范围则自动将检索匹配步长减半,继续检索匹配,直至步长减为最小,检索匹配过程中出现新的1值点则定义为需要进行y向检索匹配的新点,执行步骤六,否则执行步骤七;Step 5: Take out the points with a value of 1 in turn, update the 1-value point taken out at the current time as the origin, update the search matching step size to L/2, and search and match in sequence along the x direction. The points that have been searched and matched before are not Re-search for matching, if the search matching exceeds the range, the search matching step size is automatically halved, and the search matching continues until the step size is reduced to the minimum. A new 1-value point during the search matching process is defined as the one that needs to be searched and matched in the y direction. New point, go to step six, otherwise go to step seven;

步骤六,检索匹配步长为L/2不变,沿着y方向依次进行检索匹配,之前已经检索匹配的点不再检索匹配,检索匹配到超出范围则自动将检索匹配步长减半,继续检索匹配,直至步长减为最小,检索匹配过程中出现新的1值点则定义为需要进行x向检索匹配的新点,执行步骤五,否则执行步骤七;Step 6: The search matching step size is unchanged at L/2, and the search matching is performed in turn along the y direction. The points that have been searched and matched before will not be searched for matching. If the search matching exceeds the range, the search matching step size will be automatically halved and continue. Search and match until the step size is reduced to the minimum. A new 1-value point in the search and matching process is defined as a new point that needs to be searched and matched in the x-direction. Step 5 is performed, otherwise, step 7 is performed;

步骤七,直至没有新点需要检索匹配,结束检索匹配,将检索匹配出的1值点的区域集合为有效目标图像;Step 7, until no new points need to be searched and matched, end the search and match, and set the area set of the 1-value points obtained by the search as a valid target image;

步骤八,将有效目标图像在历史语音特征图库进行搜索匹配分析;Step 8: Search and match the effective target image in the historical voice feature gallery;

步骤九,根据识别结果调用相应策略。Step 9: Invoke the corresponding strategy according to the identification result.

优选地,所述步骤八还包括图像校对处理,包括:Preferably, the step 8 further includes image proofreading processing, including:

步骤a,将有效目标图像定义为

Figure BDA0003490869210000091
任选历史语音特征图库中一参考图像定义为IC;Step a, define a valid target image as
Figure BDA0003490869210000091
A reference image in the optional historical speech feature library is defined as IC;

步骤b,定义参考图像IC和通过极坐标变换后的目标图像

Figure BDA0003490869210000092
有关联关系如下:Step b, define the reference imageIC and the target image transformed by polar coordinates
Figure BDA0003490869210000092
The relationship is as follows:

Figure BDA0003490869210000101
其中,αz为尺度偏移参数,
Figure BDA0003490869210000102
为旋转偏移参数;
Figure BDA0003490869210000101
where αz is the scale offset parameter,
Figure BDA0003490869210000102
is the rotation offset parameter;

步骤c,计算出参考图像IC在极坐标系中径向上的投影

Figure BDA0003490869210000103
Figure BDA0003490869210000104
目标图像
Figure BDA0003490869210000105
在径向上的投影
Figure BDA0003490869210000106
将KC(i)和
Figure BDA0003490869210000107
取对数得到LKC(i)和
Figure BDA0003490869210000108
将LKC(i)和
Figure BDA0003490869210000109
的平移差值作为尺度偏移参数αz;Stepc , calculate the projection of the reference image IC on the radial direction in the polar coordinate system
Figure BDA0003490869210000103
Figure BDA0003490869210000104
target image
Figure BDA0003490869210000105
Projection in the radial direction
Figure BDA0003490869210000106
Put KC (i) and
Figure BDA0003490869210000107
Take the logarithm to get LKC (i) and
Figure BDA0003490869210000108
Put LKC (i) and
Figure BDA0003490869210000109
The translation difference of is used as the scale offset parameter αz ;

Figure BDA00034908692100001010
Figure BDA00034908692100001010

i=1,2,...nr

Figure BDA00034908692100001011
i=1,2,...nr ,
Figure BDA00034908692100001011

Figure BDA00034908692100001012
Figure BDA00034908692100001012

Figure BDA00034908692100001013
为Ki=Kmax处角度方向的采样数,ce()表示大于或等于括号内值的最小整数,fl()表示小于或大于括号内值的最大整数;目标图像的大小为2Kmax×2Kmax,nr=Kmax为径向方向采样数,nφ=8Ki为角度方向采样数;
Figure BDA00034908692100001013
is the number of samples in the angular direction atKi = Kmax , ce() represents the smallest integer greater than or equal to the value in the brackets, fl() represents the largest integer less than or greater than the value in the brackets; the size of the target image is 2Kmax × 2Kmax , nr =Kmax is the number of samples in the radial direction, nφ =8Ki is the number of samples in the angular direction;

步骤d,根据步骤c中的尺度偏移参数,计算出参考图像IC和目标图像

Figure BDA00034908692100001014
在径向和角度上的投影:Step d, according to the scale offset parameter in step c, calculate the reference imageIC and the target image
Figure BDA00034908692100001014
Projection on radial and angular:

Figure BDA00034908692100001015
Figure BDA00034908692100001015

Figure BDA00034908692100001016
Figure BDA00034908692100001016

Figure BDA00034908692100001017
Figure BDA00034908692100001018
进行归一化计算,计算出最高点的平移量
Figure BDA00034908692100001019
根据
Figure BDA00034908692100001020
Figure BDA0003490869210000111
计算出旋转偏移参数
Figure BDA0003490869210000112
right
Figure BDA00034908692100001017
and
Figure BDA00034908692100001018
Perform a normalization calculation to calculate the translation of the highest point
Figure BDA00034908692100001019
according to
Figure BDA00034908692100001020
Figure BDA0003490869210000111
Calculate the rotation offset parameter
Figure BDA0003490869210000112

步骤e,将旋转偏移参数φz和尺度偏移参数αz带入步骤A对目标图像进行矫正,同时根据

Figure BDA0003490869210000113
计算出∈z最小值对应的位置点
Figure BDA0003490869210000114
为目标图像的中心点,完成图像校对处理In step e, the rotation offset parameter φz and the scale offset parameter αz are brought into step A to correct the target image, and at the same time according to the
Figure BDA0003490869210000113
Calculate the position point corresponding to the minimum value of ∈z
Figure BDA0003490869210000114
As the center point of the target image, complete the image proofreading process

优选地,所述步骤八中的搜索匹配分析还包括:Preferably, the search matching analysis in the step 8 further includes:

步骤A,将目标图像

Figure BDA0003490869210000115
以中心点为中心作同心圆,分割语音特征图像成B个环形区域,最后将每个环形区域分割成K个扇形区域,K和B均为预定义的常数;Step A, the target image
Figure BDA0003490869210000115
Make concentric circles with the center point as the center, divide the speech feature image into B annular areas, and finally divide each annular area into K sector-shaped areas, where K and B are both predefined constants;

步骤B,计算出每一扇区Ssq的扇区语音图像特征值Vsqθ作为Code1;Step B, calculates the sector voice image feature value Vsqθ of each sector Ssq as Code1;

Figure BDA0003490869210000116
Figure BDA0003490869210000116

其中,Fsqθ(x,y)为扇形区域Ssq的各像素的灰度值,Psqθ表示扇形区域Ssq内像素灰度值的平均值,nsq为环形区域Ssq内的数目,0<sq≤B×K-1,θ={0°,(360°/K),2*(360°/K),3*(360°/K),...≤180°};Among them, Fsqθ (x, y) is the gray value of each pixel in the fan-shaped area Ssq , Psqθ is the average value of the pixel gray value in the fan-shaped area Ssq , nsq is the number in the annular area Ssq , 0 <sq≤B×K-1, θ={0°, (360°/K), 2*(360°/K), 3*(360°/K), ...≤180°};

步骤C,将语音特征图像旋转(180°/K)后,重复步骤B,提取每个扇区Ssq的扇区语音特征值Vsqθ作为Code2;Step C, after the voice feature image is rotated (180°/K), repeat step B, extract the sector voice feature value Vsqθ of each sector Ssq as Code2;

步骤E,将Code1和Code2分别旋转R×(360°/K)(R=0,1,2...K-1)得到Code1’和Code2’;Step E, rotate Code1 and Code2 by R×(360°/K) (R=0, 1, 2...K-1) to obtain Code1' and Code2';

步骤F,将步骤E的Code1和Code2,Code1’和Code2’输入历史语音特征图库进行匹配。Step F: Input the Code1 and Code2, Code1' and Code2' of Step E into the historical speech feature library for matching.

本实施例将音箱定位信息作为判断是否“疑似同组”的依据,并根据反馈结果有效调节可能关联或冲突的音箱与音箱之间的关系。同时云服务器可调控多种互联网应用。将语音的特征识别转换为特征图谱的整体识别,能够有更高的识别效率。通过对于特征图像的预交校对和定位处理,提高了控制的精度和效率。In this embodiment, the speaker positioning information is used as the basis for judging whether it is "suspected to be in the same group", and the relationship between the possible related or conflicting speakers and the speakers is effectively adjusted according to the feedback result. At the same time, the cloud server can control a variety of Internet applications. Converting the feature recognition of speech into the overall recognition of the feature map can have higher recognition efficiency. The accuracy and efficiency of control are improved by pre-posting, proofreading and positioning processing of feature images.

尽管上面对本发明说明性的具体实施方式进行了描述,以便于本技术领域的技术人员能够理解本发明,但是本发明不仅限于具体实施方式的范围,对本技术领域的普通技术人员而言,只要各种变化只要在所附的权利要求限定和确定的本发明精神和范围内,一切利用本发明构思的发明创造均在保护之列。Although the illustrative specific embodiments of the present invention are described above so that those skilled in the art can understand the present invention, the present invention is not limited to the scope of the specific embodiments. As long as such changes fall within the spirit and scope of the present invention defined and determined by the appended claims, all inventions and creations utilizing the inventive concept are included in the protection list.

Claims (6)

Translated fromChinese
1.一种音箱云服务平台系统,所述音箱包含语音输入模块、网络连接单元以及播放装置,其特征在于:所述音箱云服务平台系统包括与云服务器,用于连接云服务器和音箱的网络连接单元;音箱处设置有定位检测单元,云服务器实时接收定位检测单元的数据;云服务器执行音箱冲突检测步骤:1. a speaker cloud service platform system, the speaker comprises a voice input module, a network connection unit and a playback device, it is characterized in that: the speaker cloud service platform system includes and a cloud server, for connecting the network of the cloud server and the speaker Connection unit; a positioning detection unit is set at the speaker, and the cloud server receives the data of the positioning detection unit in real time; the cloud server performs the speaker conflict detection steps:步骤1,云服务器接收处于工作状态的音箱的定位数据;Step 1, the cloud server receives the positioning data of the speaker in working state;步骤2,云服务器依据定位数据判断在运音箱的状态,如音箱定位距离小于预定义阈值,将对应的音箱标记为“疑似同组”,并向相应的音箱发送“疑似同组确认信息”;Step 2, the cloud server judges the status of the speakers in motion according to the positioning data, if the positioning distance of the speakers is less than the predefined threshold, the corresponding speakers are marked as "suspected to be in the same group", and "suspected to be in the same group confirmation information" is sent to the corresponding speakers;步骤3,接收“疑似同组确认信息”的反馈结果,如结果为“yes”,则将同组音箱进行统一数据传输给同组任一音箱,并控制任一音箱在同组范围内进行数据传输;如结果为“NO”,则按照权重优先级依次传输播放数据并控制音量大小。Step 3: Receive the feedback result of "suspected confirmation information of the same group", if the result is "yes", then transmit the unified data of the same group of speakers to any speaker in the same group, and control any speaker to perform data within the same group. Transmission; if the result is "NO", the playback data will be transmitted in order according to the weight priority and the volume will be controlled.2.根据权利要求1所述的音箱云服务平台系统,其特征在于:所述权重优先级由云服务器通过如下方式确定:2. The speaker cloud service platform system according to claim 1, wherein the weight priority is determined by the cloud server in the following manner:步骤1.1,判断音箱的联网启动时间,时间在前的优先级高;Step 1.1, judge the networking startup time of the speaker, the priority of the time is higher;步骤1.2,判断音箱的自检状态,自检状态好的优先级高。Step 1.2, judge the self-inspection status of the speaker, and a good self-inspection status has a high priority.3.根据权利要求1所述的音箱云服务平台系统,其特征在于:所述云服务器还可根据音箱的指令调用互联网上的其他应用。3 . The speaker cloud service platform system according to claim 1 , wherein the cloud server can also call other applications on the Internet according to the instructions of the speaker. 4 .4.根据权利要求1-3任一所述的音箱云服务平台系统,其特征在于:所述云服务器还接收音箱的语音控制信号进行语音识别,包括:4. The speaker cloud service platform system according to any one of claims 1-3, wherein the cloud server also receives the voice control signal of the speaker to perform speech recognition, comprising:步骤一,建立历史语音特征图库,历史语音特征图是将预先输入或历史记录的语句语音进行特征提取,绘制语句语音特征图,语句语音特征图包含字、词、句特征图;The first step is to establish a historical voice feature library. The historical voice feature map is to perform feature extraction on the pre-input or historically recorded sentence voice, and draw a sentence voice feature map, and the sentence voice feature map includes character, word, and sentence feature maps;步骤二,将音箱实时采集的语句语音进行特征提取,绘制目标语句语音特征图;任选历史语音特征图库中的语句语音特征图定义为参考图像,将目标语句语音特征图为目标图像,In step 2, feature extraction is performed on the speech speech collected by the speaker in real time, and a speech characteristic map of the target speech is drawn; optionally, the speech speech characteristic map of the speech in the historical speech characteristic library is defined as a reference image, and the speech characteristic map of the target speech is defined as the target image,步骤三,将目标图像IC进行二值化处理,值为1则定义为有语音特征,0则定义为无语音特征;将二值化处理后的特征图采用单元网格划分为网格图,定义网格图首点(x1,y1)为原点,定义检索匹配步长为L,自原点开始,沿着x的方向进行检索,如果检索出值为1的出的,则记录该点的位置和值,并依序标号,否则继续检索匹配;In step 3, the target imageIC is subjected to binarization processing, and the value of 1 is defined as having a voice feature, and 0 is defined as no voice feature; the feature map after binarization processing is divided into a grid map by using a unit grid. , define the first point (x1, y1) of the grid map as the origin, define the search matching step as L, start from the origin, and search along the direction of x. If the value of 1 is retrieved, record the value of the point position and value, and label them in sequence, otherwise continue to search for matches;步骤四,将点(x1,y1+N*L)更新为原点,返还执行步骤步骤三,直至x方向和y方向都检索匹配完毕,完成初步定位检索匹配,其中N为整数,L为常数;Step 4: Update the point (x1, y1+N*L) to the origin, and return to step 3 until both the x-direction and the y-direction are searched and matched, and the preliminary positioning search match is completed, where N is an integer and L is a constant;步骤五,依次将值为1的点取出,将当次取出的1值点更新为原点,更新检索匹配步长为L/2,沿着x方向依次进行检索匹配,之前已经检索匹配的点不再检索匹配,检索匹配到超出范围则自动将检索匹配步长减半,继续检索匹配,直至步长减为最小,检索匹配过程中出现新的1值点则定义为需要进行y向检索匹配的新点,执行步骤六,否则执行步骤七;Step 5: Take out the points with a value of 1 in turn, update the 1-value point taken out at the current time as the origin, update the search matching step size to L/2, and search and match in sequence along the x direction. The points that have been searched and matched before are not Re-search for matching, if the search matching exceeds the range, the search matching step size is automatically halved, and the search matching continues until the step size is reduced to the minimum. A new 1-value point during the search matching process is defined as the one that needs to be searched and matched in the y direction. New point, go to step six, otherwise go to step seven;步骤六,检索匹配步长为L/2不变,沿着y方向依次进行检索匹配,之前已经检索匹配的点不再检索匹配,检索匹配到超出范围则自动将检索匹配步长减半,继续检索匹配,直至步长减为最小,检索匹配过程中出现新的1值点则定义为需要进行x向检索匹配的新点,执行步骤五,否则执行步骤七;Step 6: The search matching step size is unchanged at L/2, and the search matching is performed in turn along the y direction. The points that have been searched and matched before will not be searched for matching. If the search matching exceeds the range, the search matching step size will be automatically halved and continue. Search and match until the step size is reduced to the minimum. A new 1-value point in the search and matching process is defined as a new point that needs to be searched and matched in the x-direction. Step 5 is performed, otherwise, step 7 is performed;步骤七,直至没有新点需要检索匹配,结束检索匹配,将检索匹配出的1值点的区域集合为有效目标图像;Step 7, until no new points need to be searched and matched, end the search and match, and set the area set of the 1-value points obtained by the search as a valid target image;步骤八,将有效目标图像在历史语音特征图库进行搜索匹配分析;Step 8: Search and match the effective target image in the historical voice feature gallery;步骤九,根据识别结果调用相应策略。Step 9: Invoke the corresponding strategy according to the identification result.5.根据权利要求4所述的音箱云服务平台系统,其特征在于:所述步骤八还包括图像校对处理,包括:5. The speaker cloud service platform system according to claim 4, wherein the step 8 also includes image proofreading processing, including:步骤a,将有效目标图像定义为
Figure FDA0003490869200000031
任选历史语音特征图库中一参考图像定义为IC;Step a, define a valid target image as
Figure FDA0003490869200000031
A reference image in the optional historical speech feature library is defined as IC;步骤b,定义参考图像IC和通过极坐标变换后的目标图像
Figure FDA0003490869200000032
有关联关系如下:
Step b, define the reference imageIC and the target image transformed by polar coordinates
Figure FDA0003490869200000032
The relationship is as follows:
Figure FDA0003490869200000033
其中,αz为尺度偏移参数,
Figure FDA00034908692000000310
为旋转偏移参数;
Figure FDA0003490869200000033
where αz is the scale offset parameter,
Figure FDA00034908692000000310
is the rotation offset parameter;
步骤c,计算出参考图像IC在极坐标系中径向上的投影
Figure FDA0003490869200000034
Figure FDA00034908692000000311
目标图像
Figure FDA0003490869200000035
在径向上的投影
Figure FDA0003490869200000036
Figure FDA00034908692000000312
Figure FDA0003490869200000037
取对数得到LKC(i)和
Figure FDA0003490869200000038
将LKC(i)和
Figure FDA0003490869200000039
的平移差值作为尺度偏移参数αz
Stepc , calculate the projection of the reference image IC on the radial direction in the polar coordinate system
Figure FDA0003490869200000034
Figure FDA00034908692000000311
target image
Figure FDA0003490869200000035
Projection in the radial direction
Figure FDA0003490869200000036
Will
Figure FDA00034908692000000312
and
Figure FDA0003490869200000037
Take the logarithm to get LKC (i) and
Figure FDA0003490869200000038
Put LKC (i) and
Figure FDA0003490869200000039
The translation difference of is used as the scale offset parameter αz ;
Figure FDA0003490869200000041
Figure FDA0003490869200000041
Figure FDA0003490869200000042
Figure FDA0003490869200000042
Figure FDA0003490869200000043
Figure FDA0003490869200000043
Figure FDA0003490869200000044
为Ki=Kmax处角度方向的采样数,ce()表示大于或等于括号内值的最小整数,fl()表示小于或大于括号内值的最大整数;目标图像的大小为2Kmax×2Kmax,nr=Kmax为径向方向采样数,nφ=8Ki为角度方向采样数;
Figure FDA0003490869200000044
is the number of samples in the angular direction atKi = Kmax , ce() represents the smallest integer greater than or equal to the value in the brackets, fl() represents the largest integer less than or greater than the value in the brackets; the size of the target image is 2Kmax × 2Kmax , nr =Kmax is the number of samples in the radial direction, nφ =8Ki is the number of samples in the angular direction;
步骤d,根据步骤c中的尺度偏移参数,计算出参考图像IC和目标图像
Figure FDA0003490869200000045
在径向和角度上的投影:
Step d, according to the scale offset parameter in step c, calculate the reference imageIC and the target image
Figure FDA0003490869200000045
Projection on radial and angular:
Figure FDA0003490869200000046
Figure FDA0003490869200000046
Figure FDA0003490869200000047
Figure FDA0003490869200000047
Figure FDA0003490869200000048
Figure FDA0003490869200000049
进行归一化计算,计算出最高点的平移量
Figure FDA00034908692000000410
根据
Figure FDA00034908692000000411
Figure FDA00034908692000000412
计算出旋转偏移参数
Figure FDA00034908692000000413
right
Figure FDA0003490869200000048
and
Figure FDA0003490869200000049
Perform a normalization calculation to calculate the translation of the highest point
Figure FDA00034908692000000410
according to
Figure FDA00034908692000000411
Figure FDA00034908692000000412
Calculate the rotation offset parameter
Figure FDA00034908692000000413
步骤e,将旋转偏移参数φz和尺度偏移参数αz带入步骤A对目标图像进行矫正,同时根据
Figure FDA00034908692000000414
计算出∈z最小值对应的位置点
Figure FDA00034908692000000415
为目标图像的中心点,完成图像校对处理。
In step e, the rotation offset parameter φz and the scale offset parameter αz are brought into step A to correct the target image, and at the same time according to the
Figure FDA00034908692000000414
Calculate the position point corresponding to the minimum value of ∈z
Figure FDA00034908692000000415
As the center point of the target image, the image proofreading process is completed.
6.根据权利要求4所述的音箱云服务平台系统,其特征在于:所述步骤八搜索匹配分析还包括:6. speaker cloud service platform system according to claim 4, is characterized in that: described step 8 search matching analysis also comprises:步骤A,将目标图像
Figure FDA0003490869200000051
以中心点为中心作同心圆,分割语音特征图像成B个环形区域,最后将每个环形区域分割成K个扇形区域,K和B均为预定义的常数;
Step A, the target image
Figure FDA0003490869200000051
Make concentric circles with the center point as the center, divide the speech feature image into B annular areas, and finally divide each annular area into K sector-shaped areas, where K and B are both predefined constants;
步骤B,计算出每一扇区Ssq的扇区语音特征值Vsqθ作为Code1;Step B, calculate the sector voice feature value Vsqθ of each sector Ssq as Code1;
Figure FDA0003490869200000052
Figure FDA0003490869200000052
其中,Fsqθ(x,y)为扇形区域Ssq的各像素的灰度值,Psqθ表示扇形区域Ssq内像素灰度值的平均值,nsq为环形区域Ssq内的数目,0<sq≤B×K-1,θ={0°,(360°/K),2*(360°/K),3*(360°/K),...≤180°};Among them, Fsqθ (x, y) is the gray value of each pixel in the fan-shaped area Ssq , Psqθ is the average value of the pixel gray value in the fan-shaped area Ssq , nsq is the number in the annular area Ssq , 0 <sq≤B×K-1, θ={0°, (360°/K), 2*(360°/K), 3*(360°/K),...≤180°};步骤C,将语音特征图像旋转(180°/K)后,重复步骤B,提取每个扇区Ssq的扇区语音特征值Vsqθ作为Code2;Step C, after the voice feature image is rotated (180°/K), repeat step B, extract the sector voice feature value Vsqθ of each sector Ssq as Code2;步骤E,将Code1和Code2分别旋转R×(360°/K)(R=0,1,2...K-1)得到Code1’和Code2’;Step E, rotate Code1 and Code2 by R×(360°/K) (R=0,1,2...K-1) to obtain Code1' and Code2';步骤F,将步骤E的Code1和Code2,Code1’和Code2’输入历史语音特征图库进行匹配。Step F: Input the Code1 and Code2, Code1' and Code2' of Step E into the historical speech feature library for matching.
CN202210095507.4A2022-01-262022-01-26 A speaker cloud service platform systemPendingCN114550715A (en)

Priority Applications (2)

Application NumberPriority DateFiling DateTitle
CN202210095507.4ACN114550715A (en)2022-01-262022-01-26 A speaker cloud service platform system
GB2300697.6AGB2616512B (en)2022-01-262023-01-17Cloud service platform system for speakers

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202210095507.4ACN114550715A (en)2022-01-262022-01-26 A speaker cloud service platform system

Publications (1)

Publication NumberPublication Date
CN114550715Atrue CN114550715A (en)2022-05-27

Family

ID=81672941

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202210095507.4APendingCN114550715A (en)2022-01-262022-01-26 A speaker cloud service platform system

Country Status (2)

CountryLink
CN (1)CN114550715A (en)
GB (1)GB2616512B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104507005A (en)*2014-12-052015-04-08广东欧珀移动通信有限公司Grouping control method and system of wireless loudspeaker box
CN104735589A (en)*2015-01-302015-06-24广东欧珀移动通信有限公司 A volume adjustment system and method for GPS-based smart speaker grouping
CN105845156A (en)*2016-03-222016-08-10广东欧珀移动通信有限公司 Control method, device and system of music playing system
CN106998514A (en)*2016-01-262017-08-01湖南汇德电子有限公司Intelligent multichannel collocation method and system
CN107517462A (en)*2017-10-122017-12-26柴雪A kind of multichannel audio amplifier control method and device
CN108513243A (en)*2018-02-282018-09-07成都星环科技有限公司A kind of intelligence sound field calibration system
CN108737933A (en)*2018-05-302018-11-02上海与德科技有限公司A kind of dialogue method, device and electronic equipment based on intelligent sound box
CN108834138A (en)*2018-05-252018-11-16四川斐讯全智信息技术有限公司A kind of distribution method and system based on voice print database
US20190288657A1 (en)*2018-03-152019-09-19Harman International Industries, IncorporatedSmart speakers with cloud equalizer
CN209642953U (en)*2019-06-052019-11-15中山市力泰电子工业有限公司 A new type of wireless transmission audio
CN112134966A (en)*2020-11-262020-12-25飞天诚信科技股份有限公司Cloud sound box broadcast voice configuration method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US9898250B1 (en)*2016-02-122018-02-20Amazon Technologies, Inc.Controlling distributed audio outputs to enable voice output
US11653148B2 (en)*2019-07-222023-05-16Apple Inc.Modifying and transferring audio between devices

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN104507005A (en)*2014-12-052015-04-08广东欧珀移动通信有限公司Grouping control method and system of wireless loudspeaker box
CN104735589A (en)*2015-01-302015-06-24广东欧珀移动通信有限公司 A volume adjustment system and method for GPS-based smart speaker grouping
CN106998514A (en)*2016-01-262017-08-01湖南汇德电子有限公司Intelligent multichannel collocation method and system
CN105845156A (en)*2016-03-222016-08-10广东欧珀移动通信有限公司 Control method, device and system of music playing system
CN107517462A (en)*2017-10-122017-12-26柴雪A kind of multichannel audio amplifier control method and device
CN108513243A (en)*2018-02-282018-09-07成都星环科技有限公司A kind of intelligence sound field calibration system
US20190288657A1 (en)*2018-03-152019-09-19Harman International Industries, IncorporatedSmart speakers with cloud equalizer
CN108834138A (en)*2018-05-252018-11-16四川斐讯全智信息技术有限公司A kind of distribution method and system based on voice print database
CN108737933A (en)*2018-05-302018-11-02上海与德科技有限公司A kind of dialogue method, device and electronic equipment based on intelligent sound box
CN209642953U (en)*2019-06-052019-11-15中山市力泰电子工业有限公司 A new type of wireless transmission audio
CN112134966A (en)*2020-11-262020-12-25飞天诚信科技股份有限公司Cloud sound box broadcast voice configuration method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李沛谕;: "语音识别技术在智能音箱系统中的应用技术浅析", 中国新通信, no. 20, 20 October 2018 (2018-10-20)*

Also Published As

Publication numberPublication date
GB202300697D0 (en)2023-03-01
GB2616512A (en)2023-09-13
GB2616512B (en)2024-11-06

Similar Documents

PublicationPublication DateTitle
CN109101602B (en)Image retrieval model training method, image retrieval method, device and storage medium
US20210295088A1 (en)Image detection method, device, storage medium and computer program product
CN113095336B (en)Method for training key point detection model and method for detecting key points of target object
JP6800351B2 (en) Methods and devices for detecting burr on electrode sheets
CN111027563A (en)Text detection method, device and recognition system
WO2019196633A1 (en)Training method for image semantic segmentation model and server
WO2022077646A1 (en)Method and apparatus for training student model for image processing
CN111402253A (en)Online monitoring method for state of power transmission and transformation equipment integrating edge calculation and deep learning
WO2020062493A1 (en)Image processing method and apparatus
WO2020062494A1 (en)Image processing method and apparatus
CN113657248B (en) Training method, device and computer program product for face recognition model
CN115359308B (en)Model training method, device, equipment, storage medium and program for identifying difficult cases
CN111784776A (en) Visual positioning method and apparatus, computer readable medium and electronic device
CN112883966A (en)Image character recognition method, device, medium and electronic equipment
CN114241411A (en)Counting model processing method and device based on target detection and computer equipment
CN114550715A (en) A speaker cloud service platform system
CN114783425A (en) A Federated Learning Method and System for Speech Recognition Based on Private Parameters
WO2023091131A1 (en)Methods and systems for retrieving images based on semantic plane features
CN112001301B (en) Building monitoring method, device and electronic device based on global cross entropy weighting
CN113778078A (en)Positioning information generation method and device, electronic equipment and computer readable medium
CN112001300A (en)Building monitoring method and device based on cross entropy according to position and electronic equipment
CN107291885A (en)A kind of big data visualization system and its control method based on man-machine interaction
CN113260044B (en) CSI fingerprint positioning method, device and device based on double-layer dictionary learning
WO2024197655A1 (en)Wifi fingerprinting indoor localization method based on denoising auto-encoder and convolutional neural network
CN114863148B (en) Target recognition method and terminal equipment based on millimeter wave radar

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp