Movatterモバイル変換


[0]ホーム

URL:


CN114612239A - Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence - Google Patents

Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence
Download PDF

Info

Publication number
CN114612239A
CN114612239ACN202210243161.8ACN202210243161ACN114612239ACN 114612239 ACN114612239 ACN 114612239ACN 202210243161 ACN202210243161 ACN 202210243161ACN 114612239 ACN114612239 ACN 114612239A
Authority
CN
China
Prior art keywords
module
bird
data
nest
nest position
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210243161.8A
Other languages
Chinese (zh)
Inventor
刘星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by IndividualfiledCriticalIndividual
Priority to CN202210243161.8ApriorityCriticalpatent/CN114612239A/en
Publication of CN114612239ApublicationCriticalpatent/CN114612239A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明涉及金融风控管理技术领域,具体地说,涉及基于算法、大数据、人工智能的股票舆情监测和风控系统。包括平台架构单元、数据处理单元、预测判识单元和风控管理单元;平台架构单元用于对平台环境进行管理;数据处理单元对数据进行分析并建立模型;预测判识单元用于预测风险并进行类型识别和程度分析;风控管理单元用于对风险进行管控。本发明设计以金融公司积累的大量市场数据为基础,构建模型并训练,可以快速准确地识别或预测可能存在的风险以便控制干预;对业务流程进行动态监测,降低风险因素,保障双方利益;实现对金融证券业务的实时监测,提高风控效果,改良金融证券公司的业务,从根本上降低风险程度,加强证券公司的风控体系。

Figure 202210243161

The invention relates to the technical field of financial risk control and management, in particular to a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence. It includes a platform architecture unit, a data processing unit, a prediction identification unit and a risk control management unit; the platform architecture unit is used to manage the platform environment; the data processing unit analyzes the data and establishes a model; the prediction identification unit is used to predict risks and carry out Type identification and degree analysis; risk control management unit is used to manage and control risks. The design of the invention is based on a large amount of market data accumulated by financial companies, and a model is constructed and trained, which can quickly and accurately identify or predict possible risks so as to control intervention; dynamically monitor business processes, reduce risk factors, and protect the interests of both parties; Real-time monitoring of financial securities business, improve the effect of risk control, improve the business of financial securities companies, fundamentally reduce the degree of risk, and strengthen the risk control system of securities companies.

Figure 202210243161

Description

Translated fromChinese
基于算法、大数据、人工智能的股票舆情监测和风控系统Stock public opinion monitoring and risk control system based on algorithm, big data and artificial intelligence

技术领域technical field

本发明涉及金融风控管理技术领域,具体地说,涉及基于算法、大数据、人工智能的股票舆情监测和风控系统。The invention relates to the technical field of financial risk control and management, in particular to a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence.

背景技术Background technique

随着人们理财产品的快速推广,金融证券成了大多数人最主要的理财对象之一。金融证券是银行及非银行金融机构发行的证券。然而,任何理财产品都存在或高或低的金融风险,即任何有可能导致企业或机构财务损失的风险,而金融证券中的风险,不单单指证券单价变动给用户带来的资产损失,更包括来自用户及证券公司双方的风险,这些风险可能来自于事前的证券公司及用户的征信问题、用户风险识别能力不足、证券公司对用户信息未完全公开、事中的政策变化或舆情影响、事后的售后服务等。如何有效地及时判识或预测金融证券业务流程中可能存在的风险并进行有效控制,是制约我国金融证券服务行业长足发展的关键。国外实践证明,以大数据技术为核心的互联网金融风险控制体系对防范金融风险有着巨大的作用,因此作为金融机构重要组成部分的证券公司硬蛋充分利用积累的时长数据优势开展更多元的金融业务并完善风控体系。然而,目前却没有较为全面的基于算法、大数据、人工智能的股票舆情监测和风控系统。With the rapid promotion of financial products, financial securities have become one of the most important financial objects for most people. Financial securities are securities issued by banks and non-bank financial institutions. However, any wealth management product has high or low financial risks, that is, any risk that may lead to financial losses of enterprises or institutions, and the risk in financial securities does not only refer to the loss of assets caused by changes in the unit price of securities to users, but also Including risks from both users and securities companies, these risks may come from prior credit information issues of securities companies and users, insufficient user risk identification capabilities, securities companies’ incomplete disclosure of user information, policy changes or public opinion influences, After-sales service, etc. How to effectively and timely identify or predict the possible risks in the financial securities business process and control them effectively is the key to restricting the rapid development of my country's financial securities service industry. Foreign practice has proved that the Internet financial risk control system with big data technology as the core plays a huge role in preventing financial risks. Therefore, as an important part of financial institutions, Ingdan, a securities company, makes full use of the advantages of accumulated time data to develop more diversified financial services. business and improve the risk control system. However, there is currently no comprehensive stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于提供了基于算法、大数据、人工智能的股票舆情监测和风控系统,以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, so as to solve the problems raised in the above background technology.

为实现上述技术问题的解决,本发明的目的之一在于,提供了基于算法、大数据、人工智能的股票舆情监测和风控系统,包括In order to solve the above technical problems, one of the purposes of the present invention is to provide a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence, including:

平台架构单元、数据处理单元、预测判识单元和风控管理单元;所述平台架构单元、所述数据处理单元、所述预测判识单元与所述风控管理单元依次通过网络通信连接;所述平台架构单元用于对构成平台运行环境的设备、软件及技术应用进行连接及管理;所述数据处理单元用于采集获取大量与金融证券及其风险相关的多元数据,通过对数据进行整理分析并建立完善的数据分析模型;所述预测判识单元用于通过对大量数据的伸入挖掘分析来预测金融证券业务流程中可能存在的风险,对风险进行类型识别和程度分析;所述风控管理单元用于从多方面、采用多种风控手段对金融证券业务的风险进行管控;a platform structure unit, a data processing unit, a prediction identification unit and a risk control management unit; the platform structure unit, the data processing unit, the prediction identification unit and the risk control management unit are sequentially connected through network communication; the The platform architecture unit is used to connect and manage the equipment, software and technical applications that constitute the operating environment of the platform; the data processing unit is used to collect and obtain a large amount of multivariate data related to financial securities and their risks. A perfect data analysis model is established; the prediction and identification unit is used to predict the possible risks in the financial securities business process by digging into and analyzing a large amount of data, and to carry out type identification and degree analysis of the risks; the risk control management The unit is used to manage and control the risks of financial securities business from various aspects and adopt various risk control methods;

所述平台架构单元包括基建设备模块、软件环境模块、技术支撑模块和三方平台模块;The platform architecture unit includes an infrastructure equipment module, a software environment module, a technical support module and a tripartite platform module;

所述数据处理单元包括数据集合模块、分类整理模块、数据分析模块和数据模型模块;The data processing unit includes a data collection module, a classification and arrangement module, a data analysis module and a data model module;

所述预测判识单元包括动态监测模块、风险预测模块、类型识别模块和程度判定模块;The prediction and identification unit includes a dynamic monitoring module, a risk prediction module, a type identification module and a degree determination module;

所述风控管理单元包括风险控制模块、合作风控模块、监管干预模块和改良措施模块。The risk control management unit includes a risk control module, a cooperative risk control module, a supervisory intervention module and an improvement measure module.

作为本技术方案的进一步改进,所述基建设备模块、所述软件环境模块、所述技术支撑模块与所述三方平台模块依次通过网络通信连接;所述基建设备模块用于对加入风控平台系统的电子计算机设备进行连接管理;所述软件环境模块用于在基建设备的基础上研发针对证券金融业务风险管理的软件及应用平台,以便构建支持系统的运行环境;所述技术支撑模块用于载入以人工智能为主的智能技术,并引入多种智能算法来支撑平台系统的顺畅运行;所述三方平台模块用于连接多个如金融证券信息管理平台、监管平台等第三方服务平台以获取大量补充数据及补充服务。As a further improvement of this technical solution, the infrastructure equipment module, the software environment module, the technical support module and the third-party platform module are sequentially connected through network communication; the infrastructure equipment module is used for adding the risk control platform system The electronic computer equipment is used for connection management; the software environment module is used to develop software and application platforms for securities financial business risk management on the basis of infrastructure equipment, so as to build an operating environment that supports the system; the technical support module is used to carry The three-party platform module is used to connect multiple third-party service platforms such as financial securities information management platforms and regulatory platforms to obtain Extensive supplementary data and supplementary services.

其中,基建设备包括但不限于计算机、显示器、PC平板、手机、智能传感器、数据采集装置(扫描仪、RFID、身份证OCR、人脸/指纹识别器等)等。Among them, infrastructure equipment includes but is not limited to computers, monitors, PC tablets, mobile phones, smart sensors, data acquisition devices (scanners, RFID, ID card OCR, face/fingerprint readers, etc.), etc.

作为本技术方案的进一步改进,所述数据集合模块的信号输出端与所述分类整理模块的信号输入端连接,所述分类整理模块的信号输出端与所述数据分析模块的信号输入端连接,所述数据分析模块的信号输出端与所述数据模型模块的信号输入端连接;所述数据集合模块用于通过多种手段从多来源获取大量与金融证券相关的数据;所述分类整理模块用于按照一定的类别规则将大量的数据进行分类归纳整理操作,以便进行后期的计算分析;所述数据分析模块用于采用多种全球领先技术来对金融证券的数据进行分析;所述数据模型模块用于以大量的数据为基础、根据数据分析的结果构建风险分析的数据模型并进行训练及验证。As a further improvement of this technical solution, the signal output end of the data collection module is connected with the signal input end of the classification and arrangement module, and the signal output end of the classification arrangement module is connected with the signal input end of the data analysis module, The signal output end of the data analysis module is connected to the signal input end of the data model module; the data collection module is used to obtain a large amount of data related to financial securities from multiple sources through various means; It is used to classify and summarize a large amount of data according to certain category rules, so as to carry out later calculation and analysis; the data analysis module is used to analyze the data of financial securities by adopting a variety of world-leading technologies; the data model module It is used to build a data model for risk analysis based on a large amount of data and according to the results of data analysis, and conduct training and verification.

作为本技术方案的进一步改进,所述数据集合模块包括舆情资讯模块、用户征信模块、公司产品模块和交易活动模块;所述舆情资讯模块、所述用户征信模块、所述公司产品模块与所述交易活动模块依次通过网络通信连接且并列运行;所述舆情资讯模块用于从网络上获取公开的历史或实时的与金融证券相关的舆情资讯;所述用户征信模块用于从用户、证券公司、合作银行等方面以合法手段或经用户授权后获取与用户征信相关的信息;所述公司产品模块用于获取各证券公司包括经营情况、公开资产、公司业务及具体产品详情等信息数据;所述交易活动模块用于获取用户与证券公司之间的交易活动的全流程信息。As a further improvement of this technical solution, the data collection module includes a public opinion information module, a user credit reporting module, a company product module and a transaction activity module; the public opinion information module, the user credit reporting module, the company product module and the The transaction activity modules are sequentially connected through network communication and run in parallel; the public opinion information module is used to obtain public historical or real-time public opinion information related to financial securities from the network; the user credit reporting module is used to obtain information from users, Securities companies, cooperative banks, etc. obtain information related to user credit reporting by legal means or with the authorization of users; the company product module is used to obtain information about securities companies, including business conditions, public assets, company business, and specific product details. data; the transaction activity module is used to obtain the whole process information of the transaction activity between the user and the securities company.

作为本技术方案的进一步改进,所述数据分析模块包括神经网络模块、机器学习模块、支持向量机模块和碰撞分析模块;所述神经网络模块、所述机器学习模块、所述支持向量机模块与所述碰撞分析模块依次通过网络通信连接且独立运行;所述神经网络模块用于通过神经网络的训练算法来将算法权重的值调整到最佳,以使得整个网络的预测效果最好,利用训练样本集中的样本对BP神经网络或支持向量机进行训练,利用测试样本集中的样本对BP神经网络或支持向量机进行测试,从而构建基于BP神经网络的股票走势预测模型用于对目标股票的走势进行预测,参数优化单元利用人工萤火虫群优化算法对BP神经网络的初始权值和阈值进行优化或采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优;所述机器学习模块用于使用机器学习相关的技术进行神经网络的训练,使得参数尽可能的与真实的模型逼近,从而使得模型训练可以获得性能与数据利用上的双重优势;所述支持向量机模块用于采用支持向量机的算法,通过构造分割面将数据进行分离,以便进行关系分析;所述碰撞分析模块用于将来自不同金融领域、不同金融机构的数据及风险因素进行碰撞分析以挖掘潜在的风险情况。As a further improvement of this technical solution, the data analysis module includes a neural network module, a machine learning module, a support vector machine module and a collision analysis module; the neural network module, the machine learning module, the support vector machine module and the The collision analysis module is sequentially connected through network communication and operates independently; the neural network module is used to adjust the value of the algorithm weight to the best through the training algorithm of the neural network, so that the prediction effect of the entire network is the best, and the training The samples in the sample set are used to train the BP neural network or the support vector machine, and the samples in the test sample set are used to test the BP neural network or the support vector machine, so as to construct a stock trend prediction model based on the BP neural network to predict the trend of the target stock. For prediction, the parameter optimization unit uses the artificial firefly swarm optimization algorithm to optimize the initial weights and thresholds of the BP neural network or uses the cuckoo search algorithm to optimize the penalty factor and kernel function parameters of the support vector machine; the machine learning module It is used to train the neural network using machine learning related technologies, so that the parameters are as close to the real model as possible, so that the model training can obtain the dual advantages of performance and data utilization; the support vector machine module is used to adopt support The algorithm of the vector machine separates the data by constructing a segmentation plane for relationship analysis; the collision analysis module is used for collision analysis of data and risk factors from different financial fields and financial institutions to mine potential risk situations.

作为本技术方案的进一步改进,所述机器学习模块中,机器学习的训练算法为:As a further improvement of this technical solution, in the machine learning module, the training algorithm for machine learning is:

首先给所有参数赋上随机值,使用这些随机生成的参数值,来预测训练数据中的样本;First assign random values to all parameters, and use these randomly generated parameter values to predict the samples in the training data;

设样本的预测目标为

Figure DEST_PATH_IMAGE001
,真实目标为y,那么定义一个值loss,计算公式如下:Let the prediction target of the sample be
Figure DEST_PATH_IMAGE001
, the real target is y, then define a value loss, the calculation formula is as follows:

Figure 320596DEST_PATH_IMAGE002
Figure 320596DEST_PATH_IMAGE002
;

其中,loss称为损失,机器学习的目标是使对所有训练数据的损失和尽可能的小;Among them, loss is called loss, and the goal of machine learning is to make the loss sum of all training data as small as possible;

进而,如果将先前的神经网络预测的矩阵公式带入到

Figure 880890DEST_PATH_IMAGE001
中,则可以把损失写为关于参数的损失函数。Furthermore, if the matrix formula predicted by the previous neural network is brought into
Figure 880890DEST_PATH_IMAGE001
, the loss can be written as a loss function with respect to the parameters.

作为本技术方案的进一步改进,所述支持向量机模块中,支持向量的算法选择方式为:As a further improvement of this technical solution, in the support vector machine module, the algorithm selection method of the support vector is:

以线性可分SVM为例,将W认为是若干样本线性组合得到的,则第1个样本为

Figure DEST_PATH_IMAGE003
,第i个为
Figure 386958DEST_PATH_IMAGE004
,对于每个x,给予其系数
Figure DEST_PATH_IMAGE005
,此时存在:
Figure 603176DEST_PATH_IMAGE006
,选取部分
Figure 485681DEST_PATH_IMAGE005
,使它们的值不为0,其余值都设为0,则对w真正起作用的就是值不为0的这些x向量,这些向量支持了法线向量,因此就是支持向量;Taking linearly separable SVM as an example, W is considered to be obtained by linear combination of several samples, then the first sample is
Figure DEST_PATH_IMAGE003
, the i-th is
Figure 386958DEST_PATH_IMAGE004
, for each x, give its coefficients
Figure DEST_PATH_IMAGE005
, which exists at this time:
Figure 603176DEST_PATH_IMAGE006
, select part
Figure 485681DEST_PATH_IMAGE005
, so that their values are not 0, and the rest of the values are set to 0, then what really works on w is these x vectors whose values are not 0. These vectors support the normal vector, so they are support vectors;

若直线l有参数w和b,通过计算每个样本到直线l的距离,衡量哪条直线是最为合适的分割线;距离d可以表示为:

Figure DEST_PATH_IMAGE007
,若每个数据集中样本的形式为
Figure 900482DEST_PATH_IMAGE008
,而每个样本的y值,就是这个样本的label(正例为1,负例为-1,这里的正负值其实反映的就是样本位于分割线的方向,位于法线正方向即为正);If the straight line l has parameters w and b, by calculating the distance from each sample to the straight line l, we can measure which straight line is the most suitable dividing line; the distance d can be expressed as:
Figure DEST_PATH_IMAGE007
, if the form of the samples in each dataset is
Figure 900482DEST_PATH_IMAGE008
, and the y value of each sample is the label of the sample (positive example is 1, negative example is -1, the positive and negative values here actually reflect the direction of the sample in the dividing line, and the positive direction of the normal is positive );

将y值一起乘入等式右边:

Figure DEST_PATH_IMAGE009
,这里的y值是样本的实际正负值,如果估计值与实际值符号相同,即分类正确,此时的结果为正值,如果分类错误,则结果为负值;Multiply the y values together into the right-hand side of the equation:
Figure DEST_PATH_IMAGE009
, the y value here is the actual positive and negative values of the sample. If the estimated value has the same sign as the actual value, that is, the classification is correct, the result at this time is a positive value, and if the classification is wrong, the result is a negative value;

在所有样本中,距离该直线最近的样本应被选为支持向量,支持向量与直线间的距离即为过渡带,因为SVM期望过渡带尽可能大,因此最终参数w与b的选择可以表示为:Among all samples, the sample closest to the line should be selected as the support vector, and the distance between the support vector and the line is the transition band, because SVM expects the transition band to be as large as possible, so the selection of the final parameters w and b can be expressed as :

Figure 843030DEST_PATH_IMAGE010
Figure 843030DEST_PATH_IMAGE010
;

因此,给定线性可分训练数据集,通过间隔最大化得到的分割超平面为:

Figure DEST_PATH_IMAGE011
,相应的分类决策函数为:
Figure 346211DEST_PATH_IMAGE012
。Therefore, given a linearly separable training dataset, the segmentation hyperplane obtained by margin maximization is:
Figure DEST_PATH_IMAGE011
, the corresponding classification decision function is:
Figure 346211DEST_PATH_IMAGE012
.

作为本技术方案的进一步改进,所述动态监测模块的信号输出端与所述风险预测模块的信号输入端连接,所述风险预测模块的信号输出端与所述类型识别模块的信号输入端连接,所述类型识别模块的信号输出端与所述程度判定模块的信号输入端连接;所述动态监测模块用于通过数字技术,以往期某一时期或某一时点的用户数据作为审核依据的风控方式,替代不能够抓住延续性数据的风控方式,重视具备延续性的用户信息并在复制上给予更高权重,从而实现对金融证券业务风险的动态监测;所述风险预测模块用于通过构建的风险预测数据模型自动预测用户与证券公司进行交易活动全流程中可能存在的风险因素;所述类型识别模块用于根据预测出的风险的在交易活动中所处的位置来识别该风险的类型;所述程度判定模块用于按照预设的风险等级划分规则自动评估各风险的程度情况。As a further improvement of this technical solution, the signal output end of the dynamic monitoring module is connected to the signal input end of the risk prediction module, and the signal output end of the risk prediction module is connected to the signal input end of the type identification module, The signal output end of the type identification module is connected with the signal input end of the degree determination module; the dynamic monitoring module is used for risk control through digital technology, using the user data of a certain period or a certain point in the past as the audit basis Instead of the risk control method that cannot grasp the continuous data, it pays attention to the continuous user information and gives a higher weight to the copy, so as to realize the dynamic monitoring of the financial and securities business risks; the risk prediction module is used to The constructed risk prediction data model automatically predicts the risk factors that may exist in the whole process of the user and the securities company's trading activities; the type identification module is used to identify the predicted risk according to its position in the trading activity. type; the degree determination module is used to automatically evaluate the degree of each risk according to the preset risk level classification rules.

其中,风险类型包括但不限于市场风险、信用风险、流动性风险、作业风险、行业风险、法律法规或政策风险、人事风险、自然灾害或其他突发事件等。Among them, the types of risks include but are not limited to market risks, credit risks, liquidity risks, operational risks, industry risks, legal and regulatory or policy risks, personnel risks, natural disasters or other emergencies, etc.

作为本技术方案的进一步改进,所述风险控制模块、所述合作风控模块、所述监管干预模块与所述改良措施模块依次通过网络通信连接;所述风险控制模块用于分别从事前、事中及事后三个方面来对金融证券业务流程中可能出现的各类风险进行控制管理;所述合作风控模块用于通过将不同金融领域、不同金融机构内的风险控制数据及风控方法实现共享合作从而提高风险控制的效果;所述监管干预模块用于从数据监管入手,在允许进一步放开券商对客户信息与交易数据开发权限的基础上,实时监控券商自身或第三方企业获取客户相关信息的来源于渠道,以及进一步进行数据内部深加工的流程和后续构建的包括客户交易习惯和征信等资料库,从而实现证券互联网化的全程监管,并引入第三方监管平台的干预手段来保障金融证券业务的低风险;所述改良措施模块用于证券公司利用大数据对一些业务功能进行改良来强化其风控体系。As a further improvement of this technical solution, the risk control module, the cooperative risk control module, the supervisory intervention module and the improvement measure module are sequentially connected through network communication; Control and management of various risks that may occur in the financial securities business process in three aspects: during and after the event; the cooperative risk control module is used to realize the risk control data and risk control methods in different financial fields and different financial institutions. Share and cooperate to improve the effect of risk control; the supervision intervention module is used to start from data supervision, and on the basis of allowing securities companies to further release the development rights of customer information and transaction data, real-time monitoring of securities companies themselves or third-party companies to obtain customer related information The source of information, the process of further internal deep processing of data, and the subsequent construction of databases including customer trading habits and credit information, so as to realize the whole process of securities Internet-based supervision, and introduce the intervention methods of third-party supervision platforms to ensure financial security Low risk of securities business; the improvement measure module is used for securities companies to use big data to improve some business functions to strengthen their risk control system.

其中,事前风险控制主要包括征信、风险定价、反欺诈等方面。Among them, ex-ante risk control mainly includes credit investigation, risk pricing, anti-fraud and other aspects.

作为本技术方案的进一步改进,所述改良措施模块包括用户上线模块、交易账户模块、数据融合模块和产品创新模块;所述用户上线模块、所述交易账户模块、所述数据融合模块与所述产品创新模块依次通过网络通信连接;所述用户上线模块用于将证券公司大量的线下存量客户线下存档的资料、交易类行为等数据进行线上化,以便衬垫用户的线上数据;所述交易账户模块用于打通用户证券交易账户的线上支付,拓展其线上的非证券交易功能,将账户体系丰富到其他线上平台以积累更多的非证券交易数据;所述数据融合模块用于在多个金融信息管理平台之上搭建数据融合平台以将所有数据进行归集整合,从而可以从各个维度对个体行为进行分析与预测,从全维度开展对个体的风险评估使评估迅速且准确;所述产品创新模块用于以大数据分析作为全方位产品创新的基础,以便开发定制化产品并精准推送和营销,并可以进行互联网化的产品设计及利用大数据进行风险定价。As a further improvement of this technical solution, the improvement measure module includes a user online module, a transaction account module, a data fusion module and a product innovation module; the user online module, the transaction account module, the data fusion module and the The product innovation modules are sequentially connected through network communication; the user online module is used to onlineize the data, transaction behavior and other data archived offline by a large number of offline stock customers of securities companies, so as to cushion the online data of users; The transaction account module is used to open up the online payment of the user's securities transaction account, expand its online non-securities transaction function, and enrich the account system to other online platforms to accumulate more non-securities transaction data; the data fusion The module is used to build a data fusion platform on top of multiple financial information management platforms to collect and integrate all data, so that individual behaviors can be analyzed and predicted from various dimensions, and individual risk assessments can be carried out from all dimensions to make the assessment quickly. and accurate; the product innovation module is used to use big data analysis as the basis for all-round product innovation, so as to develop customized products and accurately push and market them, and can carry out Internet-based product design and use big data for risk pricing.

作为本技术方案的进一步改进,在利用构建的样本集对支持向量机进行训练时,所述采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优包括设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优;监测目标确定单元用于提取接收到的各新闻舆情数据中的股票主体,并对包含所述股票主体的新闻舆情数据进行统计,当包含所述股票主体的新闻舆情数据在此次接收到的新闻舆情数据中所占的比例超出给定的阈值时,则判定该股票主体为需要进行舆情监测的目标股票,具体为:As a further improvement of this technical solution, when using the constructed sample set to train the support vector machine, using the cuckoo search algorithm to optimize the penalty factor and kernel function parameters of the support vector machine includes setting the position of the cuckoo bird's nest The smaller the corresponding fitness function value is, the better the solution corresponding to the position of the bird's nest is; the monitoring target determination unit is used to extract the stock subject in the received news public opinion data, and perform the news public opinion data containing the stock subject. Statistics, when the proportion of news public opinion data including the stock subject in the received news public opinion data exceeds a given threshold, it is determined that the stock subject is the target stock that needs public opinion monitoring, specifically:

Figure DEST_PATH_IMAGE013
表示监测目标确定单元此次接收到的新闻舆情数据的总数,
Figure 766828DEST_PATH_IMAGE014
表示监测目标确定单元在此次接收到的新闻舆情数据中提取到的第
Figure DEST_PATH_IMAGE015
个股票主体,当包含第
Figure 239398DEST_PATH_IMAGE015
个股票主体的新闻舆情数据在此次接收到的新闻舆情数据中满足:
Figure 149585DEST_PATH_IMAGE016
时,则判定该第
Figure 74816DEST_PATH_IMAGE015
个股票主体为需要进行舆情监测的目标股票,其中
Figure DEST_PATH_IMAGE017
为给定的阈值,
Figure 299124DEST_PATH_IMAGE017
的值可以取
Figure 626200DEST_PATH_IMAGE018
;Assume
Figure DEST_PATH_IMAGE013
Indicates the total number of news and public opinion data received by the monitoring target determination unit this time,
Figure 766828DEST_PATH_IMAGE014
Indicates the first number extracted by the monitoring target determination unit from the news public opinion data received this time.
Figure DEST_PATH_IMAGE015
A stock subject, when including the first
Figure 239398DEST_PATH_IMAGE015
The news and public opinion data of each stock subject satisfies the following in the news and public opinion data received this time:
Figure 149585DEST_PATH_IMAGE016
, it is determined that the
Figure 74816DEST_PATH_IMAGE015
Each stock subject is the target stock that needs to be monitored by public opinion, among which
Figure DEST_PATH_IMAGE017
for a given threshold,
Figure 299124DEST_PATH_IMAGE017
The value of can take
Figure 626200DEST_PATH_IMAGE018
;

舆情预警单元用于对包含所述目标股票的新闻舆情数据进行统计,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例超出给定的预警阈值时进行预警,具体为:The public opinion early-warning unit is used to perform statistics on the news and public opinion data containing the target stock, when the proportion of the news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock exceeds a given early warning threshold. Warning, specifically:

Figure DEST_PATH_IMAGE019
表示舆情监测模块接收到的包含第
Figure 972868DEST_PATH_IMAGE020
个目标股票的新闻舆情数据的总数,
Figure DEST_PATH_IMAGE021
表示包含第
Figure 385394DEST_PATH_IMAGE020
个目标股票的新闻舆情数据中情感标签为负面的新闻舆情数据的数量,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例满足:
Figure 147814DEST_PATH_IMAGE022
时,舆情预警单元进行预警,其中,
Figure DEST_PATH_IMAGE023
为给定的预警阈值,
Figure 391714DEST_PATH_IMAGE023
的值可以取
Figure 846966DEST_PATH_IMAGE018
;Assume
Figure DEST_PATH_IMAGE019
Indicates that the information received by the public opinion monitoring module contains the first
Figure 972868DEST_PATH_IMAGE020
The total number of news and public opinion data for each target stock,
Figure DEST_PATH_IMAGE021
Indicates that the
Figure 385394DEST_PATH_IMAGE020
The number of news and public opinion data with negative emotional labels in the news and public opinion data of each target stock, when the proportion of news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock satisfies:
Figure 147814DEST_PATH_IMAGE022
, the public opinion warning unit will give an early warning, among which,
Figure DEST_PATH_IMAGE023
for a given warning threshold,
Figure 391714DEST_PATH_IMAGE023
The value of can take
Figure 846966DEST_PATH_IMAGE018
;

舆情数据管理单元用于对接收到的包含目标股票的新闻舆情数据进行的预处理包括分词处理、过滤停用词、并删除所有与情感信息无关的链接地址、联系方式的文本;The public opinion data management unit is configured to perform preprocessing on the received news public opinion data containing the target stock, including word segmentation, filtering stop words, and deleting all texts of link addresses and contact information irrelevant to emotional information;

设置

Figure 746789DEST_PATH_IMAGE024
表示布谷鸟种群中的第
Figure DEST_PATH_IMAGE025
个布谷鸟在进行第
Figure 312899DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,采用莱维飞行模式对鸟巢位置
Figure 411305DEST_PATH_IMAGE024
进行第
Figure DEST_PATH_IMAGE027
次的迭代更新,具体为:set up
Figure 746789DEST_PATH_IMAGE024
Indicates the number one in the cuckoo population
Figure DEST_PATH_IMAGE025
A cuckoo is in the process of
Figure 312899DEST_PATH_IMAGE026
The final bird's nest position after the second iteration update, using the Levi flight mode to determine the bird's nest position
Figure 411305DEST_PATH_IMAGE024
carry out the
Figure DEST_PATH_IMAGE027
Iterative update of times, specifically:

Figure DEST_PATH_IMAGE029
Figure DEST_PATH_IMAGE029

Figure DEST_PATH_IMAGE031
Figure DEST_PATH_IMAGE031

式中,

Figure 37459DEST_PATH_IMAGE032
表示采用莱维飞行模式对鸟巢位置
Figure DEST_PATH_IMAGE033
进行第
Figure 486894DEST_PATH_IMAGE034
次的迭代更新后的鸟巢位置,
Figure DEST_PATH_IMAGE035
表示步长因子,
Figure 591117DEST_PATH_IMAGE036
表示点对点乘法,
Figure DEST_PATH_IMAGE037
表示服从参数
Figure 809608DEST_PATH_IMAGE038
的莱维分布生成的随机搜索向量,
Figure DEST_PATH_IMAGE039
表示鸟巢位置
Figure 606663DEST_PATH_IMAGE032
的适应度函数值,
Figure 215499DEST_PATH_IMAGE040
表示鸟巢位置
Figure 388991DEST_PATH_IMAGE033
的适应度函数值;In the formula,
Figure 37459DEST_PATH_IMAGE032
Indicates the position of the bird's nest using the Levi flight mode
Figure DEST_PATH_IMAGE033
carry out the
Figure 486894DEST_PATH_IMAGE034
The updated bird's nest position in the next iteration,
Figure DEST_PATH_IMAGE035
represents the step factor,
Figure 591117DEST_PATH_IMAGE036
represents point-to-point multiplication,
Figure DEST_PATH_IMAGE037
Indicates obedience parameter
Figure 809608DEST_PATH_IMAGE038
The random search vector generated by the Levy distribution of ,
Figure DEST_PATH_IMAGE039
Indicates the location of the bird's nest
Figure 606663DEST_PATH_IMAGE032
The fitness function value of ,
Figure 215499DEST_PATH_IMAGE040
Indicates the location of the bird's nest
Figure 388991DEST_PATH_IMAGE033
The fitness function value of ;

在采用莱维飞行模式对种群中的鸟巢位置进行第

Figure 461990DEST_PATH_IMAGE027
次的迭代更新后,采用下列步骤在种群中选取鸟巢位置进行第
Figure 429946DEST_PATH_IMAGE027
次的偏好随机更新,具体包括:The first study of nest locations in the population using the Levi flight mode
Figure 461990DEST_PATH_IMAGE027
After the iterative update of the number of times, the following steps are used to select the bird's nest position in the population for the first
Figure 429946DEST_PATH_IMAGE027
The preferences are randomly updated each time, including:

(1)对当前采用莱维飞行模式进行了第

Figure 526078DEST_PATH_IMAGE027
次迭代更新后的鸟巢位置进行区域划分;(1) Carry out the first review of the current Levie flight mode.
Figure 526078DEST_PATH_IMAGE027
The position of the bird's nest after the second iteration update is divided into regions;

(2)在划分的各区域中选取鸟巢位置进行第

Figure 972102DEST_PATH_IMAGE027
次的偏好随机更新;(2) Select the bird's nest position in the divided areas for the first
Figure 972102DEST_PATH_IMAGE027
The preferences are updated randomly;

给定种群在采用莱维飞行模式进行第

Figure 102869DEST_PATH_IMAGE027
次的迭代更新后的区域分割阈值
Figure DEST_PATH_IMAGE041
,且
Figure 301114DEST_PATH_IMAGE041
的值设置为:
Figure 884542DEST_PATH_IMAGE042
,其中,
Figure DEST_PATH_IMAGE043
表示鸟巢位置
Figure 134258DEST_PATH_IMAGE044
的近邻分割值,且
Figure DEST_PATH_IMAGE045
Figure 181848DEST_PATH_IMAGE046
表示当前种群中距离鸟巢位置
Figure 491607DEST_PATH_IMAGE032
Figure 562331DEST_PATH_IMAGE020
近的鸟巢位置,
Figure DEST_PATH_IMAGE047
为给定的正整数,且
Figure 615738DEST_PATH_IMAGE048
Figure 517835DEST_PATH_IMAGE047
的值可以取
Figure DEST_PATH_IMAGE049
Figure 998495DEST_PATH_IMAGE050
为种群中的布谷鸟数,采用下列步骤根据给定的区域分割阈值
Figure 290936DEST_PATH_IMAGE041
对当前采用莱维飞行模式进行了第
Figure 148033DEST_PATH_IMAGE027
次迭代更新后的鸟巢位置进行区域划分:The given population is in the Levy flight mode for the first
Figure 102869DEST_PATH_IMAGE027
Iteratively updated region segmentation threshold
Figure DEST_PATH_IMAGE041
,and
Figure 301114DEST_PATH_IMAGE041
The value is set to:
Figure 884542DEST_PATH_IMAGE042
,in,
Figure DEST_PATH_IMAGE043
Indicates the location of the bird's nest
Figure 134258DEST_PATH_IMAGE044
The nearest neighbor split value of , and
Figure DEST_PATH_IMAGE045
,
Figure 181848DEST_PATH_IMAGE046
Indicates the distance to the bird's nest in the current population
Figure 491607DEST_PATH_IMAGE032
the first
Figure 562331DEST_PATH_IMAGE020
near the bird's nest,
Figure DEST_PATH_IMAGE047
is a given positive integer, and
Figure 615738DEST_PATH_IMAGE048
,
Figure 517835DEST_PATH_IMAGE047
The value of can take
Figure DEST_PATH_IMAGE049
,
Figure 998495DEST_PATH_IMAGE050
is the number of cuckoos in the population, using the following steps to divide the threshold according to the given area
Figure 290936DEST_PATH_IMAGE041
The first review of the current use of Levi's flight mode
Figure 148033DEST_PATH_IMAGE027
The location of the bird's nest after the second iteration update is divided into regions:

Step1:在种群中随机选取一个鸟巢位置,设

Figure DEST_PATH_IMAGE051
为此次随机选取的鸟巢位置,
Figure 904637DEST_PATH_IMAGE052
表示布谷鸟种群中的第
Figure DEST_PATH_IMAGE053
个布谷鸟在进行第
Figure 556198DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,则
Figure 663831DEST_PATH_IMAGE051
表示采用莱维飞行模式对鸟巢位置
Figure 59040DEST_PATH_IMAGE054
进行第
Figure 607833DEST_PATH_IMAGE034
次的迭代更新后的鸟巢位置,将鸟巢位置
Figure DEST_PATH_IMAGE055
所处区域标记为
Figure 695875DEST_PATH_IMAGE056
,并将鸟巢位置
Figure 25225DEST_PATH_IMAGE051
划分进区域
Figure 958546DEST_PATH_IMAGE056
中,对种群中未划分区域的鸟巢位置依次进行筛选,具体为:Step 1: Randomly select a bird's nest location in the population, set
Figure DEST_PATH_IMAGE051
The location of the bird's nest randomly selected for this time,
Figure 904637DEST_PATH_IMAGE052
Indicates the number one in the cuckoo population
Figure DEST_PATH_IMAGE053
A cuckoo is in the process of
Figure 556198DEST_PATH_IMAGE026
The final bird’s nest position after the next iteration update, then
Figure 663831DEST_PATH_IMAGE051
Indicates the position of the bird's nest using the Levi flight mode
Figure 59040DEST_PATH_IMAGE054
carry out the
Figure 607833DEST_PATH_IMAGE034
The updated bird's nest position in the next iteration, the bird's nest position
Figure DEST_PATH_IMAGE055
The area is marked as
Figure 695875DEST_PATH_IMAGE056
, and set the nest position
Figure 25225DEST_PATH_IMAGE051
divided into regions
Figure 958546DEST_PATH_IMAGE056
, screen the nest positions of undivided areas in the population in turn, specifically:

设置

Figure 627425DEST_PATH_IMAGE024
表示布谷鸟种群中的第
Figure 886368DEST_PATH_IMAGE025
个布谷鸟在进行第
Figure 375118DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,
Figure 377709DEST_PATH_IMAGE044
表示采用莱维飞行模式对鸟巢位置
Figure 963411DEST_PATH_IMAGE024
进行第
Figure 127676DEST_PATH_IMAGE027
次的迭代更新后的鸟巢位置,当鸟巢位置
Figure 369302DEST_PATH_IMAGE044
满足:
Figure DEST_PATH_IMAGE057
时,则将鸟巢位置
Figure 910005DEST_PATH_IMAGE044
划分进区域
Figure 350213DEST_PATH_IMAGE056
中;set up
Figure 627425DEST_PATH_IMAGE024
Indicates the number one in the cuckoo population
Figure 886368DEST_PATH_IMAGE025
A cuckoo is in the process of
Figure 375118DEST_PATH_IMAGE026
The final bird's nest position after the second iteration update,
Figure 377709DEST_PATH_IMAGE044
Indicates the position of the bird's nest using the Levi flight mode
Figure 963411DEST_PATH_IMAGE024
carry out the
Figure 127676DEST_PATH_IMAGE027
The next iteration is the updated bird's nest position, when the bird's nest position
Figure 369302DEST_PATH_IMAGE044
Satisfy:
Figure DEST_PATH_IMAGE057
, the bird's nest position
Figure 910005DEST_PATH_IMAGE044
divided into regions
Figure 350213DEST_PATH_IMAGE056
middle;

当对种群中未划分区域的鸟巢位置筛选完成后,进入Step2;After completing the screening of bird nests in undivided areas in the population, go toStep 2;

Step2:在种群中未划分进区域的鸟巢位置中随机选取一个鸟巢位置,设

Figure 950959DEST_PATH_IMAGE058
为此次随机选取的鸟巢位置,
Figure DEST_PATH_IMAGE059
表示布谷鸟种群中的第
Figure 414301DEST_PATH_IMAGE060
个布谷鸟在进行第
Figure 758695DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,则
Figure 321919DEST_PATH_IMAGE058
表示采用莱维飞行模式对鸟巢位置
Figure 827987DEST_PATH_IMAGE059
进行第
Figure 44204DEST_PATH_IMAGE027
次的迭代更新后的鸟巢位置,将鸟巢位置
Figure 926710DEST_PATH_IMAGE058
所处的区域标记为
Figure DEST_PATH_IMAGE061
,并将鸟巢位置
Figure 341510DEST_PATH_IMAGE058
划分进区域
Figure 18479DEST_PATH_IMAGE061
中,对种群中未划分进行区域的鸟巢位置依次进行筛选,具体为:Step 2: Randomly select a bird's nest position among the bird's nest positions that are not divided into regions in the population, set
Figure 950959DEST_PATH_IMAGE058
The location of the bird's nest randomly selected for this time,
Figure DEST_PATH_IMAGE059
Indicates the number one in the cuckoo population
Figure 414301DEST_PATH_IMAGE060
A cuckoo is in the process of
Figure 758695DEST_PATH_IMAGE026
The final bird’s nest position after the next iteration update, then
Figure 321919DEST_PATH_IMAGE058
Indicates the position of the bird's nest using the Levi flight mode
Figure 827987DEST_PATH_IMAGE059
carry out the
Figure 44204DEST_PATH_IMAGE027
The updated bird's nest position in the next iteration, the bird's nest position
Figure 926710DEST_PATH_IMAGE058
The area is marked as
Figure DEST_PATH_IMAGE061
, and set the nest position
Figure 341510DEST_PATH_IMAGE058
divided into regions
Figure 18479DEST_PATH_IMAGE061
, screen the bird's nest positions that are not divided into areas in the population in turn, specifically:

当鸟巢位置

Figure 456414DEST_PATH_IMAGE044
满足:
Figure 142610DEST_PATH_IMAGE062
时,则将鸟巢位置
Figure 349601DEST_PATH_IMAGE044
划分进行区域
Figure 463050DEST_PATH_IMAGE061
中;When the bird's nest position
Figure 456414DEST_PATH_IMAGE044
Satisfy:
Figure 142610DEST_PATH_IMAGE062
, the bird's nest position
Figure 349601DEST_PATH_IMAGE044
Divide the area
Figure 463050DEST_PATH_IMAGE061
middle;

当对种群中未划分进区域的鸟巢位置筛选完成后,进入步骤Step3;When the selection of the bird's nest positions in the population that are not divided into areas is completed, go to Step 3;

Step3:当种群中未划分进区域的鸟巢位置的个数不为

Figure DEST_PATH_IMAGE063
时,则继续按照步骤Step2中的方式对种群中未划分进区域的鸟巢位置进行区域划分,当种群中未划分进区域的鸟巢位置的个数为
Figure 450598DEST_PATH_IMAGE063
时,则停止对种群中的鸟巢位置进行区域划分;Step 3: When the number of nest positions that are not divided into areas in the population is not
Figure DEST_PATH_IMAGE063
, then continue to divide the bird's nest positions that are not divided into the area in the population according to the method inStep 2. When the number of bird's nest positions that are not divided into the area in the population is
Figure 450598DEST_PATH_IMAGE063
, then stop regional division of the bird's nest position in the population;

在划分的各区域中选取鸟巢位置进行第

Figure 674906DEST_PATH_IMAGE027
次的偏好随机更新,具体为:Select the bird's nest location in the divided areas for the first
Figure 674906DEST_PATH_IMAGE027
The preference is randomly updated for the second time, specifically:

设置

Figure 1982DEST_PATH_IMAGE064
表示对采用莱维飞行模式进行第
Figure 286333DEST_PATH_IMAGE034
次的迭代更新后的鸟巢位置进行区域划分所得的第
Figure DEST_PATH_IMAGE065
个区域,定义
Figure 495597DEST_PATH_IMAGE066
表示区域
Figure 523596DEST_PATH_IMAGE064
中鸟巢位置的区域属性系数,且
Figure 705179DEST_PATH_IMAGE066
的值为:
Figure DEST_PATH_IMAGE067
,式中,
Figure 222748DEST_PATH_IMAGE068
表示区域
Figure 122571DEST_PATH_IMAGE064
中鸟巢位置的临近距离值的均值,且
Figure DEST_PATH_IMAGE069
Figure 688681DEST_PATH_IMAGE070
表示区域
Figure 724770DEST_PATH_IMAGE064
中鸟巢位置的临近距离值的离散系数,且
Figure DEST_PATH_IMAGE071
,其中,
Figure 413241DEST_PATH_IMAGE072
表示区域
Figure 534780DEST_PATH_IMAGE064
中的第
Figure DEST_PATH_IMAGE073
个鸟巢位置,
Figure 904582DEST_PATH_IMAGE074
表示鸟巢位置
Figure 60757DEST_PATH_IMAGE072
的临近距离值,且
Figure DEST_PATH_IMAGE075
Figure 920128DEST_PATH_IMAGE076
表示区域
Figure 528964DEST_PATH_IMAGE064
中距离鸟巢位置
Figure 436877DEST_PATH_IMAGE072
最近的鸟巢位置,
Figure DEST_PATH_IMAGE077
表示区域
Figure 509875DEST_PATH_IMAGE064
中的鸟巢位置数;set up
Figure 1982DEST_PATH_IMAGE064
Indicates that the first flight in Levi flight mode
Figure 286333DEST_PATH_IMAGE034
The second iteration of the updated bird's nest location is the result of regional division.
Figure DEST_PATH_IMAGE065
area, define
Figure 495597DEST_PATH_IMAGE066
Representation area
Figure 523596DEST_PATH_IMAGE064
the regional attribute coefficients of the bird's nest location in , and
Figure 705179DEST_PATH_IMAGE066
The value is:
Figure DEST_PATH_IMAGE067
, where,
Figure 222748DEST_PATH_IMAGE068
Representation area
Figure 122571DEST_PATH_IMAGE064
the mean of the proximity distance values of the bird's nest location in the middle, and
Figure DEST_PATH_IMAGE069
,
Figure 688681DEST_PATH_IMAGE070
Representation area
Figure 724770DEST_PATH_IMAGE064
The dispersion coefficient of the proximity distance value of the bird's nest location in the middle, and
Figure DEST_PATH_IMAGE071
,in,
Figure 413241DEST_PATH_IMAGE072
Representation area
Figure 534780DEST_PATH_IMAGE064
in the
Figure DEST_PATH_IMAGE073
bird's nest location,
Figure 904582DEST_PATH_IMAGE074
Indicates the location of the bird's nest
Figure 60757DEST_PATH_IMAGE072
the proximity distance value of , and
Figure DEST_PATH_IMAGE075
,
Figure 920128DEST_PATH_IMAGE076
Representation area
Figure 528964DEST_PATH_IMAGE064
Middle distance bird's nest location
Figure 436877DEST_PATH_IMAGE072
The nearest bird's nest location,
Figure DEST_PATH_IMAGE077
Representation area
Figure 509875DEST_PATH_IMAGE064
The number of nest positions in ;

按照下列步骤对区域

Figure 477831DEST_PATH_IMAGE064
中的鸟巢位置进行可替代性检测:Follow the steps below to
Figure 477831DEST_PATH_IMAGE064
Alternative detection of the bird's nest location in:

步骤1:设置

Figure 573963DEST_PATH_IMAGE078
表示区域
Figure 285568DEST_PATH_IMAGE064
中当前未进行可替代性检测的鸟巢位置集合,设
Figure DEST_PATH_IMAGE079
表示集合
Figure 475722DEST_PATH_IMAGE078
中的第
Figure 614579DEST_PATH_IMAGE080
个鸟巢位置,定义
Figure DEST_PATH_IMAGE081
表示鸟巢位置
Figure 932428DEST_PATH_IMAGE079
在集合
Figure 447723DEST_PATH_IMAGE078
中的可替代性系数,且
Figure 495313DEST_PATH_IMAGE081
的值为:Step 1: Setup
Figure 573963DEST_PATH_IMAGE078
Representation area
Figure 285568DEST_PATH_IMAGE064
The set of bird's nest positions that have not currently been tested for substitutability, set
Figure DEST_PATH_IMAGE079
Represents a collection
Figure 475722DEST_PATH_IMAGE078
in the
Figure 614579DEST_PATH_IMAGE080
nest locations, defined
Figure DEST_PATH_IMAGE081
Indicates the location of the bird's nest
Figure 932428DEST_PATH_IMAGE079
in collection
Figure 447723DEST_PATH_IMAGE078
the substitutability coefficient in , and
Figure 495313DEST_PATH_IMAGE081
The value is:

Figure DEST_PATH_IMAGE083
Figure DEST_PATH_IMAGE083

式中,

Figure 805072DEST_PATH_IMAGE084
表示集合
Figure 610217DEST_PATH_IMAGE078
中的第
Figure DEST_PATH_IMAGE085
个鸟巢位置,
Figure 663624DEST_PATH_IMAGE086
为鸟巢位置
Figure 300141DEST_PATH_IMAGE079
和鸟巢位置
Figure 780801DEST_PATH_IMAGE084
之间的可替代判断函数,当
Figure DEST_PATH_IMAGE087
时,
Figure 338821DEST_PATH_IMAGE086
的值取
Figure 930340DEST_PATH_IMAGE088
,当
Figure DEST_PATH_IMAGE089
时,
Figure 218102DEST_PATH_IMAGE086
的值取
Figure 869663DEST_PATH_IMAGE063
;In the formula,
Figure 805072DEST_PATH_IMAGE084
Represents a collection
Figure 610217DEST_PATH_IMAGE078
in the
Figure DEST_PATH_IMAGE085
bird's nest location,
Figure 663624DEST_PATH_IMAGE086
for the bird's nest location
Figure 300141DEST_PATH_IMAGE079
and bird's nest location
Figure 780801DEST_PATH_IMAGE084
Alternative judgment functions between , when
Figure DEST_PATH_IMAGE087
hour,
Figure 338821DEST_PATH_IMAGE086
the value of
Figure 930340DEST_PATH_IMAGE088
,when
Figure DEST_PATH_IMAGE089
hour,
Figure 218102DEST_PATH_IMAGE086
the value of
Figure 869663DEST_PATH_IMAGE063
;

按照上述方法计算集合

Figure 649400DEST_PATH_IMAGE078
中各鸟巢位置的可替代性系数;Calculate the set as above
Figure 649400DEST_PATH_IMAGE078
The substitutability coefficient of each bird's nest location in ;

步骤2:在当前集合

Figure 44609DEST_PATH_IMAGE078
中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,设
Figure 921298DEST_PATH_IMAGE090
为当前集合
Figure 743761DEST_PATH_IMAGE078
中具有最大可替代性系数的鸟巢位置,且
Figure 10794DEST_PATH_IMAGE090
表示集合
Figure 209694DEST_PATH_IMAGE078
中的第
Figure DEST_PATH_IMAGE091
个鸟巢位置,
Figure 940890DEST_PATH_IMAGE092
表示鸟巢位置
Figure 934254DEST_PATH_IMAGE090
在集合
Figure 688583DEST_PATH_IMAGE078
中的可替代鸟巢位置集合,且
Figure DEST_PATH_IMAGE093
,其中,
Figure 691174DEST_PATH_IMAGE094
表示集合
Figure 214559DEST_PATH_IMAGE078
中的第
Figure DEST_PATH_IMAGE095
个鸟巢位置,
Figure 441141DEST_PATH_IMAGE096
为鸟巢位置
Figure 682767DEST_PATH_IMAGE090
和鸟巢位置
Figure 223470DEST_PATH_IMAGE094
之间的可替代判断函数,当
Figure DEST_PATH_IMAGE097
时,
Figure 398099DEST_PATH_IMAGE096
的值取
Figure 998845DEST_PATH_IMAGE088
,当
Figure 462187DEST_PATH_IMAGE098
时,
Figure 806581DEST_PATH_IMAGE096
的值取
Figure 38979DEST_PATH_IMAGE063
;设
Figure DEST_PATH_IMAGE099
表示鸟巢位置
Figure 141452DEST_PATH_IMAGE090
在集合
Figure 92090DEST_PATH_IMAGE078
中的可替代性系数,当
Figure 974596DEST_PATH_IMAGE099
的值满足:
Figure 327079DEST_PATH_IMAGE100
时,则在集合
Figure 331945DEST_PATH_IMAGE092
中随机选取
Figure DEST_PATH_IMAGE101
个鸟巢位置进行偏好随机更新,并将集合
Figure 769879DEST_PATH_IMAGE092
中的鸟巢位置和鸟巢位置
Figure 456075DEST_PATH_IMAGE090
都标注为已进行可替代性检测的鸟巢位置,并在集合
Figure 663066DEST_PATH_IMAGE078
中去除所述已进行可替代性检测的鸟巢位置后进入步骤3;当
Figure 573253DEST_PATH_IMAGE099
的值满足:
Figure 498484DEST_PATH_IMAGE102
时,则停止在区域
Figure 722792DEST_PATH_IMAGE064
中选取鸟巢位置进行第
Figure 49868DEST_PATH_IMAGE027
次的偏好随机更新,其中,
Figure DEST_PATH_IMAGE103
为给定的可替代性检测阈值,且
Figure 396536DEST_PATH_IMAGE103
为大于
Figure 809062DEST_PATH_IMAGE088
的正整数,
Figure 837061DEST_PATH_IMAGE103
的值可以取
Figure 18644DEST_PATH_IMAGE104
;Step 2: In the current collection
Figure 44609DEST_PATH_IMAGE078
Select the bird's nest position with the largest substitutability coefficient for substitutability detection, set
Figure 921298DEST_PATH_IMAGE090
for the current collection
Figure 743761DEST_PATH_IMAGE078
The nest location with the largest substitutability coefficient in , and
Figure 10794DEST_PATH_IMAGE090
Represents a collection
Figure 209694DEST_PATH_IMAGE078
in the
Figure DEST_PATH_IMAGE091
bird's nest location,
Figure 940890DEST_PATH_IMAGE092
Indicates the location of the bird's nest
Figure 934254DEST_PATH_IMAGE090
in collection
Figure 688583DEST_PATH_IMAGE078
the set of alternative nest positions in , and
Figure DEST_PATH_IMAGE093
,in,
Figure 691174DEST_PATH_IMAGE094
Represents a collection
Figure 214559DEST_PATH_IMAGE078
in the
Figure DEST_PATH_IMAGE095
bird's nest location,
Figure 441141DEST_PATH_IMAGE096
for the bird's nest location
Figure 682767DEST_PATH_IMAGE090
and bird's nest location
Figure 223470DEST_PATH_IMAGE094
Alternative judgment functions between , when
Figure DEST_PATH_IMAGE097
hour,
Figure 398099DEST_PATH_IMAGE096
the value of
Figure 998845DEST_PATH_IMAGE088
,when
Figure 462187DEST_PATH_IMAGE098
hour,
Figure 806581DEST_PATH_IMAGE096
the value of
Figure 38979DEST_PATH_IMAGE063
;Assume
Figure DEST_PATH_IMAGE099
Indicates the location of the bird's nest
Figure 141452DEST_PATH_IMAGE090
in collection
Figure 92090DEST_PATH_IMAGE078
The substitutability coefficient in , when
Figure 974596DEST_PATH_IMAGE099
The value of satisfies:
Figure 327079DEST_PATH_IMAGE100
, then in the collection
Figure 331945DEST_PATH_IMAGE092
randomly selected from
Figure DEST_PATH_IMAGE101
Randomly update the preference of the bird's nest positions, and collect the
Figure 769879DEST_PATH_IMAGE092
Bird's Nest Location and Bird's Nest Location in
Figure 456075DEST_PATH_IMAGE090
are marked as nest locations that have undergone substitutable detection, and are collected in the collection
Figure 663066DEST_PATH_IMAGE078
After removing the position of the bird's nest that has undergone alternative detection, go to step 3; when
Figure 573253DEST_PATH_IMAGE099
The value of satisfies:
Figure 498484DEST_PATH_IMAGE102
, stop at the area
Figure 722792DEST_PATH_IMAGE064
Select the location of the bird's nest from the
Figure 49868DEST_PATH_IMAGE027
times the preference is randomly updated, where,
Figure DEST_PATH_IMAGE103
is a given alternative detection threshold, and
Figure 396536DEST_PATH_IMAGE103
is greater than
Figure 809062DEST_PATH_IMAGE088
a positive integer of ,
Figure 837061DEST_PATH_IMAGE103
The value of can take
Figure 18644DEST_PATH_IMAGE104
;

步骤3:继续按照步骤2中的方式在当前集合

Figure 473896DEST_PATH_IMAGE078
中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,直到集合
Figure 904877DEST_PATH_IMAGE078
中的鸟巢位置都进行了可替代性检测;Step 3: Continue in the current collection as inStep 2
Figure 473896DEST_PATH_IMAGE078
Select the bird's nest position with the largest substitutability coefficient for substitutability detection until the set
Figure 904877DEST_PATH_IMAGE078
The bird's nest positions in the

设置选取的鸟巢位置采用下列方式进行第

Figure 470988DEST_PATH_IMAGE027
次的偏好随机更新:Set the selected bird's nest position in the following ways:
Figure 470988DEST_PATH_IMAGE027
The preferences are updated randomly for times:

Figure DEST_PATH_IMAGE105
表示选取的进行第
Figure 772656DEST_PATH_IMAGE034
次的偏好随机更新的鸟巢位置,
Figure 398810DEST_PATH_IMAGE106
表示种群中的第
Figure DEST_PATH_IMAGE107
只布谷鸟在第
Figure 848245DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,则
Figure 218047DEST_PATH_IMAGE105
表示采用莱维飞行模式对鸟巢位置
Figure 374222DEST_PATH_IMAGE106
进行第
Figure 171276DEST_PATH_IMAGE034
次迭代更新后的鸟巢位置,则
Figure 780112DEST_PATH_IMAGE105
采用下列方式进行第
Figure 750342DEST_PATH_IMAGE034
次的偏好随机更新:Assume
Figure DEST_PATH_IMAGE105
Indicates the selected
Figure 772656DEST_PATH_IMAGE034
The preferred randomly updated bird's nest location,
Figure 398810DEST_PATH_IMAGE106
represents the first in the population
Figure DEST_PATH_IMAGE107
only cuckoo in the
Figure 848245DEST_PATH_IMAGE026
The final bird’s nest position after the next iteration update, then
Figure 218047DEST_PATH_IMAGE105
Indicates the position of the bird's nest using the Levi flight mode
Figure 374222DEST_PATH_IMAGE106
carry out the
Figure 171276DEST_PATH_IMAGE034
the position of the bird’s nest after the next iteration update, then
Figure 780112DEST_PATH_IMAGE105
carry out the following
Figure 750342DEST_PATH_IMAGE034
The preferences are updated randomly for times:

Figure DEST_PATH_IMAGE109
Figure DEST_PATH_IMAGE109

Figure DEST_PATH_IMAGE111
Figure DEST_PATH_IMAGE111

式中,

Figure 761024DEST_PATH_IMAGE112
表示采用随机偏好更新模式对鸟巢位置
Figure 728980DEST_PATH_IMAGE105
进行迭代更新后的鸟巢位置,
Figure DEST_PATH_IMAGE113
表示鸟巢位置
Figure 621849DEST_PATH_IMAGE112
的适应度函数值,
Figure 333453DEST_PATH_IMAGE114
表示鸟巢位置
Figure 198641DEST_PATH_IMAGE105
的适应度函数值,
Figure DEST_PATH_IMAGE115
表示种群中第
Figure 498402DEST_PATH_IMAGE107
只布谷鸟在第
Figure 81831DEST_PATH_IMAGE034
次迭代更新后的最终鸟巢位置,
Figure 597126DEST_PATH_IMAGE116
Figure DEST_PATH_IMAGE117
为在当前采用莱维飞行模式进行第
Figure 316820DEST_PATH_IMAGE034
次迭代更新后的鸟巢位置中随机选取的两个鸟巢位置,且
Figure 423316DEST_PATH_IMAGE118
Figure DEST_PATH_IMAGE119
表示
Figure 494040DEST_PATH_IMAGE063
Figure 547447DEST_PATH_IMAGE088
之间的随机数。In the formula,
Figure 761024DEST_PATH_IMAGE112
Indicates that the random preference update mode is used to adjust the position of the bird's nest.
Figure 728980DEST_PATH_IMAGE105
The nest position after iterative update,
Figure DEST_PATH_IMAGE113
Indicates the location of the bird's nest
Figure 621849DEST_PATH_IMAGE112
The fitness function value of ,
Figure 333453DEST_PATH_IMAGE114
Indicates the location of the bird's nest
Figure 198641DEST_PATH_IMAGE105
The fitness function value of ,
Figure DEST_PATH_IMAGE115
represents the number of
Figure 498402DEST_PATH_IMAGE107
only cuckoo in the
Figure 81831DEST_PATH_IMAGE034
The final bird's nest position after the second iteration update,
Figure 597126DEST_PATH_IMAGE116
and
Figure DEST_PATH_IMAGE117
For the first flight in Levi's flight mode
Figure 316820DEST_PATH_IMAGE034
Two bird nest positions randomly selected from the bird nest positions updated by the second iteration, and
Figure 423316DEST_PATH_IMAGE118
,
Figure DEST_PATH_IMAGE119
express
Figure 494040DEST_PATH_IMAGE063
arrive
Figure 547447DEST_PATH_IMAGE088
random numbers in between.

本发明的目的之二在于,提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行方法,包括:The second purpose of the present invention is to provide an operation method of a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence, including:

首先在构建预测及风控平台的产品架构上,开发相关的软件应用,连接各金融证券信息管理平台及第三方服务平台、监管平台,以合法手段从多渠道获取大量金融舆情资讯、用户征信、用户与证券公司之间的交易活动信息、证券公司及其产品的数据,对数据进行归集整合后,采用神经网络、机器学习、支持向量机等算法技术对数据进行建模及训练,以确保风险预测数据模型的准确度,平台运转过程中,通过模型对金融证券业务的动态监测,实时预测业务全流程中可能出现的风险,判识风险的类型、评估风险的程度,从而采用预设的风险控制手段对其进行干预或消除,并利用大数据不断改良金融证券交易业务的政策及流程,以期从根本上降低金融证券交易业务的风险。First of all, on the product structure of building a prediction and risk control platform, develop related software applications, connect various financial securities information management platforms, third-party service platforms, and regulatory platforms, and obtain a large amount of financial public opinion information and user credit information from multiple channels by legal means. , trading activity information between users and securities companies, and data of securities companies and their products. After the data is collected and integrated, neural networks, machine learning, support vector machines and other algorithmic technologies are used to model and train the data to achieve To ensure the accuracy of the risk prediction data model, during the operation of the platform, through the dynamic monitoring of the financial and securities business by the model, real-time prediction of possible risks in the entire business process, identification of the type of risk, and assessment of the degree of risk, so as to adopt the preset It intervenes or eliminates it by means of risk control, and uses big data to continuously improve the policies and procedures of financial securities trading business, in order to fundamentally reduce the risk of financial securities trading business.

本发明的目的之三在于,提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行装置,包括处理器、存储器以及存储在存储器中并在处理器上运行的计算机程序,处理器用于执行计算机程序时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。The third object of the present invention is to provide an operating device for a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including a processor, a memory, and a computer program stored in the memory and running on the processor. The device is used to implement the above-mentioned stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence when executing computer programs.

本发明的目的之四在于,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。The fourth object of the present invention is to provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, realizes the above-mentioned algorithms, big data, artificial intelligence stock public opinion monitoring and risk control system.

与现有技术相比,本发明的有益效果:Compared with the prior art, the beneficial effects of the present invention:

1.该基于算法、大数据、人工智能的股票舆情监测和风控系统以金融公司积累的大量市场数据为基础,从多途径获取大量舆情、金融证券交易活动、用户信用等数据,对海量的数据进行处理,构建风险预测数据模型并通过多种技术进行训练以提高模型的准确度,从而可以快速准确地识别或预测金融证券业务流程中可能存在的风险以便控制干预;1. The stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence is based on a large amount of market data accumulated by financial companies, and obtains a large amount of public opinion, financial securities trading activities, user credit and other data from multiple channels. Process, build a risk prediction data model and train it through a variety of techniques to improve the accuracy of the model, so that possible risks in financial securities business processes can be quickly and accurately identified or predicted to control intervention;

2.该基于算法、大数据、人工智能的股票舆情监测和风控系统重视用户的征信情况,对金融证券业务流程进行动态监测,着重从事前风险控制入手,深入了解用户的情况以便进行个性化营销,降低如征信、风险定价及欺诈等事前风险因素,保障证券公司和用户双方的利益;2. The stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence attaches great importance to the user's credit information, dynamically monitors the financial and securities business processes, focuses on pre-risk control, and deeply understands the user's situation for personalization. Marketing, reducing ex ante risk factors such as credit reporting, risk pricing and fraud, to protect the interests of both securities companies and users;

3.该基于算法、大数据、人工智能的股票舆情监测和风控系统以大数据分析为基础,以互联网金融环境为支撑,提高数据的精确度,实现对金融证券业务的实时监测,并可联合不同领域的金融数据来提高风控效果,还可全面地改良金融证券公司的业务,从根本上降低风险程度,加强证券公司的风控体系。3. The stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence is based on big data analysis, supported by the Internet financial environment, improves the accuracy of data, realizes real-time monitoring of financial securities business, and can be combined with Financial data in different fields can improve the effect of risk control, and can also comprehensively improve the business of financial securities companies, fundamentally reduce the degree of risk, and strengthen the risk control system of securities companies.

附图说明Description of drawings

图1为本发明的示例性产品架构图;1 is an exemplary product architecture diagram of the present invention;

图2为本发明的整体平台系统装置结构图;2 is a structural diagram of an overall platform system device of the present invention;

图3为本发明的局部平台系统装置结构图之一;3 is one of the structural diagrams of the local platform system device of the present invention;

图4为本发明的局部平台系统装置结构图之二;4 is the second structural diagram of the local platform system device of the present invention;

图5为本发明的局部平台系统装置结构图之三;FIG. 5 is the third structural diagram of the local platform system device of the present invention;

图6为本发明的局部平台系统装置结构图之四;FIG. 6 is the fourth structural diagram of the local platform system device of the present invention;

图7为本发明的局部平台系统装置结构图之五;FIG. 7 is the fifth structural diagram of the local platform system device of the present invention;

图8为本发明的局部平台系统装置结构图之六;8 is the sixth structural diagram of the local platform system device of the present invention;

图9为本发明的局部平台系统装置结构图之七;FIG. 9 is the seventh structural diagram of the local platform system device of the present invention;

图10为本发明的示例性电子计算机平台装置结构示意图。FIG. 10 is a schematic structural diagram of an exemplary electronic computer platform device of the present invention.

图中各个标号意义为:The meanings of the symbols in the figure are:

1、计算机处理器;2、显示终端;3、风控平台;4、数据存储服务器;5、第三方平台服务器;6、用户;1. Computer processor; 2. Display terminal; 3. Risk control platform; 4. Data storage server; 5. Third-party platform server; 6. User;

100、平台架构单元;101、基建设备模块;102、软件环境模块;103、技术支撑模块;104、三方平台模块;100, platform architecture unit; 101, infrastructure equipment module; 102, software environment module; 103, technical support module; 104, tripartite platform module;

200、数据处理单元;201、数据集合模块;2011、舆情资讯模块;2012、用户征信模块;2013、公司产品模块;2014、交易活动模块;202、分类整理模块;203、数据分析模块;2031、神经网络模块;2032、机器学习模块;2033、支持向量机模块;2034、碰撞分析模块;204、数据模型模块;200, data processing unit; 201, data collection module; 2011, public opinion information module; 2012, user credit reporting module; 2013, company product module; 2014, transaction activity module; 202, classification module; 203, data analysis module; 2031 , neural network module; 2032, machine learning module; 2033, support vector machine module; 2034, collision analysis module; 204, data model module;

300、预测判识单元;301、动态监测模块;302、风险预测模块;303、类型识别模块;304、程度判定模块;300, prediction and identification unit; 301, dynamic monitoring module; 302, risk prediction module; 303, type identification module; 304, degree determination module;

400、风控管理单元;401、风险控制模块;402、合作风控模块;403、监管干预模块;404、改良措施模块;4041、用户上线模块;4042、交易账户模块;4043、数据融合模块;4044、产品创新模块。400, risk control management unit; 401, risk control module; 402, cooperative risk control module; 403, regulatory intervention module; 404, improvement measures module; 4041, user online module; 4042, transaction account module; 4043, data fusion module; 4044. Product innovation module.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施例1Example 1

如图1-图10所示,本实施例提供了基于算法、大数据、人工智能的股票舆情监测和风控系统,包括As shown in Figures 1-10, this embodiment provides a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including

平台架构单元100、数据处理单元200、预测判识单元300和风控管理单元400;平台架构单元100、数据处理单元200、预测判识单元300与风控管理单元400依次通过网络通信连接;平台架构单元100用于对构成平台运行环境的设备、软件及技术应用进行连接及管理;数据处理单元200用于采集获取大量与金融证券及其风险相关的多元数据,通过对数据进行整理分析并建立完善的数据分析模型;预测判识单元300用于通过对大量数据的伸入挖掘分析来预测金融证券业务流程中可能存在的风险,对风险进行类型识别和程度分析;风控管理单元400用于从多方面、采用多种风控手段对金融证券业务的风险进行管控。Theplatform architecture unit 100, thedata processing unit 200, theprediction identification unit 300 and the riskcontrol management unit 400; theplatform architecture unit 100, thedata processing unit 200, theprediction identification unit 300 and the riskcontrol management unit 400 are sequentially connected through network communication; the platform architecture Theunit 100 is used to connect and manage the equipment, software and technical applications that constitute the operating environment of the platform; thedata processing unit 200 is used to collect and obtain a large amount of multivariate data related to financial securities and their risks. The prediction andidentification unit 300 is used to predict the possible risks in the financial securities business process by digging into and analyzing a large amount of data, and to carry out type identification and degree analysis of the risks; the riskcontrol management unit 400 is used to Various risk control methods are adopted to manage and control the risks of financial securities business.

平台架构单元100包括基建设备模块101、软件环境模块102、技术支撑模块103和三方平台模块104;Theplatform architecture unit 100 includes aninfrastructure equipment module 101, asoftware environment module 102, atechnical support module 103 and a third-party platform module 104;

数据处理单元200包括数据集合模块201、分类整理模块202、数据分析模块203和数据模型模块204;Thedata processing unit 200 includes adata collection module 201, aclassification module 202, adata analysis module 203 and adata model module 204;

预测判识单元300包括动态监测模块301、风险预测模块302、类型识别模块303和程度判定模块304;The prediction andidentification unit 300 includes adynamic monitoring module 301, arisk prediction module 302, atype identification module 303 and adegree determination module 304;

风控管理单元400包括风险控制模块401、合作风控模块402、监管干预模块403和改良措施模块404。The riskcontrol management unit 400 includes arisk control module 401 , a cooperativerisk control module 402 , asupervisory intervention module 403 and animprovement measure module 404 .

本实施例中,基建设备模块101、软件环境模块102、技术支撑模块103与三方平台模块104依次通过网络通信连接;基建设备模块101用于对加入风控平台系统的电子计算机设备进行连接管理;软件环境模块102用于在基建设备的基础上研发针对证券金融业务风险管理的软件及应用平台,以便构建支持系统的运行环境;技术支撑模块103用于载入以人工智能为主的智能技术,并引入多种智能算法来支撑平台系统的顺畅运行;三方平台模块104用于连接多个如金融证券信息管理平台、监管平台等第三方服务平台以获取大量补充数据及补充服务。In this embodiment, theinfrastructure equipment module 101, thesoftware environment module 102, thetechnical support module 103 and the third-party platform module 104 are sequentially connected through network communication; theinfrastructure equipment module 101 is used for connection management of the electronic computer equipment added to the risk control platform system; Thesoftware environment module 102 is used to develop software and application platforms for securities financial business risk management based on infrastructure equipment, so as to build an operating environment that supports the system; thetechnical support module 103 is used to load intelligent technologies based on artificial intelligence, And introduce a variety of intelligent algorithms to support the smooth operation of the platform system; the third-party platform module 104 is used to connect multiple third-party service platforms such as financial securities information management platforms, regulatory platforms, etc. to obtain a large amount of supplementary data and supplementary services.

其中,基建设备包括但不限于计算机、显示器、PC平板、手机、智能传感器、数据采集装置(扫描仪、RFID、身份证OCR、人脸/指纹识别器等)等。Among them, infrastructure equipment includes but is not limited to computers, monitors, PC tablets, mobile phones, smart sensors, data acquisition devices (scanners, RFID, ID card OCR, face/fingerprint readers, etc.), etc.

本实施例中,数据集合模块201的信号输出端与分类整理模块202的信号输入端连接,分类整理模块202的信号输出端与数据分析模块203的信号输入端连接,数据分析模块203的信号输出端与数据模型模块204的信号输入端连接;数据集合模块201用于通过多种手段从多来源获取大量与金融证券相关的数据;分类整理模块202用于按照一定的类别规则将大量的数据进行分类归纳整理操作,以便进行后期的计算分析;数据分析模块203用于采用多种全球领先技术来对金融证券的数据进行分析;数据模型模块204用于以大量的数据为基础、根据数据分析的结果构建风险分析的数据模型并进行训练及验证。In this embodiment, the signal output terminal of thedata collection module 201 is connected to the signal input terminal of thesorting module 202, the signal output terminal of thesorting module 202 is connected to the signal input terminal of thedata analysis module 203, and the signal output terminal of thedata analysis module 203 is connected. The terminal is connected to the signal input terminal of thedata model module 204; thedata collection module 201 is used to obtain a large amount of data related to financial securities from multiple sources through various means; Classification, induction and sorting operations are used for later calculation and analysis; thedata analysis module 203 is used to analyze the data of financial securities by adopting a variety of global leading technologies; thedata model module 204 is used for a large amount of data. As a result, the data model of risk analysis is constructed and trained and verified.

进一步地,数据集合模块201包括舆情资讯模块2011、用户征信模块2012、公司产品模块2013和交易活动模块2014;舆情资讯模块2011、用户征信模块2012、公司产品模块2013与交易活动模块2014依次通过网络通信连接且并列运行;舆情资讯模块2011用于从网络上获取公开的历史或实时的与金融证券相关的舆情资讯;用户征信模块2012用于从用户、证券公司、合作银行等方面以合法手段或经用户授权后获取与用户征信相关的信息;公司产品模块2013用于获取各证券公司包括经营情况、公开资产、公司业务及具体产品详情等信息数据;交易活动模块2014用于获取用户与证券公司之间的交易活动的全流程信息。Further, thedata collection module 201 includes a publicopinion information module 2011, a usercredit reporting module 2012, acompany product module 2013 and atransaction activity module 2014; The publicopinion information module 2011 is used to obtain public historical or real-time public opinion information related to financial securities from the network; the usercredit reporting module 2012 is used to obtain information from users, securities companies, cooperative banks, etc. Obtain the information related to the user's credit investigation by legal means or after the authorization of the user; thecompany product module 2013 is used to obtain the information and data of each securities company including the operation status, public assets, company business and specific product details; thetransaction activity module 2014 is used to obtain Full-process information of trading activities between users and securities companies.

进一步地,数据分析模块203包括神经网络模块2031、机器学习模块2032、支持向量机模块2033和碰撞分析模块2034;神经网络模块2031、机器学习模块2032、支持向量机模块2033与碰撞分析模块2034依次通过网络通信连接且独立运行;神经网络模块2031用于通过神经网络的训练算法来将算法权重的值调整到最佳,以使得整个网络的预测效果最好,利用训练样本集中的样本对BP神经网络或支持向量机进行训练,利用测试样本集中的样本对BP神经网络或支持向量机进行测试,从而构建基于BP神经网络的股票走势预测模型用于对目标股票的走势进行预测,参数优化单元利用人工萤火虫群优化算法对BP神经网络的初始权值和阈值进行优化或采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优;机器学习模块2032用于使用机器学习相关的技术进行神经网络的训练,使得参数尽可能的与真实的模型逼近,从而使得模型训练可以获得性能与数据利用上的双重优势;支持向量机模块2033用于采用支持向量机的算法,通过构造分割面将数据进行分离,以便进行关系分析;碰撞分析模块2034用于将来自不同金融领域、不同金融机构的数据及风险因素进行碰撞分析以挖掘潜在的风险情况。数据分析模块203还利用构建的样本集对支持向量机进行训练,在利用构建的样本集对支持向量机进行训练时,采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优,设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优。Further, thedata analysis module 203 includes aneural network module 2031, amachine learning module 2032, a support vector machine module 2033 and acollision analysis module 2034; theneural network module 2031, themachine learning module 2032, the support vector machine module 2033 and thecollision analysis module 2034 are in sequence It is connected through network communication and operates independently; theneural network module 2031 is used to adjust the value of the algorithm weight to the best through the training algorithm of the neural network, so as to make the prediction effect of the whole network the best, and use the samples in the training sample set to BP neural network. The network or support vector machine is trained, and the samples in the test sample set are used to test the BP neural network or the support vector machine, so as to construct a stock trend prediction model based on the BP neural network to predict the trend of the target stock. The parameter optimization unit uses The artificial firefly swarm optimization algorithm optimizes the initial weights and thresholds of the BP neural network or uses the cuckoo search algorithm to optimize the penalty factor and kernel function parameters of the support vector machine; themachine learning module 2032 is used to use machine learning related technologies Train the neural network to make the parameters as close to the real model as possible, so that the model training can obtain the dual advantages of performance and data utilization; the support vector machine module 2033 is used to adopt the algorithm of the support vector machine, by constructing the segmentation surface The data is separated for relationship analysis; thecollision analysis module 2034 is used to perform collision analysis on data and risk factors from different financial fields and financial institutions to mine potential risk situations. Thedata analysis module 203 also uses the constructed sample set to train the support vector machine. When using the constructed sample set to train the support vector machine, the cuckoo search algorithm is used to optimize the penalty factor and kernel function parameters of the support vector machine. , the smaller the fitness function value corresponding to the nest position of the cuckoo, the better the solution corresponding to the nest position.

具体地,机器学习模块2032中,机器学习的训练算法为:Specifically, in themachine learning module 2032, the training algorithm for machine learning is:

首先给所有参数赋上随机值,使用这些随机生成的参数值,来预测训练数据中的样本;First assign random values to all parameters, and use these randomly generated parameter values to predict the samples in the training data;

设样本的预测目标为

Figure 387227DEST_PATH_IMAGE001
,真实目标为y,那么定义一个值loss,计算公式如下:Let the prediction target of the sample be
Figure 387227DEST_PATH_IMAGE001
, the real target is y, then define a value loss, the calculation formula is as follows:

Figure 930204DEST_PATH_IMAGE002
Figure 930204DEST_PATH_IMAGE002
;

其中,loss称为损失,机器学习的目标是使对所有训练数据的损失和尽可能的小;Among them, loss is called loss, and the goal of machine learning is to make the loss sum of all training data as small as possible;

进而,如果将先前的神经网络预测的矩阵公式带入到

Figure 488224DEST_PATH_IMAGE001
中,则可以把损失写为关于参数的损失函数。Furthermore, if the matrix formula predicted by the previous neural network is brought into
Figure 488224DEST_PATH_IMAGE001
, the loss can be written as a loss function with respect to the parameters.

具体地,支持向量机模块2033中,支持向量的算法选择方式为:Specifically, in the support vector machine module 2033, the algorithm selection method of the support vector is:

以线性可分SVM为例,将W认为是若干样本线性组合得到的,则第1个样本为

Figure 79742DEST_PATH_IMAGE003
,第i个为
Figure 774029DEST_PATH_IMAGE004
,对于每个x,给予其系数
Figure 691169DEST_PATH_IMAGE005
,此时存在:
Figure 470907DEST_PATH_IMAGE006
,选取部分
Figure 194012DEST_PATH_IMAGE005
,使它们的值不为0,其余值都设为0,则对w真正起作用的就是值不为0的这些x向量,这些向量支持了法线向量,因此就是支持向量;Taking linearly separable SVM as an example, W is considered to be obtained by linear combination of several samples, then the first sample is
Figure 79742DEST_PATH_IMAGE003
, the i-th is
Figure 774029DEST_PATH_IMAGE004
, for each x, give its coefficients
Figure 691169DEST_PATH_IMAGE005
, which exists at this time:
Figure 470907DEST_PATH_IMAGE006
, select part
Figure 194012DEST_PATH_IMAGE005
, so that their values are not 0, and the rest of the values are set to 0, then what really works on w is these x vectors whose values are not 0. These vectors support the normal vector, so they are support vectors;

若直线l有参数w和b,通过计算每个样本到直线l的距离,衡量哪条直线是最为合适的分割线;距离d可以表示为:

Figure 8384DEST_PATH_IMAGE007
,若每个数据集中样本的形式为
Figure 830847DEST_PATH_IMAGE008
,而每个样本的y值,就是这个样本的label(正例为1,负例为-1,这里的正负值其实反映的就是样本位于分割线的方向,位于法线正方向即为正);If the straight line l has parameters w and b, by calculating the distance from each sample to the straight line l, we can measure which straight line is the most suitable dividing line; the distance d can be expressed as:
Figure 8384DEST_PATH_IMAGE007
, if the form of the samples in each dataset is
Figure 830847DEST_PATH_IMAGE008
, and the y value of each sample is the label of the sample (positive example is 1, negative example is -1, the positive and negative values here actually reflect the direction of the sample in the dividing line, and the positive direction of the normal is positive );

将y值一起乘入等式右边:

Figure 97880DEST_PATH_IMAGE009
,这里的y值是样本的实际正负值,如果估计值与实际值符号相同,即分类正确,此时的结果为正值,如果分类错误,则结果为负值;Multiply the y values together into the right-hand side of the equation:
Figure 97880DEST_PATH_IMAGE009
, the y value here is the actual positive and negative values of the sample. If the estimated value has the same sign as the actual value, that is, the classification is correct, the result at this time is a positive value, and if the classification is wrong, the result is a negative value;

在所有样本中,距离该直线最近的样本应被选为支持向量,支持向量与直线间的距离即为过渡带,因为SVM期望过渡带尽可能大,因此最终参数w与b的选择可以表示为:Among all samples, the sample closest to the line should be selected as the support vector, and the distance between the support vector and the line is the transition band, because SVM expects the transition band to be as large as possible, so the selection of the final parameters w and b can be expressed as :

Figure 296780DEST_PATH_IMAGE010
Figure 296780DEST_PATH_IMAGE010
;

因此,给定线性可分训练数据集,通过间隔最大化得到的分割超平面为:

Figure 762397DEST_PATH_IMAGE011
,相应的分类决策函数为:
Figure 755760DEST_PATH_IMAGE012
。Therefore, given a linearly separable training dataset, the segmentation hyperplane obtained by margin maximization is:
Figure 762397DEST_PATH_IMAGE011
, the corresponding classification decision function is:
Figure 755760DEST_PATH_IMAGE012
.

本实施例中,动态监测模块301的信号输出端与风险预测模块302的信号输入端连接,风险预测模块302的信号输出端与类型识别模块303的信号输入端连接,类型识别模块303的信号输出端与程度判定模块304的信号输入端连接;动态监测模块301用于通过数字技术,以往期某一时期或某一时点的用户数据作为审核依据的风控方式,替代不能够抓住延续性数据的风控方式,重视具备延续性的用户信息并在复制上给予更高权重,从而实现对金融证券业务风险的动态监测;风险预测模块302用于通过构建的风险预测数据模型自动预测用户与证券公司进行交易活动全流程中可能存在的风险因素;类型识别模块303用于根据预测出的风险的在交易活动中所处的位置来识别该风险的类型;程度判定模块304用于按照预设的风险等级划分规则自动评估各风险的程度情况。In this embodiment, the signal output end of thedynamic monitoring module 301 is connected to the signal input end of therisk prediction module 302, the signal output end of therisk prediction module 302 is connected to the signal input end of thetype identification module 303, and the signal output end of thetype identification module 303 is connected. The terminal is connected to the signal input terminal of thedegree determination module 304; thedynamic monitoring module 301 is used for the risk control method that uses the user data of a certain period or a certain point in the past as the audit basis through digital technology to replace the continuous data that cannot be captured. The risk control method is based on the risk control method, pays attention to the continuous user information and gives higher weight to the copy, so as to realize the dynamic monitoring of financial securities business risks; therisk prediction module 302 is used to automatically predict users and securities through the constructed risk prediction data model. The risk factors that may exist in the whole process of the company's trading activities; thetype identification module 303 is used to identify the type of the risk according to the position of the predicted risk in the trading activity; thedegree determination module 304 is used to Risk grading rules automatically assess the degree of each risk.

其中,风险类型包括但不限于市场风险、信用风险、流动性风险、作业风险、行业风险、法律法规或政策风险、人事风险、自然灾害或其他突发事件等。Among them, the types of risks include but are not limited to market risks, credit risks, liquidity risks, operational risks, industry risks, legal and regulatory or policy risks, personnel risks, natural disasters or other emergencies, etc.

本实施例中,风险控制模块401、合作风控模块402、监管干预模块403与改良措施模块404依次通过网络通信连接;风险控制模块401用于分别从事前、事中及事后三个方面来对金融证券业务流程中可能出现的各类风险进行控制管理;合作风控模块402用于通过将不同金融领域、不同金融机构内的风险控制数据及风控方法实现共享合作从而提高风险控制的效果;监管干预模块403用于从数据监管入手,在允许进一步放开券商对客户信息与交易数据开发权限的基础上,实时监控券商自身或第三方企业获取客户相关信息的来源于渠道,以及进一步进行数据内部深加工的流程和后续构建的包括客户交易习惯和征信等资料库,从而实现证券互联网化的全程监管,并引入第三方监管平台的干预手段来保障金融证券业务的低风险;改良措施模块404用于证券公司利用大数据对一些业务功能进行改良来强化其风控体系。In this embodiment, therisk control module 401 , the cooperativerisk control module 402 , thesupervision intervention module 403 and theimprovement measure module 404 are sequentially connected through network communication; Control and manage various risks that may occur in the financial securities business process; the cooperativerisk control module 402 is used to realize the sharing and cooperation of risk control data and risk control methods in different financial fields and different financial institutions to improve the effect of risk control; Thesupervisory intervention module 403 is used to start from data supervision, on the basis of allowing the brokers to further release the development authority of the client information and transaction data, to monitor the source channels of the brokers themselves or third-party companies to obtain customer-related information in real time, and to further conduct data analysis. The internal deep processing process and the subsequent construction of databases including customer trading habits and credit information, so as to realize the whole process of securities Internet-based supervision, and introduce the intervention means of third-party supervision platforms to ensure the low risk of financial securities business;improvement measures module 404 It is used for securities companies to use big data to improve some business functions to strengthen their risk control systems.

其中,事前风险控制主要包括征信、风险定价、反欺诈等方面。Among them, ex-ante risk control mainly includes credit investigation, risk pricing, anti-fraud and other aspects.

进一步地,改良措施模块404包括用户上线模块4041、交易账户模块4042、数据融合模块4043和产品创新模块4044;用户上线模块4041、交易账户模块4042、数据融合模块4043与产品创新模块4044依次通过网络通信连接;用户上线模块4041用于将证券公司大量的线下存量客户线下存档的资料、交易类行为等数据进行线上化,以便衬垫用户的线上数据;交易账户模块4042用于打通用户证券交易账户的线上支付,拓展其线上的非证券交易功能,将账户体系丰富到其他线上平台以积累更多的非证券交易数据;数据融合模块4043用于在多个金融信息管理平台之上搭建数据融合平台以将所有数据进行归集整合,从而可以从各个维度对个体行为进行分析与预测,从全维度开展对个体的风险评估使评估迅速且准确;产品创新模块4044用于以大数据分析作为全方位产品创新的基础,以便开发定制化产品并精准推送和营销,并可以进行互联网化的产品设计及利用大数据进行风险定价。Further, theimprovement measures module 404 includes auser online module 4041, atransaction account module 4042, adata fusion module 4043 and aproduct innovation module 4044; theuser online module 4041, thetransaction account module 4042, thedata fusion module 4043 and theproduct innovation module 4044 pass through the network in sequence. Communication connection; theuser online module 4041 is used to onlineize the data, transaction behavior and other data archived offline by a large number of offline stock customers of the securities company, so as to cushion the online data of the user; thetransaction account module 4042 is used to open up Online payment of users' securities trading accounts, expand their online non-securities trading functions, and enrich the account system to other online platforms to accumulate more non-securities trading data;data fusion module 4043 is used for multiple financial information management A data fusion platform is built on the platform to collect and integrate all data, so that individual behaviors can be analyzed and predicted from various dimensions, and individual risk assessments can be carried out from all dimensions to make the assessment fast and accurate;product innovation module 4044 is used for Big data analysis is used as the basis for all-round product innovation, in order to develop customized products and accurately push and market them, and can carry out Internet-based product design and use big data for risk pricing.

本实施例还提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行方式,包括:This embodiment also provides the operation mode of the stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including:

首先在构建预测及风控平台的产品架构上,开发相关的软件应用,连接各金融证券信息管理平台及第三方服务平台、监管平台,以合法手段从多渠道获取大量金融舆情资讯、用户征信、用户与证券公司之间的交易活动信息、证券公司及其产品的数据,对数据进行归集整合后,采用神经网络、机器学习、支持向量机等算法技术对数据进行建模及训练,以确保风险预测数据模型的准确度,平台运转过程中,通过模型对金融证券业务的动态监测,实时预测业务全流程中可能出现的风险,判识风险的类型、评估风险的程度,从而采用预设的风险控制手段对其进行干预或消除,并利用大数据不断改良金融证券交易业务的政策及流程,以期从根本上降低金融证券交易业务的风险。First of all, on the product structure of building a prediction and risk control platform, develop related software applications, connect various financial securities information management platforms, third-party service platforms, and regulatory platforms, and obtain a large amount of financial public opinion information and user credit information from multiple channels by legal means. , trading activity information between users and securities companies, and data of securities companies and their products. After the data is collected and integrated, neural networks, machine learning, support vector machines and other algorithmic technologies are used to model and train the data to achieve To ensure the accuracy of the risk prediction data model, during the operation of the platform, through the dynamic monitoring of the financial and securities business by the model, real-time prediction of possible risks in the entire business process, identification of the type of risk, and assessment of the degree of risk, so as to adopt the preset It intervenes or eliminates it by means of risk control, and uses big data to continuously improve the policies and procedures of financial securities trading business, in order to fundamentally reduce the risk of financial securities trading business.

在优选的实施例中,监测目标确定单元用于提取接收到的各新闻舆情数据中的股票主体,并对包含所述股票主体的新闻舆情数据进行统计,当包含所述股票主体的新闻舆情数据在此次接收到的新闻舆情数据中所占的比例超出给定的阈值时,则判定该股票主体为需要进行舆情监测的目标股票,具体为:In a preferred embodiment, the monitoring target determination unit is configured to extract the stock subject in the received news public opinion data, and perform statistics on the news public opinion data containing the stock subject, when the news public opinion data of the stock subject is included When the proportion of the received news and public opinion data exceeds the given threshold, it is determined that the stock subject is the target stock that needs to be monitored for public opinion, specifically:

Figure 510090DEST_PATH_IMAGE013
表示监测目标确定单元此次接收到的新闻舆情数据的总数,
Figure 247102DEST_PATH_IMAGE014
表示监测目标确定单元在此次接收到的新闻舆情数据中提取到的第
Figure 770487DEST_PATH_IMAGE015
个股票主体,当包含第
Figure 528227DEST_PATH_IMAGE015
个股票主体的新闻舆情数据在此次接收到的新闻舆情数据中满足:
Figure 769853DEST_PATH_IMAGE016
时,则判定该第
Figure 44976DEST_PATH_IMAGE015
个股票主体为需要进行舆情监测的目标股票,其中
Figure 688447DEST_PATH_IMAGE017
为给定的阈值,
Figure 23614DEST_PATH_IMAGE017
的值可以取
Figure 549273DEST_PATH_IMAGE018
。Assume
Figure 510090DEST_PATH_IMAGE013
Indicates the total number of news and public opinion data received by the monitoring target determination unit this time,
Figure 247102DEST_PATH_IMAGE014
Indicates the first number extracted by the monitoring target determination unit from the news public opinion data received this time.
Figure 770487DEST_PATH_IMAGE015
A stock subject, when including the first
Figure 528227DEST_PATH_IMAGE015
The news and public opinion data of each stock subject satisfies the following in the news and public opinion data received this time:
Figure 769853DEST_PATH_IMAGE016
, it is determined that the
Figure 44976DEST_PATH_IMAGE015
Each stock subject is the target stock that needs to be monitored by public opinion, among which
Figure 688447DEST_PATH_IMAGE017
for a given threshold,
Figure 23614DEST_PATH_IMAGE017
The value of can take
Figure 549273DEST_PATH_IMAGE018
.

舆情预警单元用于对包含所述目标股票的新闻舆情数据进行统计,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例超出给定的预警阈值时进行预警,具体为:The public opinion early-warning unit is used to perform statistics on the news and public opinion data containing the target stock, when the proportion of the news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock exceeds a given early warning threshold. Warning, specifically:

Figure 628087DEST_PATH_IMAGE019
表示舆情监测模块接收到的包含第
Figure 126065DEST_PATH_IMAGE020
个目标股票的新闻舆情数据的总数,
Figure 897712DEST_PATH_IMAGE021
表示包含第
Figure 848350DEST_PATH_IMAGE020
个目标股票的新闻舆情数据中情感标签为负面的新闻舆情数据的数量,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例满足:
Figure 793172DEST_PATH_IMAGE022
时,舆情预警单元进行预警,其中,
Figure 145656DEST_PATH_IMAGE023
为给定的预警阈值,
Figure 88205DEST_PATH_IMAGE023
的值可以取
Figure 526139DEST_PATH_IMAGE018
。Assume
Figure 628087DEST_PATH_IMAGE019
Indicates that the information received by the public opinion monitoring module contains the first
Figure 126065DEST_PATH_IMAGE020
The total number of news and public opinion data for each target stock,
Figure 897712DEST_PATH_IMAGE021
Indicates that the
Figure 848350DEST_PATH_IMAGE020
The number of news and public opinion data with negative emotional labels in the news and public opinion data of each target stock, when the proportion of news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock satisfies:
Figure 793172DEST_PATH_IMAGE022
, the public opinion warning unit will give an early warning, among which,
Figure 145656DEST_PATH_IMAGE023
for a given warning threshold,
Figure 88205DEST_PATH_IMAGE023
The value of can take
Figure 526139DEST_PATH_IMAGE018
.

舆情数据管理单元用于对接收到的包含目标股票的新闻舆情数据进行的预处理包括分词处理、过滤停用词、并删除所有与情感信息无关的链接地址、联系方式的文本。The public opinion data management unit is used for preprocessing the received news public opinion data containing the target stock, including word segmentation, filtering stop words, and deleting all texts of link addresses and contact information irrelevant to emotional information.

具体的,本实施例通过建立股票舆情监测系统,对股票舆情数据进行有效的分析,有利于及时了解股票市场的民众情绪和舆论发展,从而引导证券市场的健康发展。Specifically, by establishing a stock public opinion monitoring system in this embodiment, the stock public opinion data can be effectively analyzed, which is conducive to timely understanding of public sentiment and public opinion development in the stock market, thereby guiding the healthy development of the securities market.

在优选的实施例中,情感分类单元采用下列步骤建立基于支持向量机的情感分类器:In a preferred embodiment, the emotion classification unit adopts the following steps to establish a support vector machine-based emotion classifier:

步骤(1):收集与股票相关的带有情感标签的新闻舆情数据,并对收集的新闻舆情数据进行数据清洗,去除所述新闻舆情数据中的噪声数据;Step (1): Collect news and public opinion data with emotional tags related to stocks, and perform data cleaning on the collected news and public opinion data to remove noise data in the news and public opinion data;

步骤(2):对清洗后的新闻舆情数据进行预处理和特征提取,从而构建特征向量,将所述新闻舆情数据的特征向量作为输入样本值,将所述新闻舆情数据带有的情感标签作为输出样本值构建样本集;Step (2): Perform preprocessing and feature extraction on the cleaned news public opinion data, thereby constructing a feature vector, taking the feature vector of the news public opinion data as the input sample value, and taking the sentiment label carried by the news public opinion data as the input sample value. Output sample values to construct a sample set;

步骤(3):利用构建的样本集对支持向量机进行训练和测试,从而建立基于支持向量机的情感分类器。Step (3): Use the constructed sample set to train and test the support vector machine, thereby establishing a sentiment classifier based on the support vector machine.

所述情感标签包括正面、中性和负面。The sentiment labels include positive, neutral, and negative.

在利用构建的样本集对支持向量机进行训练时,采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优,设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优。When using the constructed sample set to train the support vector machine, the cuckoo search algorithm is used to optimize the penalty factor and kernel function parameters of the support vector machine. The solution corresponding to the bird's nest position is better.

具体的,本实施例采用支持向量机对获取的新闻舆情数据进行情感分类,在针对支持向量机最佳算法参数难以确定的缺陷以及布谷鸟搜索算法局部搜索能力弱、寻优精度低的不足,通过采用布谷鸟搜索算法对支持向量机的参数进行寻优,并在布谷鸟搜索算法的迭代过程中对布谷鸟搜索算法的偏好随机更新模式进行改进,提高了布谷鸟搜索算法的搜索精度,从而使得寻优所得的最优算法参数能够提高支持向量机的分类精度。Specifically, in this embodiment, the support vector machine is used to classify the acquired news and public opinion data. In view of the defect that the optimal algorithm parameters of the support vector machine are difficult to determine and the weak local search ability and low optimization accuracy of the cuckoo search algorithm, By using the cuckoo search algorithm to optimize the parameters of the support vector machine, and in the iterative process of the cuckoo search algorithm, the preference random update mode of the cuckoo search algorithm is improved, and the search accuracy of the cuckoo search algorithm is improved. The optimal algorithm parameters obtained from the optimization can improve the classification accuracy of the support vector machine.

在优选的实施例中,设置

Figure 946756DEST_PATH_IMAGE024
表示布谷鸟种群中的第
Figure 750152DEST_PATH_IMAGE025
个布谷鸟在进行第
Figure 863601DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,采用莱维飞行模式对鸟巢位置
Figure 788832DEST_PATH_IMAGE024
进行第
Figure 13140DEST_PATH_IMAGE027
次的迭代更新,具体为:In the preferred embodiment, setting
Figure 946756DEST_PATH_IMAGE024
Indicates the number one in the cuckoo population
Figure 750152DEST_PATH_IMAGE025
A cuckoo is in the process of
Figure 863601DEST_PATH_IMAGE026
The final bird's nest position after the second iteration update, using the Levi flight mode to determine the bird's nest position
Figure 788832DEST_PATH_IMAGE024
carry out the
Figure 13140DEST_PATH_IMAGE027
Iterative update of times, specifically:

Figure 74637DEST_PATH_IMAGE029
Figure 74637DEST_PATH_IMAGE029

Figure 358987DEST_PATH_IMAGE031
Figure 358987DEST_PATH_IMAGE031

式中,

Figure 568252DEST_PATH_IMAGE032
表示采用莱维飞行模式对鸟巢位置
Figure 596251DEST_PATH_IMAGE033
进行第
Figure 777833DEST_PATH_IMAGE034
次的迭代更新后的鸟巢位置,
Figure 967506DEST_PATH_IMAGE035
表示步长因子,
Figure 867329DEST_PATH_IMAGE036
表示点对点乘法,
Figure 495757DEST_PATH_IMAGE037
表示服从参数
Figure 797425DEST_PATH_IMAGE038
的莱维分布生成的随机搜索向量,
Figure 157999DEST_PATH_IMAGE039
表示鸟巢位置
Figure 545118DEST_PATH_IMAGE032
的适应度函数值,
Figure 914920DEST_PATH_IMAGE040
表示鸟巢位置
Figure 867832DEST_PATH_IMAGE033
的适应度函数值。In the formula,
Figure 568252DEST_PATH_IMAGE032
Indicates the position of the bird's nest using the Levi flight mode
Figure 596251DEST_PATH_IMAGE033
carry out the
Figure 777833DEST_PATH_IMAGE034
The updated bird's nest position in the next iteration,
Figure 967506DEST_PATH_IMAGE035
represents the step factor,
Figure 867329DEST_PATH_IMAGE036
represents point-to-point multiplication,
Figure 495757DEST_PATH_IMAGE037
Indicates obedience parameter
Figure 797425DEST_PATH_IMAGE038
The random search vector generated by the Levy distribution of ,
Figure 157999DEST_PATH_IMAGE039
Indicates the location of the bird's nest
Figure 545118DEST_PATH_IMAGE032
The fitness function value of ,
Figure 914920DEST_PATH_IMAGE040
Indicates the location of the bird's nest
Figure 867832DEST_PATH_IMAGE033
The fitness function value of .

在采用莱维飞行模式对种群中的鸟巢位置进行第

Figure 664887DEST_PATH_IMAGE027
次的迭代更新后,采用下列步骤在种群中选取鸟巢位置进行第
Figure 273723DEST_PATH_IMAGE027
次的偏好随机更新,具体包括:The first study of nest locations in the population using the Levi flight mode
Figure 664887DEST_PATH_IMAGE027
After the iterative update of the number of times, the following steps are used to select the bird's nest position in the population for the first
Figure 273723DEST_PATH_IMAGE027
The preferences are randomly updated each time, including:

(1)对当前采用莱维飞行模式进行了第

Figure 447215DEST_PATH_IMAGE027
次迭代更新后的鸟巢位置进行区域划分;(1) Carry out the first review of the current Levie flight mode.
Figure 447215DEST_PATH_IMAGE027
The position of the bird's nest after the second iteration update is divided into regions;

(2)在划分的各区域中选取鸟巢位置进行第

Figure 457896DEST_PATH_IMAGE027
次的偏好随机更新。(2) Select the bird's nest position in the divided areas for the first
Figure 457896DEST_PATH_IMAGE027
The preferences are updated randomly.

给定种群在采用莱维飞行模式进行第

Figure 488169DEST_PATH_IMAGE027
次的迭代更新后的区域分割阈值
Figure 584301DEST_PATH_IMAGE041
,且
Figure 295905DEST_PATH_IMAGE041
的值设置为:
Figure 161093DEST_PATH_IMAGE042
,其中,
Figure 299950DEST_PATH_IMAGE043
表示鸟巢位置
Figure 883379DEST_PATH_IMAGE044
的近邻分割值,且
Figure 460990DEST_PATH_IMAGE045
Figure 446264DEST_PATH_IMAGE046
表示当前种群中距离鸟巢位置
Figure 756023DEST_PATH_IMAGE032
Figure 826747DEST_PATH_IMAGE020
近的鸟巢位置,
Figure 880153DEST_PATH_IMAGE047
为给定的正整数,且
Figure 516671DEST_PATH_IMAGE048
Figure 997331DEST_PATH_IMAGE047
的值可以取
Figure 289772DEST_PATH_IMAGE049
Figure 146870DEST_PATH_IMAGE050
为种群中的布谷鸟数,采用下列步骤根据给定的区域分割阈值
Figure 841156DEST_PATH_IMAGE041
对当前采用莱维飞行模式进行了第
Figure 555034DEST_PATH_IMAGE027
次迭代更新后的鸟巢位置进行区域划分:The given population is in the Levy flight mode for the first
Figure 488169DEST_PATH_IMAGE027
Iteratively updated region segmentation threshold
Figure 584301DEST_PATH_IMAGE041
,and
Figure 295905DEST_PATH_IMAGE041
The value is set to:
Figure 161093DEST_PATH_IMAGE042
,in,
Figure 299950DEST_PATH_IMAGE043
Indicates the location of the bird's nest
Figure 883379DEST_PATH_IMAGE044
The nearest neighbor split value of , and
Figure 460990DEST_PATH_IMAGE045
,
Figure 446264DEST_PATH_IMAGE046
Indicates the distance to the bird's nest in the current population
Figure 756023DEST_PATH_IMAGE032
the first
Figure 826747DEST_PATH_IMAGE020
near the bird's nest,
Figure 880153DEST_PATH_IMAGE047
is a given positive integer, and
Figure 516671DEST_PATH_IMAGE048
,
Figure 997331DEST_PATH_IMAGE047
The value of can take
Figure 289772DEST_PATH_IMAGE049
,
Figure 146870DEST_PATH_IMAGE050
is the number of cuckoos in the population, using the following steps to divide the threshold according to the given area
Figure 841156DEST_PATH_IMAGE041
The first review of the current use of Levi's flight mode
Figure 555034DEST_PATH_IMAGE027
The location of the bird's nest after the second iteration update is divided into regions:

Step1:在种群中随机选取一个鸟巢位置,设

Figure 600351DEST_PATH_IMAGE051
为此次随机选取的鸟巢位置,
Figure 995560DEST_PATH_IMAGE052
表示布谷鸟种群中的第
Figure 544353DEST_PATH_IMAGE053
个布谷鸟在进行第
Figure 632395DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,则
Figure 958815DEST_PATH_IMAGE051
表示采用莱维飞行模式对鸟巢位置
Figure 157715DEST_PATH_IMAGE054
进行第
Figure 561015DEST_PATH_IMAGE034
次的迭代更新后的鸟巢位置,将鸟巢位置
Figure 819958DEST_PATH_IMAGE055
所处区域标记为
Figure 574287DEST_PATH_IMAGE056
,并将鸟巢位置
Figure 373616DEST_PATH_IMAGE051
划分进区域
Figure 897001DEST_PATH_IMAGE056
中,对种群中未划分区域的鸟巢位置依次进行筛选,具体为:Step 1: Randomly select a bird's nest location in the population, set
Figure 600351DEST_PATH_IMAGE051
The location of the bird's nest randomly selected for this time,
Figure 995560DEST_PATH_IMAGE052
Indicates the number one in the cuckoo population
Figure 544353DEST_PATH_IMAGE053
A cuckoo is in the process of
Figure 632395DEST_PATH_IMAGE026
The final bird’s nest position after the next iteration update, then
Figure 958815DEST_PATH_IMAGE051
Indicates the position of the bird's nest using the Levi flight mode
Figure 157715DEST_PATH_IMAGE054
carry out the
Figure 561015DEST_PATH_IMAGE034
The updated bird's nest position in the next iteration, the bird's nest position
Figure 819958DEST_PATH_IMAGE055
The area is marked as
Figure 574287DEST_PATH_IMAGE056
, and set the nest position
Figure 373616DEST_PATH_IMAGE051
divided into regions
Figure 897001DEST_PATH_IMAGE056
, screen the nest positions of undivided areas in the population in turn, specifically:

设置

Figure 326846DEST_PATH_IMAGE024
表示布谷鸟种群中的第
Figure 302892DEST_PATH_IMAGE025
个布谷鸟在进行第
Figure 843595DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,
Figure 487065DEST_PATH_IMAGE044
表示采用莱维飞行模式对鸟巢位置
Figure 884549DEST_PATH_IMAGE024
进行第
Figure 613470DEST_PATH_IMAGE027
次的迭代更新后的鸟巢位置,当鸟巢位置
Figure 692285DEST_PATH_IMAGE044
满足:
Figure 190262DEST_PATH_IMAGE057
时,则将鸟巢位置
Figure 696330DEST_PATH_IMAGE044
划分进区域
Figure 974865DEST_PATH_IMAGE056
中;set up
Figure 326846DEST_PATH_IMAGE024
Indicates the number one in the cuckoo population
Figure 302892DEST_PATH_IMAGE025
A cuckoo is in the process of
Figure 843595DEST_PATH_IMAGE026
The final bird's nest position after the second iteration update,
Figure 487065DEST_PATH_IMAGE044
Indicates the position of the bird's nest using the Levi flight mode
Figure 884549DEST_PATH_IMAGE024
carry out the
Figure 613470DEST_PATH_IMAGE027
The next iteration is the updated bird's nest position, when the bird's nest position
Figure 692285DEST_PATH_IMAGE044
Satisfy:
Figure 190262DEST_PATH_IMAGE057
, the bird's nest position
Figure 696330DEST_PATH_IMAGE044
divided into regions
Figure 974865DEST_PATH_IMAGE056
middle;

当对种群中未划分区域的鸟巢位置筛选完成后,进入Step2;After completing the screening of bird nests in undivided areas in the population, go toStep 2;

Step2:在种群中未划分进区域的鸟巢位置中随机选取一个鸟巢位置,设

Figure 857370DEST_PATH_IMAGE058
为此次随机选取的鸟巢位置,
Figure 209854DEST_PATH_IMAGE059
表示布谷鸟种群中的第
Figure 152402DEST_PATH_IMAGE060
个布谷鸟在进行第
Figure 590337DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,则
Figure 73271DEST_PATH_IMAGE058
表示采用莱维飞行模式对鸟巢位置
Figure 545840DEST_PATH_IMAGE059
进行第
Figure 393710DEST_PATH_IMAGE027
次的迭代更新后的鸟巢位置,将鸟巢位置
Figure 318941DEST_PATH_IMAGE058
所处的区域标记为
Figure 543249DEST_PATH_IMAGE061
,并将鸟巢位置
Figure 932642DEST_PATH_IMAGE058
划分进区域
Figure 216993DEST_PATH_IMAGE061
中,对种群中未划分进行区域的鸟巢位置依次进行筛选,具体为:Step 2: Randomly select a bird's nest position among the bird's nest positions that are not divided into regions in the population, set
Figure 857370DEST_PATH_IMAGE058
The location of the bird's nest randomly selected for this time,
Figure 209854DEST_PATH_IMAGE059
Indicates the number one in the cuckoo population
Figure 152402DEST_PATH_IMAGE060
A cuckoo is in the process of
Figure 590337DEST_PATH_IMAGE026
The final bird’s nest position after the next iteration update, then
Figure 73271DEST_PATH_IMAGE058
Indicates the position of the bird's nest using the Levi flight mode
Figure 545840DEST_PATH_IMAGE059
carry out the
Figure 393710DEST_PATH_IMAGE027
The updated bird's nest position in the next iteration, the bird's nest position
Figure 318941DEST_PATH_IMAGE058
The area is marked as
Figure 543249DEST_PATH_IMAGE061
, and set the nest position
Figure 932642DEST_PATH_IMAGE058
divided into regions
Figure 216993DEST_PATH_IMAGE061
, screen the bird's nest positions that are not divided into areas in the population in turn, specifically:

当鸟巢位置

Figure 629520DEST_PATH_IMAGE044
满足:
Figure 391939DEST_PATH_IMAGE062
时,则将鸟巢位置
Figure 573522DEST_PATH_IMAGE044
划分进行区域
Figure 28774DEST_PATH_IMAGE061
中;When the bird's nest position
Figure 629520DEST_PATH_IMAGE044
Satisfy:
Figure 391939DEST_PATH_IMAGE062
, the bird's nest position
Figure 573522DEST_PATH_IMAGE044
Divide the area
Figure 28774DEST_PATH_IMAGE061
middle;

当对种群中未划分进区域的鸟巢位置筛选完成后,进入步骤Step3;When the selection of the bird's nest positions in the population that are not divided into areas is completed, go to Step 3;

Step3:当种群中未划分进区域的鸟巢位置的个数不为

Figure 990914DEST_PATH_IMAGE063
时,则继续按照步骤Step2中的方式对种群中未划分进区域的鸟巢位置进行区域划分,当种群中未划分进区域的鸟巢位置的个数为
Figure 557024DEST_PATH_IMAGE063
时,则停止对种群中的鸟巢位置进行区域划分。Step 3: When the number of nest positions that are not divided into areas in the population is not
Figure 990914DEST_PATH_IMAGE063
, then continue to divide the bird's nest positions that are not divided into the area in the population according to the method inStep 2. When the number of bird's nest positions that are not divided into the area in the population is
Figure 557024DEST_PATH_IMAGE063
When , the regional division of the bird's nest position in the population is stopped.

在划分的各区域中选取鸟巢位置进行第

Figure 593114DEST_PATH_IMAGE027
次的偏好随机更新,具体为:Select the bird's nest location in the divided areas for the first
Figure 593114DEST_PATH_IMAGE027
The preference is randomly updated for the second time, specifically:

设置

Figure 219267DEST_PATH_IMAGE064
表示对采用莱维飞行模式进行第
Figure 606386DEST_PATH_IMAGE034
次的迭代更新后的鸟巢位置进行区域划分所得的第
Figure 772925DEST_PATH_IMAGE065
个区域,定义
Figure 929100DEST_PATH_IMAGE066
表示区域
Figure 726155DEST_PATH_IMAGE064
中鸟巢位置的区域属性系数,且
Figure 334991DEST_PATH_IMAGE066
的值为:
Figure 508483DEST_PATH_IMAGE067
,式中,
Figure 584411DEST_PATH_IMAGE068
表示区域
Figure 552367DEST_PATH_IMAGE064
中鸟巢位置的临近距离值的均值,且
Figure 648499DEST_PATH_IMAGE069
Figure 94524DEST_PATH_IMAGE070
表示区域
Figure 225291DEST_PATH_IMAGE064
中鸟巢位置的临近距离值的离散系数,且
Figure 426465DEST_PATH_IMAGE071
,其中,
Figure 9893DEST_PATH_IMAGE072
表示区域
Figure 259609DEST_PATH_IMAGE064
中的第
Figure 244882DEST_PATH_IMAGE073
个鸟巢位置,
Figure 554641DEST_PATH_IMAGE074
表示鸟巢位置
Figure 687682DEST_PATH_IMAGE072
的临近距离值,且
Figure 741089DEST_PATH_IMAGE075
Figure 580869DEST_PATH_IMAGE076
表示区域
Figure 61529DEST_PATH_IMAGE064
中距离鸟巢位置
Figure 353970DEST_PATH_IMAGE072
最近的鸟巢位置,
Figure 211067DEST_PATH_IMAGE077
表示区域
Figure 967671DEST_PATH_IMAGE064
中的鸟巢位置数;set up
Figure 219267DEST_PATH_IMAGE064
Indicates that the first flight in Levi flight mode
Figure 606386DEST_PATH_IMAGE034
The second iteration of the updated bird's nest location is the result of regional division.
Figure 772925DEST_PATH_IMAGE065
area, define
Figure 929100DEST_PATH_IMAGE066
Representation area
Figure 726155DEST_PATH_IMAGE064
the regional attribute coefficients of the bird's nest location in , and
Figure 334991DEST_PATH_IMAGE066
The value is:
Figure 508483DEST_PATH_IMAGE067
, where,
Figure 584411DEST_PATH_IMAGE068
Representation area
Figure 552367DEST_PATH_IMAGE064
the mean of the proximity distance values of the bird's nest location in the middle, and
Figure 648499DEST_PATH_IMAGE069
,
Figure 94524DEST_PATH_IMAGE070
Representation area
Figure 225291DEST_PATH_IMAGE064
The dispersion coefficient of the proximity distance value of the bird's nest location in the middle, and
Figure 426465DEST_PATH_IMAGE071
,in,
Figure 9893DEST_PATH_IMAGE072
Representation area
Figure 259609DEST_PATH_IMAGE064
in the
Figure 244882DEST_PATH_IMAGE073
bird's nest location,
Figure 554641DEST_PATH_IMAGE074
Indicates the location of the bird's nest
Figure 687682DEST_PATH_IMAGE072
the proximity distance value of , and
Figure 741089DEST_PATH_IMAGE075
,
Figure 580869DEST_PATH_IMAGE076
Representation area
Figure 61529DEST_PATH_IMAGE064
Middle distance bird's nest location
Figure 353970DEST_PATH_IMAGE072
The nearest bird's nest location,
Figure 211067DEST_PATH_IMAGE077
Representation area
Figure 967671DEST_PATH_IMAGE064
The number of nest positions in ;

按照下列步骤对区域

Figure 619232DEST_PATH_IMAGE064
中的鸟巢位置进行可替代性检测:Follow the steps below to
Figure 619232DEST_PATH_IMAGE064
Alternative detection of the bird's nest location in:

步骤1:设置

Figure 664548DEST_PATH_IMAGE078
表示区域
Figure 59757DEST_PATH_IMAGE064
中当前未进行可替代性检测的鸟巢位置集合,设
Figure 608550DEST_PATH_IMAGE079
表示集合
Figure 758909DEST_PATH_IMAGE078
中的第
Figure 25942DEST_PATH_IMAGE080
个鸟巢位置,定义
Figure 959263DEST_PATH_IMAGE081
表示鸟巢位置
Figure 628142DEST_PATH_IMAGE079
在集合
Figure 887085DEST_PATH_IMAGE078
中的可替代性系数,且
Figure 438152DEST_PATH_IMAGE081
的值为:Step 1: Setup
Figure 664548DEST_PATH_IMAGE078
Representation area
Figure 59757DEST_PATH_IMAGE064
The set of bird's nest positions that have not currently been tested for substitutability, set
Figure 608550DEST_PATH_IMAGE079
Represents a collection
Figure 758909DEST_PATH_IMAGE078
in the
Figure 25942DEST_PATH_IMAGE080
nest locations, defined
Figure 959263DEST_PATH_IMAGE081
Indicates the location of the bird's nest
Figure 628142DEST_PATH_IMAGE079
in collection
Figure 887085DEST_PATH_IMAGE078
the substitutability coefficient in , and
Figure 438152DEST_PATH_IMAGE081
The value is:

Figure 440743DEST_PATH_IMAGE083
Figure 440743DEST_PATH_IMAGE083

式中,

Figure 964128DEST_PATH_IMAGE084
表示集合
Figure 128394DEST_PATH_IMAGE078
中的第
Figure 370019DEST_PATH_IMAGE085
个鸟巢位置,
Figure 973039DEST_PATH_IMAGE086
为鸟巢位置
Figure 350930DEST_PATH_IMAGE079
和鸟巢位置
Figure 951676DEST_PATH_IMAGE084
之间的可替代判断函数,当
Figure 415018DEST_PATH_IMAGE087
时,
Figure 759412DEST_PATH_IMAGE086
的值取
Figure 257389DEST_PATH_IMAGE088
,当
Figure 825774DEST_PATH_IMAGE089
时,
Figure 41992DEST_PATH_IMAGE086
的值取
Figure 924497DEST_PATH_IMAGE063
;In the formula,
Figure 964128DEST_PATH_IMAGE084
Represents a collection
Figure 128394DEST_PATH_IMAGE078
in the
Figure 370019DEST_PATH_IMAGE085
bird's nest location,
Figure 973039DEST_PATH_IMAGE086
for the bird's nest location
Figure 350930DEST_PATH_IMAGE079
and bird's nest location
Figure 951676DEST_PATH_IMAGE084
Alternative judgment functions between , when
Figure 415018DEST_PATH_IMAGE087
hour,
Figure 759412DEST_PATH_IMAGE086
the value of
Figure 257389DEST_PATH_IMAGE088
,when
Figure 825774DEST_PATH_IMAGE089
hour,
Figure 41992DEST_PATH_IMAGE086
the value of
Figure 924497DEST_PATH_IMAGE063
;

按照上述方法计算集合

Figure 276981DEST_PATH_IMAGE078
中各鸟巢位置的可替代性系数;Calculate the set as above
Figure 276981DEST_PATH_IMAGE078
The substitutability coefficient of each bird's nest location in ;

步骤2:在当前集合

Figure 953950DEST_PATH_IMAGE078
中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,设
Figure 451272DEST_PATH_IMAGE090
为当前集合
Figure 137468DEST_PATH_IMAGE078
中具有最大可替代性系数的鸟巢位置,且
Figure 344459DEST_PATH_IMAGE090
表示集合
Figure 457908DEST_PATH_IMAGE078
中的第
Figure 383139DEST_PATH_IMAGE091
个鸟巢位置,
Figure 669764DEST_PATH_IMAGE092
表示鸟巢位置
Figure 996840DEST_PATH_IMAGE090
在集合
Figure 281191DEST_PATH_IMAGE078
中的可替代鸟巢位置集合,且
Figure 428138DEST_PATH_IMAGE093
,其中,
Figure 456137DEST_PATH_IMAGE094
表示集合
Figure 700036DEST_PATH_IMAGE078
中的第
Figure 155289DEST_PATH_IMAGE095
个鸟巢位置,
Figure 55111DEST_PATH_IMAGE096
为鸟巢位置
Figure 621222DEST_PATH_IMAGE090
和鸟巢位置
Figure 657311DEST_PATH_IMAGE094
之间的可替代判断函数,当
Figure 283465DEST_PATH_IMAGE097
时,
Figure 467321DEST_PATH_IMAGE096
的值取
Figure 837123DEST_PATH_IMAGE088
,当
Figure 993298DEST_PATH_IMAGE098
时,
Figure 790352DEST_PATH_IMAGE096
的值取
Figure 399188DEST_PATH_IMAGE063
;设
Figure 369418DEST_PATH_IMAGE099
表示鸟巢位置
Figure 380099DEST_PATH_IMAGE090
在集合
Figure 348055DEST_PATH_IMAGE078
中的可替代性系数,当
Figure 444187DEST_PATH_IMAGE099
的值满足:
Figure 155792DEST_PATH_IMAGE100
时,则在集合
Figure 348875DEST_PATH_IMAGE092
中随机选取
Figure 487733DEST_PATH_IMAGE101
个鸟巢位置进行偏好随机更新,并将集合
Figure 805582DEST_PATH_IMAGE092
中的鸟巢位置和鸟巢位置
Figure 320877DEST_PATH_IMAGE090
都标注为已进行可替代性检测的鸟巢位置,并在集合
Figure 306150DEST_PATH_IMAGE078
中去除所述已进行可替代性检测的鸟巢位置后进入步骤3;当
Figure 678226DEST_PATH_IMAGE099
的值满足:
Figure 483371DEST_PATH_IMAGE102
时,则停止在区域
Figure 536777DEST_PATH_IMAGE064
中选取鸟巢位置进行第
Figure 376557DEST_PATH_IMAGE027
次的偏好随机更新,其中,
Figure 857217DEST_PATH_IMAGE103
为给定的可替代性检测阈值,且
Figure 477554DEST_PATH_IMAGE103
为大于
Figure 69073DEST_PATH_IMAGE088
的正整数,
Figure 28938DEST_PATH_IMAGE103
的值可以取
Figure 680500DEST_PATH_IMAGE104
;Step 2: In the current collection
Figure 953950DEST_PATH_IMAGE078
Select the bird's nest position with the largest substitutability coefficient for substitutability detection, set
Figure 451272DEST_PATH_IMAGE090
for the current collection
Figure 137468DEST_PATH_IMAGE078
The nest location with the largest substitutability coefficient in , and
Figure 344459DEST_PATH_IMAGE090
Represents a collection
Figure 457908DEST_PATH_IMAGE078
in the
Figure 383139DEST_PATH_IMAGE091
bird's nest location,
Figure 669764DEST_PATH_IMAGE092
Indicates the location of the bird's nest
Figure 996840DEST_PATH_IMAGE090
in collection
Figure 281191DEST_PATH_IMAGE078
the set of alternative nest positions in , and
Figure 428138DEST_PATH_IMAGE093
,in,
Figure 456137DEST_PATH_IMAGE094
Represents a collection
Figure 700036DEST_PATH_IMAGE078
in the
Figure 155289DEST_PATH_IMAGE095
bird's nest location,
Figure 55111DEST_PATH_IMAGE096
for the bird's nest location
Figure 621222DEST_PATH_IMAGE090
and bird's nest location
Figure 657311DEST_PATH_IMAGE094
Alternative judgment functions between , when
Figure 283465DEST_PATH_IMAGE097
hour,
Figure 467321DEST_PATH_IMAGE096
the value of
Figure 837123DEST_PATH_IMAGE088
,when
Figure 993298DEST_PATH_IMAGE098
hour,
Figure 790352DEST_PATH_IMAGE096
the value of
Figure 399188DEST_PATH_IMAGE063
;Assume
Figure 369418DEST_PATH_IMAGE099
Indicates the location of the bird's nest
Figure 380099DEST_PATH_IMAGE090
in collection
Figure 348055DEST_PATH_IMAGE078
The substitutability coefficient in , when
Figure 444187DEST_PATH_IMAGE099
The value of satisfies:
Figure 155792DEST_PATH_IMAGE100
, then in the collection
Figure 348875DEST_PATH_IMAGE092
randomly selected from
Figure 487733DEST_PATH_IMAGE101
Randomly update the preference of the bird's nest positions, and collect the
Figure 805582DEST_PATH_IMAGE092
Bird's Nest Location and Bird's Nest Location in
Figure 320877DEST_PATH_IMAGE090
are marked as nest locations that have undergone substitutable detection, and are collected in the collection
Figure 306150DEST_PATH_IMAGE078
After removing the position of the bird's nest that has undergone alternative detection, go to step 3; when
Figure 678226DEST_PATH_IMAGE099
The value of satisfies:
Figure 483371DEST_PATH_IMAGE102
, stop at the area
Figure 536777DEST_PATH_IMAGE064
Select the location of the bird's nest from the
Figure 376557DEST_PATH_IMAGE027
times the preference is randomly updated, where,
Figure 857217DEST_PATH_IMAGE103
is a given alternative detection threshold, and
Figure 477554DEST_PATH_IMAGE103
is greater than
Figure 69073DEST_PATH_IMAGE088
a positive integer of ,
Figure 28938DEST_PATH_IMAGE103
The value of can take
Figure 680500DEST_PATH_IMAGE104
;

步骤3:继续按照步骤2中的方式在当前集合

Figure 460237DEST_PATH_IMAGE078
中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,直到集合
Figure 855446DEST_PATH_IMAGE078
中的鸟巢位置都进行了可替代性检测。Step 3: Continue in the current collection as inStep 2
Figure 460237DEST_PATH_IMAGE078
Select the bird's nest position with the largest substitutability coefficient for substitutability detection until the set
Figure 855446DEST_PATH_IMAGE078
The bird's nest locations in the

设置选取的鸟巢位置采用下列方式进行第

Figure 735065DEST_PATH_IMAGE027
次的偏好随机更新:Set the selected bird's nest position in the following ways:
Figure 735065DEST_PATH_IMAGE027
The preferences are updated randomly for times:

Figure 557527DEST_PATH_IMAGE105
表示选取的进行第
Figure 824561DEST_PATH_IMAGE034
次的偏好随机更新的鸟巢位置,
Figure 23461DEST_PATH_IMAGE106
表示种群中的第
Figure 692340DEST_PATH_IMAGE107
只布谷鸟在第
Figure 748020DEST_PATH_IMAGE026
次迭代更新后的最终鸟巢位置,则
Figure 502350DEST_PATH_IMAGE105
表示采用莱维飞行模式对鸟巢位置
Figure 504941DEST_PATH_IMAGE106
进行第
Figure 28326DEST_PATH_IMAGE034
次迭代更新后的鸟巢位置,则
Figure 192591DEST_PATH_IMAGE105
采用下列方式进行第
Figure 496533DEST_PATH_IMAGE034
次的偏好随机更新:Assume
Figure 557527DEST_PATH_IMAGE105
Indicates the selected
Figure 824561DEST_PATH_IMAGE034
The preferred randomly updated bird's nest location,
Figure 23461DEST_PATH_IMAGE106
represents the first in the population
Figure 692340DEST_PATH_IMAGE107
only cuckoo in the
Figure 748020DEST_PATH_IMAGE026
The final bird’s nest position after the next iteration update, then
Figure 502350DEST_PATH_IMAGE105
Indicates the position of the bird's nest using the Levi flight mode
Figure 504941DEST_PATH_IMAGE106
carry out the
Figure 28326DEST_PATH_IMAGE034
the position of the bird’s nest after the next iteration update, then
Figure 192591DEST_PATH_IMAGE105
carry out the following
Figure 496533DEST_PATH_IMAGE034
The preferences are updated randomly for times:

Figure 37236DEST_PATH_IMAGE109
Figure 37236DEST_PATH_IMAGE109

Figure 415128DEST_PATH_IMAGE111
Figure 415128DEST_PATH_IMAGE111

式中,

Figure 15874DEST_PATH_IMAGE112
表示采用随机偏好更新模式对鸟巢位置
Figure 479216DEST_PATH_IMAGE105
进行迭代更新后的鸟巢位置,
Figure 885926DEST_PATH_IMAGE113
表示鸟巢位置
Figure 118325DEST_PATH_IMAGE112
的适应度函数值,
Figure 889972DEST_PATH_IMAGE114
表示鸟巢位置
Figure 840610DEST_PATH_IMAGE105
的适应度函数值,
Figure 723115DEST_PATH_IMAGE115
表示种群中第
Figure 137916DEST_PATH_IMAGE107
只布谷鸟在第
Figure 80464DEST_PATH_IMAGE034
次迭代更新后的最终鸟巢位置,
Figure 518399DEST_PATH_IMAGE116
Figure 204595DEST_PATH_IMAGE117
为在当前采用莱维飞行模式进行第
Figure 411586DEST_PATH_IMAGE034
次迭代更新后的鸟巢位置中随机选取的两个鸟巢位置,且
Figure 525035DEST_PATH_IMAGE118
Figure 512583DEST_PATH_IMAGE119
表示
Figure 736891DEST_PATH_IMAGE063
Figure 63967DEST_PATH_IMAGE088
之间的随机数。In the formula,
Figure 15874DEST_PATH_IMAGE112
Indicates that the random preference update mode is used to adjust the position of the bird's nest.
Figure 479216DEST_PATH_IMAGE105
The nest position after iterative update,
Figure 885926DEST_PATH_IMAGE113
Indicates the location of the bird's nest
Figure 118325DEST_PATH_IMAGE112
The fitness function value of ,
Figure 889972DEST_PATH_IMAGE114
Indicates the location of the bird's nest
Figure 840610DEST_PATH_IMAGE105
The fitness function value of ,
Figure 723115DEST_PATH_IMAGE115
represents the number of
Figure 137916DEST_PATH_IMAGE107
only cuckoo in the
Figure 80464DEST_PATH_IMAGE034
The final bird's nest position after the second iteration update,
Figure 518399DEST_PATH_IMAGE116
and
Figure 204595DEST_PATH_IMAGE117
For the first flight in Levi's flight mode
Figure 411586DEST_PATH_IMAGE034
Two bird nest positions randomly selected from the bird nest positions updated by the second iteration, and
Figure 525035DEST_PATH_IMAGE118
,
Figure 512583DEST_PATH_IMAGE119
express
Figure 736891DEST_PATH_IMAGE063
arrive
Figure 63967DEST_PATH_IMAGE088
random numbers in between.

具体的,本发明在针对支持向量机最佳算法参数难以确定的缺陷以及布谷鸟搜索算法局部搜索能力弱、寻优精度低的不足,通过布谷鸟搜索算法对支持向量机的参数进行寻优,并在布谷鸟搜索算法的迭代过程中对布谷鸟搜索算法的偏好随机更新模式进行改进,提高了布谷鸟搜索算法的搜索精度,从而使得寻优所得的最优算法参数能够提高支持向量机的分类精度。标准布谷鸟搜索算法的最大特点是采用莱维飞行模式,将频繁的短距离探索与偶尔的较长距离迁移结合在了约束空间内最优解的寻找策略中,即莱维飞行模式的搜索更加注重于局部搜索,因此,当种群在每次迭代更新后的鸟巢位置分布的较为全面时,下一次采用莱维飞行模式进行迭代更新时就能实现更加全面的局部搜索,从而提高种群的局部搜索精度,增加寻找到最优解的概率。而传统的偏好随机更新模式利用生成的随机数和发现概率进行比较从而确定进行偏好随机更新的鸟巢位置,在选取鸟巢位置方面具有较强的随机性,也容易破坏当前种群中鸟巢位置分布的全面性,为了更好的利用偏好随机更新模式增强种群多样性的同时,保证当前采用莱维飞行模式进行迭代更新后的鸟巢位置可以较为全面地引导下一次的莱维飞行更新,本实施例将种群中的鸟巢位置进行区域划分,将相似的鸟巢位置归为一个区域,通过可替代性检测来确定区域中各鸟巢位置在当前区域中拥有的可替代解的数量,当一个鸟巢位置在其所在的区域中拥有较多的可替代解,那么在该鸟巢位置和其拥有的可替代解中随机选取一定数量的鸟巢位置进行偏好随机更新,这样既可以保证区域中鸟巢位置分布的较为全面,从而保证下一次迭代时采用莱维飞行模式对该区域进行搜索的局部搜索精度,又实现了采用偏好随机更新模式进行更新,从而增加种群多样性的目的,因此相较于标准布谷鸟搜索算法具有更好的搜索精度。Specifically, the present invention seeks to optimize the parameters of the support vector machine through the cuckoo search algorithm, aiming at the defects that the optimal algorithm parameters of the support vector machine are difficult to determine and the weak local search ability and low optimization accuracy of the cuckoo search algorithm. In the iterative process of the cuckoo search algorithm, the preference random update mode of the cuckoo search algorithm is improved, which improves the search accuracy of the cuckoo search algorithm, so that the optimal algorithm parameters obtained from the optimization can improve the classification of support vector machines. precision. The most important feature of the standard cuckoo search algorithm is the use of Levi flight mode, which combines frequent short-distance exploration and occasional long-distance migration in the search strategy for the optimal solution in the constrained space, that is, the search of Levi flight mode is more efficient. Focus on local search. Therefore, when the distribution of nest positions of the population after each iterative update is relatively comprehensive, a more comprehensive local search can be achieved when the Levie flight mode is used for iterative update next time, thereby improving the local search of the population. Accuracy increases the probability of finding the optimal solution. The traditional preference random update mode uses the generated random number and the discovery probability to compare to determine the bird's nest location for preference random update, which has strong randomness in selecting the bird's nest location and easily destroys the overall distribution of the bird's nest location in the current population. In order to better utilize the preference random update mode to enhance the diversity of the population and at the same time ensure that the position of the bird’s nest after iteratively updated by the current Levie flight mode can guide the next Levie flight update more comprehensively, in this embodiment, the population The bird's nest positions in the region are divided into regions, similar bird's nest positions are classified into one region, and the number of alternative solutions that each bird's nest position in the region has in the current region is determined by substitutability detection. There are many alternative solutions in the region, then a certain number of bird nest positions are randomly selected from the bird's nest position and the alternative solutions it has for random preference update, which can not only ensure a more comprehensive distribution of bird nest positions in the region, so as to ensure In the next iteration, the local search accuracy of the area is searched using the Levi flight mode, and the preference random update mode is used to update, thereby increasing the diversity of the population, so it has better performance than the standard cuckoo search algorithm. search accuracy.

如图1所示,本实施例还提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的示例性产品架构,包括计算机处理器1及其配套的显示终端2,计算机处理器1内装载有风控平台3,计算机处理器1外通讯连接有数据存储服务器4,数据存储服务器4采集来自公开及第三方平台服务器5的海量数据服务于风控平台3,用户6可通过计算机处理器1访问风控平台3。As shown in FIG. 1, this embodiment also provides an exemplary product architecture of a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including a computer processor 1 and its supportingdisplay terminal 2, and the computer processor 1 There is a risk control platform 3 inside, and a data storage server 4 is connected to the computer processor 1 for communication. The data storage server 4 collects massive data from public and third-party platform servers 5 to serve the risk control platform 3, and the user 6 can process it through the computer. Device 1 accesses risk control platform 3.

如图10所示,本实施例还提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行装置,该装置包括处理器、存储器以及存储在存储器中并在处理器上运行的计算机程序。As shown in FIG. 10 , this embodiment also provides a device for running a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence. The device includes a processor, a memory, and a device stored in the memory and running on the processor. Computer program.

处理器包括一个或一个以上处理核心,处理器通过总线与存储器相连,存储器用于存储程序指令,处理器执行存储器中的程序指令时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。The processor includes one or more processing cores, and the processor is connected to the memory through a bus, and the memory is used to store program instructions. When the processor executes the program instructions in the memory, the above-mentioned algorithm, big data, artificial intelligence-based stock public opinion monitoring and risk management are realized. control system.

可选的,存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随时存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。Alternatively, the memory can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static anytime access memory (SRAM), electrically erasable programmable read only memory (EEPROM), which can be Erase programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.

此外,本发明还提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。In addition, the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the above-mentioned stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence is implemented .

可选的,本发明还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面基于算法、大数据、人工智能的股票舆情监测和风控系统。Optionally, the present invention also provides a computer program product containing instructions, which, when running on a computer, enables the computer to execute the above aspects of the stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence.

本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,程序可以存储于计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above-mentioned embodiments can be completed by hardware, or can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium. The above-mentioned storage medium It can be a read-only memory, a magnetic disk or an optical disk, etc.

以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的仅为本发明的优选例,并不用来限制本发明,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments, and the above-mentioned embodiments and descriptions are only preferred examples of the present invention, and are not intended to limit the present invention, without departing from the spirit and scope of the present invention. Under the premise, the present invention will also have various changes and improvements, and these changes and improvements all fall within the scope of the claimed invention. The claimed scope of the present invention is defined by the appended claims and their equivalents.

Claims (10)

Translated fromChinese
1.基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:包括1. A stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence, which is characterized by: including平台架构单元(100)、数据处理单元(200)、预测判识单元(300)和风控管理单元(400);所述平台架构单元(100)、所述数据处理单元(200)、所述预测判识单元(300)与所述风控管理单元(400)依次通过网络通信连接;所述平台架构单元(100)用于对构成平台运行环境的设备、软件及技术应用进行连接及管理;所述数据处理单元(200)用于采集获取大量与金融证券及其风险相关的多元数据,通过对数据进行整理分析并建立完善的数据分析模型;所述预测判识单元(300)用于通过对大量数据的伸入挖掘分析来预测金融证券业务流程中可能存在的风险,对风险进行类型识别和程度分析;所述风控管理单元(400)用于从多方面、采用多种风控手段对金融证券业务的风险进行管控;A platform architecture unit (100), a data processing unit (200), a prediction identification unit (300), and a risk control management unit (400); the platform architecture unit (100), the data processing unit (200), the prediction The identification unit (300) and the risk control management unit (400) are sequentially connected through network communication; the platform architecture unit (100) is used to connect and manage the equipment, software and technical applications that constitute the platform operating environment; The data processing unit (200) is used to collect and obtain a large amount of multivariate data related to financial securities and their risks, and to organize and analyze the data and establish a perfect data analysis model; A large amount of data is extended into mining analysis to predict the possible risks in the financial securities business process, and the type identification and degree analysis of the risks are carried out; Risk management and control of financial securities business;所述平台架构单元(100)包括基建设备模块(101)、软件环境模块(102)、技术支撑模块(103)和三方平台模块(104);The platform architecture unit (100) includes an infrastructure equipment module (101), a software environment module (102), a technical support module (103) and a third-party platform module (104);所述数据处理单元(200)包括数据集合模块(201)、分类整理模块(202)、数据分析模块(203)和数据模型模块(204);The data processing unit (200) includes a data collection module (201), a classification and arrangement module (202), a data analysis module (203) and a data model module (204);所述预测判识单元(300)包括动态监测模块(301)、风险预测模块(302)、类型识别模块(303)和程度判定模块(304);The prediction and identification unit (300) includes a dynamic monitoring module (301), a risk prediction module (302), a type identification module (303) and a degree determination module (304);所述风控管理单元(400)包括风险控制模块(401)、合作风控模块(402)、监管干预模块(403)和改良措施模块(404);The risk control management unit (400) includes a risk control module (401), a cooperative risk control module (402), a supervisory intervention module (403) and an improvement measure module (404);该基于算法、大数据、人工智能的股票舆情监测和风控系统在运行时,首先在构建预测及风控平台的产品架构上,开发相关的软件应用,连接各金融证券信息管理平台及第三方服务平台、监管平台,以合法手段从多渠道获取大量金融舆情资讯、用户征信、用户与证券公司之间的交易活动信息、证券公司及其产品的数据,对数据进行归集整合后,采用神经网络、机器学习、支持向量机算法技术对数据进行建模及训练,以确保风险预测数据模型的准确度,平台运转过程中,通过模型对金融证券业务的动态监测,实时预测业务全流程中可能出现的风险,判识风险的类型、评估风险的程度,从而采用预设的风险控制手段对其进行干预或消除,并利用大数据不断改良金融证券交易业务的政策及流程,以期从根本上降低金融证券交易业务的风险。When the stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence is running, it first develops relevant software applications on the product architecture of the prediction and risk control platform, and connects various financial securities information management platforms and third-party services. The platform and supervision platform obtain a large amount of financial public opinion information, user credit information, transaction activity information between users and securities companies, and data of securities companies and their products from multiple channels through legal means. Network, machine learning, and support vector machine algorithm technologies model and train data to ensure the accuracy of the risk prediction data model. During the operation of the platform, the dynamic monitoring of financial securities business through the model can predict the possibility of real-time prediction in the entire business process. Emerging risks, identify the type of risk and assess the degree of risk, so as to use preset risk control methods to intervene or eliminate it, and use big data to continuously improve the policies and procedures of financial securities trading business, in order to fundamentally reduce Risks of financial securities trading business.2.根据权利要求1所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述基建设备模块(101)、所述软件环境模块(102)、所述技术支撑模块(103)与所述三方平台模块(104)依次通过网络通信连接;所述基建设备模块(101)用于对加入风控平台系统的电子计算机设备进行连接管理;所述软件环境模块(102)用于在基建设备的基础上研发针对证券金融业务风险管理的软件及应用平台,以便构建支持系统的运行环境;所述技术支撑模块(103)用于载入以人工智能为主的智能技术,并引入多种智能算法来支撑平台系统的顺畅运行;所述三方平台模块(104)用于连接多个如金融证券信息管理平台、监管平台等第三方服务平台以获取大量补充数据及补充服务。其中,基建设备包括但不限于计算机、显示器、PC平板、手机、智能传感器、数据采集装置(扫描仪、RFID、身份证OCR、人脸/指纹识别器等)等。2. The stock public opinion monitoring and risk control system based on algorithm, big data and artificial intelligence according to claim 1, characterized in that: the infrastructure equipment module (101), the software environment module (102), the technology The support module (103) and the third-party platform module (104) are sequentially connected through network communication; the infrastructure equipment module (101) is used for connection management of electronic computer equipment added to the risk control platform system; the software environment module (101) 102) It is used to develop software and application platforms for securities financial business risk management based on infrastructure equipment, so as to build an operating environment that supports the system; the technical support module (103) is used to load intelligence based on artificial intelligence. technology, and introduce a variety of intelligent algorithms to support the smooth operation of the platform system; the three-party platform module (104) is used to connect multiple third-party service platforms such as financial securities information management platforms, regulatory platforms and other third-party service platforms to obtain a large amount of supplementary data and supplements Serve. Among them, infrastructure equipment includes but is not limited to computers, monitors, PC tablets, mobile phones, smart sensors, data acquisition devices (scanners, RFID, ID card OCR, face/fingerprint readers, etc.), etc.3.根据权利要求1所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述数据集合模块(201)的信号输出端与所述分类整理模块(202)的信号输入端连接,所述分类整理模块(202)的信号输出端与所述数据分析模块(203)的信号输入端连接,所述数据分析模块(203)的信号输出端与所述数据模型模块(204)的信号输入端连接;所述数据集合模块(201)用于通过多种手段从多来源获取大量与金融证券相关的数据;所述分类整理模块(202)用于按照一定的类别规则将大量的数据进行分类归纳整理操作,以便进行后期的计算分析;所述数据分析模块(203)用于采用多种全球领先技术来对金融证券的数据进行分析;所述数据模型模块(204)用于以大量的数据为基础、根据数据分析的结果构建风险分析的数据模型并进行训练及验证。3. The stock public opinion monitoring and risk control system based on algorithm, big data, artificial intelligence according to claim 1, is characterized in that: the signal output end of described data collection module (201) and described sorting module (202) The signal input end of the classification and sorting module (202) is connected with the signal input end of the data analysis module (203), and the signal output end of the data analysis module (203) is connected with the data model The signal input end of the module (204) is connected; the data collection module (201) is used to obtain a large amount of data related to financial securities from multiple sources through a variety of means; the classification module (202) is used to classify according to a certain category The rules classify and summarize a large amount of data for later calculation and analysis; the data analysis module (203) is used to analyze the data of financial securities by adopting a variety of world-leading technologies; the data model module (204) ) is used to build a data model for risk analysis based on a large amount of data and based on the results of data analysis, and to conduct training and verification.4.根据权利要求3所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述数据集合模块(201)包括舆情资讯模块(2011)、用户征信模块(2012)、公司产品模块(2013)和交易活动模块(2014);所述舆情资讯模块(2011)、所述用户征信模块(2012)、所述公司产品模块(2013)与所述交易活动模块(2014)依次通过网络通信连接且并列运行;所述舆情资讯模块(2011)用于从网络上获取公开的历史或实时的与金融证券相关的舆情资讯;所述用户征信模块(2012)用于从用户、证券公司、合作银行等方面以合法手段或经用户授权后获取与用户征信相关的信息;所述公司产品模块(2013)用于获取各证券公司包括经营情况、公开资产、公司业务及具体产品详情等信息数据;所述交易活动模块(2014)用于获取用户与证券公司之间的交易活动的全流程信息。4. the stock public opinion monitoring and risk control system based on algorithm, big data, artificial intelligence according to claim 3, is characterized in that: described data collection module (201) comprises public opinion information module (2011), user credit reporting module ( 2012), company product module (2013) and transaction activity module (2014); the public opinion information module (2011), the user credit reporting module (2012), the company product module (2013) and the transaction activity module (2014) are sequentially connected through network communication and run in parallel; the public opinion information module (2011) is used to obtain public historical or real-time public opinion information related to financial securities from the network; the user credit reporting module (2012) uses Obtain information related to user credit reporting from users, securities companies, cooperative banks, etc. by legal means or after the user's authorization; the company product module (2013) is used to obtain information about securities companies, including operating conditions, public assets, company Information data such as business and specific product details; the transaction activity module (2014) is used to obtain the whole process information of the transaction activity between the user and the securities company.5.根据权利要求3所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述数据分析模块(203)包括神经网络模块(2031)、机器学习模块(2032)、支持向量机模块(2033)和碰撞分析模块(2034);所述神经网络模块(2031)、所述机器学习模块(2032)、所述支持向量机模块(2033)与所述碰撞分析模块(2034)依次通过网络通信连接且独立运行;所述神经网络模块(2031)用于通过神经网络的训练算法来将算法权重的值调整到最佳,以使得整个网络的预测效果最好,利用训练样本集中的样本对BP神经网络或支持向量机进行训练,利用测试样本集中的样本对BP神经网络或支持向量机进行测试,从而构建股票走势预测模型用于对目标股票的走势进行预测,参数优化单元利用人工萤火虫群优化算法对BP神经网络的初始权值和阈值进行优化或采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优;所述机器学习模块(2032)用于使用机器学习相关的技术进行神经网络的训练,使得参数尽可能的与真实的模型逼近,从而使得模型训练可以获得性能与数据利用上的双重优势;所述支持向量机模块(2033)用于采用支持向量机的算法,通过构造分割面将数据进行分离,以便进行关系分析;所述碰撞分析模块(2034)用于将来自不同金融领域、不同金融机构的数据及风险因素进行碰撞分析以挖掘潜在的风险情况。5. the stock public opinion monitoring and risk control system based on algorithm, big data, artificial intelligence according to claim 3, is characterized in that: described data analysis module (203) comprises neural network module (2031), machine learning module (2032 ), support vector machine module (2033) and collision analysis module (2034); the neural network module (2031), the machine learning module (2032), the support vector machine module (2033) and the collision analysis module (2034) are connected and run independently through network communication in turn; the neural network module (2031) is used to adjust the value of the algorithm weight to the best through the training algorithm of the neural network, so that the prediction effect of the entire network is the best, using The samples in the training sample set are used to train the BP neural network or the support vector machine, and the samples in the test sample set are used to test the BP neural network or the support vector machine, so as to construct a stock trend prediction model for predicting the trend of the target stock. The optimization unit uses the artificial firefly swarm optimization algorithm to optimize the initial weights and thresholds of the BP neural network, or uses the cuckoo search algorithm to optimize the penalty factor and kernel function parameters of the support vector machine; the machine learning module (2032) uses Use machine learning-related technologies to train neural networks, so that the parameters are as close to the real model as possible, so that model training can obtain dual advantages in performance and data utilization; the support vector machine module (2033) is used for Using the algorithm of support vector machine, the data is separated by constructing segmentation planes for relationship analysis; the collision analysis module (2034) is used for collision analysis of data and risk factors from different financial fields and different financial institutions to mine Potential Risk Situation.6.根据权利要求5所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述机器学习模块(2032)中,机器学习的训练算法为:6. the stock public opinion monitoring and risk control system based on algorithm, big data, artificial intelligence according to claim 5, is characterized in that: in described machine learning module (2032), the training algorithm of machine learning is:首先给所有参数赋上随机值,使用这些随机生成的参数值,来预测训练数据中的样本;First assign random values to all parameters, and use these randomly generated parameter values to predict the samples in the training data;设样本的预测目标为yp,真实目标为y,那么定义一个值loss,计算公式如下:Assuming that the predicted target of the sample is yp and the real target is y, then define a value loss, and the calculation formula is as follows:
Figure FDA0003543567060000041
Figure FDA0003543567060000041
其中,loss称为损失,机器学习的目标是使对所有训练数据的损失和尽可能的小;Among them, loss is called loss, and the goal of machine learning is to make the loss sum of all training data as small as possible;进而,如果将先前的神经网络预测的矩阵公式带入到yp中,则可以把损失写为关于参数的损失函数。Furthermore, if the matrix formulation of the previous neural network predictions is brought into yp , the loss can be written as a loss function with respect to the parameters.7.根据权利要求5所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述支持向量机模块(2033)中,支持向量的算法选择方式为:7. the stock public opinion monitoring and risk control system based on algorithm, big data, artificial intelligence according to claim 5, is characterized in that: in described support vector machine module (2033), the algorithm selection mode of support vector is:以线性可分SVM为例,将W认为是若干样本线性组合得到的,则第1个样本为x1,第i个为xi,对于每个x,给予其系数α,此时存在:
Figure FDA0003543567060000042
选取部分α,使它们的值不为0,其余值都设为0,则对w真正起作用的就是值不为0的这些x向量,这些向量支持了法线向量,因此就是支持向量;
Taking linearly separable SVM as an example, W is considered to be obtained by the linear combination of several samples, then the first sample is x1 , the i-th sample is xi , and for each x, its coefficient α is given, and at this time there is:
Figure FDA0003543567060000042
Select some α so that their values are not 0, and the rest are set to 0, then what really works on w are these x vectors whose values are not 0. These vectors support the normal vector, so they are support vectors;
若直线l有参数w和b,通过计算每个样本到直线l的距离,衡量哪条直线是最为合适的分割线;距离d可以表示为:
Figure FDA0003543567060000043
若每个数据集中样本的形式为T={(x1,y1)(x2,y2)...(xn,yn)},而每个样本的y值,就是这个样本的label(正例为1,负例为-1,这里的正负值其实反映的就是样本位于分割线的方向,位于法线正方向即为正);
If the straight line l has parameters w and b, by calculating the distance from each sample to the straight line l, we can measure which straight line is the most suitable dividing line; the distance d can be expressed as:
Figure FDA0003543567060000043
If the form of the samples in each data set is T={(x1 , y1 )(x2 , y2 )...(xn , yn )}, and the y value of each sample is the value of this sample label (positive example is 1, negative example is -1, the positive and negative values here actually reflect that the sample is located in the direction of the dividing line, and the positive direction of the normal is positive);
将y值一起乘入等式右边:
Figure FDA0003543567060000044
这里的y值是样本的实际正负值,如果估计值与实际值符号相同,即分类正确,此时的结果为正值,如果分类错误,则结果为负值;
Multiply the y values together into the right-hand side of the equation:
Figure FDA0003543567060000044
The y value here is the actual positive and negative values of the sample. If the estimated value has the same sign as the actual value, that is, the classification is correct, the result at this time is a positive value, and if the classification is wrong, the result is a negative value;
在所有样本中,距离该直线最近的样本应被选为支持向量,支持向量与直线间的距离即为过渡带,因为SVM期望过渡带尽可能大,因此最终参数w与b的选择可以表示为:Among all samples, the sample closest to the line should be selected as the support vector, and the distance between the support vector and the line is the transition band, because SVM expects the transition band to be as large as possible, so the selection of the final parameters w and b can be expressed as :
Figure FDA0003543567060000045
Figure FDA0003543567060000045
因此,给定线性可分训练数据集,通过间隔最大化得到的分割超平面为:y(x)=wTΦ(x)+b,相应的分类决策函数为:f(x)=sign(wTΦ(x)+b)。Therefore, given a linearly separable training dataset, the segmentation hyperplane obtained by maximizing the interval is: y(x)=wT Φ(x)+b, and the corresponding classification decision function is: f(x)=sign( wT Φ(x)+b).
8.根据权利要求1所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述动态监测模块(301)的信号输出端与所述风险预测模块(302)的信号输入端连接,所述风险预测模块(302)的信号输出端与所述类型识别模块(303)的信号输入端连接,所述类型识别模块(303)的信号输出端与所述程度判定模块(304)的信号输入端连接;所述动态监测模块(301)用于通过数字技术,以往期某一时期或某一时点的用户数据作为审核依据的风控方式,替代不能够抓住延续性数据的风控方式,重视具备延续性的用户信息并在复制上给予更高权重,从而实现对金融证券业务风险的动态监测;所述风险预测模块(302)用于通过构建的风险预测数据模型自动预测用户与证券公司进行交易活动全流程中可能存在的风险因素;所述类型识别模块(303)用于根据预测出的风险的在交易活动中所处的位置来识别该风险的类型;所述程度判定模块(304)用于按照预设的风险等级划分规则自动评估各风险的程度情况。其中,风险类型包括但不限于市场风险、信用风险、流动性风险、作业风险、行业风险、法律法规或政策风险、人事风险、自然灾害或其他突发事件等。8. The stock public opinion monitoring and risk control system based on algorithm, big data, and artificial intelligence according to claim 1, characterized in that: the signal output end of the dynamic monitoring module (301) and the risk prediction module (302) The signal input terminal of the risk prediction module (302) is connected to the signal input terminal of the type identification module (303), and the signal output terminal of the type identification module (303) is connected to the degree determination module (303). The signal input end of the module (304) is connected; the dynamic monitoring module (301) is used for the risk control method in which the user data of a certain period or a certain time point in the past period is used as an audit basis through digital technology, instead of being unable to grasp the continuation It adopts the risk control method of persistent data, pays attention to the user information with continuity and gives higher weight to the replication, so as to realize the dynamic monitoring of financial securities business risks; the risk prediction module (302) is used to predict the risks through the constructed data The model automatically predicts the risk factors that may exist in the whole process of trading activities between the user and the securities company; the type identification module (303) is used to identify the type of the risk according to the position of the predicted risk in the trading activity; The degree determination module (304) is used for automatically evaluating the degree of each risk according to a preset risk level classification rule. Among them, the types of risks include but are not limited to market risks, credit risks, liquidity risks, operational risks, industry risks, legal and regulatory or policy risks, personnel risks, natural disasters or other emergencies, etc.9.根据权利要求1所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:所述风险控制模块(401)、所述合作风控模块(402)、所述监管干预模块(403)与所述改良措施模块(404)依次通过网络通信连接;所述风险控制模块(401)用于分别从事前、事中及事后三个方面来对金融证券业务流程中可能出现的各类风险进行控制管理;所述合作风控模块(402)用于通过将不同金融领域、不同金融机构内的风险控制数据及风控方法实现共享合作从而提高风险控制的效果;所述监管干预模块(403)用于从数据监管入手,在允许进一步放开券商对客户信息与交易数据开发权限的基础上,实时监控券商自身或第三方企业获取客户相关信息的来源于渠道,以及进一步进行数据内部深加工的流程和后续构建的包括客户交易习惯和征信等资料库,从而实现证券互联网化的全程监管,并引入第三方监管平台的干预手段来保障金融证券业务的低风险;所述改良措施模块(404)用于证券公司利用大数据对一些业务功能进行改良来强化其风控体系。其中,事前风险控制主要包括征信、风险定价、反欺诈等方面;所述改良措施模块(404)包括用户上线模块(4041)、交易账户模块(4042)、数据融合模块(4043)和产品创新模块(4044);所述用户上线模块(4041)、所述交易账户模块(4042)、所述数据融合模块(4043)与所述产品创新模块(4044)依次通过网络通信连接;所述用户上线模块(4041)用于将证券公司大量的线下存量客户线下存档的资料、交易类行为等数据进行线上化,以便衬垫用户的线上数据;所述交易账户模块(4042)用于打通用户证券交易账户的线上支付,拓展其线上的非证券交易功能,将账户体系丰富到其他线上平台以积累更多的非证券交易数据;所述数据融合模块(4043)用于在多个金融信息管理平台之上搭建数据融合平台以将所有数据进行归集整合,从而可以从各个维度对个体行为进行分析与预测,从全维度开展对个体的风险评估使评估迅速且准确;所述产品创新模块(4044)用于以大数据分析作为全方位产品创新的基础,以便开发定制化产品并精准推送和营销,并可以进行互联网化的产品设计及利用大数据进行风险定价。9. The stock public opinion monitoring and risk control system based on algorithm, big data, and artificial intelligence according to claim 1, characterized in that: the risk control module (401), the cooperative risk control module (402), the The supervisory intervention module (403) and the improvement measure module (404) are sequentially connected through network communication; the risk control module (401) is used to control the possible risks in the financial securities business process from three aspects: before, during and after the event. Various types of risks that appear are controlled and managed; the cooperative risk control module (402) is used for sharing and cooperating with risk control data and risk control methods in different financial fields and different financial institutions to improve the effect of risk control; the The supervision intervention module (403) is used to start from data supervision, and on the basis of allowing the further release of the rights of the securities companies to develop customer information and transaction data, real-time monitoring of the sources from which the securities companies themselves or third-party companies obtain customer-related information, and further. The process of internal deep processing of data and the subsequent construction of databases including customer transaction habits and credit information, so as to realize the full supervision of securities Internet-based, and introduce the intervention methods of third-party supervision platforms to ensure the low risk of financial securities business; The improvement measures module (404) is used by the securities company to use big data to improve some business functions to strengthen its risk control system. Among them, the ex-ante risk control mainly includes credit reporting, risk pricing, anti-fraud, etc.; the improvement measures module (404) includes a user online module (4041), a transaction account module (4042), a data fusion module (4043) and product innovation module (4044); the user online module (4041), the transaction account module (4042), the data fusion module (4043) and the product innovation module (4044) are sequentially connected through network communication; the user is online The module (4041) is used to onlineize a large number of offline archived data, transaction behaviors and other data of the securities company's offline stock customers, so as to cushion the online data of the users; the transaction account module (4042) is used to Open up the online payment of the user's securities trading account, expand its online non-securities trading function, and enrich the account system to other online platforms to accumulate more non-securities trading data; the data fusion module (4043) is used in the A data fusion platform is built on multiple financial information management platforms to collect and integrate all data, so that individual behaviors can be analyzed and predicted from various dimensions, and individual risk assessments can be carried out from all dimensions to make the assessment fast and accurate; The product innovation module (4044) described above is used to use big data analysis as the basis for all-round product innovation, so as to develop customized products and accurately push and market them, and can carry out Internet-based product design and use big data for risk pricing.10.根据权利要求5所述的基于算法、大数据、人工智能的股票舆情监测和风控系统,其特征在于:在利用构建的样本集对支持向量机进行训练时,所述采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优包括设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优;监测目标确定单元用于提取接收到的各新闻舆情数据中的股票主体,并对包含所述股票主体的新闻舆情数据进行统计,当包含所述股票主体的新闻舆情数据在此次接收到的新闻舆情数据中所占的比例超出给定的阈值时,则判定该股票主体为需要进行舆情监测的目标股票,具体为:10. The stock public opinion monitoring and risk control system based on algorithm, big data, and artificial intelligence according to claim 5, characterized in that: when using the constructed sample set to train the support vector machine, the cuckoo search algorithm is adopted. Optimizing the penalty factor and kernel function parameters of the support vector machine includes setting the smaller the fitness function value corresponding to the nest position of the cuckoo, the better the solution corresponding to the nest position; the monitoring target determination unit is used to extract the received data. The stock subject in each news public opinion data, and the news public opinion data containing the stock subject is counted, when the proportion of the news public opinion data containing the stock subject in the received news public opinion data exceeds the given When the threshold is reached, it is determined that the stock subject is the target stock that needs to be monitored by public opinion, specifically:设n表示监测目标确定单元此次接收到的新闻舆情数据的总数,nr表示监测目标确定单元在此次接收到的新闻舆情数据中提取到的第r个股票主体,当包含第r个股票主体的新闻舆情数据在此次接收到的新闻舆情数据中满足:
Figure FDA0003543567060000061
时,则判定该第r个股票主体为需要进行舆情监测的目标股票,其中U为给定的阈值,U的值可以取
Figure FDA0003543567060000062
Let n represent the total number of news and public opinion data received by the monitoring target determination unit this time, and nr represent the rth stock subject extracted by the monitoring target determination unit from the news and public opinion data received this time. When the rth stock is included The news and public opinion data of the subject satisfies the following in the news and public opinion data received this time:
Figure FDA0003543567060000061
When , the rth stock subject is determined as the target stock that needs to be monitored by public opinion, where U is a given threshold, and the value of U can be taken as
Figure FDA0003543567060000062
舆情预警单元用于对包含所述目标股票的新闻舆情数据进行统计,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例超出给定的预警阈值时进行预警,具体为:The public opinion early-warning unit is used to perform statistics on the news and public opinion data containing the target stock, when the proportion of the news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock exceeds a given early warning threshold. Warning, specifically:设Ne表示舆情监测模块接收到的包含第e个目标股票的新闻舆情数据的总数,N′e表示包含第e个目标股票的新闻舆情数据中情感标签为负面的新闻舆情数据的数量,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例满足:
Figure FDA0003543567060000071
时,舆情预警单元进行预警,其中,E为给定的预警阈值,E的值可以取
Figure FDA0003543567060000072
Let Ne represent the total number of news public opinion data containing the e-th target stock received by the public opinion monitoring module, and N′e represent the number of news public opinion data with negative emotional labels in the news public opinion data containing the e-th target stock, when The proportion of news and public opinion data with negative sentiment labels in the news and public opinion data containing the target stock satisfies:
Figure FDA0003543567060000071
When , the public opinion early warning unit performs early warning, where E is the given early warning threshold, and the value of E can be taken as
Figure FDA0003543567060000072
舆情数据管理单元用于对接收到的包含目标股票的新闻舆情数据进行的预处理包括分词处理、过滤停用词、并删除所有与情感信息无关的链接地址、联系方式的文本;The public opinion data management unit is configured to perform preprocessing on the received news public opinion data containing the target stock, including word segmentation, filtering stop words, and deleting all texts of link addresses and contact information irrelevant to emotional information;设置xi(t)表示布谷鸟种群中的第i个布谷鸟在进行第t次迭代更新后的最终鸟巢位置,采用莱维飞行模式对鸟巢位置xi(t)进行第(t+1)次的迭代更新,具体为:Set xi (t) to represent the final nest position of thei -th cuckoo in the cuckoo population after the t-th iteration update, and use the Levi flight mode to perform the (t+1)-th (t+1) Iterative update of times, specifically:
Figure FDA0003543567060000073
Figure FDA0003543567060000073
Figure FDA0003543567060000074
Figure FDA0003543567060000074
式中,Xi(t+1)表示采用莱维飞行模式对鸟巢位置xi(t)进行第(t+1)次的迭代更新后的鸟巢位置,α表示步长因子,
Figure FDA0003543567060000075
表示点对点乘法,L(λ)表示服从参数λ的莱维分布生成的随机搜索向量,f(Xi(t+1))表示鸟巢位置Xi(t+1)的适应度函数值,f(xi(t))表示鸟巢位置xi(t)的适应度函数值;
In the formula, Xi (t+1) represents the bird’s nest position after the (t+1)th iteration update of the bird’s nest position xi (t) using the Levi flight mode, α represents the step factor,
Figure FDA0003543567060000075
Represents point-to-point multiplication, L(λ) represents the random search vector generated by the Levy distribution obeying the parameter λ, f(Xi (t+1)) represents the fitness function value of the bird’s nest position Xi (t+1), f( xi (t)) represents the fitness function value of the bird's nest position xi (t);
在采用莱维飞行模式对种群中的鸟巢位置进行第(t+1)次的迭代更新后,采用下列步骤在种群中选取鸟巢位置进行第(t+1)次的偏好随机更新,具体包括:After the (t+1)th iterative update of the bird's nest position in the population using the Levy flight mode, the following steps are used to select the bird's nest position in the population for the (t+1)th random preference update, including:(1)对当前采用莱维飞行模式进行了第(t+1)次迭代更新后的鸟巢位置进行区域划分;(1) Divide the area of the bird's nest position after the (t+1)th iterative update in the current Levie flight mode;(2)在划分的各区域中选取鸟巢位置进行第(t+1)次的偏好随机更新;(2) Select the bird's nest position in each divided area to perform the (t+1)th random update of preference;给定种群在采用莱维飞行模式进行第(t+1)次的迭代更新后的区域分割阈值D(t+1),且D(t+1)的值设置为:
Figure FDA0003543567060000076
其中,Di(t+1)表示鸟巢位置Xi(t+1)的近邻分割值,且
Figure FDA0003543567060000077
Figure FDA0003543567060000078
表示当前种群中距离鸟巢位置Xi(t+1)第e近的鸟巢位置,K为给定的正整数,且K<N,K的值可以取5,N为种群中的布谷鸟数,采用下列步骤根据给定的区域分割阈值D(t+1)对当前采用莱维飞行模式进行了第(t+1)次迭代更新后的鸟巢位置进行区域划分:
The regional segmentation threshold D(t+1) of the given population after the (t+1)-th iteration update using the Levy flight mode, and the value of D(t+1) is set as:
Figure FDA0003543567060000076
Among them, Di (t+1) represents the neighbor segmentation value of the bird’s nest position Xi( t+1), and
Figure FDA0003543567060000077
Figure FDA0003543567060000078
Indicates the position of the bird's nest closest to the bird's nest positionXi (t+1) in the current population, K is a given positive integer, and K<N, the value of K can be 5, and N is the number of cuckoos in the population, The following steps are used to divide the area of the bird's nest after the (t+1)th iteration update using the Levie flight mode according to the given area segmentation threshold D(t+1):
Stepl:在种群中随机选取一个鸟巢位置,设Xa(t+1)为此次随机选取的鸟巢位置,xa(t)表示布谷鸟种群中的第a个布谷鸟在进行第t次迭代更新后的最终鸟巢位置,则Xa(t+1)表示采用莱维飞行模式对鸟巢位置xa(t)进行第(t+1)次的迭代更新后的鸟巢位置,将鸟巢位置Xa(t+1)所处区域标记为Ω1(t+1),并将鸟巢位置Xa(t+1)划分进区域Ω1(t+1)中,对种群中未划分区域的鸟巢位置依次进行筛选,具体为:Step1: Randomly select a bird's nest position in the population, let Xa (t+1) be the bird's nest position randomly selected this time, and xa (t) indicate that the a-th cuckoo in the cuckoo population is performing the t-th iteration The updated final bird’s nest position, then Xa (t+1) represents the bird’s nest position after the (t+1)th iteration update is performed on the bird’s nest position xa (t) using the Levy flight mode, and the bird’s nest position Xa The area where (t+1) is located is marked as Ω1 (t+1), and the bird’s nest position Xa (t+1) is divided into the area Ω1 (t+1), and the bird’s nest position in the undivided area in the population is Filter in order, specifically:设置xi(t)表示布谷鸟种群中的第i个布谷鸟在进行第t次迭代更新后的最终鸟巢位置,Xi(t+1)表示采用莱维飞行模式对鸟巢位置xi(t)进行第(t+1)次的迭代更新后的鸟巢位置,当鸟巢位置Xi(t+1)满足:
Figure FDA0003543567060000081
时,则将鸟巢位置Xi(t+1)划分进区域Ω1(t+1)中;
Setxi (t) to represent the final bird's nest position of the i-th cuckoo in the cuckoo population after the t-th iteration update, and Xi (t+1) to represent the use of Levi flight mode to adjust the bird's nest positionxi (t ) performs the (t+1)th iteration to update the bird’s nest position, when the bird’s nest position Xi (t+1) satisfies:
Figure FDA0003543567060000081
When , the bird's nest positionXi (t+1) is divided into the area Ω1 (t+1);
当对种群中未划分区域的鸟巢位置筛选完成后,进入Step2;After completing the screening of bird nests in undivided areas in the population, go to Step 2;Step2:在种群中未划分进区域的鸟巢位置中随机选取一个鸟巢位置,设Xb(t+1)为此次随机选取的鸟巢位置,xb(t)表示布谷鸟种群中的第b个布谷鸟在进行第t次迭代更新后的最终鸟巢位置,则Xb(t+1)表示采用莱维飞行模式对鸟巢位置xb(t)进行第(t+1)次的迭代更新后的鸟巢位置,将鸟巢位置Xb(t+1)所处的区域标记为Ω2(t+1),并将鸟巢位置Xb(t+1)划分进区域Ω2(t+1)中,对种群中未划分进行区域的鸟巢位置依次进行筛选,具体为:Step2: Randomly select a bird's nest position among the bird's nest positions that are not divided into areas in the population, let Xb (t+1) be the bird's nest position randomly selected this time, and xb (t) represent the bth in the cuckoo population The final bird's nest position of the cuckoo after the t-th iteration update, then Xb (t+1) represents the (t+1)-th iteration update of the bird's nest position xb (t) using the Levi flight mode. Bird's nest position, mark the area where the bird's nest position Xb (t+1) is located as Ω2 (t+1), and divide the bird's nest position Xb (t+1) into the region Ω2 (t+1), The positions of bird nests that are not divided into areas in the population are screened in sequence, specifically:当鸟巢位置Xi(t+1)满足:
Figure FDA0003543567060000082
时,则将鸟巢位置Xi(t+1)划分进行区域Ω2(t+1)中;
When the bird's nest positionXi (t+1) satisfies:
Figure FDA0003543567060000082
When , the bird's nest positionXi (t+1) is divided into the region Ω2 (t+1);
当对种群中未划分进区域的鸟巢位置筛选完成后,进入步骤Step3;When the selection of the bird's nest positions in the population that are not divided into areas is completed, go to Step 3;Step3:当种群中未划分进区域的鸟巢位置的个数不为0时,则继续按照步骤Step2中的方式对种群中未划分进区域的鸟巢位置进行区域划分,当种群中未划分进区域的鸟巢位置的个数为0时,则停止对种群中的鸟巢位置进行区域划分;Step3: When the number of nest positions that are not divided into areas in the population is not 0, continue to divide the bird's nest positions that are not divided into areas in the population according to the method in Step 2. When the number of bird's nest positions is 0, the regional division of bird's nest positions in the population is stopped;在划分的各区域中选取鸟巢位置进行第(t+1)次的偏好随机更新,具体为:Select the bird's nest position in each divided area to perform the (t+1)th random update of the preference, specifically:设置Ωl(t+1)表示对采用莱维飞行模式进行第(t+1)次的迭代更新后的鸟巢位置进行区域划分所得的第l个区域,定义Fl(t+1)表示区域Ωl(t+1)中鸟巢位置的区域属性系数,且Fl(t+1)的值为:
Figure FDA0003543567060000091
式中,
Figure FDA0003543567060000092
表示区域Ωl(t+1)中鸟巢位置的临近距离值的均值,且
Figure FDA0003543567060000093
Figure FDA0003543567060000094
表示区域Ωl(t+1)中鸟巢位置的临近距离值的离散系数,且
Figure FDA0003543567060000095
其中,Xl,o(t+1)表示区域Ωl(t+1)中的第o个鸟巢位置,dl,o(t+1)表示鸟巢位置Xl,o(t+1)的临近距离值,且
Figure FDA0003543567060000096
Figure FDA0003543567060000097
Figure FDA0003543567060000098
表示区域Ωl(t+1)中距离鸟巢位置Xl,o(t+1)最近的鸟巢位置,Ml(t+1)表示区域Ωl(t+1)中的鸟巢位置数;
Set Ωl (t+1) to represent the l-th area obtained by dividing the bird’s nest position after the (t+1)-th iterative update using the Levy flight mode, and define Fl (t+1) to represent the area The regional attribute coefficient of the bird's nest position in Ωl (t+1), and the value of Fl (t+1) is:
Figure FDA0003543567060000091
In the formula,
Figure FDA0003543567060000092
represents the mean of the proximity distance values of the bird's nest location in the region Ωl (t+1), and
Figure FDA0003543567060000093
Figure FDA0003543567060000094
is the dispersion coefficient representing the proximity distance value of the bird's nest location in the area Ωl (t+1), and
Figure FDA0003543567060000095
Among them, Xl, o (t+1) represents the o-th bird's nest position in the area Ωl (t+1), and dl, o (t+1) represents the position of the bird's nest in Xl, o (t+1) the proximity distance value, and
Figure FDA0003543567060000096
Figure FDA0003543567060000097
Figure FDA0003543567060000098
represents the bird's nest position closest to the bird's nest position Xl, o (t+1) in the region Ωl (t+1), and Ml (t+1) represents the number of bird nest positions in the region Ωl (t+1);
按照下列步骤对区域Ωl(t+1)中的鸟巢位置进行可替代性检测:The alternative detection of the bird's nest location in the region Ωl (t+1) is carried out as follows:步骤1:设置Nl(t+1)表示区域Ωl(t+1)中当前未进行可替代性检测的鸟巢位置集合,设X′l,j(t+1)表示集合Nl(t+1)中的第j个鸟巢位置,定义Y′l,j(t+1)表示鸟巢位置X′l,j(t+1)在集合Nl(t+1)中的可替代性系数,且Y′l,j(t+1)的值为:Step 1: Set Nl (t+1) to represent the set of bird's nest positions in the area Ωl (t+1) that are not currently undergoing alternative detection, and set X′l, j (t+1) to represent the set Nl (t The jth bird's nest position in +1), the definition Y'l, j (t+1) represents the substitutability coefficient of the bird's nest position X'l, j (t+1) in the set Nl (t+1) , and the value of Y′l, j (t+1) is:
Figure FDA0003543567060000099
Figure FDA0003543567060000099
式中,X′l,k(t+1)表示集合Nl(t+1)中的第k个鸟巢位置,ρ(X′l,j(t+1),X′l,k(t+1))为鸟巢位置X′l,j(t+1)和鸟巢位置X′l,k(t+1)之间的可替代判断函数,当
Figure FDA00035435670600000910
Figure FDA00035435670600000911
时,ρ(X′l,j(t+1),X′l,k(t+1))的值取1,当
Figure FDA00035435670600000912
Figure FDA00035435670600000913
时,ρ(x′l,j(t+1),X′l,k(t+1))的值取0;
In the formula, X'l, k (t+1) represents the k-th bird's nest position in the set Nl (t+1), ρ(X'l, j (t+1), X'l, k (t +1)) is an alternative judgment function between the bird's nest position X'l, j (t+1) and the bird's nest position X'l, k (t+1), when
Figure FDA00035435670600000910
Figure FDA00035435670600000911
, the value of ρ(X′l, j (t+1), X′l, k (t+1)) is 1, when
Figure FDA00035435670600000912
Figure FDA00035435670600000913
When , the value of ρ(x'l, j (t+1), X'l, k (t+1)) is 0;
按照上述方法计算集合Nl(t+1)中各鸟巢位置的可替代性系数;Calculate the substitutability coefficient of each bird's nest position in the set Nl (t+1) according to the above method;步骤2:在当前集合Nl(t+1)中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,设X′l,c(t+1)为当前集合Nl(t+1)中具有最大可替代性系数的鸟巢位置,且X′l,c(t+1)表示集合Nl(t+1)中的第c个鸟巢位置,N′l,c(t+1)表示鸟巢位置X′l,c(t+1)在集合Nl(t+1)中的可替代鸟巢位置集合,且N′l,c(t+1)={X′l,s(t+1):ρ(X′l,c(t+1),X′l,s(t+1))=1且X′l,s(t+1)∈Nl(t+1)},其中,X′l,s(t+1)表示集合Nl(t+1)中的第s个鸟巢位置,ρ(X′l,c(t+1),X′l,s(t+1))为鸟巢位置X′l,c(t+1)和鸟巢位置X′l,s(t+1)之间的可替代判断函数,当
Figure FDA0003543567060000104
时,ρ(X′l,c(t+1),X′l,s(t+1))的值取1,当
Figure FDA0003543567060000105
时,ρ(X′l,c(t+1),X′l,s(t+1))的值取0;设Y′l,c(t+1)表示鸟巢位置X′l,c(t+1)在集合Nl(t+1)中的可替代性系数,当Y′l,c(t+1)的值满足:Y′l,c(t+1)>Y时,则在集合N′l,c(t+1)中随机选取
Figure FDA0003543567060000101
个鸟巢位置进行偏好随机更新,并将集合N′l,c(t+1)中的鸟巢位置和鸟巢位置X′l,c(t+1)都标注为已进行可替代性检测的鸟巢位置,并在集合Nl(t+1)中去除所述已进行可替代性检测的鸟巢位置后进入步骤3;当Y′l,c(t+1)的值满足:Y′l,c(t+1)≤Y时,则停止在区域Ωl(t+1)中选取鸟巢位置进行第(t+1)次的偏好随机更新,其中,Y为给定的可替代性检测阈值,且Y为大于1的正整数,Y的值可以取4;
Step 2: Select the bird's nest position with the largest substitutability coefficient in the current set Nl (t+1) for substitutability detection, let X'l, c (t+1) be the current set Nl (t+1) ), and X′l,c (t+1) represents the c-th bird’s nest position in the set Nl (t+1), N′l,c (t+1) represents the set of alternative bird's nest positions of the bird's nest position X'l,c (t+1) in the set Nl (t+1), and N'l,c (t+1)={X'l,s (t +1): ρ(X'l,c (t+1),X'l,s (t+1))=1 and X'l,s (t+1)∈Nl (t+1)} , where X'l, s (t+1) represents the s-th bird's nest position in the set Nl (t+1), ρ(X'l, c (t+1), X'l, s (t +1)) is an alternative judgment function between the bird's nest position X'l, c (t+1) and the bird's nest position X'l, s (t+1), when
Figure FDA0003543567060000104
, the value of ρ(X′l, c (t+1), X′l, s (t+1)) is 1, when
Figure FDA0003543567060000105
, the value of ρ(X'l, c (t+1), X'l, s (t+1)) is 0; let Y'l, c (t+1) represent the bird's nest position X'l, c The substitutability coefficient of (t+1) in the set Nl (t+1), when the value of Y′l, c (t+1) satisfies: Y′l, c (t+1)>Y, Then randomly select from the set N'l, c (t+1)
Figure FDA0003543567060000101
The preference is randomly updated for each bird's nest position, and the bird's nest position in the set N'l, c (t+1) and the bird's nest position X'l, c (t+1) are marked as the bird's nest position that has been replaced by the detection. , and remove the bird's nest position that has undergone alternative detection in the set Nl (t+1) and enter step 3; when the value of Y'l, c (t+1) satisfies: Y'l, c ( When t+1)≤Y, stop selecting the bird’s nest position in the region Ωl (t+1) to perform the (t+1)th random preference update, where Y is the given alternative detection threshold, and Y is a positive integer greater than 1, and the value of Y can take 4;
步骤3:继续按照步骤2中的方式在当前集合Nl(t+1)中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,直到集喝Nl(t+1)中的鸟巢位置都进行了可替代性检测;Step 3: Continue to select the bird's nest position with the largest substitutability coefficient in the current set Nl (t+1) according to the method in step 2 for substitutability detection, until the bird's nest in the set Nl (t+1) Substitutable testing has been carried out on the location;设置选取的鸟巢位置采用下列方式进行第(t+1)次的偏好随机更新:Set the selected bird's nest location to perform the (t+1)th random update of the preference in the following way:设Xp(t+1)表示选取的进行第(t+1)次的偏好随机更新的鸟巢位置,xp(t)表示种群中的第p只布谷鸟在第t次迭代更新后的最终鸟巢位置,则Xp(t+1)表示采用莱维飞行模式对鸟巢位置xp(t)进行第(t+1)次迭代更新后的鸟巢位置,则Xp(t+1)采用下列方式进行第(t+1)次的偏好随机更新:Let Xp (t+1) denote the selected bird’s nest position for the (t+1)-th preference random update, and xp (t) denote the final position of the p-th cuckoo in the population after the t-th iteration update. The bird's nest position, then Xp (t+1) represents the bird's nest position after the (t+1)th iteration update of the bird's nest position xp (t) using the Levy flight mode, then Xp (t+1) adopts the following The (t+1)th preference random update is performed in the following way:
Figure FDA0003543567060000102
Figure FDA0003543567060000102
Figure FDA0003543567060000103
Figure FDA0003543567060000103
式中,χp(t+1)表示采用随机偏好更新模式对鸟巢位置Xp(t+1)进行迭代更新后的鸟巢位置,f(χp(t+1))表示鸟巢位置χp(t+1)的适应度函数值,f(Xp(t+1))表示鸟巢位置Xp(t+1)的适应度函数值,xp(t+1)表示种群中第p只布谷鸟在第(t+1)次迭代更新后的最终鸟巢位置,Xp1(t+1)和Xp2(t+1)为在当前采用莱维飞行模式进行第(t+1)次迭代更新后的鸟巢位置中随机选取的两个鸟巢位置,且Xp1(t+1)≠Xp2(t+1),rand表示0到1之间的随机数。In the formula, χp (t+1) represents the bird’s nest position after iterative update of the bird’s nest position Xp (t+1) using the random preference update mode, and f(χp (t+1)) represents the bird’s nest position χp ( The fitness function value of t+1), f(Xp (t+1)) represents the fitness function value of the bird’s nest location Xp (t+1), and xp (t+1) represents the p-th cuckoo in the population The final bird’s nest position after the (t+1)th iteration update, Xp1 (t+1) and Xp2 (t+1) are the (t+1)th iteration update in the current Levi flight mode Two bird nest positions are randomly selected from the latter bird nest positions, and Xp1 (t+1)≠Xp2 (t+1), rand represents a random number between 0 and 1.
CN202210243161.8A2022-03-112022-03-11Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligencePendingCN114612239A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202210243161.8ACN114612239A (en)2022-03-112022-03-11Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202210243161.8ACN114612239A (en)2022-03-112022-03-11Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence

Publications (1)

Publication NumberPublication Date
CN114612239Atrue CN114612239A (en)2022-06-10

Family

ID=81862616

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202210243161.8APendingCN114612239A (en)2022-03-112022-03-11Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence

Country Status (1)

CountryLink
CN (1)CN114612239A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN116012019A (en)*2023-03-272023-04-25北京力码科技有限公司Financial wind control management system based on big data analysis
CN118552315A (en)*2024-07-302024-08-27上海大智慧信息科技有限公司Real-time monitoring system and device for stock abnormal transaction behavior
CN120470582A (en)*2025-07-082025-08-12福州高岭数据科技有限公司 An attack defense method and device for e-commerce intelligent customer service large model

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109583738A (en)*2018-11-222019-04-05第创业证券股份有限公司A kind of device and method for bond risk control
CN111061792A (en)*2019-12-162020-04-24杭州城市大数据运营有限公司Financial service management system
CN111192144A (en)*2020-01-032020-05-22湖南工商大学Financial data prediction method, device, equipment and storage medium
CN113034284A (en)*2021-04-142021-06-25刘星Stock tendency analysis and early warning system based on algorithm, big data and block chain
CN113065962A (en)*2021-03-312021-07-02北京安九信息技术有限公司Stock price transaction risk assessment method, system and device for listed companies
CN113393331A (en)*2021-06-102021-09-14罗忠明Database and algorithm based big data insurance accurate wind control, management, intelligent customer service and marketing system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109583738A (en)*2018-11-222019-04-05第创业证券股份有限公司A kind of device and method for bond risk control
CN111061792A (en)*2019-12-162020-04-24杭州城市大数据运营有限公司Financial service management system
CN111192144A (en)*2020-01-032020-05-22湖南工商大学Financial data prediction method, device, equipment and storage medium
CN113065962A (en)*2021-03-312021-07-02北京安九信息技术有限公司Stock price transaction risk assessment method, system and device for listed companies
CN113034284A (en)*2021-04-142021-06-25刘星Stock tendency analysis and early warning system based on algorithm, big data and block chain
CN113393331A (en)*2021-06-102021-09-14罗忠明Database and algorithm based big data insurance accurate wind control, management, intelligent customer service and marketing system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN116012019A (en)*2023-03-272023-04-25北京力码科技有限公司Financial wind control management system based on big data analysis
CN118552315A (en)*2024-07-302024-08-27上海大智慧信息科技有限公司Real-time monitoring system and device for stock abnormal transaction behavior
CN120470582A (en)*2025-07-082025-08-12福州高岭数据科技有限公司 An attack defense method and device for e-commerce intelligent customer service large model
CN120470582B (en)*2025-07-082025-09-19福州高岭数据科技有限公司 An attack defense method and device for e-commerce intelligent customer service large model

Similar Documents

PublicationPublication DateTitle
CN114612239A (en)Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence
CN114078050A (en)Loan overdue prediction method and device, electronic equipment and computer readable medium
Xu et al.Novel key indicators selection method of financial fraud prediction model based on machine learning hybrid mode
CN110930038A (en)Loan demand identification method, loan demand identification device, loan demand identification terminal and loan demand identification storage medium
CN117934162A (en) Multi-dimensional dynamic assessment of real estate mortgage financial risk prevention and control method and system
CN114897564A (en)Target customer recommendation method and device, electronic equipment and storage medium
CN116385151A (en)Method and computing device for risk rating prediction based on big data
CN113393316B (en)Loan overall process accurate wind control and management system based on massive big data and core algorithm
Chen et al.Predicting a corporate financial crisis using letters to shareholders.
ZangConstruction of Mobile Internet Financial Risk Cautioning Framework Based on BP Neural Network
CN117933568A (en)Operation decision method, apparatus, device, medium and program product
CN117172910A (en)Credit evaluation method and device based on EBM model, electronic equipment and storage medium
CN116844287A (en)Cash amount prediction method, cash amount prediction device, computer equipment and storage medium
CN116384750A (en)Method and computing device for generating marking sample and training risk rating prediction model
CN116384751A (en)Method and computing device for carrying out standardized risk index and risk rating prediction
CN117764692A (en)Method for predicting credit risk default probability
Nazari et al.Evaluating the effectiveness of data mining techniques in credit scoring of bank customers using mathematical models: a case study of individual borrowers of Refah Kargaran Bank in Zanjan Province, Iran
LohseMachine Learning in Banking: Exploring the feasibility of using consumer level bank transaction data for credit risk evaluation
CN120181980A (en) Method for constructing a serious negative risk prediction model for inclusive enterprises and a Scorenegative model for inclusive credit
CN118333737A (en)Method for constructing retail credit risk prediction model and consumer credit business Scorebetai model
CN119477509A (en) Methods for building retail credit risk prediction models and Scorealpha2 models for credit cards and special installment businesses
CN118333739A (en)Method for constructing retail credit risk prediction model and retail credit business Scoremult model
CN119887363A (en)Method for constructing retail credit risk prediction model and Internet credit service Scoregamma model
CN120106960A (en) Methods for building inclusive credit risk prediction models and inclusive credit Scorezeta model
Faris et al.Using Artificial Intelligence and Deep Learning Applications in Credit Risk Analysis

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication
RJ01Rejection of invention patent application after publication

Application publication date:20220610


[8]ページ先頭

©2009-2025 Movatter.jp