









技术领域technical field
本发明涉及金融风控管理技术领域,具体地说,涉及基于算法、大数据、人工智能的股票舆情监测和风控系统。The invention relates to the technical field of financial risk control and management, in particular to a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence.
背景技术Background technique
随着人们理财产品的快速推广,金融证券成了大多数人最主要的理财对象之一。金融证券是银行及非银行金融机构发行的证券。然而,任何理财产品都存在或高或低的金融风险,即任何有可能导致企业或机构财务损失的风险,而金融证券中的风险,不单单指证券单价变动给用户带来的资产损失,更包括来自用户及证券公司双方的风险,这些风险可能来自于事前的证券公司及用户的征信问题、用户风险识别能力不足、证券公司对用户信息未完全公开、事中的政策变化或舆情影响、事后的售后服务等。如何有效地及时判识或预测金融证券业务流程中可能存在的风险并进行有效控制,是制约我国金融证券服务行业长足发展的关键。国外实践证明,以大数据技术为核心的互联网金融风险控制体系对防范金融风险有着巨大的作用,因此作为金融机构重要组成部分的证券公司硬蛋充分利用积累的时长数据优势开展更多元的金融业务并完善风控体系。然而,目前却没有较为全面的基于算法、大数据、人工智能的股票舆情监测和风控系统。With the rapid promotion of financial products, financial securities have become one of the most important financial objects for most people. Financial securities are securities issued by banks and non-bank financial institutions. However, any wealth management product has high or low financial risks, that is, any risk that may lead to financial losses of enterprises or institutions, and the risk in financial securities does not only refer to the loss of assets caused by changes in the unit price of securities to users, but also Including risks from both users and securities companies, these risks may come from prior credit information issues of securities companies and users, insufficient user risk identification capabilities, securities companies’ incomplete disclosure of user information, policy changes or public opinion influences, After-sales service, etc. How to effectively and timely identify or predict the possible risks in the financial securities business process and control them effectively is the key to restricting the rapid development of my country's financial securities service industry. Foreign practice has proved that the Internet financial risk control system with big data technology as the core plays a huge role in preventing financial risks. Therefore, as an important part of financial institutions, Ingdan, a securities company, makes full use of the advantages of accumulated time data to develop more diversified financial services. business and improve the risk control system. However, there is currently no comprehensive stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供了基于算法、大数据、人工智能的股票舆情监测和风控系统,以解决上述背景技术中提出的问题。The purpose of the present invention is to provide a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, so as to solve the problems raised in the above background technology.
为实现上述技术问题的解决,本发明的目的之一在于,提供了基于算法、大数据、人工智能的股票舆情监测和风控系统,包括In order to solve the above technical problems, one of the purposes of the present invention is to provide a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence, including:
平台架构单元、数据处理单元、预测判识单元和风控管理单元;所述平台架构单元、所述数据处理单元、所述预测判识单元与所述风控管理单元依次通过网络通信连接;所述平台架构单元用于对构成平台运行环境的设备、软件及技术应用进行连接及管理;所述数据处理单元用于采集获取大量与金融证券及其风险相关的多元数据,通过对数据进行整理分析并建立完善的数据分析模型;所述预测判识单元用于通过对大量数据的伸入挖掘分析来预测金融证券业务流程中可能存在的风险,对风险进行类型识别和程度分析;所述风控管理单元用于从多方面、采用多种风控手段对金融证券业务的风险进行管控;a platform structure unit, a data processing unit, a prediction identification unit and a risk control management unit; the platform structure unit, the data processing unit, the prediction identification unit and the risk control management unit are sequentially connected through network communication; the The platform architecture unit is used to connect and manage the equipment, software and technical applications that constitute the operating environment of the platform; the data processing unit is used to collect and obtain a large amount of multivariate data related to financial securities and their risks. A perfect data analysis model is established; the prediction and identification unit is used to predict the possible risks in the financial securities business process by digging into and analyzing a large amount of data, and to carry out type identification and degree analysis of the risks; the risk control management The unit is used to manage and control the risks of financial securities business from various aspects and adopt various risk control methods;
所述平台架构单元包括基建设备模块、软件环境模块、技术支撑模块和三方平台模块;The platform architecture unit includes an infrastructure equipment module, a software environment module, a technical support module and a tripartite platform module;
所述数据处理单元包括数据集合模块、分类整理模块、数据分析模块和数据模型模块;The data processing unit includes a data collection module, a classification and arrangement module, a data analysis module and a data model module;
所述预测判识单元包括动态监测模块、风险预测模块、类型识别模块和程度判定模块;The prediction and identification unit includes a dynamic monitoring module, a risk prediction module, a type identification module and a degree determination module;
所述风控管理单元包括风险控制模块、合作风控模块、监管干预模块和改良措施模块。The risk control management unit includes a risk control module, a cooperative risk control module, a supervisory intervention module and an improvement measure module.
作为本技术方案的进一步改进,所述基建设备模块、所述软件环境模块、所述技术支撑模块与所述三方平台模块依次通过网络通信连接;所述基建设备模块用于对加入风控平台系统的电子计算机设备进行连接管理;所述软件环境模块用于在基建设备的基础上研发针对证券金融业务风险管理的软件及应用平台,以便构建支持系统的运行环境;所述技术支撑模块用于载入以人工智能为主的智能技术,并引入多种智能算法来支撑平台系统的顺畅运行;所述三方平台模块用于连接多个如金融证券信息管理平台、监管平台等第三方服务平台以获取大量补充数据及补充服务。As a further improvement of this technical solution, the infrastructure equipment module, the software environment module, the technical support module and the third-party platform module are sequentially connected through network communication; the infrastructure equipment module is used for adding the risk control platform system The electronic computer equipment is used for connection management; the software environment module is used to develop software and application platforms for securities financial business risk management on the basis of infrastructure equipment, so as to build an operating environment that supports the system; the technical support module is used to carry The three-party platform module is used to connect multiple third-party service platforms such as financial securities information management platforms and regulatory platforms to obtain Extensive supplementary data and supplementary services.
其中,基建设备包括但不限于计算机、显示器、PC平板、手机、智能传感器、数据采集装置(扫描仪、RFID、身份证OCR、人脸/指纹识别器等)等。Among them, infrastructure equipment includes but is not limited to computers, monitors, PC tablets, mobile phones, smart sensors, data acquisition devices (scanners, RFID, ID card OCR, face/fingerprint readers, etc.), etc.
作为本技术方案的进一步改进,所述数据集合模块的信号输出端与所述分类整理模块的信号输入端连接,所述分类整理模块的信号输出端与所述数据分析模块的信号输入端连接,所述数据分析模块的信号输出端与所述数据模型模块的信号输入端连接;所述数据集合模块用于通过多种手段从多来源获取大量与金融证券相关的数据;所述分类整理模块用于按照一定的类别规则将大量的数据进行分类归纳整理操作,以便进行后期的计算分析;所述数据分析模块用于采用多种全球领先技术来对金融证券的数据进行分析;所述数据模型模块用于以大量的数据为基础、根据数据分析的结果构建风险分析的数据模型并进行训练及验证。As a further improvement of this technical solution, the signal output end of the data collection module is connected with the signal input end of the classification and arrangement module, and the signal output end of the classification arrangement module is connected with the signal input end of the data analysis module, The signal output end of the data analysis module is connected to the signal input end of the data model module; the data collection module is used to obtain a large amount of data related to financial securities from multiple sources through various means; It is used to classify and summarize a large amount of data according to certain category rules, so as to carry out later calculation and analysis; the data analysis module is used to analyze the data of financial securities by adopting a variety of world-leading technologies; the data model module It is used to build a data model for risk analysis based on a large amount of data and according to the results of data analysis, and conduct training and verification.
作为本技术方案的进一步改进,所述数据集合模块包括舆情资讯模块、用户征信模块、公司产品模块和交易活动模块;所述舆情资讯模块、所述用户征信模块、所述公司产品模块与所述交易活动模块依次通过网络通信连接且并列运行;所述舆情资讯模块用于从网络上获取公开的历史或实时的与金融证券相关的舆情资讯;所述用户征信模块用于从用户、证券公司、合作银行等方面以合法手段或经用户授权后获取与用户征信相关的信息;所述公司产品模块用于获取各证券公司包括经营情况、公开资产、公司业务及具体产品详情等信息数据;所述交易活动模块用于获取用户与证券公司之间的交易活动的全流程信息。As a further improvement of this technical solution, the data collection module includes a public opinion information module, a user credit reporting module, a company product module and a transaction activity module; the public opinion information module, the user credit reporting module, the company product module and the The transaction activity modules are sequentially connected through network communication and run in parallel; the public opinion information module is used to obtain public historical or real-time public opinion information related to financial securities from the network; the user credit reporting module is used to obtain information from users, Securities companies, cooperative banks, etc. obtain information related to user credit reporting by legal means or with the authorization of users; the company product module is used to obtain information about securities companies, including business conditions, public assets, company business, and specific product details. data; the transaction activity module is used to obtain the whole process information of the transaction activity between the user and the securities company.
作为本技术方案的进一步改进,所述数据分析模块包括神经网络模块、机器学习模块、支持向量机模块和碰撞分析模块;所述神经网络模块、所述机器学习模块、所述支持向量机模块与所述碰撞分析模块依次通过网络通信连接且独立运行;所述神经网络模块用于通过神经网络的训练算法来将算法权重的值调整到最佳,以使得整个网络的预测效果最好,利用训练样本集中的样本对BP神经网络或支持向量机进行训练,利用测试样本集中的样本对BP神经网络或支持向量机进行测试,从而构建基于BP神经网络的股票走势预测模型用于对目标股票的走势进行预测,参数优化单元利用人工萤火虫群优化算法对BP神经网络的初始权值和阈值进行优化或采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优;所述机器学习模块用于使用机器学习相关的技术进行神经网络的训练,使得参数尽可能的与真实的模型逼近,从而使得模型训练可以获得性能与数据利用上的双重优势;所述支持向量机模块用于采用支持向量机的算法,通过构造分割面将数据进行分离,以便进行关系分析;所述碰撞分析模块用于将来自不同金融领域、不同金融机构的数据及风险因素进行碰撞分析以挖掘潜在的风险情况。As a further improvement of this technical solution, the data analysis module includes a neural network module, a machine learning module, a support vector machine module and a collision analysis module; the neural network module, the machine learning module, the support vector machine module and the The collision analysis module is sequentially connected through network communication and operates independently; the neural network module is used to adjust the value of the algorithm weight to the best through the training algorithm of the neural network, so that the prediction effect of the entire network is the best, and the training The samples in the sample set are used to train the BP neural network or the support vector machine, and the samples in the test sample set are used to test the BP neural network or the support vector machine, so as to construct a stock trend prediction model based on the BP neural network to predict the trend of the target stock. For prediction, the parameter optimization unit uses the artificial firefly swarm optimization algorithm to optimize the initial weights and thresholds of the BP neural network or uses the cuckoo search algorithm to optimize the penalty factor and kernel function parameters of the support vector machine; the machine learning module It is used to train the neural network using machine learning related technologies, so that the parameters are as close to the real model as possible, so that the model training can obtain the dual advantages of performance and data utilization; the support vector machine module is used to adopt support The algorithm of the vector machine separates the data by constructing a segmentation plane for relationship analysis; the collision analysis module is used for collision analysis of data and risk factors from different financial fields and financial institutions to mine potential risk situations.
作为本技术方案的进一步改进,所述机器学习模块中,机器学习的训练算法为:As a further improvement of this technical solution, in the machine learning module, the training algorithm for machine learning is:
首先给所有参数赋上随机值,使用这些随机生成的参数值,来预测训练数据中的样本;First assign random values to all parameters, and use these randomly generated parameter values to predict the samples in the training data;
设样本的预测目标为,真实目标为y,那么定义一个值loss,计算公式如下:Let the prediction target of the sample be , the real target is y, then define a value loss, the calculation formula is as follows:
; ;
其中,loss称为损失,机器学习的目标是使对所有训练数据的损失和尽可能的小;Among them, loss is called loss, and the goal of machine learning is to make the loss sum of all training data as small as possible;
进而,如果将先前的神经网络预测的矩阵公式带入到中,则可以把损失写为关于参数的损失函数。Furthermore, if the matrix formula predicted by the previous neural network is brought into , the loss can be written as a loss function with respect to the parameters.
作为本技术方案的进一步改进,所述支持向量机模块中,支持向量的算法选择方式为:As a further improvement of this technical solution, in the support vector machine module, the algorithm selection method of the support vector is:
以线性可分SVM为例,将W认为是若干样本线性组合得到的,则第1个样本为,第i个为,对于每个x,给予其系数,此时存在:,选取部分,使它们的值不为0,其余值都设为0,则对w真正起作用的就是值不为0的这些x向量,这些向量支持了法线向量,因此就是支持向量;Taking linearly separable SVM as an example, W is considered to be obtained by linear combination of several samples, then the first sample is , the i-th is , for each x, give its coefficients , which exists at this time: , select part , so that their values are not 0, and the rest of the values are set to 0, then what really works on w is these x vectors whose values are not 0. These vectors support the normal vector, so they are support vectors;
若直线l有参数w和b,通过计算每个样本到直线l的距离,衡量哪条直线是最为合适的分割线;距离d可以表示为:,若每个数据集中样本的形式为,而每个样本的y值,就是这个样本的label(正例为1,负例为-1,这里的正负值其实反映的就是样本位于分割线的方向,位于法线正方向即为正);If the straight line l has parameters w and b, by calculating the distance from each sample to the straight line l, we can measure which straight line is the most suitable dividing line; the distance d can be expressed as: , if the form of the samples in each dataset is , and the y value of each sample is the label of the sample (positive example is 1, negative example is -1, the positive and negative values here actually reflect the direction of the sample in the dividing line, and the positive direction of the normal is positive );
将y值一起乘入等式右边: ,这里的y值是样本的实际正负值,如果估计值与实际值符号相同,即分类正确,此时的结果为正值,如果分类错误,则结果为负值;Multiply the y values together into the right-hand side of the equation: , the y value here is the actual positive and negative values of the sample. If the estimated value has the same sign as the actual value, that is, the classification is correct, the result at this time is a positive value, and if the classification is wrong, the result is a negative value;
在所有样本中,距离该直线最近的样本应被选为支持向量,支持向量与直线间的距离即为过渡带,因为SVM期望过渡带尽可能大,因此最终参数w与b的选择可以表示为:Among all samples, the sample closest to the line should be selected as the support vector, and the distance between the support vector and the line is the transition band, because SVM expects the transition band to be as large as possible, so the selection of the final parameters w and b can be expressed as :
; ;
因此,给定线性可分训练数据集,通过间隔最大化得到的分割超平面为:,相应的分类决策函数为:。Therefore, given a linearly separable training dataset, the segmentation hyperplane obtained by margin maximization is: , the corresponding classification decision function is: .
作为本技术方案的进一步改进,所述动态监测模块的信号输出端与所述风险预测模块的信号输入端连接,所述风险预测模块的信号输出端与所述类型识别模块的信号输入端连接,所述类型识别模块的信号输出端与所述程度判定模块的信号输入端连接;所述动态监测模块用于通过数字技术,以往期某一时期或某一时点的用户数据作为审核依据的风控方式,替代不能够抓住延续性数据的风控方式,重视具备延续性的用户信息并在复制上给予更高权重,从而实现对金融证券业务风险的动态监测;所述风险预测模块用于通过构建的风险预测数据模型自动预测用户与证券公司进行交易活动全流程中可能存在的风险因素;所述类型识别模块用于根据预测出的风险的在交易活动中所处的位置来识别该风险的类型;所述程度判定模块用于按照预设的风险等级划分规则自动评估各风险的程度情况。As a further improvement of this technical solution, the signal output end of the dynamic monitoring module is connected to the signal input end of the risk prediction module, and the signal output end of the risk prediction module is connected to the signal input end of the type identification module, The signal output end of the type identification module is connected with the signal input end of the degree determination module; the dynamic monitoring module is used for risk control through digital technology, using the user data of a certain period or a certain point in the past as the audit basis Instead of the risk control method that cannot grasp the continuous data, it pays attention to the continuous user information and gives a higher weight to the copy, so as to realize the dynamic monitoring of the financial and securities business risks; the risk prediction module is used to The constructed risk prediction data model automatically predicts the risk factors that may exist in the whole process of the user and the securities company's trading activities; the type identification module is used to identify the predicted risk according to its position in the trading activity. type; the degree determination module is used to automatically evaluate the degree of each risk according to the preset risk level classification rules.
其中,风险类型包括但不限于市场风险、信用风险、流动性风险、作业风险、行业风险、法律法规或政策风险、人事风险、自然灾害或其他突发事件等。Among them, the types of risks include but are not limited to market risks, credit risks, liquidity risks, operational risks, industry risks, legal and regulatory or policy risks, personnel risks, natural disasters or other emergencies, etc.
作为本技术方案的进一步改进,所述风险控制模块、所述合作风控模块、所述监管干预模块与所述改良措施模块依次通过网络通信连接;所述风险控制模块用于分别从事前、事中及事后三个方面来对金融证券业务流程中可能出现的各类风险进行控制管理;所述合作风控模块用于通过将不同金融领域、不同金融机构内的风险控制数据及风控方法实现共享合作从而提高风险控制的效果;所述监管干预模块用于从数据监管入手,在允许进一步放开券商对客户信息与交易数据开发权限的基础上,实时监控券商自身或第三方企业获取客户相关信息的来源于渠道,以及进一步进行数据内部深加工的流程和后续构建的包括客户交易习惯和征信等资料库,从而实现证券互联网化的全程监管,并引入第三方监管平台的干预手段来保障金融证券业务的低风险;所述改良措施模块用于证券公司利用大数据对一些业务功能进行改良来强化其风控体系。As a further improvement of this technical solution, the risk control module, the cooperative risk control module, the supervisory intervention module and the improvement measure module are sequentially connected through network communication; Control and management of various risks that may occur in the financial securities business process in three aspects: during and after the event; the cooperative risk control module is used to realize the risk control data and risk control methods in different financial fields and different financial institutions. Share and cooperate to improve the effect of risk control; the supervision intervention module is used to start from data supervision, and on the basis of allowing securities companies to further release the development rights of customer information and transaction data, real-time monitoring of securities companies themselves or third-party companies to obtain customer related information The source of information, the process of further internal deep processing of data, and the subsequent construction of databases including customer trading habits and credit information, so as to realize the whole process of securities Internet-based supervision, and introduce the intervention methods of third-party supervision platforms to ensure financial security Low risk of securities business; the improvement measure module is used for securities companies to use big data to improve some business functions to strengthen their risk control system.
其中,事前风险控制主要包括征信、风险定价、反欺诈等方面。Among them, ex-ante risk control mainly includes credit investigation, risk pricing, anti-fraud and other aspects.
作为本技术方案的进一步改进,所述改良措施模块包括用户上线模块、交易账户模块、数据融合模块和产品创新模块;所述用户上线模块、所述交易账户模块、所述数据融合模块与所述产品创新模块依次通过网络通信连接;所述用户上线模块用于将证券公司大量的线下存量客户线下存档的资料、交易类行为等数据进行线上化,以便衬垫用户的线上数据;所述交易账户模块用于打通用户证券交易账户的线上支付,拓展其线上的非证券交易功能,将账户体系丰富到其他线上平台以积累更多的非证券交易数据;所述数据融合模块用于在多个金融信息管理平台之上搭建数据融合平台以将所有数据进行归集整合,从而可以从各个维度对个体行为进行分析与预测,从全维度开展对个体的风险评估使评估迅速且准确;所述产品创新模块用于以大数据分析作为全方位产品创新的基础,以便开发定制化产品并精准推送和营销,并可以进行互联网化的产品设计及利用大数据进行风险定价。As a further improvement of this technical solution, the improvement measure module includes a user online module, a transaction account module, a data fusion module and a product innovation module; the user online module, the transaction account module, the data fusion module and the The product innovation modules are sequentially connected through network communication; the user online module is used to onlineize the data, transaction behavior and other data archived offline by a large number of offline stock customers of securities companies, so as to cushion the online data of users; The transaction account module is used to open up the online payment of the user's securities transaction account, expand its online non-securities transaction function, and enrich the account system to other online platforms to accumulate more non-securities transaction data; the data fusion The module is used to build a data fusion platform on top of multiple financial information management platforms to collect and integrate all data, so that individual behaviors can be analyzed and predicted from various dimensions, and individual risk assessments can be carried out from all dimensions to make the assessment quickly. and accurate; the product innovation module is used to use big data analysis as the basis for all-round product innovation, so as to develop customized products and accurately push and market them, and can carry out Internet-based product design and use big data for risk pricing.
作为本技术方案的进一步改进,在利用构建的样本集对支持向量机进行训练时,所述采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优包括设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优;监测目标确定单元用于提取接收到的各新闻舆情数据中的股票主体,并对包含所述股票主体的新闻舆情数据进行统计,当包含所述股票主体的新闻舆情数据在此次接收到的新闻舆情数据中所占的比例超出给定的阈值时,则判定该股票主体为需要进行舆情监测的目标股票,具体为:As a further improvement of this technical solution, when using the constructed sample set to train the support vector machine, using the cuckoo search algorithm to optimize the penalty factor and kernel function parameters of the support vector machine includes setting the position of the cuckoo bird's nest The smaller the corresponding fitness function value is, the better the solution corresponding to the position of the bird's nest is; the monitoring target determination unit is used to extract the stock subject in the received news public opinion data, and perform the news public opinion data containing the stock subject. Statistics, when the proportion of news public opinion data including the stock subject in the received news public opinion data exceeds a given threshold, it is determined that the stock subject is the target stock that needs public opinion monitoring, specifically:
设表示监测目标确定单元此次接收到的新闻舆情数据的总数,表示监测目标确定单元在此次接收到的新闻舆情数据中提取到的第个股票主体,当包含第个股票主体的新闻舆情数据在此次接收到的新闻舆情数据中满足:时,则判定该第个股票主体为需要进行舆情监测的目标股票,其中为给定的阈值,的值可以取;Assume Indicates the total number of news and public opinion data received by the monitoring target determination unit this time, Indicates the first number extracted by the monitoring target determination unit from the news public opinion data received this time. A stock subject, when including the first The news and public opinion data of each stock subject satisfies the following in the news and public opinion data received this time: , it is determined that the Each stock subject is the target stock that needs to be monitored by public opinion, among which for a given threshold, The value of can take ;
舆情预警单元用于对包含所述目标股票的新闻舆情数据进行统计,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例超出给定的预警阈值时进行预警,具体为:The public opinion early-warning unit is used to perform statistics on the news and public opinion data containing the target stock, when the proportion of the news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock exceeds a given early warning threshold. Warning, specifically:
设表示舆情监测模块接收到的包含第个目标股票的新闻舆情数据的总数,表示包含第个目标股票的新闻舆情数据中情感标签为负面的新闻舆情数据的数量,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例满足:时,舆情预警单元进行预警,其中,为给定的预警阈值,的值可以取;Assume Indicates that the information received by the public opinion monitoring module contains the first The total number of news and public opinion data for each target stock, Indicates that the The number of news and public opinion data with negative emotional labels in the news and public opinion data of each target stock, when the proportion of news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock satisfies: , the public opinion warning unit will give an early warning, among which, for a given warning threshold, The value of can take ;
舆情数据管理单元用于对接收到的包含目标股票的新闻舆情数据进行的预处理包括分词处理、过滤停用词、并删除所有与情感信息无关的链接地址、联系方式的文本;The public opinion data management unit is configured to perform preprocessing on the received news public opinion data containing the target stock, including word segmentation, filtering stop words, and deleting all texts of link addresses and contact information irrelevant to emotional information;
设置表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,采用莱维飞行模式对鸟巢位置进行第次的迭代更新,具体为:set up Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird's nest position after the second iteration update, using the Levi flight mode to determine the bird's nest position carry out the Iterative update of times, specifically:
式中,表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,表示步长因子,表示点对点乘法,表示服从参数的莱维分布生成的随机搜索向量,表示鸟巢位置的适应度函数值,表示鸟巢位置的适应度函数值;In the formula, Indicates the position of the bird's nest using the Levi flight mode carry out the The updated bird's nest position in the next iteration, represents the step factor, represents point-to-point multiplication, Indicates obedience parameter The random search vector generated by the Levy distribution of , Indicates the location of the bird's nest The fitness function value of , Indicates the location of the bird's nest The fitness function value of ;
在采用莱维飞行模式对种群中的鸟巢位置进行第次的迭代更新后,采用下列步骤在种群中选取鸟巢位置进行第次的偏好随机更新,具体包括:The first study of nest locations in the population using the Levi flight mode After the iterative update of the number of times, the following steps are used to select the bird's nest position in the population for the first The preferences are randomly updated each time, including:
(1)对当前采用莱维飞行模式进行了第次迭代更新后的鸟巢位置进行区域划分;(1) Carry out the first review of the current Levie flight mode. The position of the bird's nest after the second iteration update is divided into regions;
(2)在划分的各区域中选取鸟巢位置进行第次的偏好随机更新;(2) Select the bird's nest position in the divided areas for the first The preferences are updated randomly;
给定种群在采用莱维飞行模式进行第次的迭代更新后的区域分割阈值,且的值设置为:,其中,表示鸟巢位置的近邻分割值,且,表示当前种群中距离鸟巢位置第近的鸟巢位置,为给定的正整数,且,的值可以取,为种群中的布谷鸟数,采用下列步骤根据给定的区域分割阈值对当前采用莱维飞行模式进行了第次迭代更新后的鸟巢位置进行区域划分:The given population is in the Levy flight mode for the first Iteratively updated region segmentation threshold ,and The value is set to: ,in, Indicates the location of the bird's nest The nearest neighbor split value of , and , Indicates the distance to the bird's nest in the current population the first near the bird's nest, is a given positive integer, and , The value of can take , is the number of cuckoos in the population, using the following steps to divide the threshold according to the given area The first review of the current use of Levi's flight mode The location of the bird's nest after the second iteration update is divided into regions:
Step1:在种群中随机选取一个鸟巢位置,设为此次随机选取的鸟巢位置,表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,则表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,将鸟巢位置所处区域标记为,并将鸟巢位置划分进区域中,对种群中未划分区域的鸟巢位置依次进行筛选,具体为:Step 1: Randomly select a bird's nest location in the population, set The location of the bird's nest randomly selected for this time, Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird’s nest position after the next iteration update, then Indicates the position of the bird's nest using the Levi flight mode carry out the The updated bird's nest position in the next iteration, the bird's nest position The area is marked as , and set the nest position divided into regions , screen the nest positions of undivided areas in the population in turn, specifically:
设置表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,当鸟巢位置满足:时,则将鸟巢位置划分进区域中;set up Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird's nest position after the second iteration update, Indicates the position of the bird's nest using the Levi flight mode carry out the The next iteration is the updated bird's nest position, when the bird's nest position Satisfy: , the bird's nest position divided into regions middle;
当对种群中未划分区域的鸟巢位置筛选完成后,进入Step2;After completing the screening of bird nests in undivided areas in the population, go to
Step2:在种群中未划分进区域的鸟巢位置中随机选取一个鸟巢位置,设为此次随机选取的鸟巢位置,表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,则表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,将鸟巢位置所处的区域标记为,并将鸟巢位置划分进区域中,对种群中未划分进行区域的鸟巢位置依次进行筛选,具体为:Step 2: Randomly select a bird's nest position among the bird's nest positions that are not divided into regions in the population, set The location of the bird's nest randomly selected for this time, Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird’s nest position after the next iteration update, then Indicates the position of the bird's nest using the Levi flight mode carry out the The updated bird's nest position in the next iteration, the bird's nest position The area is marked as , and set the nest position divided into regions , screen the bird's nest positions that are not divided into areas in the population in turn, specifically:
当鸟巢位置满足:时,则将鸟巢位置划分进行区域中;When the bird's nest position Satisfy: , the bird's nest position Divide the area middle;
当对种群中未划分进区域的鸟巢位置筛选完成后,进入步骤Step3;When the selection of the bird's nest positions in the population that are not divided into areas is completed, go to Step 3;
Step3:当种群中未划分进区域的鸟巢位置的个数不为时,则继续按照步骤Step2中的方式对种群中未划分进区域的鸟巢位置进行区域划分,当种群中未划分进区域的鸟巢位置的个数为时,则停止对种群中的鸟巢位置进行区域划分;Step 3: When the number of nest positions that are not divided into areas in the population is not , then continue to divide the bird's nest positions that are not divided into the area in the population according to the method in
在划分的各区域中选取鸟巢位置进行第次的偏好随机更新,具体为:Select the bird's nest location in the divided areas for the first The preference is randomly updated for the second time, specifically:
设置表示对采用莱维飞行模式进行第次的迭代更新后的鸟巢位置进行区域划分所得的第个区域,定义表示区域中鸟巢位置的区域属性系数,且的值为:,式中,表示区域中鸟巢位置的临近距离值的均值,且,表示区域中鸟巢位置的临近距离值的离散系数,且,其中,表示区域中的第个鸟巢位置,表示鸟巢位置的临近距离值,且,表示区域中距离鸟巢位置最近的鸟巢位置,表示区域中的鸟巢位置数;set up Indicates that the first flight in Levi flight mode The second iteration of the updated bird's nest location is the result of regional division. area, define Representation area the regional attribute coefficients of the bird's nest location in , and The value is: , where, Representation area the mean of the proximity distance values of the bird's nest location in the middle, and , Representation area The dispersion coefficient of the proximity distance value of the bird's nest location in the middle, and ,in, Representation area in the bird's nest location, Indicates the location of the bird's nest the proximity distance value of , and , Representation area Middle distance bird's nest location The nearest bird's nest location, Representation area The number of nest positions in ;
按照下列步骤对区域中的鸟巢位置进行可替代性检测:Follow the steps below to Alternative detection of the bird's nest location in:
步骤1:设置表示区域中当前未进行可替代性检测的鸟巢位置集合,设表示集合中的第个鸟巢位置,定义表示鸟巢位置在集合中的可替代性系数,且的值为:Step 1: Setup Representation area The set of bird's nest positions that have not currently been tested for substitutability, set Represents a collection in the nest locations, defined Indicates the location of the bird's nest in collection the substitutability coefficient in , and The value is:
式中,表示集合中的第个鸟巢位置,为鸟巢位置和鸟巢位置之间的可替代判断函数,当时,的值取,当时,的值取;In the formula, Represents a collection in the bird's nest location, for the bird's nest location and bird's nest location Alternative judgment functions between , when hour, the value of ,when hour, the value of ;
按照上述方法计算集合中各鸟巢位置的可替代性系数;Calculate the set as above The substitutability coefficient of each bird's nest location in ;
步骤2:在当前集合中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,设为当前集合中具有最大可替代性系数的鸟巢位置,且表示集合中的第个鸟巢位置,表示鸟巢位置在集合中的可替代鸟巢位置集合,且,其中,表示集合中的第个鸟巢位置,为鸟巢位置和鸟巢位置之间的可替代判断函数,当时,的值取,当时,的值取;设表示鸟巢位置在集合中的可替代性系数,当的值满足:时,则在集合中随机选取个鸟巢位置进行偏好随机更新,并将集合中的鸟巢位置和鸟巢位置都标注为已进行可替代性检测的鸟巢位置,并在集合中去除所述已进行可替代性检测的鸟巢位置后进入步骤3;当的值满足:时,则停止在区域中选取鸟巢位置进行第次的偏好随机更新,其中,为给定的可替代性检测阈值,且为大于的正整数,的值可以取;Step 2: In the current collection Select the bird's nest position with the largest substitutability coefficient for substitutability detection, set for the current collection The nest location with the largest substitutability coefficient in , and Represents a collection in the bird's nest location, Indicates the location of the bird's nest in collection the set of alternative nest positions in , and ,in, Represents a collection in the bird's nest location, for the bird's nest location and bird's nest location Alternative judgment functions between , when hour, the value of ,when hour, the value of ;Assume Indicates the location of the bird's nest in collection The substitutability coefficient in , when The value of satisfies: , then in the collection randomly selected from Randomly update the preference of the bird's nest positions, and collect the Bird's Nest Location and Bird's Nest Location in are marked as nest locations that have undergone substitutable detection, and are collected in the collection After removing the position of the bird's nest that has undergone alternative detection, go to step 3; when The value of satisfies: , stop at the area Select the location of the bird's nest from the times the preference is randomly updated, where, is a given alternative detection threshold, and is greater than a positive integer of , The value of can take ;
步骤3:继续按照步骤2中的方式在当前集合中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,直到集合中的鸟巢位置都进行了可替代性检测;Step 3: Continue in the current collection as in
设置选取的鸟巢位置采用下列方式进行第次的偏好随机更新:Set the selected bird's nest position in the following ways: The preferences are updated randomly for times:
设表示选取的进行第次的偏好随机更新的鸟巢位置,表示种群中的第只布谷鸟在第次迭代更新后的最终鸟巢位置,则表示采用莱维飞行模式对鸟巢位置进行第次迭代更新后的鸟巢位置,则采用下列方式进行第次的偏好随机更新:Assume Indicates the selected The preferred randomly updated bird's nest location, represents the first in the population only cuckoo in the The final bird’s nest position after the next iteration update, then Indicates the position of the bird's nest using the Levi flight mode carry out the the position of the bird’s nest after the next iteration update, then carry out the following The preferences are updated randomly for times:
式中,表示采用随机偏好更新模式对鸟巢位置进行迭代更新后的鸟巢位置,表示鸟巢位置的适应度函数值,表示鸟巢位置的适应度函数值,表示种群中第只布谷鸟在第次迭代更新后的最终鸟巢位置,和为在当前采用莱维飞行模式进行第次迭代更新后的鸟巢位置中随机选取的两个鸟巢位置,且,表示到之间的随机数。In the formula, Indicates that the random preference update mode is used to adjust the position of the bird's nest. The nest position after iterative update, Indicates the location of the bird's nest The fitness function value of , Indicates the location of the bird's nest The fitness function value of , represents the number of only cuckoo in the The final bird's nest position after the second iteration update, and For the first flight in Levi's flight mode Two bird nest positions randomly selected from the bird nest positions updated by the second iteration, and , express arrive random numbers in between.
本发明的目的之二在于,提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行方法,包括:The second purpose of the present invention is to provide an operation method of a stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence, including:
首先在构建预测及风控平台的产品架构上,开发相关的软件应用,连接各金融证券信息管理平台及第三方服务平台、监管平台,以合法手段从多渠道获取大量金融舆情资讯、用户征信、用户与证券公司之间的交易活动信息、证券公司及其产品的数据,对数据进行归集整合后,采用神经网络、机器学习、支持向量机等算法技术对数据进行建模及训练,以确保风险预测数据模型的准确度,平台运转过程中,通过模型对金融证券业务的动态监测,实时预测业务全流程中可能出现的风险,判识风险的类型、评估风险的程度,从而采用预设的风险控制手段对其进行干预或消除,并利用大数据不断改良金融证券交易业务的政策及流程,以期从根本上降低金融证券交易业务的风险。First of all, on the product structure of building a prediction and risk control platform, develop related software applications, connect various financial securities information management platforms, third-party service platforms, and regulatory platforms, and obtain a large amount of financial public opinion information and user credit information from multiple channels by legal means. , trading activity information between users and securities companies, and data of securities companies and their products. After the data is collected and integrated, neural networks, machine learning, support vector machines and other algorithmic technologies are used to model and train the data to achieve To ensure the accuracy of the risk prediction data model, during the operation of the platform, through the dynamic monitoring of the financial and securities business by the model, real-time prediction of possible risks in the entire business process, identification of the type of risk, and assessment of the degree of risk, so as to adopt the preset It intervenes or eliminates it by means of risk control, and uses big data to continuously improve the policies and procedures of financial securities trading business, in order to fundamentally reduce the risk of financial securities trading business.
本发明的目的之三在于,提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行装置,包括处理器、存储器以及存储在存储器中并在处理器上运行的计算机程序,处理器用于执行计算机程序时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。The third object of the present invention is to provide an operating device for a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including a processor, a memory, and a computer program stored in the memory and running on the processor. The device is used to implement the above-mentioned stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence when executing computer programs.
本发明的目的之四在于,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。The fourth object of the present invention is to provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, realizes the above-mentioned algorithms, big data, artificial intelligence stock public opinion monitoring and risk control system.
与现有技术相比,本发明的有益效果:Compared with the prior art, the beneficial effects of the present invention:
1.该基于算法、大数据、人工智能的股票舆情监测和风控系统以金融公司积累的大量市场数据为基础,从多途径获取大量舆情、金融证券交易活动、用户信用等数据,对海量的数据进行处理,构建风险预测数据模型并通过多种技术进行训练以提高模型的准确度,从而可以快速准确地识别或预测金融证券业务流程中可能存在的风险以便控制干预;1. The stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence is based on a large amount of market data accumulated by financial companies, and obtains a large amount of public opinion, financial securities trading activities, user credit and other data from multiple channels. Process, build a risk prediction data model and train it through a variety of techniques to improve the accuracy of the model, so that possible risks in financial securities business processes can be quickly and accurately identified or predicted to control intervention;
2.该基于算法、大数据、人工智能的股票舆情监测和风控系统重视用户的征信情况,对金融证券业务流程进行动态监测,着重从事前风险控制入手,深入了解用户的情况以便进行个性化营销,降低如征信、风险定价及欺诈等事前风险因素,保障证券公司和用户双方的利益;2. The stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence attaches great importance to the user's credit information, dynamically monitors the financial and securities business processes, focuses on pre-risk control, and deeply understands the user's situation for personalization. Marketing, reducing ex ante risk factors such as credit reporting, risk pricing and fraud, to protect the interests of both securities companies and users;
3.该基于算法、大数据、人工智能的股票舆情监测和风控系统以大数据分析为基础,以互联网金融环境为支撑,提高数据的精确度,实现对金融证券业务的实时监测,并可联合不同领域的金融数据来提高风控效果,还可全面地改良金融证券公司的业务,从根本上降低风险程度,加强证券公司的风控体系。3. The stock public opinion monitoring and risk control system based on algorithms, big data and artificial intelligence is based on big data analysis, supported by the Internet financial environment, improves the accuracy of data, realizes real-time monitoring of financial securities business, and can be combined with Financial data in different fields can improve the effect of risk control, and can also comprehensively improve the business of financial securities companies, fundamentally reduce the degree of risk, and strengthen the risk control system of securities companies.
附图说明Description of drawings
图1为本发明的示例性产品架构图;1 is an exemplary product architecture diagram of the present invention;
图2为本发明的整体平台系统装置结构图;2 is a structural diagram of an overall platform system device of the present invention;
图3为本发明的局部平台系统装置结构图之一;3 is one of the structural diagrams of the local platform system device of the present invention;
图4为本发明的局部平台系统装置结构图之二;4 is the second structural diagram of the local platform system device of the present invention;
图5为本发明的局部平台系统装置结构图之三;FIG. 5 is the third structural diagram of the local platform system device of the present invention;
图6为本发明的局部平台系统装置结构图之四;FIG. 6 is the fourth structural diagram of the local platform system device of the present invention;
图7为本发明的局部平台系统装置结构图之五;FIG. 7 is the fifth structural diagram of the local platform system device of the present invention;
图8为本发明的局部平台系统装置结构图之六;8 is the sixth structural diagram of the local platform system device of the present invention;
图9为本发明的局部平台系统装置结构图之七;FIG. 9 is the seventh structural diagram of the local platform system device of the present invention;
图10为本发明的示例性电子计算机平台装置结构示意图。FIG. 10 is a schematic structural diagram of an exemplary electronic computer platform device of the present invention.
图中各个标号意义为:The meanings of the symbols in the figure are:
1、计算机处理器;2、显示终端;3、风控平台;4、数据存储服务器;5、第三方平台服务器;6、用户;1. Computer processor; 2. Display terminal; 3. Risk control platform; 4. Data storage server; 5. Third-party platform server; 6. User;
100、平台架构单元;101、基建设备模块;102、软件环境模块;103、技术支撑模块;104、三方平台模块;100, platform architecture unit; 101, infrastructure equipment module; 102, software environment module; 103, technical support module; 104, tripartite platform module;
200、数据处理单元;201、数据集合模块;2011、舆情资讯模块;2012、用户征信模块;2013、公司产品模块;2014、交易活动模块;202、分类整理模块;203、数据分析模块;2031、神经网络模块;2032、机器学习模块;2033、支持向量机模块;2034、碰撞分析模块;204、数据模型模块;200, data processing unit; 201, data collection module; 2011, public opinion information module; 2012, user credit reporting module; 2013, company product module; 2014, transaction activity module; 202, classification module; 203, data analysis module; 2031 , neural network module; 2032, machine learning module; 2033, support vector machine module; 2034, collision analysis module; 204, data model module;
300、预测判识单元;301、动态监测模块;302、风险预测模块;303、类型识别模块;304、程度判定模块;300, prediction and identification unit; 301, dynamic monitoring module; 302, risk prediction module; 303, type identification module; 304, degree determination module;
400、风控管理单元;401、风险控制模块;402、合作风控模块;403、监管干预模块;404、改良措施模块;4041、用户上线模块;4042、交易账户模块;4043、数据融合模块;4044、产品创新模块。400, risk control management unit; 401, risk control module; 402, cooperative risk control module; 403, regulatory intervention module; 404, improvement measures module; 4041, user online module; 4042, transaction account module; 4043, data fusion module; 4044. Product innovation module.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
实施例1Example 1
如图1-图10所示,本实施例提供了基于算法、大数据、人工智能的股票舆情监测和风控系统,包括As shown in Figures 1-10, this embodiment provides a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including
平台架构单元100、数据处理单元200、预测判识单元300和风控管理单元400;平台架构单元100、数据处理单元200、预测判识单元300与风控管理单元400依次通过网络通信连接;平台架构单元100用于对构成平台运行环境的设备、软件及技术应用进行连接及管理;数据处理单元200用于采集获取大量与金融证券及其风险相关的多元数据,通过对数据进行整理分析并建立完善的数据分析模型;预测判识单元300用于通过对大量数据的伸入挖掘分析来预测金融证券业务流程中可能存在的风险,对风险进行类型识别和程度分析;风控管理单元400用于从多方面、采用多种风控手段对金融证券业务的风险进行管控。The
平台架构单元100包括基建设备模块101、软件环境模块102、技术支撑模块103和三方平台模块104;The
数据处理单元200包括数据集合模块201、分类整理模块202、数据分析模块203和数据模型模块204;The
预测判识单元300包括动态监测模块301、风险预测模块302、类型识别模块303和程度判定模块304;The prediction and
风控管理单元400包括风险控制模块401、合作风控模块402、监管干预模块403和改良措施模块404。The risk
本实施例中,基建设备模块101、软件环境模块102、技术支撑模块103与三方平台模块104依次通过网络通信连接;基建设备模块101用于对加入风控平台系统的电子计算机设备进行连接管理;软件环境模块102用于在基建设备的基础上研发针对证券金融业务风险管理的软件及应用平台,以便构建支持系统的运行环境;技术支撑模块103用于载入以人工智能为主的智能技术,并引入多种智能算法来支撑平台系统的顺畅运行;三方平台模块104用于连接多个如金融证券信息管理平台、监管平台等第三方服务平台以获取大量补充数据及补充服务。In this embodiment, the
其中,基建设备包括但不限于计算机、显示器、PC平板、手机、智能传感器、数据采集装置(扫描仪、RFID、身份证OCR、人脸/指纹识别器等)等。Among them, infrastructure equipment includes but is not limited to computers, monitors, PC tablets, mobile phones, smart sensors, data acquisition devices (scanners, RFID, ID card OCR, face/fingerprint readers, etc.), etc.
本实施例中,数据集合模块201的信号输出端与分类整理模块202的信号输入端连接,分类整理模块202的信号输出端与数据分析模块203的信号输入端连接,数据分析模块203的信号输出端与数据模型模块204的信号输入端连接;数据集合模块201用于通过多种手段从多来源获取大量与金融证券相关的数据;分类整理模块202用于按照一定的类别规则将大量的数据进行分类归纳整理操作,以便进行后期的计算分析;数据分析模块203用于采用多种全球领先技术来对金融证券的数据进行分析;数据模型模块204用于以大量的数据为基础、根据数据分析的结果构建风险分析的数据模型并进行训练及验证。In this embodiment, the signal output terminal of the
进一步地,数据集合模块201包括舆情资讯模块2011、用户征信模块2012、公司产品模块2013和交易活动模块2014;舆情资讯模块2011、用户征信模块2012、公司产品模块2013与交易活动模块2014依次通过网络通信连接且并列运行;舆情资讯模块2011用于从网络上获取公开的历史或实时的与金融证券相关的舆情资讯;用户征信模块2012用于从用户、证券公司、合作银行等方面以合法手段或经用户授权后获取与用户征信相关的信息;公司产品模块2013用于获取各证券公司包括经营情况、公开资产、公司业务及具体产品详情等信息数据;交易活动模块2014用于获取用户与证券公司之间的交易活动的全流程信息。Further, the
进一步地,数据分析模块203包括神经网络模块2031、机器学习模块2032、支持向量机模块2033和碰撞分析模块2034;神经网络模块2031、机器学习模块2032、支持向量机模块2033与碰撞分析模块2034依次通过网络通信连接且独立运行;神经网络模块2031用于通过神经网络的训练算法来将算法权重的值调整到最佳,以使得整个网络的预测效果最好,利用训练样本集中的样本对BP神经网络或支持向量机进行训练,利用测试样本集中的样本对BP神经网络或支持向量机进行测试,从而构建基于BP神经网络的股票走势预测模型用于对目标股票的走势进行预测,参数优化单元利用人工萤火虫群优化算法对BP神经网络的初始权值和阈值进行优化或采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优;机器学习模块2032用于使用机器学习相关的技术进行神经网络的训练,使得参数尽可能的与真实的模型逼近,从而使得模型训练可以获得性能与数据利用上的双重优势;支持向量机模块2033用于采用支持向量机的算法,通过构造分割面将数据进行分离,以便进行关系分析;碰撞分析模块2034用于将来自不同金融领域、不同金融机构的数据及风险因素进行碰撞分析以挖掘潜在的风险情况。数据分析模块203还利用构建的样本集对支持向量机进行训练,在利用构建的样本集对支持向量机进行训练时,采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优,设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优。Further, the
具体地,机器学习模块2032中,机器学习的训练算法为:Specifically, in the
首先给所有参数赋上随机值,使用这些随机生成的参数值,来预测训练数据中的样本;First assign random values to all parameters, and use these randomly generated parameter values to predict the samples in the training data;
设样本的预测目标为,真实目标为y,那么定义一个值loss,计算公式如下:Let the prediction target of the sample be , the real target is y, then define a value loss, the calculation formula is as follows:
; ;
其中,loss称为损失,机器学习的目标是使对所有训练数据的损失和尽可能的小;Among them, loss is called loss, and the goal of machine learning is to make the loss sum of all training data as small as possible;
进而,如果将先前的神经网络预测的矩阵公式带入到中,则可以把损失写为关于参数的损失函数。Furthermore, if the matrix formula predicted by the previous neural network is brought into , the loss can be written as a loss function with respect to the parameters.
具体地,支持向量机模块2033中,支持向量的算法选择方式为:Specifically, in the support vector machine module 2033, the algorithm selection method of the support vector is:
以线性可分SVM为例,将W认为是若干样本线性组合得到的,则第1个样本为,第i个为,对于每个x,给予其系数,此时存在:,选取部分,使它们的值不为0,其余值都设为0,则对w真正起作用的就是值不为0的这些x向量,这些向量支持了法线向量,因此就是支持向量;Taking linearly separable SVM as an example, W is considered to be obtained by linear combination of several samples, then the first sample is , the i-th is , for each x, give its coefficients , which exists at this time: , select part , so that their values are not 0, and the rest of the values are set to 0, then what really works on w is these x vectors whose values are not 0. These vectors support the normal vector, so they are support vectors;
若直线l有参数w和b,通过计算每个样本到直线l的距离,衡量哪条直线是最为合适的分割线;距离d可以表示为:,若每个数据集中样本的形式为,而每个样本的y值,就是这个样本的label(正例为1,负例为-1,这里的正负值其实反映的就是样本位于分割线的方向,位于法线正方向即为正);If the straight line l has parameters w and b, by calculating the distance from each sample to the straight line l, we can measure which straight line is the most suitable dividing line; the distance d can be expressed as: , if the form of the samples in each dataset is , and the y value of each sample is the label of the sample (positive example is 1, negative example is -1, the positive and negative values here actually reflect the direction of the sample in the dividing line, and the positive direction of the normal is positive );
将y值一起乘入等式右边: ,这里的y值是样本的实际正负值,如果估计值与实际值符号相同,即分类正确,此时的结果为正值,如果分类错误,则结果为负值;Multiply the y values together into the right-hand side of the equation: , the y value here is the actual positive and negative values of the sample. If the estimated value has the same sign as the actual value, that is, the classification is correct, the result at this time is a positive value, and if the classification is wrong, the result is a negative value;
在所有样本中,距离该直线最近的样本应被选为支持向量,支持向量与直线间的距离即为过渡带,因为SVM期望过渡带尽可能大,因此最终参数w与b的选择可以表示为:Among all samples, the sample closest to the line should be selected as the support vector, and the distance between the support vector and the line is the transition band, because SVM expects the transition band to be as large as possible, so the selection of the final parameters w and b can be expressed as :
; ;
因此,给定线性可分训练数据集,通过间隔最大化得到的分割超平面为:,相应的分类决策函数为:。Therefore, given a linearly separable training dataset, the segmentation hyperplane obtained by margin maximization is: , the corresponding classification decision function is: .
本实施例中,动态监测模块301的信号输出端与风险预测模块302的信号输入端连接,风险预测模块302的信号输出端与类型识别模块303的信号输入端连接,类型识别模块303的信号输出端与程度判定模块304的信号输入端连接;动态监测模块301用于通过数字技术,以往期某一时期或某一时点的用户数据作为审核依据的风控方式,替代不能够抓住延续性数据的风控方式,重视具备延续性的用户信息并在复制上给予更高权重,从而实现对金融证券业务风险的动态监测;风险预测模块302用于通过构建的风险预测数据模型自动预测用户与证券公司进行交易活动全流程中可能存在的风险因素;类型识别模块303用于根据预测出的风险的在交易活动中所处的位置来识别该风险的类型;程度判定模块304用于按照预设的风险等级划分规则自动评估各风险的程度情况。In this embodiment, the signal output end of the
其中,风险类型包括但不限于市场风险、信用风险、流动性风险、作业风险、行业风险、法律法规或政策风险、人事风险、自然灾害或其他突发事件等。Among them, the types of risks include but are not limited to market risks, credit risks, liquidity risks, operational risks, industry risks, legal and regulatory or policy risks, personnel risks, natural disasters or other emergencies, etc.
本实施例中,风险控制模块401、合作风控模块402、监管干预模块403与改良措施模块404依次通过网络通信连接;风险控制模块401用于分别从事前、事中及事后三个方面来对金融证券业务流程中可能出现的各类风险进行控制管理;合作风控模块402用于通过将不同金融领域、不同金融机构内的风险控制数据及风控方法实现共享合作从而提高风险控制的效果;监管干预模块403用于从数据监管入手,在允许进一步放开券商对客户信息与交易数据开发权限的基础上,实时监控券商自身或第三方企业获取客户相关信息的来源于渠道,以及进一步进行数据内部深加工的流程和后续构建的包括客户交易习惯和征信等资料库,从而实现证券互联网化的全程监管,并引入第三方监管平台的干预手段来保障金融证券业务的低风险;改良措施模块404用于证券公司利用大数据对一些业务功能进行改良来强化其风控体系。In this embodiment, the
其中,事前风险控制主要包括征信、风险定价、反欺诈等方面。Among them, ex-ante risk control mainly includes credit investigation, risk pricing, anti-fraud and other aspects.
进一步地,改良措施模块404包括用户上线模块4041、交易账户模块4042、数据融合模块4043和产品创新模块4044;用户上线模块4041、交易账户模块4042、数据融合模块4043与产品创新模块4044依次通过网络通信连接;用户上线模块4041用于将证券公司大量的线下存量客户线下存档的资料、交易类行为等数据进行线上化,以便衬垫用户的线上数据;交易账户模块4042用于打通用户证券交易账户的线上支付,拓展其线上的非证券交易功能,将账户体系丰富到其他线上平台以积累更多的非证券交易数据;数据融合模块4043用于在多个金融信息管理平台之上搭建数据融合平台以将所有数据进行归集整合,从而可以从各个维度对个体行为进行分析与预测,从全维度开展对个体的风险评估使评估迅速且准确;产品创新模块4044用于以大数据分析作为全方位产品创新的基础,以便开发定制化产品并精准推送和营销,并可以进行互联网化的产品设计及利用大数据进行风险定价。Further, the
本实施例还提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行方式,包括:This embodiment also provides the operation mode of the stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including:
首先在构建预测及风控平台的产品架构上,开发相关的软件应用,连接各金融证券信息管理平台及第三方服务平台、监管平台,以合法手段从多渠道获取大量金融舆情资讯、用户征信、用户与证券公司之间的交易活动信息、证券公司及其产品的数据,对数据进行归集整合后,采用神经网络、机器学习、支持向量机等算法技术对数据进行建模及训练,以确保风险预测数据模型的准确度,平台运转过程中,通过模型对金融证券业务的动态监测,实时预测业务全流程中可能出现的风险,判识风险的类型、评估风险的程度,从而采用预设的风险控制手段对其进行干预或消除,并利用大数据不断改良金融证券交易业务的政策及流程,以期从根本上降低金融证券交易业务的风险。First of all, on the product structure of building a prediction and risk control platform, develop related software applications, connect various financial securities information management platforms, third-party service platforms, and regulatory platforms, and obtain a large amount of financial public opinion information and user credit information from multiple channels by legal means. , trading activity information between users and securities companies, and data of securities companies and their products. After the data is collected and integrated, neural networks, machine learning, support vector machines and other algorithmic technologies are used to model and train the data to achieve To ensure the accuracy of the risk prediction data model, during the operation of the platform, through the dynamic monitoring of the financial and securities business by the model, real-time prediction of possible risks in the entire business process, identification of the type of risk, and assessment of the degree of risk, so as to adopt the preset It intervenes or eliminates it by means of risk control, and uses big data to continuously improve the policies and procedures of financial securities trading business, in order to fundamentally reduce the risk of financial securities trading business.
在优选的实施例中,监测目标确定单元用于提取接收到的各新闻舆情数据中的股票主体,并对包含所述股票主体的新闻舆情数据进行统计,当包含所述股票主体的新闻舆情数据在此次接收到的新闻舆情数据中所占的比例超出给定的阈值时,则判定该股票主体为需要进行舆情监测的目标股票,具体为:In a preferred embodiment, the monitoring target determination unit is configured to extract the stock subject in the received news public opinion data, and perform statistics on the news public opinion data containing the stock subject, when the news public opinion data of the stock subject is included When the proportion of the received news and public opinion data exceeds the given threshold, it is determined that the stock subject is the target stock that needs to be monitored for public opinion, specifically:
设表示监测目标确定单元此次接收到的新闻舆情数据的总数,表示监测目标确定单元在此次接收到的新闻舆情数据中提取到的第个股票主体,当包含第个股票主体的新闻舆情数据在此次接收到的新闻舆情数据中满足:时,则判定该第个股票主体为需要进行舆情监测的目标股票,其中为给定的阈值,的值可以取。Assume Indicates the total number of news and public opinion data received by the monitoring target determination unit this time, Indicates the first number extracted by the monitoring target determination unit from the news public opinion data received this time. A stock subject, when including the first The news and public opinion data of each stock subject satisfies the following in the news and public opinion data received this time: , it is determined that the Each stock subject is the target stock that needs to be monitored by public opinion, among which for a given threshold, The value of can take .
舆情预警单元用于对包含所述目标股票的新闻舆情数据进行统计,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例超出给定的预警阈值时进行预警,具体为:The public opinion early-warning unit is used to perform statistics on the news and public opinion data containing the target stock, when the proportion of the news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock exceeds a given early warning threshold. Warning, specifically:
设表示舆情监测模块接收到的包含第个目标股票的新闻舆情数据的总数,表示包含第个目标股票的新闻舆情数据中情感标签为负面的新闻舆情数据的数量,当情感标签为负面的新闻舆情数据在所述包含目标股票的新闻舆情数据中所占的比例满足:时,舆情预警单元进行预警,其中,为给定的预警阈值,的值可以取。Assume Indicates that the information received by the public opinion monitoring module contains the first The total number of news and public opinion data for each target stock, Indicates that the The number of news and public opinion data with negative emotional labels in the news and public opinion data of each target stock, when the proportion of news and public opinion data with negative emotional labels in the news and public opinion data containing the target stock satisfies: , the public opinion warning unit will give an early warning, among which, for a given warning threshold, The value of can take .
舆情数据管理单元用于对接收到的包含目标股票的新闻舆情数据进行的预处理包括分词处理、过滤停用词、并删除所有与情感信息无关的链接地址、联系方式的文本。The public opinion data management unit is used for preprocessing the received news public opinion data containing the target stock, including word segmentation, filtering stop words, and deleting all texts of link addresses and contact information irrelevant to emotional information.
具体的,本实施例通过建立股票舆情监测系统,对股票舆情数据进行有效的分析,有利于及时了解股票市场的民众情绪和舆论发展,从而引导证券市场的健康发展。Specifically, by establishing a stock public opinion monitoring system in this embodiment, the stock public opinion data can be effectively analyzed, which is conducive to timely understanding of public sentiment and public opinion development in the stock market, thereby guiding the healthy development of the securities market.
在优选的实施例中,情感分类单元采用下列步骤建立基于支持向量机的情感分类器:In a preferred embodiment, the emotion classification unit adopts the following steps to establish a support vector machine-based emotion classifier:
步骤(1):收集与股票相关的带有情感标签的新闻舆情数据,并对收集的新闻舆情数据进行数据清洗,去除所述新闻舆情数据中的噪声数据;Step (1): Collect news and public opinion data with emotional tags related to stocks, and perform data cleaning on the collected news and public opinion data to remove noise data in the news and public opinion data;
步骤(2):对清洗后的新闻舆情数据进行预处理和特征提取,从而构建特征向量,将所述新闻舆情数据的特征向量作为输入样本值,将所述新闻舆情数据带有的情感标签作为输出样本值构建样本集;Step (2): Perform preprocessing and feature extraction on the cleaned news public opinion data, thereby constructing a feature vector, taking the feature vector of the news public opinion data as the input sample value, and taking the sentiment label carried by the news public opinion data as the input sample value. Output sample values to construct a sample set;
步骤(3):利用构建的样本集对支持向量机进行训练和测试,从而建立基于支持向量机的情感分类器。Step (3): Use the constructed sample set to train and test the support vector machine, thereby establishing a sentiment classifier based on the support vector machine.
所述情感标签包括正面、中性和负面。The sentiment labels include positive, neutral, and negative.
在利用构建的样本集对支持向量机进行训练时,采用布谷鸟搜索算法对支持向量机的惩罚因子和核函数参数进行寻优,设置布谷鸟的鸟巢位置对应的适应度函数值越小,该鸟巢位置所对应的解越优。When using the constructed sample set to train the support vector machine, the cuckoo search algorithm is used to optimize the penalty factor and kernel function parameters of the support vector machine. The solution corresponding to the bird's nest position is better.
具体的,本实施例采用支持向量机对获取的新闻舆情数据进行情感分类,在针对支持向量机最佳算法参数难以确定的缺陷以及布谷鸟搜索算法局部搜索能力弱、寻优精度低的不足,通过采用布谷鸟搜索算法对支持向量机的参数进行寻优,并在布谷鸟搜索算法的迭代过程中对布谷鸟搜索算法的偏好随机更新模式进行改进,提高了布谷鸟搜索算法的搜索精度,从而使得寻优所得的最优算法参数能够提高支持向量机的分类精度。Specifically, in this embodiment, the support vector machine is used to classify the acquired news and public opinion data. In view of the defect that the optimal algorithm parameters of the support vector machine are difficult to determine and the weak local search ability and low optimization accuracy of the cuckoo search algorithm, By using the cuckoo search algorithm to optimize the parameters of the support vector machine, and in the iterative process of the cuckoo search algorithm, the preference random update mode of the cuckoo search algorithm is improved, and the search accuracy of the cuckoo search algorithm is improved. The optimal algorithm parameters obtained from the optimization can improve the classification accuracy of the support vector machine.
在优选的实施例中,设置表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,采用莱维飞行模式对鸟巢位置进行第次的迭代更新,具体为:In the preferred embodiment, setting Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird's nest position after the second iteration update, using the Levi flight mode to determine the bird's nest position carry out the Iterative update of times, specifically:
式中,表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,表示步长因子,表示点对点乘法,表示服从参数的莱维分布生成的随机搜索向量,表示鸟巢位置的适应度函数值,表示鸟巢位置的适应度函数值。In the formula, Indicates the position of the bird's nest using the Levi flight mode carry out the The updated bird's nest position in the next iteration, represents the step factor, represents point-to-point multiplication, Indicates obedience parameter The random search vector generated by the Levy distribution of , Indicates the location of the bird's nest The fitness function value of , Indicates the location of the bird's nest The fitness function value of .
在采用莱维飞行模式对种群中的鸟巢位置进行第次的迭代更新后,采用下列步骤在种群中选取鸟巢位置进行第次的偏好随机更新,具体包括:The first study of nest locations in the population using the Levi flight mode After the iterative update of the number of times, the following steps are used to select the bird's nest position in the population for the first The preferences are randomly updated each time, including:
(1)对当前采用莱维飞行模式进行了第次迭代更新后的鸟巢位置进行区域划分;(1) Carry out the first review of the current Levie flight mode. The position of the bird's nest after the second iteration update is divided into regions;
(2)在划分的各区域中选取鸟巢位置进行第次的偏好随机更新。(2) Select the bird's nest position in the divided areas for the first The preferences are updated randomly.
给定种群在采用莱维飞行模式进行第次的迭代更新后的区域分割阈值,且的值设置为:,其中,表示鸟巢位置的近邻分割值,且,表示当前种群中距离鸟巢位置第近的鸟巢位置,为给定的正整数,且,的值可以取,为种群中的布谷鸟数,采用下列步骤根据给定的区域分割阈值对当前采用莱维飞行模式进行了第次迭代更新后的鸟巢位置进行区域划分:The given population is in the Levy flight mode for the first Iteratively updated region segmentation threshold ,and The value is set to: ,in, Indicates the location of the bird's nest The nearest neighbor split value of , and , Indicates the distance to the bird's nest in the current population the first near the bird's nest, is a given positive integer, and , The value of can take , is the number of cuckoos in the population, using the following steps to divide the threshold according to the given area The first review of the current use of Levi's flight mode The location of the bird's nest after the second iteration update is divided into regions:
Step1:在种群中随机选取一个鸟巢位置,设为此次随机选取的鸟巢位置,表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,则表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,将鸟巢位置所处区域标记为,并将鸟巢位置划分进区域中,对种群中未划分区域的鸟巢位置依次进行筛选,具体为:Step 1: Randomly select a bird's nest location in the population, set The location of the bird's nest randomly selected for this time, Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird’s nest position after the next iteration update, then Indicates the position of the bird's nest using the Levi flight mode carry out the The updated bird's nest position in the next iteration, the bird's nest position The area is marked as , and set the nest position divided into regions , screen the nest positions of undivided areas in the population in turn, specifically:
设置表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,当鸟巢位置满足:时,则将鸟巢位置划分进区域中;set up Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird's nest position after the second iteration update, Indicates the position of the bird's nest using the Levi flight mode carry out the The next iteration is the updated bird's nest position, when the bird's nest position Satisfy: , the bird's nest position divided into regions middle;
当对种群中未划分区域的鸟巢位置筛选完成后,进入Step2;After completing the screening of bird nests in undivided areas in the population, go to
Step2:在种群中未划分进区域的鸟巢位置中随机选取一个鸟巢位置,设为此次随机选取的鸟巢位置,表示布谷鸟种群中的第个布谷鸟在进行第次迭代更新后的最终鸟巢位置,则表示采用莱维飞行模式对鸟巢位置进行第次的迭代更新后的鸟巢位置,将鸟巢位置所处的区域标记为,并将鸟巢位置划分进区域中,对种群中未划分进行区域的鸟巢位置依次进行筛选,具体为:Step 2: Randomly select a bird's nest position among the bird's nest positions that are not divided into regions in the population, set The location of the bird's nest randomly selected for this time, Indicates the number one in the cuckoo population A cuckoo is in the process of The final bird’s nest position after the next iteration update, then Indicates the position of the bird's nest using the Levi flight mode carry out the The updated bird's nest position in the next iteration, the bird's nest position The area is marked as , and set the nest position divided into regions , screen the bird's nest positions that are not divided into areas in the population in turn, specifically:
当鸟巢位置满足:时,则将鸟巢位置划分进行区域中;When the bird's nest position Satisfy: , the bird's nest position Divide the area middle;
当对种群中未划分进区域的鸟巢位置筛选完成后,进入步骤Step3;When the selection of the bird's nest positions in the population that are not divided into areas is completed, go to Step 3;
Step3:当种群中未划分进区域的鸟巢位置的个数不为时,则继续按照步骤Step2中的方式对种群中未划分进区域的鸟巢位置进行区域划分,当种群中未划分进区域的鸟巢位置的个数为时,则停止对种群中的鸟巢位置进行区域划分。Step 3: When the number of nest positions that are not divided into areas in the population is not , then continue to divide the bird's nest positions that are not divided into the area in the population according to the method in
在划分的各区域中选取鸟巢位置进行第次的偏好随机更新,具体为:Select the bird's nest location in the divided areas for the first The preference is randomly updated for the second time, specifically:
设置表示对采用莱维飞行模式进行第次的迭代更新后的鸟巢位置进行区域划分所得的第个区域,定义表示区域中鸟巢位置的区域属性系数,且的值为:,式中,表示区域中鸟巢位置的临近距离值的均值,且,表示区域中鸟巢位置的临近距离值的离散系数,且,其中,表示区域中的第个鸟巢位置,表示鸟巢位置的临近距离值,且,表示区域中距离鸟巢位置最近的鸟巢位置,表示区域中的鸟巢位置数;set up Indicates that the first flight in Levi flight mode The second iteration of the updated bird's nest location is the result of regional division. area, define Representation area the regional attribute coefficients of the bird's nest location in , and The value is: , where, Representation area the mean of the proximity distance values of the bird's nest location in the middle, and , Representation area The dispersion coefficient of the proximity distance value of the bird's nest location in the middle, and ,in, Representation area in the bird's nest location, Indicates the location of the bird's nest the proximity distance value of , and , Representation area Middle distance bird's nest location The nearest bird's nest location, Representation area The number of nest positions in ;
按照下列步骤对区域中的鸟巢位置进行可替代性检测:Follow the steps below to Alternative detection of the bird's nest location in:
步骤1:设置表示区域中当前未进行可替代性检测的鸟巢位置集合,设表示集合中的第个鸟巢位置,定义表示鸟巢位置在集合中的可替代性系数,且的值为:Step 1: Setup Representation area The set of bird's nest positions that have not currently been tested for substitutability, set Represents a collection in the nest locations, defined Indicates the location of the bird's nest in collection the substitutability coefficient in , and The value is:
式中,表示集合中的第个鸟巢位置,为鸟巢位置和鸟巢位置之间的可替代判断函数,当时,的值取,当时,的值取;In the formula, Represents a collection in the bird's nest location, for the bird's nest location and bird's nest location Alternative judgment functions between , when hour, the value of ,when hour, the value of ;
按照上述方法计算集合中各鸟巢位置的可替代性系数;Calculate the set as above The substitutability coefficient of each bird's nest location in ;
步骤2:在当前集合中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,设为当前集合中具有最大可替代性系数的鸟巢位置,且表示集合中的第个鸟巢位置,表示鸟巢位置在集合中的可替代鸟巢位置集合,且,其中,表示集合中的第个鸟巢位置,为鸟巢位置和鸟巢位置之间的可替代判断函数,当时,的值取,当时,的值取;设表示鸟巢位置在集合中的可替代性系数,当的值满足:时,则在集合中随机选取个鸟巢位置进行偏好随机更新,并将集合中的鸟巢位置和鸟巢位置都标注为已进行可替代性检测的鸟巢位置,并在集合中去除所述已进行可替代性检测的鸟巢位置后进入步骤3;当的值满足:时,则停止在区域中选取鸟巢位置进行第次的偏好随机更新,其中,为给定的可替代性检测阈值,且为大于的正整数,的值可以取;Step 2: In the current collection Select the bird's nest position with the largest substitutability coefficient for substitutability detection, set for the current collection The nest location with the largest substitutability coefficient in , and Represents a collection in the bird's nest location, Indicates the location of the bird's nest in collection the set of alternative nest positions in , and ,in, Represents a collection in the bird's nest location, for the bird's nest location and bird's nest location Alternative judgment functions between , when hour, the value of ,when hour, the value of ;Assume Indicates the location of the bird's nest in collection The substitutability coefficient in , when The value of satisfies: , then in the collection randomly selected from Randomly update the preference of the bird's nest positions, and collect the Bird's Nest Location and Bird's Nest Location in are marked as nest locations that have undergone substitutable detection, and are collected in the collection After removing the position of the bird's nest that has undergone alternative detection, go to step 3; when The value of satisfies: , stop at the area Select the location of the bird's nest from the times the preference is randomly updated, where, is a given alternative detection threshold, and is greater than a positive integer of , The value of can take ;
步骤3:继续按照步骤2中的方式在当前集合中选取具有最大可替代性系数的鸟巢位置进行可替代性检测,直到集合中的鸟巢位置都进行了可替代性检测。Step 3: Continue in the current collection as in
设置选取的鸟巢位置采用下列方式进行第次的偏好随机更新:Set the selected bird's nest position in the following ways: The preferences are updated randomly for times:
设表示选取的进行第次的偏好随机更新的鸟巢位置,表示种群中的第只布谷鸟在第次迭代更新后的最终鸟巢位置,则表示采用莱维飞行模式对鸟巢位置进行第次迭代更新后的鸟巢位置,则采用下列方式进行第次的偏好随机更新:Assume Indicates the selected The preferred randomly updated bird's nest location, represents the first in the population only cuckoo in the The final bird’s nest position after the next iteration update, then Indicates the position of the bird's nest using the Levi flight mode carry out the the position of the bird’s nest after the next iteration update, then carry out the following The preferences are updated randomly for times:
式中,表示采用随机偏好更新模式对鸟巢位置进行迭代更新后的鸟巢位置,表示鸟巢位置的适应度函数值,表示鸟巢位置的适应度函数值,表示种群中第只布谷鸟在第次迭代更新后的最终鸟巢位置,和为在当前采用莱维飞行模式进行第次迭代更新后的鸟巢位置中随机选取的两个鸟巢位置,且,表示到之间的随机数。In the formula, Indicates that the random preference update mode is used to adjust the position of the bird's nest. The nest position after iterative update, Indicates the location of the bird's nest The fitness function value of , Indicates the location of the bird's nest The fitness function value of , represents the number of only cuckoo in the The final bird's nest position after the second iteration update, and For the first flight in Levi's flight mode Two bird nest positions randomly selected from the bird nest positions updated by the second iteration, and , express arrive random numbers in between.
具体的,本发明在针对支持向量机最佳算法参数难以确定的缺陷以及布谷鸟搜索算法局部搜索能力弱、寻优精度低的不足,通过布谷鸟搜索算法对支持向量机的参数进行寻优,并在布谷鸟搜索算法的迭代过程中对布谷鸟搜索算法的偏好随机更新模式进行改进,提高了布谷鸟搜索算法的搜索精度,从而使得寻优所得的最优算法参数能够提高支持向量机的分类精度。标准布谷鸟搜索算法的最大特点是采用莱维飞行模式,将频繁的短距离探索与偶尔的较长距离迁移结合在了约束空间内最优解的寻找策略中,即莱维飞行模式的搜索更加注重于局部搜索,因此,当种群在每次迭代更新后的鸟巢位置分布的较为全面时,下一次采用莱维飞行模式进行迭代更新时就能实现更加全面的局部搜索,从而提高种群的局部搜索精度,增加寻找到最优解的概率。而传统的偏好随机更新模式利用生成的随机数和发现概率进行比较从而确定进行偏好随机更新的鸟巢位置,在选取鸟巢位置方面具有较强的随机性,也容易破坏当前种群中鸟巢位置分布的全面性,为了更好的利用偏好随机更新模式增强种群多样性的同时,保证当前采用莱维飞行模式进行迭代更新后的鸟巢位置可以较为全面地引导下一次的莱维飞行更新,本实施例将种群中的鸟巢位置进行区域划分,将相似的鸟巢位置归为一个区域,通过可替代性检测来确定区域中各鸟巢位置在当前区域中拥有的可替代解的数量,当一个鸟巢位置在其所在的区域中拥有较多的可替代解,那么在该鸟巢位置和其拥有的可替代解中随机选取一定数量的鸟巢位置进行偏好随机更新,这样既可以保证区域中鸟巢位置分布的较为全面,从而保证下一次迭代时采用莱维飞行模式对该区域进行搜索的局部搜索精度,又实现了采用偏好随机更新模式进行更新,从而增加种群多样性的目的,因此相较于标准布谷鸟搜索算法具有更好的搜索精度。Specifically, the present invention seeks to optimize the parameters of the support vector machine through the cuckoo search algorithm, aiming at the defects that the optimal algorithm parameters of the support vector machine are difficult to determine and the weak local search ability and low optimization accuracy of the cuckoo search algorithm. In the iterative process of the cuckoo search algorithm, the preference random update mode of the cuckoo search algorithm is improved, which improves the search accuracy of the cuckoo search algorithm, so that the optimal algorithm parameters obtained from the optimization can improve the classification of support vector machines. precision. The most important feature of the standard cuckoo search algorithm is the use of Levi flight mode, which combines frequent short-distance exploration and occasional long-distance migration in the search strategy for the optimal solution in the constrained space, that is, the search of Levi flight mode is more efficient. Focus on local search. Therefore, when the distribution of nest positions of the population after each iterative update is relatively comprehensive, a more comprehensive local search can be achieved when the Levie flight mode is used for iterative update next time, thereby improving the local search of the population. Accuracy increases the probability of finding the optimal solution. The traditional preference random update mode uses the generated random number and the discovery probability to compare to determine the bird's nest location for preference random update, which has strong randomness in selecting the bird's nest location and easily destroys the overall distribution of the bird's nest location in the current population. In order to better utilize the preference random update mode to enhance the diversity of the population and at the same time ensure that the position of the bird’s nest after iteratively updated by the current Levie flight mode can guide the next Levie flight update more comprehensively, in this embodiment, the population The bird's nest positions in the region are divided into regions, similar bird's nest positions are classified into one region, and the number of alternative solutions that each bird's nest position in the region has in the current region is determined by substitutability detection. There are many alternative solutions in the region, then a certain number of bird nest positions are randomly selected from the bird's nest position and the alternative solutions it has for random preference update, which can not only ensure a more comprehensive distribution of bird nest positions in the region, so as to ensure In the next iteration, the local search accuracy of the area is searched using the Levi flight mode, and the preference random update mode is used to update, thereby increasing the diversity of the population, so it has better performance than the standard cuckoo search algorithm. search accuracy.
如图1所示,本实施例还提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的示例性产品架构,包括计算机处理器1及其配套的显示终端2,计算机处理器1内装载有风控平台3,计算机处理器1外通讯连接有数据存储服务器4,数据存储服务器4采集来自公开及第三方平台服务器5的海量数据服务于风控平台3,用户6可通过计算机处理器1访问风控平台3。As shown in FIG. 1, this embodiment also provides an exemplary product architecture of a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence, including a computer processor 1 and its supporting
如图10所示,本实施例还提供了基于算法、大数据、人工智能的股票舆情监测和风控系统的运行装置,该装置包括处理器、存储器以及存储在存储器中并在处理器上运行的计算机程序。As shown in FIG. 10 , this embodiment also provides a device for running a stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence. The device includes a processor, a memory, and a device stored in the memory and running on the processor. Computer program.
处理器包括一个或一个以上处理核心,处理器通过总线与存储器相连,存储器用于存储程序指令,处理器执行存储器中的程序指令时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。The processor includes one or more processing cores, and the processor is connected to the memory through a bus, and the memory is used to store program instructions. When the processor executes the program instructions in the memory, the above-mentioned algorithm, big data, artificial intelligence-based stock public opinion monitoring and risk management are realized. control system.
可选的,存储器可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随时存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。Alternatively, the memory can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static anytime access memory (SRAM), electrically erasable programmable read only memory (EEPROM), which can be Erase programmable read only memory (EPROM), programmable read only memory (PROM), read only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
此外,本发明还提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时实现上述的基于算法、大数据、人工智能的股票舆情监测和风控系统。In addition, the present invention also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the above-mentioned stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence is implemented .
可选的,本发明还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面基于算法、大数据、人工智能的股票舆情监测和风控系统。Optionally, the present invention also provides a computer program product containing instructions, which, when running on a computer, enables the computer to execute the above aspects of the stock public opinion monitoring and risk control system based on algorithms, big data, and artificial intelligence.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,程序可以存储于计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above-mentioned embodiments can be completed by hardware, or can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium. The above-mentioned storage medium It can be a read-only memory, a magnetic disk or an optical disk, etc.
以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的仅为本发明的优选例,并不用来限制本发明,在不脱离本发明精神和范围的前提下,本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments, and the above-mentioned embodiments and descriptions are only preferred examples of the present invention, and are not intended to limit the present invention, without departing from the spirit and scope of the present invention. Under the premise, the present invention will also have various changes and improvements, and these changes and improvements all fall within the scope of the claimed invention. The claimed scope of the present invention is defined by the appended claims and their equivalents.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210243161.8ACN114612239A (en) | 2022-03-11 | 2022-03-11 | Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210243161.8ACN114612239A (en) | 2022-03-11 | 2022-03-11 | Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence |
| Publication Number | Publication Date |
|---|---|
| CN114612239Atrue CN114612239A (en) | 2022-06-10 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210243161.8APendingCN114612239A (en) | 2022-03-11 | 2022-03-11 | Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence |
| Country | Link |
|---|---|
| CN (1) | CN114612239A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116012019A (en)* | 2023-03-27 | 2023-04-25 | 北京力码科技有限公司 | Financial wind control management system based on big data analysis |
| CN118552315A (en)* | 2024-07-30 | 2024-08-27 | 上海大智慧信息科技有限公司 | Real-time monitoring system and device for stock abnormal transaction behavior |
| CN120470582A (en)* | 2025-07-08 | 2025-08-12 | 福州高岭数据科技有限公司 | An attack defense method and device for e-commerce intelligent customer service large model |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109583738A (en)* | 2018-11-22 | 2019-04-05 | 第创业证券股份有限公司 | A kind of device and method for bond risk control |
| CN111061792A (en)* | 2019-12-16 | 2020-04-24 | 杭州城市大数据运营有限公司 | Financial service management system |
| CN111192144A (en)* | 2020-01-03 | 2020-05-22 | 湖南工商大学 | Financial data prediction method, device, equipment and storage medium |
| CN113034284A (en)* | 2021-04-14 | 2021-06-25 | 刘星 | Stock tendency analysis and early warning system based on algorithm, big data and block chain |
| CN113065962A (en)* | 2021-03-31 | 2021-07-02 | 北京安九信息技术有限公司 | Stock price transaction risk assessment method, system and device for listed companies |
| CN113393331A (en)* | 2021-06-10 | 2021-09-14 | 罗忠明 | Database and algorithm based big data insurance accurate wind control, management, intelligent customer service and marketing system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109583738A (en)* | 2018-11-22 | 2019-04-05 | 第创业证券股份有限公司 | A kind of device and method for bond risk control |
| CN111061792A (en)* | 2019-12-16 | 2020-04-24 | 杭州城市大数据运营有限公司 | Financial service management system |
| CN111192144A (en)* | 2020-01-03 | 2020-05-22 | 湖南工商大学 | Financial data prediction method, device, equipment and storage medium |
| CN113065962A (en)* | 2021-03-31 | 2021-07-02 | 北京安九信息技术有限公司 | Stock price transaction risk assessment method, system and device for listed companies |
| CN113034284A (en)* | 2021-04-14 | 2021-06-25 | 刘星 | Stock tendency analysis and early warning system based on algorithm, big data and block chain |
| CN113393331A (en)* | 2021-06-10 | 2021-09-14 | 罗忠明 | Database and algorithm based big data insurance accurate wind control, management, intelligent customer service and marketing system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116012019A (en)* | 2023-03-27 | 2023-04-25 | 北京力码科技有限公司 | Financial wind control management system based on big data analysis |
| CN118552315A (en)* | 2024-07-30 | 2024-08-27 | 上海大智慧信息科技有限公司 | Real-time monitoring system and device for stock abnormal transaction behavior |
| CN120470582A (en)* | 2025-07-08 | 2025-08-12 | 福州高岭数据科技有限公司 | An attack defense method and device for e-commerce intelligent customer service large model |
| CN120470582B (en)* | 2025-07-08 | 2025-09-19 | 福州高岭数据科技有限公司 | An attack defense method and device for e-commerce intelligent customer service large model |
| Publication | Publication Date | Title |
|---|---|---|
| CN114612239A (en) | Stock public opinion monitoring and wind control system based on algorithm, big data and artificial intelligence | |
| CN114078050A (en) | Loan overdue prediction method and device, electronic equipment and computer readable medium | |
| Xu et al. | Novel key indicators selection method of financial fraud prediction model based on machine learning hybrid mode | |
| CN110930038A (en) | Loan demand identification method, loan demand identification device, loan demand identification terminal and loan demand identification storage medium | |
| CN117934162A (en) | Multi-dimensional dynamic assessment of real estate mortgage financial risk prevention and control method and system | |
| CN114897564A (en) | Target customer recommendation method and device, electronic equipment and storage medium | |
| CN116385151A (en) | Method and computing device for risk rating prediction based on big data | |
| CN113393316B (en) | Loan overall process accurate wind control and management system based on massive big data and core algorithm | |
| Chen et al. | Predicting a corporate financial crisis using letters to shareholders. | |
| Zang | Construction of Mobile Internet Financial Risk Cautioning Framework Based on BP Neural Network | |
| CN117933568A (en) | Operation decision method, apparatus, device, medium and program product | |
| CN117172910A (en) | Credit evaluation method and device based on EBM model, electronic equipment and storage medium | |
| CN116844287A (en) | Cash amount prediction method, cash amount prediction device, computer equipment and storage medium | |
| CN116384750A (en) | Method and computing device for generating marking sample and training risk rating prediction model | |
| CN116384751A (en) | Method and computing device for carrying out standardized risk index and risk rating prediction | |
| CN117764692A (en) | Method for predicting credit risk default probability | |
| Nazari et al. | Evaluating the effectiveness of data mining techniques in credit scoring of bank customers using mathematical models: a case study of individual borrowers of Refah Kargaran Bank in Zanjan Province, Iran | |
| Lohse | Machine Learning in Banking: Exploring the feasibility of using consumer level bank transaction data for credit risk evaluation | |
| CN120181980A (en) | Method for constructing a serious negative risk prediction model for inclusive enterprises and a Scorenegative model for inclusive credit | |
| CN118333737A (en) | Method for constructing retail credit risk prediction model and consumer credit business Scorebetai model | |
| CN119477509A (en) | Methods for building retail credit risk prediction models and Scorealpha2 models for credit cards and special installment businesses | |
| CN118333739A (en) | Method for constructing retail credit risk prediction model and retail credit business Scoremult model | |
| CN119887363A (en) | Method for constructing retail credit risk prediction model and Internet credit service Scoregamma model | |
| CN120106960A (en) | Methods for building inclusive credit risk prediction models and inclusive credit Scorezeta model | |
| Faris et al. | Using Artificial Intelligence and Deep Learning Applications in Credit Risk Analysis |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication | Application publication date:20220610 |