技术领域Technical field
本申请涉及移动通信技术领域,具体涉及一种诈骗用户预警方法、装置、电子设备及存储介质。This application relates to the field of mobile communication technology, specifically to a fraud user early warning method, device, electronic equipment and storage medium.
背景技术Background technique
近年来,电信诈骗事件频发,引发极大社会危害,相关防治工作受到高度重视。电话诈骗是一项社会危害巨大且高发的犯罪行为,诈骗分子电话冒充他人恶意骗取财产,影响社会安定,且随着对新技术的应用,电话诈骗日益呈现出多变性、对抗性等特点。目前由用户通过手机标记功能进行诈骗电话标记,使得服务端可以根据标记的诈骗电话服务端建立诈骗号码标记库,并根据诈骗号码标记库对用户的来电进行提醒,但是,上述方法存在如下缺陷:In recent years, telecom fraud incidents have occurred frequently, causing great social harm, and relevant prevention and control work has received great attention. Telephone fraud is a crime that causes great harm to society and is highly frequent. Scammers impersonate others over the phone to maliciously defraud property and affect social stability. With the application of new technologies, telephone fraud has increasingly shown characteristics such as variability and confrontation. Currently, users mark fraudulent calls through the mobile phone marking function, so that the server can establish a fraudulent number marking library based on the marked fraudulent calls, and alert users of incoming calls based on the fraudulent number marking library. However, the above method has the following flaws:
数据不够全面:各方单独建立诈骗号码标记库,由于用户群有限,存在号码标记不完整的情形,很难全面地防范电信诈骗及骚扰。The data is not comprehensive enough: All parties have independently established fraud number tag libraries. Due to the limited user base, there are incomplete number tags, making it difficult to comprehensively prevent telecommunications fraud and harassment.
数据真实性不能保证:由于对使用用户没有限制、用户失误等原因,各方的诈骗号码标记库中存在恶意标记和标记错误的现象;也可能存在各诈骗号码标记库的拥有者或者运维者为了自身的利益,与不法分子进行利益交易篡改标记数据的行为,数据的真实性存在问题。The authenticity of the data cannot be guaranteed: due to no restrictions on users, user errors, etc., there are malicious marks and labeling errors in the fraud number tag libraries of all parties; there may also be owners or operators of each fraud number tag library For the sake of one's own interests, the behavior of tampering with marked data in profit transactions with criminals raises questions about the authenticity of the data.
只能事后提醒,不能提前防御:只能在用户接入电话后进行号码类型提醒和事后提醒,不能提前对确定的恶意号码进行拦截。It can only provide reminders afterward, but cannot prevent it in advance: it can only remind users of number types and afterward after accessing the phone, and it cannot intercept certain malicious numbers in advance.
由于由用户进行诈骗号码标记的方法存在上述缺陷,导致当前进行诈骗预警的效率低下。Due to the above-mentioned defects in the method of marking fraud numbers by users, the efficiency of current fraud warnings is low.
发明内容Contents of the invention
本申请实施例提供一种诈骗用户预警方法、装置、电子设备及存储介质,用以解决当前进行诈骗预警的效率低下的问题。Embodiments of the present application provide a fraud user early warning method, device, electronic device and storage medium to solve the current low efficiency problem of fraud early warning.
第一方面,本申请实施例提供一种诈骗用户预警方法,包括:In the first aspect, embodiments of this application provide a fraud user early warning method, including:
获取待分析用户的运营域数据和业务域数据;Obtain the operational domain data and business domain data of the user to be analyzed;
将所述运营域数据和所述业务域数据输入至人证分离用户预测模型,得到所述人证分离用户预测模型输出的预测结果;其中,所述人证分离用户预测模型用于确定用户信息是否与证件信息一致;The operation domain data and the business domain data are input into the witness and ID separation user prediction model, and the prediction results output by the ID and ID separation user prediction model are obtained; wherein, the ID and ID separation user prediction model is used to determine user information. Whether it is consistent with the certificate information;
若所述预测结果为所述待分析用户属于人证分离用户,则获取所述人证分离用户的通讯特征;If the prediction result is that the user to be analyzed belongs to a person-identity separation user, obtain the communication characteristics of the person-identity separation user;
将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别;其中,所述诈骗用户识别模型用于预测诈骗级别。Input the communication characteristics of the identity separation user into the fraud user identification model to obtain the fraud warning level output by the fraud user identification model; wherein, the fraud user identification model is used to predict the fraud level.
在一个实施例中,在将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别之后,还包括:In one embodiment, after inputting the communication characteristics of the identity separation user into the fraud user identification model and obtaining the fraud warning level output by the fraud user identification model, the method further includes:
根据所述诈骗预警级别,对所述人证分离用户的电话号码进行处理。The phone number of the witness separation user is processed according to the fraud warning level.
在一个实施例中,所述诈骗预警级别至少包括第一预警级别、第二预警级别与第三预警级别;所述根据所述诈骗预警级别,对所述人证分离用户的电话号码进行处理,包括以下任一项:In one embodiment, the fraud warning level includes at least a first warning level, a second warning level and a third warning level; and the phone number of the witness separation user is processed according to the fraud warning level, Include any of the following:
若所述诈骗预警级别为第一预警级别,则对所述人证分离用户的电话号码进行通讯功能关停处理;If the fraud warning level is the first warning level, shut down the communication function of the phone number of the user who separated the identity and identity certificate;
若所述诈骗预警级别为第二预警级别,则将所述人证分离用户的电话号码输出并接收基于所述电话号码的审核结果,在所述审核结果为不通过的情况下,对所述人证分离用户的电话号码进行通讯功能关停处理;If the fraud warning level is the second warning level, the phone number of the user who separated the identity certificate is output and the audit result based on the phone number is received. If the audit result is failed, the The user's phone number is separated from the identity card and the communication function is shut down;
若所述诈骗预警级别为第三预警级别,则将所述人证分离用户的电话号码输出并接收基于所述电话号码的审核结果,在所述审核结果为不通过且预设时长内未完成二次实名认证的情况下,对所述人证分离用户的电话号码进行通讯功能关停处理。If the fraud warning level is the third warning level, the phone number of the user who separated the identity certificate is output and the audit result based on the phone number is received. If the audit result is failed and the verification is not completed within the preset time period, In the case of secondary real-name authentication, the communication function of the phone number of the user whose identity is separated will be shut down.
在一个实施例中,在对所述人证分离用户的电话号码进行通讯功能关停处理之后,还包括:In one embodiment, after the communication function shutdown process is performed on the phone number of the identity separation user, the method further includes:
若人证分离用户完成二次实名认证,则恢复所述人证分离用户的电话号码的通讯功能。If the person-identity-separated user completes the second real-name authentication, the communication function of the person-identity-separated user's phone number is restored.
在一个实施例中,在将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别之后,还包括:In one embodiment, after inputting the communication characteristics of the identity separation user into the fraud user identification model and obtaining the fraud warning level output by the fraud user identification model, the method further includes:
基于所述人证分离用户的电话号码与所述诈骗预警级别,对预设诈骗用户清单进行数据更新。Based on the phone number of the witness separation user and the fraud warning level, data is updated on the preset fraud user list.
在一个实施例中,所述诈骗用户识别模型是基于如下步骤构建的:In one embodiment, the fraud user identification model is constructed based on the following steps:
采集诈骗用户的通讯特征作为正样本集,采集非诈骗用户的通讯特征作为负样本集;Collect the communication characteristics of fraudulent users as a positive sample set, and collect the communication characteristics of non-fraudulent users as a negative sample set;
采用随机森林算法,通过所述正样本集和所述负样本集生成多棵决策树;Using a random forest algorithm, multiple decision trees are generated through the positive sample set and the negative sample set;
基于所述多棵决策树构建诈骗用户识别模型。A fraud user identification model is constructed based on the multiple decision trees.
在一个实施例中,在基于所述多棵决策树构建诈骗用户识别模型之后,还包括:In one embodiment, after building a fraud user identification model based on the multiple decision trees, the method further includes:
获取未通过二次实名认证的用户的电话号码;Obtain the phone number of the user who has failed the second real-name authentication;
基于所述未通过二次实名认证的用户的电话号码构建诈骗号码库;Construct a fraud number database based on the phone numbers of users who have not passed secondary real-name authentication;
基于所述诈骗号码库,采用径向基函数对所述诈骗用户识别模型进行训练。Based on the fraud number library, the radial basis function is used to train the fraud user identification model.
第二方面,本申请实施例提供一种诈骗用户预警装置,包括:In the second aspect, embodiments of this application provide a fraud user early warning device, including:
第一获取模块,用于获取待分析用户的运营域数据和业务域数据;The first acquisition module is used to acquire the operational domain data and business domain data of the user to be analyzed;
第一输入模块,用于将所述运营域数据和所述业务域数据输入至人证分离用户预测模型,得到所述人证分离用户预测模型输出的预测结果;其中,所述人证分离用户预测模型用于确定用户信息是否与证件信息一致;The first input module is used to input the operation domain data and the business domain data into the person-identity separation user prediction model, and obtain the prediction results output by the person-identity separation user prediction model; wherein, the person-identity separation user prediction model The prediction model is used to determine whether the user information is consistent with the certificate information;
第二获取模块,用于若所述预测结果为所述待分析用户属于人证分离用户,则获取所述人证分离用户的通讯特征;The second acquisition module is used to obtain the communication characteristics of the user who separates witnesses and witnesses if the prediction result is that the user to be analyzed belongs to the user who separates witnesses and witnesses;
第二输入模块,用于将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别;其中,所述诈骗用户识别模型用于预测诈骗级别。The second input module is used to input the communication characteristics of the identity separation user into the fraud user identification model to obtain the fraud warning level output by the fraud user identification model; wherein the fraud user identification model is used to predict the fraud level. .
第三方面,本申请实施例提供一种电子设备,包括处理器和存储有计算机程序的存储器,所述处理器执行所述程序时实现第一方面所述的诈骗用户预警方法。In a third aspect, embodiments of the present application provide an electronic device, including a processor and a memory storing a computer program. When the processor executes the program, the fraud user early warning method described in the first aspect is implemented.
第四方面,本申请实施例提供一种存储介质,所述存储介质为计算机可读存储介质,包括计算机程序,所述计算机程序被处理器执行时实现第一方面所述的诈骗用户预警方法。In a fourth aspect, embodiments of the present application provide a storage medium. The storage medium is a computer-readable storage medium and includes a computer program. When the computer program is executed by a processor, the fraud user early warning method described in the first aspect is implemented.
本申请实施例提供的诈骗用户预警方法、装置、电子设备及存储介质,通过人证分离用户预测模型结合运营域数据和业务域数据,确定待分析用户是否为人证分离用户,并在确定待分析用户属于人证分离用户后,再通过诈骗用户识别模型结合人证分离用户的通讯特征,确定出人证分离用户的诈骗预警级别,由此可以提高诈骗用户预测的精准度,也便于根据诈骗预警级别进行恶意号码拦截,进而可以提高进行诈骗预警时的效率。The fraud user early warning method, device, electronic device and storage medium provided by the embodiment of this application uses the witness separation user prediction model combined with the operational domain data and business domain data to determine whether the user to be analyzed is a witness separation user, and determines whether the user to be analyzed is a witness separation user. After analyzing that the user belongs to the identity separation user, the fraud user identification model is combined with the communication characteristics of the identity separation user to determine the fraud warning level of the identity separation user. This can improve the accuracy of fraud user prediction and facilitate the identification of fraud users. Block malicious numbers at the early warning level, which can improve the efficiency of fraud warnings.
附图说明Description of the drawings
为了更清楚地说明本申请或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions in this application or the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are of the present invention. For some embodiments of the application, those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.
图1是本申请实施例提供的诈骗用户预警方法的流程示意图;Figure 1 is a schematic flow chart of the fraud user early warning method provided by the embodiment of this application;
图2是本申请实施例提供的诈骗用户预警方法的径向基函数神经网络结构图;Figure 2 is a radial basis function neural network structure diagram of the fraud user early warning method provided by the embodiment of this application;
图3是本申请实施例提供的诈骗用户预警方法可应用的诈骗用户预警系统的架构图;Figure 3 is an architectural diagram of a fraud user early warning system to which the fraudulent user early warning method provided by the embodiment of the present application can be applied;
图4是本申请诈骗用户预警装置实施例的功能模块示意图;Figure 4 is a functional module schematic diagram of an embodiment of the fraud user warning device of the present application;
图5是本申请实施例提供的电子设备的结构示意图。FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of this application clearer, the technical solutions in this application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of this application. Obviously, the described embodiments are part of this application. Examples, not all examples. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.
下面结合实施例对本发明提供的诈骗用户预警方法、装置、电子设备及存储介质进行详细描述。The fraud user early warning method, device, electronic device and storage medium provided by the present invention will be described in detail below with reference to the embodiments.
图1为本申请实施例提供的诈骗用户预警方法的流程示意图。参照图1,本申请实施例提供一种诈骗用户预警方法,该方法可以包括:Figure 1 is a schematic flowchart of a fraud user early warning method provided by an embodiment of this application. Referring to Figure 1, an embodiment of the present application provides a fraud user early warning method, which may include:
步骤100,获取待分析用户的运营域数据和业务域数据;Step 100: Obtain the operational domain data and business domain data of the user to be analyzed;
需要说明的是,本申请实施例提供的诈骗用户预警方法的执行主体可以是计算机设备,例如手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(Ultra-mobile Personal Computer,UMPC)、上网本或者个人数字助理(PersonalDigital Assistant,PDA)等。It should be noted that the execution subject of the fraud user warning method provided by the embodiment of the present application can be a computer device, such as a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (Ultra- mobile Personal Computer (UMPC), netbook or personal digital assistant (Personal Digital Assistant, PDA), etc.
本申请中业务域数据可以包括网络数据,比如信令、地图数据、告警、故障和网络资源等。运营域数据可以包括用户数据和业务数据,比如用户的消费习惯、终端信息、平均每位用户的收入(Average Revenue Per User,ARPU)的分组、业务内容、业务受众人群等。The business domain data in this application may include network data, such as signaling, map data, alarms, faults, network resources, etc. Operational domain data can include user data and business data, such as user consumption habits, terminal information, average revenue per user (ARPU) grouping, business content, business audience, etc.
步骤200,将运营域数据和业务域数据输入至人证分离用户预测模型,得到人证分离用户预测模型输出的预测结果;Step 200: Input the operation domain data and business domain data into the person-identity separation user prediction model, and obtain the prediction results output by the person-identity separation user prediction model;
本申请中,人证分离用户预测模型用于确定用户信息是否与证件信息一致。In this application, the user prediction model for separation of person and certificate is used to determine whether the user information is consistent with the certificate information.
人证分离用户预测模型可以为基于随机树算法构建的模型。The user prediction model for witness separation can be a model built based on the random tree algorithm.
其中,随机树算法是一种典型的分类方法,该算法首先对数据进行处理,利用归纳算法生成可读的规则和决策树,然后使用决策对新数据进行分析,典型的随机树算法包括迭代二叉树算法(Iterative Dichotomiser 3,ID3)、基于信息增益比的分类算法(C4.5)和分类与回归树算法(Classification and Regression Trees,CART),在本申请的一个实施例中可以为C4.5算法。Among them, the random tree algorithm is a typical classification method. The algorithm first processes the data, uses an inductive algorithm to generate readable rules and decision trees, and then uses decisions to analyze new data. Typical random tree algorithms include iterative binary trees Algorithm (Iterative Dichotomiser 3, ID3), classification algorithm based on information gain ratio (C4.5) and classification and regression tree algorithm (Classification and Regression Trees, CART), in one embodiment of the present application, it can be the C4.5 algorithm .
预测结果可以为待分析用户属于人证分离用户或待分析用户不属于人证分离用户。The prediction result may be that the user to be analyzed belongs to the user who separates the person and the evidence, or the user to be analyzed does not belong to the user who separates the person and the evidence.
步骤300,若预测结果为待分析用户属于人证分离用户,则获取人证分离用户的通讯特征;Step 300: If the prediction result is that the user to be analyzed belongs to the user with separated identity and identity, obtain the communication characteristics of the user with separated identity and identity;
本申请中的通讯特征可以包括但不限于用户归属地、缴费金额、是否合约客户、次均通话时长、高危地主叫频次、是否实名制用户、客户余额、是否校园客户、主叫陌生人次数、高危地主叫陌生人次数、开户渠道、在网时长、是否一证多号用户、漫游通话次数、交往圈人数、用户星级、客户授信额度、语音通话次数、漫游主叫频次、交往圈外通话人数、用户品牌、是否集团关键人、主叫频次、是否漫游至高危地、被叫号码离散度、用户套餐、是否4G客户、语音主叫长话次数、人证信息、同一身份证下号码数量、入网地离散度、常驻使用地与高危地通话次数等特征中的一项或多项。The communication characteristics in this application may include but are not limited to the user's location, payment amount, whether the user is a contract customer, average call duration, frequency of high-risk calls, whether the user is a real-name user, customer balance, whether the user is a campus customer, number of stranger calls, high-risk The number of times the landlord calls strangers, account opening channel, length of time on the Internet, whether one card has multiple numbers, number of roaming calls, number of people in the circle of contacts, user star rating, customer credit limit, number of voice calls, frequency of roaming callers, number of calls outside the circle of contacts , user brand, whether the group is a key person, call frequency, whether roaming to high-risk areas, called number dispersion, user package, whether it is a 4G customer, number of long-distance voice calls, witness information, number of numbers under the same ID card, One or more of the following characteristics: the dispersion of network access locations, the number of calls between permanent use locations and high-risk locations.
其中,单个用户的每个通讯特征对应一个固定的属性值,例如通讯特征的属性为北京,缴费金额为100元。Among them, each communication feature of a single user corresponds to a fixed attribute value. For example, the attribute of the communication feature is Beijing, and the payment amount is 100 yuan.
步骤400,将人证分离用户的通讯特征输入至诈骗用户识别模型,得到诈骗用户识别模型输出的诈骗预警级别。Step 400: Input the communication characteristics of the separated user to the fraud user identification model to obtain the fraud warning level output by the fraud user identification model.
本申请中,诈骗用户识别模型用于预测诈骗级别。In this application, the fraud user identification model is used to predict the level of fraud.
本申请中,可以对人证分离用户的通讯特征中的一个或多个特征进行随机组合后输入至诈骗用户识别模型。In this application, one or more characteristics of the communication characteristics of the identity separation user can be randomly combined and then input into the fraud user identification model.
本申请中的诈骗预警级别,可以为采用随机森林算法,通过正样本集和负样本集生成多棵决策树,并基于多棵决策树构建的识别模型。The fraud warning level in this application can be based on the random forest algorithm, generating multiple decision trees through positive sample sets and negative sample sets, and building an identification model based on multiple decision trees.
本申请中的诈骗预警级别可以包括但不限于第一预警级别、第二预警级别与第三预警级别等。The fraud warning levels in this application may include but are not limited to the first warning level, the second warning level, the third warning level, etc.
其中,第一预警级别在本申请中可以为高危预警级别、第二预警级别在本申请中可以为中危预警级别,第三预警级别在本申请中可以为低危预警级别。其中,高级、中级与低级的区分标准可以根据实际需求设定。In this application, the first early warning level may be a high-risk early warning level, the second early warning level may be a medium-risk early warning level in this application, and the third early warning level may be a low-risk early warning level in this application. Among them, the standards for distinguishing high-level, intermediate and low-level can be set according to actual needs.
本申请实施例提供的诈骗用户预警方法,通过人证分离用户预测模型结合运营域数据和业务域数据,确定待分析用户是否为人证分离用户,并在确定待分析用户属于人证分离用户后,再通过诈骗用户识别模型结合人证分离用户的通讯特征,确定出人证分离用户的诈骗预警级别,由此可以提高诈骗用户预测的精准度,也便于根据诈骗预警级别进行恶意号码拦截,进而可以提高进行诈骗预警时的效率。The fraud user early warning method provided by the embodiment of this application determines whether the user to be analyzed is a witness-separated user by using the witness-separated user prediction model combined with operational domain data and business domain data, and after determining that the user to be analyzed is a witness-separated user , and then use the fraud user identification model combined with the communication characteristics of the identity separation user to determine the fraud warning level of the identity separation user, which can improve the accuracy of fraud user prediction, and also facilitate the interception of malicious numbers based on the fraud warning level, and then It can improve the efficiency of fraud warning.
在一个实施例中,在将人证分离用户的通讯特征输入至诈骗用户识别模型,得到诈骗用户识别模型输出的诈骗预警级别之后,还包括:In one embodiment, after inputting the communication characteristics of the identity separation user into the fraud user identification model and obtaining the fraud warning level output by the fraud user identification model, it also includes:
步骤500,根据诈骗预警级别,对人证分离用户的电话号码进行处理。Step 500: Process the phone number of the user who has separated the identity card according to the fraud warning level.
本申请在得到人证分离用户的诈骗预警级别后,可以根据人证分离用户的诈骗预警级别,确定是否需要对该用户的电话号码进行通讯功能关停处理,以降低诈骗用户带来的社会危害。After obtaining the fraud warning level of the person-identity separation user, this application can determine whether it is necessary to shut down the communication function of the user's phone number based on the fraud warning level of the person-identity separation user, in order to reduce the social harm caused by fraudulent users. .
具体地,根据诈骗预警级别,对人证分离用户的电话号码进行处理,包括以下任一项:Specifically, according to the fraud warning level, the phone number of the witness separation user is processed, including any of the following:
步骤501,若诈骗预警级别为第一预警级别,则对人证分离用户的电话号码进行通讯功能关停处理;Step 501: If the fraud warning level is the first warning level, shut down the communication function of the phone number of the user who separated the user's identity;
若确定诈骗预警级别为高危级别,则对人证分离用户的电话号码进行通讯功能关停处理。If the fraud warning level is determined to be a high-risk level, the communication function of the user's phone number will be shut down.
需要说明的是,关停后的电话号码在恢复通讯功能之前无法进行通讯。It should be noted that communication with a shut down phone number will not be possible until the communication function is restored.
由此,可以降低诈骗用户带来的社会危害。This can reduce the social harm caused by defrauding users.
步骤502,若诈骗预警级别为第二预警级别,则将人证分离用户的电话号码输出并接收基于电话号码的审核结果,在审核结果为不通过的情况下,对人证分离用户的电话号码进行通讯功能关停处理;Step 502, if the fraud warning level is the second warning level, output the phone number of the user who separated the identity card and receive the review result based on the phone number. If the review result is failed, the phone number of the user who separated the identity card is output. Carry out communication function shutdown processing;
若确定诈骗预警级别为中危级别,则将中危级别用户即人证分离用户的电话号码输出供人工审核,若接收到的审核结果为通过,则不对该用户的电话号码进行通讯功能关停处理;若接收到的审核结果为不通过,则对人证分离用户的电话号码进行通讯功能关停处理。If the fraud warning level is determined to be a medium-risk level, the phone number of the medium-risk level user, that is, the witness separation user, will be output for manual review. If the received review result is passed, the communication function of the user's phone number will not be shut down. Process; if the received review result is not passed, the communication function of the phone number of the user who separated the identity and identity will be shut down.
步骤503,若诈骗预警级别为第三预警级别,则将人证分离用户的电话号码输出并接收基于电话号码的审核结果,在审核结果为不通过且预设时长内未完成二次实名认证的情况下,对人证分离用户的电话号码进行通讯功能关停处理。Step 503, if the fraud warning level is the third warning level, output the phone number of the separated user and receive the audit result based on the phone number. If the audit result is failed and the second real-name authentication has not been completed within the preset time period, In this case, the communication function of the phone number of the user whose identity is separated will be shut down.
若确定诈骗预警级别为低危级别,则将低危级别用户的电话号码输出供人工审核,若接收到的审核结果为通过,则不对该用户的号码进行通讯功能关停处理;若接收到的审核结果为不通过,则确定该用户是否在预设时长内完成二次实名人证,在用户在预设时长内完成二次实名认证的情况下,不对该用户的号码进行通讯功能关停处理;在用户在预设时长内未完成二次实名认证的情况下,对人证分离用户的电话号码进行通讯功能关停处理。If the fraud warning level is determined to be low-risk, the phone number of the low-risk user will be output for manual review. If the received review result is passed, the communication function of the user's number will not be shut down; if the received If the audit result is not passed, it will be determined whether the user has completed the second real-name authentication within the preset time period. If the user completes the second real-name authentication within the preset time period, the communication function of the user's number will not be shut down. ; If the user fails to complete the second real-name authentication within the preset time period, the communication function of the user's phone number will be shut down.
优选地,本申请中预设时长可以为3小时。用户可在3小时内完成二次实名认证,以此避免号码的通讯功能关停。Preferably, the preset duration in this application can be 3 hours. Users can complete the second real-name authentication within 3 hours to avoid shutting down the number’s communication function.
本实施例对于高危、中危和低危的用户分别采取不同的处理方式,有效的控制了诈骗行为,对低危级别的用户,设置预设时长,减小对客户的影响,提升用户体验感。This embodiment adopts different processing methods for high-risk, medium-risk and low-risk users, effectively controlling fraud. For low-risk users, a preset time is set to reduce the impact on customers and improve user experience. .
进一步地,在对人证分离用户的电话号码进行通讯功能关停处理之后,还包括:Furthermore, after shutting down the communication function of the phone number of the person-identity separation user, it also includes:
若人证分离用户完成二次实名认证,则恢复人证分离用户的电话号码的通讯功能。If the user whose identity and identity are separated completes the second real-name authentication, the communication function of the user's phone number will be restored.
本申请中若确定已对电话号码进行通讯功能关停处理的人证分离用户,完成了二次实名认证,则恢复人证分离用户的电话号码的通讯功能。In this application, if it is determined that the user whose phone number has been shut down for communication function has completed the second real-name authentication, the communication function of the user's phone number will be restored.
其中已进行通讯功能关停处理的用户包括,已进行通讯功能关停处理的高危用户、中危用户和低危用户。已进行通讯功能关停处理的用户可在客户端进行二次实名认证。The users who have had their communication functions shut down include high-risk users, medium-risk users and low-risk users who have had their communication functions shut down. Users who have shut down the communication function can perform a second real-name authentication on the client.
具体地,客户在用户端输入手机号,通过短信验证码登录,登录成功后,通过密码服务或近期通话记录进行补登记。登记成功后,用户上传身份证图片,系统通过光学字符识别技术,将用户上传的身份证照片与预存的身份信息库做对比,确保用户身份信息的真实性。核实用户的身份信息后,系统开启摄像头及录音功能,生成提示信息提示用户朗读预设的数字信息,后台通过声音识别用户朗读数字是否与系统数字一致,是否为本人朗读,同时辅以人像比对技术、静默活体技术判断拍摄视频是否本人且是否为真人活体拍摄。Specifically, the customer enters the mobile phone number on the user terminal and logs in through the SMS verification code. After successful login, the customer can re-register through the password service or recent call records. After successful registration, the user uploads an ID card photo, and the system uses optical character recognition technology to compare the ID card photo uploaded by the user with the pre-stored identity information database to ensure the authenticity of the user's identity information. After verifying the user's identity information, the system turns on the camera and recording function, generates a prompt message to prompt the user to read the preset digital information, and uses the sound in the background to identify whether the number read by the user is consistent with the system number and whether it is read by the user, and is supplemented by portrait comparison. technology and silent live technology to determine whether the video is shot by the person and whether it was shot by a real live person.
若确定朗读正确且是本人活体拍摄,则二次实名认证通过,将用户的身份信息传至后台,根据用户上传的身份信息,调用后台接口,为客户完成信息补录,同时,恢复该用户的电话号码的通讯功能。If it is determined that the reading is correct and the person was photographed alive, the second real-name authentication will be passed, and the user's identity information will be transmitted to the backend. Based on the identity information uploaded by the user, the backend interface will be called to complete the information supplementary recording for the customer, and at the same time, the user's identity information will be restored. Telephone number communication function.
本实施例提供二次实名认证的方法,对误判的正常客户需要提供快速恢复功能,减小对客户的不友好感知,减小对正常客户的影响。This embodiment provides a secondary real-name authentication method, which provides a quick recovery function for misjudged normal customers, reduces the unfriendly perception to customers, and reduces the impact on normal customers.
进一步地,在将人证分离用户的通讯特征输入至诈骗用户识别模型,得到诈骗用户识别模型输出的诈骗预警级别之后,还包括:Further, after inputting the communication characteristics of the separated user into the fraud user identification model and obtaining the fraud warning level output by the fraud user identification model, it also includes:
步骤600,基于人证分离用户的电话号码与诈骗预警级别,对预设诈骗用户清单进行数据更新。Step 600: Update the preset fraud user list based on the phone number and fraud warning level of the witness separation user.
本申请在得到诈骗预警级别后,可以将人证分离用户的电话号码及其诈骗预警级别作为一项数据,添加到预设诈骗用户清单中,得到更新后的预设诈骗用户清单。After obtaining the fraud warning level, this application can add the phone number of the witness separation user and its fraud warning level as a piece of data to the preset fraud user list to obtain an updated preset fraud user list.
本申请的预设诈骗用户清单可以为预先构建的,用于存储诈骗用户的电话号码及其诈骗预警级别等信息的清单。The preset fraud user list in this application can be a pre-built list used to store information such as phone numbers of fraud users and their fraud warning levels.
在一个实施例中,对预设诈骗用户清单进行数据更新,还包括以下步骤:In one embodiment, updating the data of the preset fraud user list also includes the following steps:
根据采集到运营域位置信息数据,每隔预设时间获取出访到预设的高危地的高危用户通信行为。Based on the collected operation domain location information data, the communication behavior of high-risk users who visit preset high-risk places is obtained at preset times.
具体地,跨系统采集A口信令、地图数据等运营域位置信息数据,通过运营域位置信息数据实时跟踪漫游出访到预设的高危地的高危用户通信行为,预设的高危地可以根据实际情况进行设置,在此不做具体限定。Specifically, the operation domain location information data such as A port signaling and map data are collected across the system, and the communication behavior of high-risk users who roam and visit preset high-risk places are tracked in real time through the operation domain location information data. The preset high-risk places can be based on actual It is set according to the situation and is not specifically limited here.
根据高危通信行为,通过流式计算方式确定疑似诈骗用户及疑似诈骗用户的诈骗预警级别。Based on high-risk communication behaviors, the suspected fraud users and the fraud warning level of the suspected fraud users are determined through streaming calculations.
具体地,根据高危通信行为及其相关用户的已知属性信息,例如是否是星级用户、在网时长和呼叫号码的离散程度等,通过流式计算方式,每隔预设的时间段确定一次疑似诈骗用户及疑似诈骗用户的诈骗预警级别。Specifically, based on the known attribute information of high-risk communication behaviors and related users, such as whether they are star users, the length of time they have been online, and the degree of dispersion of call numbers, etc., it is determined every preset time period through streaming computing. Suspected fraud users and fraud warning levels for suspected fraud users.
流式计算的方式相较于离线计算具有更高的实时性,流式计算区别于实时计算又存在一定的时延。Streaming computing has higher real-time performance than offline computing. Streaming computing is different from real-time computing and has a certain delay.
优选地,本实施例中可以每隔1小时更新一次诈骗用户的诈骗预警级别。Preferably, in this embodiment, the fraud warning level of the fraudulent user can be updated every hour.
基于疑似诈骗用户及疑似诈骗用户的诈骗预警级别,更新预设诈骗用户清单。Based on the suspected fraud users and the fraud warning level of the suspected fraud users, the default fraud user list is updated.
具体地,根据基于流式计算确定的诈骗用户及其对应的诈骗预警级别更新预设诈骗用户清单。Specifically, the preset fraud user list is updated according to the fraud users determined based on streaming calculations and their corresponding fraud warning levels.
本实施例基于流式计算对用户的高危通信行为进行计算,对预设的诈骗用户清单进行更新,提高了诈骗用户预测的精准度和实时性。This embodiment calculates users' high-risk communication behaviors based on streaming computing, updates the preset list of fraudulent users, and improves the accuracy and real-time performance of predicting fraudulent users.
在一个实施例中,诈骗用户识别模型是基于如下步骤构建的:In one embodiment, the fraud user identification model is constructed based on the following steps:
采集诈骗用户的通讯特征作为正样本集,采集非诈骗用户的通讯特征作为负样本集;Collect the communication characteristics of fraudulent users as a positive sample set, and collect the communication characteristics of non-fraudulent users as a negative sample set;
具体地,诈骗用户包括实时抓取的诈骗用户和系统预存的诈骗库中的用户,采集诈骗用户的通讯特征作为正样本集,例如上述通讯特征或其他通讯特征中的诈骗号码语言信息、短信信息、流量信息、位置信息、网龄信息和终端漫游指标信息等信息。非诈骗用户包括已完成二次实名认证的用户和系统预存的非诈骗用户,采集非诈骗用户的通讯特征作为负样本集。Specifically, fraudulent users include real-time captured fraudulent users and users in the system's pre-stored fraud library. The communication characteristics of fraudulent users are collected as a positive sample set, such as the fraudulent number language information and SMS information in the above communication characteristics or other communication characteristics. , traffic information, location information, network age information, terminal roaming indicator information and other information. Non-fraudulent users include users who have completed secondary real-name authentication and non-fraudulent users pre-stored in the system. The communication characteristics of non-fraudulent users are collected as a negative sample set.
采用随机森林算法,通过正样本集和负样本集生成多棵决策树;The random forest algorithm is used to generate multiple decision trees through positive sample sets and negative sample sets;
其中,随机森林是一种比较新的机器学习模型。经典的机器学习模型是神经网络,有半个多世纪的历史了。神经网络预测精确,但是计算量很大。上世纪八十年代分类树的算法出现,通过反复二分数据进行分类或回归,计算量大大降低。2001年有人提出把分类树组合成随机森林,即在变量(列)的使用和数据(行)的使用上进行随机化,生成很多分类树,再汇总分类树的结果。随机森林在运算量没有显著提高的前提下提高了预测精度。随机森林对多元公线性不敏感,结果对缺失数据和非平衡的数据比较稳健,可以很好地预测多达几千个解释变量的作用,随机森林算法被誉为当前最好的算法之一。Among them, random forest is a relatively new machine learning model. The classic machine learning model is the neural network, which has a history of more than half a century. Neural network predictions are accurate, but computationally intensive. The classification tree algorithm appeared in the 1980s, and the amount of calculation was greatly reduced by repeatedly dichotomizing data for classification or regression. In 2001, someone proposed to combine classification trees into random forests, that is, randomize the use of variables (columns) and data (rows), generate many classification trees, and then summarize the results of the classification trees. Random forest improves prediction accuracy without significantly increasing the amount of calculations. Random forest is insensitive to multivariate common linearity, and the results are relatively robust to missing data and unbalanced data. It can well predict the effects of up to thousands of explanatory variables. The random forest algorithm is hailed as one of the best algorithms currently.
随机森林算法处理潜客挖掘业务的主要优势如下:在数据集上表现良好,两种随机性的引入,使得随机森林不容易陷入过拟合,同时使得随机森林具有很好的抗噪声能力,有利于处理潜客挖掘业务的超大客户信息数据集;它能够处理很高维度的数据,并且不用做特征选择,对数据集的适应能力强:既能处理离散型数据,也能处理连续型数据,数据集无需规范化,因此都无需对输入潜客用户数据进行预处理;可以快速得到变量重要性排序(两种:基于袋外(Out-of-Bag,OOB)误分率的增加量和基于分裂时的基尼指数(下降量);在训练过程中,能够检测到特征值之间的互相影响;算法并发能力强,能够充分发挥Hadoop并行大数据平台的优势;The main advantages of the random forest algorithm in handling potential customer mining business are as follows: it performs well on data sets. The introduction of two kinds of randomness makes the random forest less likely to fall into overfitting. At the same time, the random forest has good anti-noise ability and has It is conducive to processing very large customer information data sets for potential customer mining business; it can handle very high-dimensional data without feature selection, and has strong adaptability to data sets: it can handle both discrete data and continuous data. The data set does not need to be normalized, so there is no need to preprocess the input potential user data; the variable importance ranking can be quickly obtained (two types: based on the increase in the misclassification rate based on Out-of-Bag (OOB) and based on splitting). Gini index (decline); during the training process, the interaction between feature values can be detected; the algorithm has strong concurrency capabilities and can give full play to the advantages of the Hadoop parallel big data platform;
随机森林算法根据正样本集和负样本集的数据构造了多棵决策树,每棵决策树用来标记一类高频欺诈号码类,每棵决策树随机从训练样本集N中有放回地重复随机抽取k个样本生成新的训练样本集合,同时每个样本的抽取g个特征值,新数据的分类结果按决策树投票多少形成分数而定,再根据数据分类的好坏筛选出具有最好特征值的集合。随机森林实质是对决策树算法的一种改进,将多个决策树合并在一起,每棵决策树的建立依赖于一个独立抽取的样品,随机森林中的每棵决策树具有相同的分布,分类误差取决于每一棵决策树的分类能力和它们之间的相关性。特征选择采用随机的方法去分裂每一个节点,然后比较不同情况下产生的误差。能够检测到的内在估计误差、分类能力和相关性决定选择哪些有价值的特征值。单棵决策树的分类能力可能很小,但在随机产生大量的决策树后,一个测试样品可以通过每一棵决策树的分类结果经统计后选择最可能的分类与最有价值的特征值。The random forest algorithm constructs multiple decision trees based on the data of the positive sample set and the negative sample set. Each decision tree is used to mark a type of high-frequency fraud number. Each decision tree is randomly replaced from the training sample set N. K samples are repeatedly randomly selected to generate a new training sample set, and g feature values are extracted from each sample. The classification result of the new data is determined by the score formed by the number of votes cast by the decision tree, and then the best ones are selected based on the quality of the data classification. A collection of good eigenvalues. The essence of random forest is an improvement on the decision tree algorithm, which merges multiple decision trees together. The establishment of each decision tree relies on an independently selected sample. Each decision tree in the random forest has the same distribution and classification. The error depends on the classification ability of each decision tree and the correlation between them. Feature selection uses a random method to split each node, and then compares the errors produced in different situations. The inherent estimation error, classification ability and correlation that can be detected determine which valuable feature values are selected. The classification ability of a single decision tree may be very small, but after randomly generating a large number of decision trees, a test sample can select the most likely classification and the most valuable feature value through statistics of the classification results of each decision tree.
决策树构建的关键在于分割点的选取,通过采用贪心算法考虑当前分割点纯度差的大小作为要素进行从大到小优先排序。对于纯度的量化使用id3算法,以信息增益度量属性选择,选择分裂后信息增益最大的属性进行分裂。计算集合的信息熵的公式如下:The key to building a decision tree lies in the selection of the split points. By using a greedy algorithm, the purity difference of the current split points is considered as an element to prioritize from large to small. For the quantification of purity, the id3 algorithm is used, information gain is used to measure attribute selection, and the attribute with the largest information gain after splitting is selected for splitting. The formula for calculating the information entropy of a set is as follows:
其中,info(D)为集合D的信息熵,Pi为第i类别在集合D中出现的概率。Among them, info(D) is the information entropy of set D, andPi is the probability that the i-th category appears in set D.
对集合按照特征属性划分后的期望信息熵计算公式如下:The formula for calculating the expected information entropy after dividing the set according to characteristic attributes is as follows:
其中,infoA(D)表示A对D所划分的期望信息熵,D为训练集合,A为特征属性。Among them, infoA (D) represents the expected information entropy divided by A to D, D is the training set, and A is the feature attribute.
对集合按照特征属性划分后的信息熵增益计算公式如下:The information entropy gain calculation formula after dividing the set according to characteristic attributes is as follows:
gain(A)=info(D)-infoA(D);gain(A)=info(D)-infoA (D);
其中,gain(A)为按照A特征属性划分后所得到的信息增益,info(D)为集合D的信息熵,infoA(D)表示A对D所划分的期望信息熵。Among them, gain(A) is the information gain obtained after dividing according to the characteristic attributes of A, info(D) is the information entropy of set D, and infoA (D) represents the expected information entropy of A divided by D.
所有的特征值都按照信息增益来进行递归排序,从而构建整个决策树,在随机森林体系所构建的决策树中不需要进行减枝,这样对训练数据就会表现很精确,尽管对其他数据没有那么精确会出现过拟合,但对于集成学习来说可以通过多个决策树共同决策来避免单个决策树的过拟合。All feature values are recursively sorted according to information gain to construct the entire decision tree. There is no need to reduce branches in the decision tree constructed by the random forest system, so that the training data will be very accurate, although it will not be used for other data. Then overfitting will occur accurately, but for ensemble learning, multiple decision trees can make decisions together to avoid overfitting of a single decision tree.
具体地,选取多个训练集样本,根据训练集及其特征值,建立决策树,重复该步骤,建立多颗决策树。训练集由正样本集与负样本集组成。Specifically, multiple training set samples are selected, a decision tree is established based on the training set and its characteristic values, and this step is repeated to establish multiple decision trees. The training set consists of positive sample set and negative sample set.
基于多棵决策树构建诈骗用户识别模型。Construct a fraud user identification model based on multiple decision trees.
具体地,对于训练集的数据,每颗决策树会经过决策,确定用户的诈骗预警等级,对每个决策树对于分类结果进行评估,筛选出部分特征类型集合,构建疑似诈骗用户识别模型。Specifically, for the data in the training set, each decision tree will make decisions to determine the user's fraud warning level, evaluate the classification results of each decision tree, screen out some feature type sets, and build a suspected fraud user identification model.
进一步地,在基于多棵决策树构建诈骗用户识别模型之后,还包括:Furthermore, after building a fraud user identification model based on multiple decision trees, it also includes:
获取未通过二次实名认证的用户的电话号码;Obtain the phone number of the user who has failed the second real-name authentication;
基于未通过二次实名认证的用户的电话号码构建诈骗号码库;Build a fraud number database based on the phone numbers of users who have not passed secondary real-name authentication;
基于诈骗号码库,采用径向基函数对诈骗用户识别模型进行训练。Based on the fraud number database, the radial basis function is used to train the fraud user identification model.
具体地,为了更好的提升诈骗用户识别模型的识别准确度,实现特征变量、参数数量和参数权重系数的智能优化,本实施例将识别出来的高危级别、中危级别和低危级别号码在经过二次实名认证后,将认证结论、涉诈号码的基础属性、漫游属性、行为属性、位置轨迹等历史数据结合,将未通过实名认证的用户的电话号码传至诈骗号码库,构建诈骗号码库。Specifically, in order to better improve the identification accuracy of the fraud user identification model and realize intelligent optimization of feature variables, parameter numbers and parameter weight coefficients, this embodiment will identify the high-risk level, medium-risk level and low-risk level numbers in After the second real-name authentication, the authentication conclusion, the basic attributes of the fraud number, roaming attributes, behavioral attributes, location trajectories and other historical data are combined, and the phone numbers of users who have not passed the real-name authentication are transferred to the fraud number database to construct the fraud number. library.
图2为本申请实施例提供的诈骗用户预警方法的径向基函数神经网络结构图,如图2所示,径向基函数神经网络包含输入层、隐层和输出层三层结构。从输入层到隐层的变换是非线性变换,输出层是输出神经元的线性加权组合。Figure 2 is a structural diagram of the radial basis function neural network of the fraud user early warning method provided by the embodiment of the present application. As shown in Figure 2, the radial basis function neural network includes a three-layer structure of an input layer, a hidden layer and an output layer. The transformation from the input layer to the hidden layer is a nonlinear transformation, and the output layer is a linear weighted combination of output neurons.
当选择高斯函数作为径向基函数时,径向基函数神经网络的输出公式如下:When a Gaussian function is selected as the radial basis function, the output formula of the radial basis function neural network is as follows:
其中,为第j个输出,j输出的次数,x(t)为输入向量,wij是第i个隐神经元与第j个输出神经元之间的突触权重,Gi为第i个隐神经元的高斯函数,μi和σi是相应高斯函数的中心和宽度。in, is the j-th output, the number of j-th outputs, x(t) is the input vector, wij is the synaptic weight between the i-th hidden neuron and the j-th output neuron, Gi is the i-th hidden neuron Gaussian function of the element, μi and σi are the center and width of the corresponding Gaussian function.
算法的主要任务就是估计出径向基函数神经网络中三个参数Wij、μi和σi。估计参数的方法是给定一组用于训练的输入输出数据对,通过调整参数wij、μi和σi使J的值最小,得到参数wij、μi和σi。J的计算公式如下:The main task of the algorithm is to estimate the three parameters Wij , μi and σi in the radial basis function neural network. The method of estimating parameters is to give a set of input and output data pairs for training, and adjust the parameters wij , μi and σi to minimize the value of J, and obtain the parameters wij , μi and σi . The calculation formula for J is as follows:
其中,y(k)为输入输出数据对中的输出数据,为根据输入输出数据对中的输入数据和径向基函数神经网络的输出公式计算得到的输出数据。Among them, y(k) is the output data in the input-output data pair, is the output data calculated based on the input data in the input-output data pair and the output formula of the radial basis function neural network.
当成功地训练出径向基函数神经网络后,就得到了径向基函数神经网络模型中未知的参数wij、μi和σi,利用此公式可由输入向量x(t)得出预测的输出结果When the radial basis function neural network is successfully trained, the unknown parameters wij , μi and σi in the radial basis function neural network model are obtained. Using this formula, the predicted value can be obtained from the input vector x(t) Output results
本实施例利用上述算法,使用已识别出的疑似诈骗客户的基础属性、漫游属性、行为属性、位置轨迹等历史数据,能够精确地预测客户的通讯行为,以此作为疑似诈骗用户的识别模型的计算数据,提高识别精确度。This embodiment uses the above algorithm and historical data such as basic attributes, roaming attributes, behavioral attributes, location trajectories of identified suspected fraudulent customers to accurately predict the customer's communication behavior, and uses this as an identification model for suspected fraudulent users. Calculate data to improve recognition accuracy.
进一步地,本申请还提供一种诈骗用户预警系统。Furthermore, this application also provides a fraud user early warning system.
图3为本申请实施例提供的诈骗用户预警方法可应用的诈骗用户预警系统的架构图,如图3所示,系统包括人证分离识别器、疑似诈骗识别模块、分层分级模块、诈骗号码库、疑似涉诈号码识别优化器、实名认证处理器和涉诈号码通讯恢复处置器。在一个实施例中,该方法包括如下步骤:Figure 3 is an architectural diagram of a fraud user early warning system to which the fraud user early warning method provided by the embodiment of the present application can be applied. As shown in Figure 3, the system includes a witness separation identifier, a suspected fraud identification module, a hierarchical classification module, and a fraud number library, suspected fraud number identification optimizer, real-name authentication processor and fraud number communication recovery processor. In one embodiment, the method includes the following steps:
步骤700:获取从客户信息输入接口输入的待分析的移动客户号码。Step 700: Obtain the mobile customer number to be analyzed input from the customer information input interface.
步骤701:将客户的特征进行分析抽取,然后将客户信息分别输入人证分离识别器,以及疑似诈骗识别模块。Step 701: Analyze and extract the customer's characteristics, and then input the customer information into the witness separation and identification device and the suspected fraud identification module respectively.
步骤702:通过人证分离识别器,进行实名实人分离号码识别,将识别出的高危疑似分离号码作为一个重要因子传至疑似诈骗识别模块,将中低危疑似分离号码直接跳转至实名认证处置。Step 702: Use the identity separation identifier to identify the real-name and real-person separation numbers, pass the identified high-risk suspected separation numbers as an important factor to the suspected fraud identification module, and directly jump the medium- and low-risk suspected separation numbers to the real-name authentication Disposal.
步骤703:疑似诈骗识别模块中含疑似涉诈号码识别器,以及准实时涉诈号码识别器,在疑似诈骗识别模块从输入客户中识别出疑似涉诈号码,分层分级输出至分层分级处置模块;Step 703: The suspected fraud identification module contains a suspected fraud number identifier and a quasi-real-time fraud number identifier. In the suspected fraud identification module, the suspected fraud number is identified from the input customers and output to hierarchical hierarchical processing. module;
步骤704:高危涉诈号码关停处置器对识别出的高危涉诈号码进行主叫、发短信和上网功能自动关停;中危涉诈号码关停处置器先是对业务人员提供中危号码审核功能,若审核为立即关停,处置器将对号码的进行主叫、发短信和上网功能自动关停;低危涉诈号码关停处置器,先是对业务人员提供中危号码审核功能,审核为涉诈号码的,系统提供3小时通话时间,3小时通话时间内,未进行二次认证或认证不通过的处置器将对号码的进行主叫、发短信和上网功能自动关停。Step 704: The high-risk fraud number shutdown handler automatically shuts down the calling, text messaging, and Internet access functions of the identified high-risk fraud numbers; the medium-risk fraud number shutdown handler first provides business personnel with a review of medium-risk numbers. function, if the review results in immediate shutdown, the processor will automatically shut down the number's calling, text messaging, and Internet access functions; if the processor is shut down for low-risk fraud numbers, it will first provide business personnel with the review function for medium-risk numbers, and then review For fraudulent numbers, the system provides 3 hours of call time. Within 3 hours of call time, processors that have not undergone secondary authentication or failed authentication will automatically shut down the number's calling, texting, and Internet access functions.
步骤705:分层分级处置模块对涉诈号码进行处置后,将号码提交至实名认证处置器,由用户进行实名认证信息的提交,再由系统判断号码是否通过实名认证。Step 705: After processing the fraud number, the hierarchical and hierarchical processing module submits the number to the real-name authentication processor. The user submits the real-name authentication information, and then the system determines whether the number has passed the real-name authentication.
步骤706:将通过实名认证的号码传至涉诈号码通讯恢复处置器,对被暂停主叫、发短信和上网功能的号码恢复主叫、发短信和上网功能;未通过实名认证的号码传至诈骗号码库,诈骗号码库中的号码传至疑似涉诈号码识别优化器进行号码属性、行为分析,优化疑似涉诈号码识别模型。Step 706: Send the number that has passed the real-name authentication to the communication recovery processor for the fraudulent number, and restore the calling, sending text messages, and surfing the Internet functions to the number that has been suspended; the number that has not passed the real-name authentication is sent to Fraudulent number database. The numbers in the fraudulent number database are sent to the suspected fraud number identification optimizer to analyze the number attributes and behavior, and optimize the suspected fraud number identification model.
步骤707:将实名认证处置器中的结果向用户进行展示。Step 707: Display the results in the real-name authentication handler to the user.
本发明实施例,使用基于实名认证预测疑似电信诈骗用户方法及装置,实现了“信令采集-信令传输-涉诈号码检出-涉诈号码关停-便携二次实名认证-快速自动开通”的闭环流程,全程系统自动处理。构建了完备的诈骗监控分析、预警处置流程,将防范关口前移,及时应对苗头性、趋势性问题,诈骗号码识别精度高、关停处理效率高,有效遏制涉诈举报通报上涨趋势,同时降低误判号码的客户投诉。Embodiments of the present invention use methods and devices for predicting users suspected of telecommunications fraud based on real-name authentication to achieve "signaling collection - signaling transmission - fraud-related number detection - fraud-related number shutdown - portable secondary real-name authentication - rapid automatic activation ” closed-loop process, the whole process is automatically processed by the system. A complete fraud monitoring and analysis, early warning and handling process has been constructed to prevent the front door from moving forward and promptly respond to emerging and trending issues. The fraud number identification accuracy is high and the shutdown processing efficiency is high, which effectively curbs the rising trend of fraud reports and notifications, and at the same time reduces the risk of fraud. Customer complaints of misidentified numbers.
进一步地,本申请还提供一种诈骗用户预警装置。Furthermore, this application also provides a fraud user early warning device.
参照图4,图4为本申请诈骗用户预警装置实施例的功能模块示意图。Referring to Figure 4, Figure 4 is a functional module schematic diagram of an embodiment of the fraud user early warning device of the present application.
所述诈骗用户预警装置包括:The fraud user warning device includes:
第一获取模块410,用于获取待分析用户的运营域数据和业务域数据;The first acquisition module 410 is used to acquire the operational domain data and business domain data of the user to be analyzed;
第一输入模块420,用于将所述运营域数据和所述业务域数据输入至人证分离用户预测模型,得到所述人证分离用户预测模型输出的预测结果;其中,所述人证分离用户预测模型用于确定用户信息是否与证件信息一致;The first input module 420 is used to input the operation domain data and the business domain data into the person-identity separation user prediction model to obtain the prediction results output by the person-identity separation user prediction model; wherein, the person-identity separation user prediction model The user prediction model is used to determine whether the user information is consistent with the certificate information;
第二获取模块430,用于若所述预测结果为所述待分析用户属于人证分离用户,则获取所述人证分离用户的通讯特征;The second acquisition module 430 is used to obtain the communication characteristics of the user who separates witnesses and witnesses if the prediction result is that the user to be analyzed belongs to the user who separates witnesses and witnesses;
第二输入模块440,用于将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别;其中,所述诈骗用户识别模型用于预测诈骗级别。The second input module 440 is used to input the communication characteristics of the identity separation user into the fraud user identification model to obtain the fraud warning level output by the fraud user identification model; wherein the fraud user identification model is used to predict fraud. level.
本申请实施例提供的诈骗用户预警装置,通过人证分离用户预测模型结合运营域数据和业务域数据,确定待分析用户是否为人证分离用户,并在确定待分析用户属于人证分离用户后,再通过诈骗用户识别模型结合人证分离用户的通讯特征,确定出人证分离用户的诈骗预警级别,由此可以提高诈骗用户预测的精准度,也便于根据诈骗预警级别进行恶意号码拦截,进而可以提高进行诈骗预警时的效率。The fraud user early warning device provided by the embodiments of this application determines whether the user to be analyzed is a witness-separated user by combining the operational domain data and the business domain data through the witness-separated user prediction model, and after determining that the user to be analyzed belongs to the witness-separated user , and then use the fraud user identification model combined with the communication characteristics of the identity separation user to determine the fraud warning level of the identity separation user, which can improve the accuracy of fraud user prediction, and also facilitate the interception of malicious numbers based on the fraud warning level, and then It can improve the efficiency of fraud warning.
在一个实施例中,第二输入模块440还用于:In one embodiment, the second input module 440 is also used to:
根据所述诈骗预警级别,对所述人证分离用户的电话号码进行处理。The phone number of the witness separation user is processed according to the fraud warning level.
在一个实施例中,第二输入模块440包括处理单元,所述处理单元用于:In one embodiment, the second input module 440 includes a processing unit for:
若所述诈骗预警级别为第一预警级别,则对所述人证分离用户的电话号码进行通讯功能关停处理;If the fraud warning level is the first warning level, shut down the communication function of the phone number of the user who separated the identity and identity certificate;
若所述诈骗预警级别为第二预警级别,则将所述人证分离用户的电话号码输出并接收基于所述电话号码的审核结果,在所述审核结果为不通过的情况下,对所述人证分离用户的电话号码进行通讯功能关停处理;If the fraud warning level is the second warning level, the phone number of the user who separated the identity certificate is output and the audit result based on the phone number is received. If the audit result is failed, the The user's phone number is separated from the identity card and the communication function is shut down;
若所述诈骗预警级别为第三预警级别,则将所述人证分离用户的电话号码输出并接收基于所述电话号码的审核结果,在所述审核结果为不通过且预设时长内未完成二次实名认证的情况下,对所述人证分离用户的电话号码进行通讯功能关停处理。If the fraud warning level is the third warning level, the phone number of the user who separated the identity certificate is output and the audit result based on the phone number is received. If the audit result is failed and the verification is not completed within the preset time period, In the case of secondary real-name authentication, the communication function of the phone number of the user whose identity is separated will be shut down.
在一个实施例中,处理单元还包括恢复单元,所述恢复单元用于:In one embodiment, the processing unit further includes a recovery unit, the recovery unit is used for:
若人证分离用户完成二次实名认证,则恢复所述人证分离用户的电话号码的通讯功能。If the person-identity-separated user completes the second real-name authentication, the communication function of the person-identity-separated user's phone number is restored.
在一个实施例中,第二输入模块440还用于:In one embodiment, the second input module 440 is also used to:
基于所述人证分离用户的电话号码与所述诈骗预警级别,对预设诈骗用户清单进行数据更新。Based on the phone number of the witness separation user and the fraud warning level, data is updated on the preset fraud user list.
图5示例了一种电子设备的实体结构示意图,如图5所示,该电子设备可以包括:处理器(processor)510、通信接口(Communication Interface)520、存储器(memory)530和通信总线540,其中,处理器510,通信接口520,存储器530通过通信总线540完成相互间的通信。处理器510可以调用存储器530中的计算机程序,以执行诈骗用户预警方法的步骤,例如包括:Figure 5 illustrates a schematic diagram of the physical structure of an electronic device. As shown in Figure 5, the electronic device may include: a processor (processor) 510, a communication interface (Communication Interface) 520, a memory (memory) 530 and a communication bus 540. Among them, the processor 510, the communication interface 520, and the memory 530 complete communication with each other through the communication bus 540. The processor 510 can call the computer program in the memory 530 to execute the steps of the fraud user warning method, including, for example:
获取待分析用户的运营域数据和业务域数据;Obtain the operational domain data and business domain data of the user to be analyzed;
将所述运营域数据和所述业务域数据输入至人证分离用户预测模型,得到所述人证分离用户预测模型输出的预测结果;其中,所述人证分离用户预测模型用于确定用户信息是否与证件信息一致;The operation domain data and the business domain data are input into the witness and ID separation user prediction model, and the prediction results output by the ID and ID separation user prediction model are obtained; wherein, the ID and ID separation user prediction model is used to determine user information. Whether it is consistent with the certificate information;
若所述预测结果为所述待分析用户属于人证分离用户,则获取所述人证分离用户的通讯特征;If the prediction result is that the user to be analyzed belongs to a person-identity separation user, obtain the communication characteristics of the person-identity separation user;
将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别;其中,所述诈骗用户识别模型用于预测诈骗级别。Input the communication characteristics of the identity separation user into the fraud user identification model to obtain the fraud warning level output by the fraud user identification model; wherein, the fraud user identification model is used to predict the fraud level.
此外,上述的存储器530中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above-mentioned logical instructions in the memory 530 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .
另一方面,本申请实施例还提供一种存储介质,所述存储介质为计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序用于使处理器执行上述各实施例提供的方法的步骤,例如包括:On the other hand, embodiments of the present application also provide a storage medium. The storage medium is a computer-readable storage medium. The computer-readable storage medium stores a computer program. The computer program is used to cause the processor to execute the above steps. The steps of the method provided by the embodiment include, for example:
获取待分析用户的运营域数据和业务域数据;Obtain the operational domain data and business domain data of the user to be analyzed;
将所述运营域数据和所述业务域数据输入至人证分离用户预测模型,得到所述人证分离用户预测模型输出的预测结果;其中,所述人证分离用户预测模型用于确定用户信息是否与证件信息一致;The operation domain data and the business domain data are input into the witness and ID separation user prediction model, and the prediction results output by the ID and ID separation user prediction model are obtained; wherein, the ID and ID separation user prediction model is used to determine user information. Whether it is consistent with the certificate information;
若所述预测结果为所述待分析用户属于人证分离用户,则获取所述人证分离用户的通讯特征;If the prediction result is that the user to be analyzed belongs to a person-identity separation user, obtain the communication characteristics of the person-identity separation user;
将所述人证分离用户的通讯特征输入至诈骗用户识别模型,得到所述诈骗用户识别模型输出的诈骗预警级别;其中,所述诈骗用户识别模型用于预测诈骗级别。Input the communication characteristics of the identity separation user into the fraud user identification model to obtain the fraud warning level output by the fraud user identification model; wherein, the fraud user identification model is used to predict the fraud level.
所述计算机可读存储介质可以是处理器能够存取的任何可用介质或数据存储设备,包括但不限于磁性存储器(例如软盘、硬盘、磁带、磁光盘(MO)等)、光学存储器(例如CD、DVD、BD、HVD等)、以及半导体存储器(例如ROM、EPROM、EEPROM、非易失性存储器(NANDFLASH)、固态硬盘(SSD))等。The computer-readable storage medium may be any available media or data storage device that can be accessed by the processor, including but not limited to magnetic storage (such as floppy disks, hard disks, tapes, magneto-optical disks (MO), etc.), optical storage (such as CDs) , DVD, BD, HVD, etc.), and semiconductor memories (such as ROM, EPROM, EEPROM, non-volatile memory (NANDFLASH), solid state drive (SSD)), etc.
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated. The components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and of course, it can also be implemented by hardware. Based on this understanding, the part of the above technical solution that essentially contributes to the existing technology can be embodied in the form of a software product. The computer software product can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments.
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present application, but not to limit it; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent substitutions are made to some of the technical features; however, these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions in the embodiments of the present application.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310930064.0ACN116963072A (en) | 2023-07-27 | 2023-07-27 | Fraud user early warning method and device, electronic equipment and storage medium |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310930064.0ACN116963072A (en) | 2023-07-27 | 2023-07-27 | Fraud user early warning method and device, electronic equipment and storage medium |
| Publication Number | Publication Date |
|---|---|
| CN116963072Atrue CN116963072A (en) | 2023-10-27 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310930064.0APendingCN116963072A (en) | 2023-07-27 | 2023-07-27 | Fraud user early warning method and device, electronic equipment and storage medium |
| Country | Link |
|---|---|
| CN (1) | CN116963072A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118523969A (en)* | 2024-07-24 | 2024-08-20 | 浙江鹏信信息科技股份有限公司 | Campus phishing early warning method and system based on DPI and readable medium |
| CN119893509A (en)* | 2025-01-24 | 2025-04-25 | 联通在线信息科技有限公司 | Multi-feature-based regional risk assessment method and system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115409521A (en)* | 2021-05-27 | 2022-11-29 | 中国移动通信集团内蒙古有限公司 | Method, device, electronic equipment and storage medium for predicting telecommunications fraud users |
| CN116033428A (en)* | 2021-10-22 | 2023-04-28 | 中国移动通信集团黑龙江有限公司 | Method and device for processing abnormal telecommunication users |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115409521A (en)* | 2021-05-27 | 2022-11-29 | 中国移动通信集团内蒙古有限公司 | Method, device, electronic equipment and storage medium for predicting telecommunications fraud users |
| CN116033428A (en)* | 2021-10-22 | 2023-04-28 | 中国移动通信集团黑龙江有限公司 | Method and device for processing abnormal telecommunication users |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118523969A (en)* | 2024-07-24 | 2024-08-20 | 浙江鹏信信息科技股份有限公司 | Campus phishing early warning method and system based on DPI and readable medium |
| CN119893509A (en)* | 2025-01-24 | 2025-04-25 | 联通在线信息科技有限公司 | Multi-feature-based regional risk assessment method and system |
| Publication | Publication Date | Title |
|---|---|---|
| CN107609708B (en) | User loss prediction method and system based on mobile game shop | |
| US11580094B2 (en) | Real-time anomaly determination using integrated probabilistic system | |
| CN110163618B (en) | Abnormal transaction detection method, device, equipment and computer-readable storage medium | |
| CN107222865B (en) | Communication swindle real-time detection method and system based on suspicious actions identification | |
| CN112581259B (en) | Account risk identification method and device, storage medium and electronic equipment | |
| CN111614690A (en) | Abnormal behavior detection method and device | |
| US20230208875A1 (en) | Method of fraud detection in telecommunication using big data mining techniques | |
| CN111654866A (en) | Method, device and computer storage medium for preventing mobile communication from fraud | |
| CN113627566B (en) | Phishing early warning method and device and computer equipment | |
| CN116963072A (en) | Fraud user early warning method and device, electronic equipment and storage medium | |
| CN108243049A (en) | Telecom fraud identification method and device | |
| KR102332997B1 (en) | Server, method and program that determines the risk of financial fraud | |
| CN111915468A (en) | Novel anti-fraud active inspection and early warning system for network | |
| CN116996325B (en) | Network security detection method and system based on cloud computing | |
| CN115409518A (en) | User transaction risk warning method and device | |
| CN115409521A (en) | Method, device, electronic equipment and storage medium for predicting telecommunications fraud users | |
| CN110493476B (en) | Detection method, device, server and storage medium | |
| CN117407800A (en) | A social media robot detection method and system based on random forest and XGBoost model | |
| CN111611519A (en) | Method and device for detecting personal abnormal behaviors | |
| Ni et al. | A Victim‐Based Framework for Telecom Fraud Analysis: A Bayesian Network Model | |
| CN119313359A (en) | Method for identifying illegal store registration and its device, equipment and medium | |
| CN114189585B (en) | Method, device and computing equipment for detecting abnormal harassment calls | |
| CN117319552A (en) | Abnormal number monitoring method and device, storage medium and electronic equipment | |
| CN116033428A (en) | Method and device for processing abnormal telecommunication users | |
| Yan et al. | An adaptive graph neural networks based on cost-sensitive learning for fraud detection |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |