Movatterモバイル変換


[0]ホーム

URL:


CN109871861B - System and method for providing coding for target data - Google Patents

System and method for providing coding for target data
Download PDF

Info

Publication number
CN109871861B
CN109871861BCN201811612338.7ACN201811612338ACN109871861BCN 109871861 BCN109871861 BCN 109871861BCN 201811612338 ACN201811612338 ACN 201811612338ACN 109871861 BCN109871861 BCN 109871861B
Authority
CN
China
Prior art keywords
data
training
code
coding
target data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811612338.7A
Other languages
Chinese (zh)
Other versions
CN109871861A (en
Inventor
白雪珂
舒南飞
赵林
林文辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aisino Corp
Original Assignee
Aisino Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aisino CorpfiledCriticalAisino Corp
Priority to CN201811612338.7ApriorityCriticalpatent/CN109871861B/en
Publication of CN109871861ApublicationCriticalpatent/CN109871861A/en
Application grantedgrantedCritical
Publication of CN109871861BpublicationCriticalpatent/CN109871861B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Landscapes

Abstract

The invention discloses a system and a method for providing codes for target data, and belongs to the technical field of tax data application. The system of the invention comprises: the training module is used for acquiring training data, wherein the training data comprises a classification coding table and historical invoice data, training the classification coding table and the historical invoice data, acquiring training results and generating a plurality of training models based on the training results; the model merging module is used for merging the multiple groups of training models based on the multiple groups of training models and superposing the same data set of the training results; the coding providing module is used for reading training result data, the coding module is connected with a plurality of interfaces, receives target data which are transmitted by the interfaces and need to be coded, and provides coding information for the target data which need to be coded. According to the invention, the history data is preprocessed in detail and effectively according to the actual data condition, interference information is removed, and the training accuracy is improved.

Description

System and method for providing coding for target data
Technical Field
The present invention relates to the field of tax data application technology, and more particularly, to a system and method for providing encoding for target data.
Background
The most recently issued "tax Classification coding Table for goods and services" by the national tax administration strictly classifies goods and services into 4207 categories, of which the major category 675 is, and the minor category 3532 is. The national tax bureau issued in 2016 requires a trial and error to add tax classification codes and code-related functions to the billing software. In addition, a great number of mistakes and inaccurate manual labeling codes exist in the invoices issued by the local side, so that misleading effects exist on the statistics, analysis and other works for preventing tax evasion and tax leakage of enterprises based on enterprise tax rate and entry and sales item analysis of the sold commodities. Because of the limitation of professional knowledge and energy of invoicers and office data analysts, the feasibility of coding and classifying massive commodity and service names is too low, so that the office data analysis is more accurate and convenient for the invoicers, and a classification recommendation system which depends on big data technology and machine learning models is specially designed.
The naive bayes method is a classification method based on bayes theorem and independent assumption of characteristic conditions, for a given training data set, first, based on the independent assumption of characteristic conditions, the joint probability distribution of input/output is learned, and then, based on the model, the output of posterior probability is obtained for the given input by using bayes theorem. The classification model obtained by naive Bayes training has better accuracy through testing, and the main idea is that the training stage takes the commodity name word segmentation of the training sample set as input, then obtains the prior probability of all categories (codes), and calculates the conditional probability of all word segmentation when a certain code is taken; and the classification stage is used for word segmentation of commodity names and calculating the probability of all possible codes according to the Bayesian theorem. However, since naive bayes are algorithms based on statistics and have certain limitations based on independent assumption of feature conditions, firstly, the naive bayes cannot process unregistered words, data with fewer samples or excessive classification errors in the samples can cause inaccurate classification, words at different positions in commodity names are processed consistently, probability calculation cost for calculating all categories is high and communication bandwidth is large in each time of recognition, so that the unknown words cannot be practically used, and an optimization scheme is needed to be proposed.
The traditional database system stores all data on a disk, so that the disk is required to be accessed frequently to perform data reading operation, and the performance is lower when the data volume is large and the reading operation is frequent. In recent years, the memory capacity is continuously increased, the price is continuously reduced, and meanwhile, the requirement on the implementation corresponding capability of a database system is increasingly increased, so that the performance of the database is improved by fully utilizing the memory technology to become a hot spot.
Disclosure of Invention
In view of the above, the present invention proposes a system for providing encoding for target data, the system comprising:
the training module is used for acquiring training data, wherein the training data comprises a classification coding table and historical invoice data, training the classification coding table and the historical invoice data, acquiring training results and generating a plurality of training models based on the training results;
the model merging module is used for merging the multiple groups of training models based on the multiple groups of training models and superposing the same data set of the training results;
the coding providing module is used for reading training result data, the coding module is connected with a plurality of interfaces, receives target data which are transmitted by the interfaces and need to be coded, and provides coding information for the target data which need to be coded.
Optionally, training the classification code table and the historical invoice data includes:
filtering the training data or correcting the content of coding marking errors in the training data to obtain corrected data;
preprocessing the correction training data, wherein the preprocessing is to filter time information, blank spaces and punctuation existing in the target data;
performing word segmentation and cleaning on the correction training data, adding position weights to the segmented and cleaned correction training data, extracting units and specification model data in the correction training data, acquiring record frequency corresponding to a classification coding table according to the correction training data, acquiring rule set training data based on the record frequency, and combining the extracted units and specification model correction training data with the rule set training data to acquire sample training data;
and constructing a training result data set based on the sample training data and storing the training result data set in a distributed file system hdfs.
Optionally, the cleaning process includes: filtering digital adjective connection pattern data, filtering brand part-of-speech data, filtering nouns, adjectives, verbs, filtering a plurality of adjectives and adjectives in a noun connection pattern.
Optionally, the training result data set includes: trade name word segmentation, coding, position weight and frequency data, trade unit, coding and frequency data, trade specification model, coding and frequency data; commodity code and frequency data and commodity code and frequency data.
Optionally, providing the coding information to the target data to be coded includes:
acquiring prior probability and conditional probability based on training result data;
broadcasting prior probability and conditional probability when the target data needing to be provided with codes are large data cluster batch data and recommended codes need to be obtained in batches;
dividing and filtering commodity names of target data, adding position weights to commodity names, acquiring conditional probabilities of a plurality of corresponding codes of the target data according to conditional probability data, acquiring prior probabilities of the plurality of corresponding codes of the target data according to prior probabilities, acquiring the product of the conditional probability and the prior probability of any one corresponding code of the target data, and taking the corresponding code of the maximum value of the multiplier as a recommended code.
Optionally, the method further comprises: the Web end code providing module is used for importing a training result data set stored in the distributed file system hdfs into a PostgreSQL database of a production environment, providing a Web end target data acquisition interface, and after target data are acquired, returning the first five analogies of the acquired target data to coding, coding names and recommendation probability.
Optionally, the system further comprises: the memory database end code providing module loads the training result data set and the correction data to a data structure server redis;
the correction data is the commodity name and the coding data of accurate or preferential recommendation;
and returning the first five analogies, the code name and the recommendation probability to the acquired target data.
Optionally, the memory database end code providing module,
the recommended target data and the recommended codes of the target data are written into a cache, the preset expiration time is set, and related information is matched from the cache when each code recommendation is performed.
Optionally, the memory database end code providing module is configured to, if the obtained target data is matched with the correction data in the data structure server redis, take the correction data corresponding code with a probability of 0.5 as a first-bit recommended code, normalize the recommended code probability, and multiply the recommended code probability by 0.5 as four bits after the recommended code.
Optionally, the system further comprises: and the online information feedback module acquires any type of recommended code information actively selected by the user in the first five analogies recommended by the Web end code recommendation module, and feeds the recommended code information back to the training model.
Optionally, the prior probability is obtained by dividing the frequency in the training result data by the frequency in the pre-word-segmentation code-frequency data in the training result data set.
Optionally, the conditional probability is obtained by dividing the frequency of the name word segmentation data, the unit data and the specification model data in the training result data set by the frequency of the word segmentation corresponding to the codes.
The invention also provides a method for providing coding for target data, the method comprising:
acquiring training data, wherein the training data comprises a classification coding table and historical invoice data, training the classification coding table and the historical invoice data, acquiring training results, and generating a plurality of training models based on the training results;
based on a plurality of groups of training models, continuously merging the plurality of groups of training models, and superposing the same data set of training results;
and the coding module is connected with a plurality of interfaces, receives target data to be coded, which are transmitted by the interfaces, and provides coding information for the target data to be coded.
Optionally, training the classification code table and the historical invoice data includes:
filtering the training data or correcting the content of coding marking errors in the training data to obtain corrected training data;
preprocessing the correction training data, wherein the preprocessing is to filter time information, blank spaces and punctuation existing in the target data;
performing word segmentation and cleaning on the correction training data, adding position weights to the segmented and cleaned correction training data, extracting units and specification model data in the correction training data, acquiring record frequency corresponding to a classification coding table according to the correction training data, acquiring rule set training data based on the record frequency, and combining the extracted units and specification model correction training data with the rule set training data to acquire sample training data;
and constructing a training result data set based on the sample training data and storing the training result data set in a distributed file system hdfs.
Optionally, the cleaning process includes: filtering digital adjective connection pattern data, filtering brand part-of-speech data, filtering nouns, adjectives, verbs, filtering a plurality of adjectives and adjectives in a noun connection pattern.
Optionally, the training result data set includes: trade name word segmentation, coding, position weight and frequency data, trade unit, coding and frequency data, trade specification model, coding and frequency data; commodity code and frequency data and commodity code and frequency data.
Optionally, providing the coding information to the target data to be coded includes:
acquiring prior probability and conditional probability based on training result data;
broadcasting prior probability and conditional probability when the target data needing to be provided with codes are large data cluster batch data and recommended codes need to be obtained in batches;
dividing and filtering commodity names of target data, adding position weights to commodity names, acquiring conditional probabilities of a plurality of corresponding codes of the target data according to conditional probability data, acquiring prior probabilities of the plurality of corresponding codes of the target data according to prior probabilities, acquiring the product of the conditional probability and the prior probability of any one corresponding code of the target data, and taking the corresponding code of the maximum value of the multiplier as a recommended code.
Optionally, the method further comprises: and importing a training result data set stored by the hdfs of the distributed file system into a PostgreSQL database of a production environment, providing a Web end target data acquisition interface, and after acquiring target data, returning the last five analogies of the acquired target data to coding, coding names and recommendation probabilities.
Optionally, the method further comprises:
loading the training result data set and the correction data to a data structure server redis;
the correction data is the commodity name and the coding data of accurate or preferential recommendation;
and returning the first five analogies, the code name and the recommendation probability to the acquired target data.
Optionally, the method further comprises:
the recommended target data and the recommended codes of the target data are written into a data structure server redis cache, the preset expiration time is set, and related information is matched from the cache when each code is recommended.
Optionally, the method further comprises:
if the acquired target data is matched with the correction data in the data structure server redis, the correction data is correspondingly encoded with the probability of 0.5 as a first recommended code, and the recommended code probabilities are normalized and multiplied by 0.5 to be four bits after recommended codes.
Optionally, the method further comprises:
and acquiring any type of recommended code information actively selected by the user in the first five analogies recommended by the Web end code recommendation module, and feeding back the information to the training model.
Optionally, the prior probability is obtained by dividing the frequency in the training result data by the frequency in the pre-word-segmentation code-frequency data in the training result data set.
Optionally, the conditional probability is obtained by dividing the frequency of the name word segmentation data, the unit data and the specification model data in the training result data set by the frequency of the word segmentation corresponding to the codes.
According to the invention, the history data is preprocessed in detail and effectively according to the actual data condition, interference information is removed, and the training accuracy is improved;
the invention provides batch identification, web end code recommendation and quick recommendation of various code recommendation interfaces based on the data structure server redis at the same time, and is a data storage method for improving the redis performance of the data structure server; in addition, a model merging and online information feedback module is provided to further improve the model recommendation accuracy;
the invention better solves the problems of short text classification codes such as similar business, food names and the like in the fields of tax, food and medicine supervision and the like.
Drawings
FIG. 1 is a block diagram of a system for providing encoding for target data in accordance with the present invention;
FIG. 2 is a flow chart of a method for providing encoding for target data in accordance with the present invention.
Detailed Description
The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the examples described herein, which are provided to fully and completely disclose the present invention and fully convey the scope of the invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, like elements/components are referred to by like reference numerals.
Unless otherwise indicated, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. In addition, it will be understood that terms defined in commonly used dictionaries should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.
The present invention provides asystem 200 for providing encoding for target data, as shown in fig. 1, thesystem 200 comprising:
the training module 201 obtains training data, where the training data includes a classification coding table and historical invoice data, and trains the classification coding table and the historical invoice data, and includes:
filtering the training data or correcting the content of coding marking errors in the training data to obtain corrected training data;
preprocessing the correction training data, wherein the preprocessing is to filter time information, blank spaces and punctuation existing in the target data;
performing word segmentation and cleaning on the correction training data, adding position weights to the segmented and cleaned correction training data, extracting units and specification model data in the correction training data, acquiring record frequency corresponding to a classification coding table according to the correction training data, acquiring rule set training data based on the record frequency, and combining the extracted units and specification model correction training data with the rule set training data to acquire sample training data;
and constructing a training result data set based on the sample training data and storing the training result data set in a distributed file system hdfs.
The cleaning process comprises the following steps: filtering the digital adjective connection mode data, filtering brand part-of-speech data, filtering nouns, adjectives, verbs, filtering a plurality of adjectives and adjectives in the noun connection mode;
the training result data set includes: trade name word segmentation, coding, position weight and frequency data, trade unit, coding and frequency data, trade specification model, coding and frequency data; commodity code and frequency data and commodity code and frequency data;
acquiring training results, and generating a plurality of training models based on the training results;
the model merging module 202 merges the multiple groups of training models continuously based on the multiple groups of training models, and superimposes the same data set of the training results;
the code providing module 203 reads training result data, the code module is connected with a plurality of interfaces, receives target data to be coded transmitted by the interfaces, and provides coding information for the target data to be coded;
providing encoding information for target data to be encoded includes:
acquiring prior probability and conditional probability based on training result data;
broadcasting prior probability and conditional probability when the target data needing to be provided with codes are large data cluster batch data and recommended codes need to be obtained in batches; preprocessing target data, wherein the preprocessing is to filter time information, blank spaces and punctuation existing in the target data;
dividing and filtering the preprocessed commodity names of the target data, adding position weights to the commodity names, acquiring conditional probabilities of a plurality of corresponding codes of the target data according to conditional probability data, acquiring prior probabilities of the plurality of corresponding codes of the target data according to prior probabilities, acquiring the product of the conditional probability and the prior probability of any one corresponding code of the target data, and taking the corresponding code of the maximum value of the multiplier as a recommended code.
The prior probability is obtained by dividing the frequency number in the training result data by the frequency number in the pre-word-segmentation coding-frequency number data in the training result data set.
The conditional probability is obtained by dividing the frequency of the name word segmentation data, the unit data and the specification model data in the training result data set by the frequency of the corresponding coded word segmentation.
The Web end code providing module 204, the Web end code providing module 204 imports the training result data set stored by the distributed file system hdfs into the postgreSQL database of the production environment to provide a Web end target data acquisition interface, and after acquiring target data, the Web end code providing module obtains the target data by returning the first five analogies, the code name and the recommendation probability.
The memory database end code providing module 205 loads the training result data set and the correction data to the data structure server redis;
first, the frequency count total value is stored with a key of "code: sum" and a type of String.
Secondly, because the code length is 19, the I/O pressure is large in the recommended process, and the codes in the result set are replaced by numbers from 1 to N, specifically: numbering codes which appear in the code-name data, the code-frequency data after word segmentation and the code-frequency data before word segmentation which are acquired by the rule set to form a code-number corresponding relation; and then, the data of the three data sets are stored in the redis in the form of hash tables with keys of 'code: $ number' (the $ number means variable number) and field of 'name', 'token', 'doc', value of name respectively.
Further, the generated code-number correspondence is formed into a character string of "0:code1,1:code2 …" and stored in redis, wherein the key is "code: total".
Then, respectively storing name word segmentation data, unit data and specification model data into a redis in a hash table, wherein key is word segmentation, unit and specification model, field is position (wherein the position weights of the unit and the specification model are 0), and value is a character string of which all codes corresponding to the same name and position are in the form of 'number 1:freq1 and number 2:freq2 …';
the correction data are accurate or preferential recommended commodity names and coding data thereof, codes in the manual correction data are replaced by numbers by utilizing a coding-numbering table generated in the last step, keys stored in rediss are 'artificial: $name: dw: $dw: ggxh: $ggxh', and values are corresponding numbers. The method comprises the steps of carrying out a first treatment on the surface of the
And returning the first five analogies, the code name and the recommendation probability to the acquired target data.
The memory database end code providing module 205 writes the recommended target data and the recommended code of the target data into the cache, sets a preset expiration time, and matches related information from the cache when each code is recommended.
And the memory database end code providing module 205 is used for providing the first recommended code with the probability of 0.5 as the corresponding code of the corrected data if the acquired target data is matched with the corrected data in the data structure server redis, and multiplying the recommended code probability by 0.5 as four bits after normalization.
The online information feedback module 206 acquires any type of recommended code information actively selected by the user in the first five types of analogies recommended by the Web end code recommendation module, and feeds the recommended code information back to the training model.
The invention also proposes a method for providing coding for target data, as shown in fig. 2, comprising:
obtaining training data, wherein the training data comprises a classification coding table and historical invoice data, training the classification coding table and the historical invoice data comprises the following steps of:
filtering the training data or correcting the content of coding marking errors in the training data to obtain corrected training data;
preprocessing the correction training data, wherein the preprocessing is to filter time information, blank spaces and punctuation existing in the target data;
performing word segmentation and cleaning on the correction training data, adding position weights to the segmented and cleaned correction training data, extracting units and specification model data in the correction training data, acquiring record frequency corresponding to a classification coding table according to the correction training data, acquiring rule set training data based on the record frequency, and combining the extracted units and specification model correction training data with the rule set training data to acquire sample training data;
and constructing a training result data set based on the sample training data and storing the training result data set in a distributed file system hdfs.
The cleaning process comprises the following steps: filtering digital adjective connection pattern data, filtering brand part-of-speech data, filtering nouns, adjectives, verbs, filtering a plurality of adjectives and adjectives in a noun connection pattern.
The training result data set includes: trade name word segmentation, coding, position weight and frequency data, trade unit, coding and frequency data, trade specification model, coding and frequency data; commodity code and frequency data and commodity code and frequency data;
acquiring training results, and generating a plurality of training models based on the training results;
based on a plurality of groups of training models, continuously merging the plurality of groups of training models, and superposing the same data set of training results;
the training result data is read, the coding module is connected with a plurality of interfaces, receives target data to be coded, which are transmitted by the interfaces, and provides coding information for the target data to be coded;
providing encoding information for target data to be encoded includes:
acquiring prior probability and conditional probability based on training result data;
broadcasting prior probability and conditional probability when the target data needing to be provided with codes are large data cluster batch data and recommended codes need to be obtained in batches;
dividing and filtering commodity names of target data, adding position weights to commodity names, acquiring conditional probabilities of a plurality of corresponding codes of the target data according to conditional probability data, acquiring prior probabilities of the plurality of corresponding codes of the target data according to prior probabilities, acquiring the product of the conditional probability and the prior probability of any one corresponding code of the target data, and taking the corresponding code of the maximum value of the multiplier as a recommended code;
the prior probability is obtained by dividing the frequency number in the training result data by the frequency number in the pre-word-segmentation coding-frequency number data in the training result data set.
The conditional probability is obtained by dividing the frequency of the name word segmentation data, the unit data and the specification model data in the training result data set by the frequency of the corresponding coded word segmentation.
And importing a training result data set stored by the hdfs of the distributed file system into a PostgreSQL database of a production environment, providing a Web end target data acquisition interface, and after acquiring target data, returning the last five analogies of the acquired target data to coding, coding names and recommendation probabilities.
Loading the training result data set and the correction data to a data structure server redis;
the correction data is the commodity name and the coding data of accurate or preferential recommendation;
and returning the first five analogies, the code name and the recommendation probability to the acquired target data.
The recommended target data and the recommended codes of the target data are written into a data structure server redis cache, the preset expiration time is set, and related information is matched from the cache when each code is recommended.
If the acquired target data is matched with the correction data in the data structure server redis, the correction data is correspondingly encoded with the probability of 0.5 as a first recommended code, and the recommended code probabilities are normalized and multiplied by 0.5 to be four bits after recommended codes.
And acquiring any type of recommended code information actively selected by the user in the first five analogies recommended by the Web end code recommendation module, and feeding back the information to the training model.
According to the invention, the history data is preprocessed in detail and effectively according to the actual data condition, interference information is removed, and the training accuracy is improved;
the invention provides batch identification, web end code recommendation and quick recommendation of various code recommendation interfaces based on the data structure server redis at the same time, and is a data storage method for improving the redis performance of the data structure server; in addition, a model merging and online information feedback module is provided to further improve the model recommendation accuracy;
the invention better solves the problems of short text classification codes such as similar business, food names and the like in the fields of tax, food and medicine supervision and the like.

Claims (18)

CN201811612338.7A2018-12-272018-12-27System and method for providing coding for target dataActiveCN109871861B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201811612338.7ACN109871861B (en)2018-12-272018-12-27System and method for providing coding for target data

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811612338.7ACN109871861B (en)2018-12-272018-12-27System and method for providing coding for target data

Publications (2)

Publication NumberPublication Date
CN109871861A CN109871861A (en)2019-06-11
CN109871861Btrue CN109871861B (en)2023-05-23

Family

ID=66917238

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811612338.7AActiveCN109871861B (en)2018-12-272018-12-27System and method for providing coding for target data

Country Status (1)

CountryLink
CN (1)CN109871861B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110597995B (en)*2019-09-202022-03-11税友软件集团股份有限公司Commodity name classification method, commodity name classification device, commodity name classification equipment and readable storage medium
CN110647845A (en)*2019-09-232020-01-03税友软件集团股份有限公司Invoice data identification device, related method and related device
CN113515511B (en)*2021-05-282022-11-11中国雄安集团数字城市科技有限公司Data cleaning method and device for information resource cataloguing file
CN116361859B (en)*2023-06-022023-08-25之江实验室Cross-mechanism patient record linking method and system based on depth privacy encoder
CN116664154B (en)*2023-07-312023-10-24山东瑞升智慧医疗科技有限公司Medical disinfection supply-based full-flow information tracing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107678858A (en)*2017-09-302018-02-09广东欧珀移动通信有限公司 Application processing method, device, storage medium and electronic equipment
CN107704892A (en)*2017-11-072018-02-16宁波爱信诺航天信息有限公司A kind of commodity code sorting technique and system based on Bayesian model
CN107862046A (en)*2017-11-072018-03-30宁波爱信诺航天信息有限公司A kind of tax commodity code sorting technique and system based on short text similarity
CN108491887A (en)*2018-03-292018-09-04安徽航天信息有限公司A kind of commodity tax incorporates the acquisition methods of code into own forces

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10290055B2 (en)*2006-04-212019-05-14Refinitiv Us Organization LlcEncoded short message service text messaging systems and methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107678858A (en)*2017-09-302018-02-09广东欧珀移动通信有限公司 Application processing method, device, storage medium and electronic equipment
CN107704892A (en)*2017-11-072018-02-16宁波爱信诺航天信息有限公司A kind of commodity code sorting technique and system based on Bayesian model
CN107862046A (en)*2017-11-072018-03-30宁波爱信诺航天信息有限公司A kind of tax commodity code sorting technique and system based on short text similarity
CN108491887A (en)*2018-03-292018-09-04安徽航天信息有限公司A kind of commodity tax incorporates the acquisition methods of code into own forces

Also Published As

Publication numberPublication date
CN109871861A (en)2019-06-11

Similar Documents

PublicationPublication DateTitle
CN109871861B (en)System and method for providing coding for target data
US11610271B1 (en)Transaction data processing systems and methods
US8639596B2 (en)Automated account reconciliation method
CN115002200B (en)Message pushing method, device, equipment and storage medium based on user portrait
US20180158078A1 (en)Computer device and method for predicting market demand of commodities
CN110428322A (en)A kind of adaptation method and device of business datum
CN111597348B (en)User image drawing method, device, computer equipment and storage medium
US20120330971A1 (en)Itemized receipt extraction using machine learning
US20190139147A1 (en)Accuracy and speed of automatically processing records in an automated environment
CN103154991A (en)Credit risk mining
CN109740642A (en) Invoice category identification method, device, electronic device and readable storage medium
Granados et al.Reducing the loss of information through annealing text distortion
CN117391086A (en)Bid participation information extraction method, device, equipment and medium
CN115952778A (en)Questionnaire question weight self-adaptive generation method, device, equipment and storage medium
US20250021914A1 (en)Apparatus and method for generating system improvement data
Sami’Un et al.Chi Square Feature Selection For Improving Sentiment Analysis of News Data Privacy Treats
US20200097605A1 (en)Machine learning techniques for automatic validation of events
CN110738538A (en)Method and device for identifying similar articles
CN117648581B (en)Enterprise similarity evaluation method, device, terminal and medium
CN110737700A (en)purchase, sales and inventory user classification method and system based on Bayesian algorithm
CN119513643A (en) Industrial chain identification method, device, terminal equipment and storage medium
TW202004523A (en)Data exchange platform based on text mining and method using same provides a data exchange platform based on text mining
CN116226744A (en)User classification method, device and equipment
CN118820325B (en)Account period data processing method, system, equipment and medium based on Microsoft 365
CN113392203B (en)Intelligent question-answering method, intelligent question-answering device, electronic equipment and computer readable storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp