Disclosure of Invention
1. Problems to be solved
Aiming at the problems of difficult creation and inaccuracy of the existing marketing content, the invention provides an analysis method for quantitatively analyzing the interaction and sales index of content labels. According to the method, the influence of the content tag on the content interaction and commodity sales is quantitatively evaluated through the content interaction index and the commodity sales index, and quantitative comparison is carried out on the content tag, so that creation of marketing content is simple, feasible and accurate, and labor cost and resource cost are saved.
2. Technical proposal
In order to solve the problems, the invention adopts the following technical scheme.
An analysis method for quantitatively analyzing interaction and sales indexes of content tags, comprising the steps of:
acquiring a content database of the social media platform, marking the content in the content database to obtain a content label, and adding a commodity SPU (Standard Product Unit, standardized product unit) label to the content database;
acquiring an electronic commerce commodity database of an electronic commerce platform in a social media platform, and adding a commodity SPU label to the commodity database;
calculating content interaction indexes of all content tags and commodity sales indexes of all content tags;
and carrying out visual analysis on the content interaction index and the commodity sales index.
Further, the calculating the content interaction index includes the following steps:
determining commodity class in the content database, and counting the content quantity n of all content labels k of the commodity class in a set time period to obtain a set C of the contentk = { content 1, content 2,., content n };
calculating an interaction value E for each content in a collection of contenti :
Ei Number of endorsements per content + number of forwarding per content + number of collections per content + number of rating per content; or (b)
Ei Number of endorsements per content or forwarding number per content or collection number per content or rating number per content;
calculating content interaction index Y of content label kk :
Or +.>
Further, the calculating the commodity sales index includes the steps of:
determining commodity class in content database and counting content quantity of all content labels k of commodity class in set time periodn, get collection C about contentk = { content 1, content 2,., content n };
determining m commodity SPU corresponding to single content i in the content set to obtain a commodity SPU set P corresponding to the single content ii ={SPU1 ,SPU2 ,...,SPUm };
Determining sales of each date of a single commodity SPU in a set time period;
determining a certain date d, a single content i for a single item SPUj Sales contribution value of (c):
wherein, sales (j, d) is a commodity SPUj Sales at date d; w (W)i For a single content i, the item SPU is paired on date dj Impact of sales contribution weights; u is the SPU reference to date d within the validity time windowj The total number of content of (a);
determining the merchandise P to which a single content i refersi ={SPU1 ,SPU2 ,...,SPUm Total sales contribution value:
wherein t is the number of days of a time window in which the set calculation content influences commodity sales from the release date;
determining a content set C corresponding to a content tag kk Cumulative value of impact on sales of goods:
calculating sales index X of content label kk :Or->Wherein p is the content set C corresponding to the content label kk The corresponding total goods SPU number is arranged again.
Further, the visual analysis of the content interaction index and the commodity sales index specifically includes the following steps:
establishing a two-dimensional coordinate system, and taking a sales index corresponding to the content label as an X axis; the interaction index corresponding to the content label is taken as a Y axis;
and determining the average value of the sales index and the interaction index of all the content labels, dividing the average value of the sales index and the interaction index into four areas, and respectively putting the sales index and the interaction index into the four areas.
Further, the content labeling process for labeling the content in the content database to obtain the content label specifically includes the following steps:
constructing a commodity knowledge graph;
constructing a content class database; acquiring a media content database, screening content data related to the categories of all commodities from the media content database by utilizing a commodity knowledge graph, and constructing a content category database;
information extraction is carried out on the content database to construct a content label tree of the content, and data in the content database is labeled according to the content label tree of the content; wherein the method specifically comprises the following steps:
firstly, extracting a character entity, a product entity, a brand entity and a commodity attribute entity from a content class database by using a RanER model; carrying out semantic recognition on the class database by combining a large language model with an information extraction type prompt and a thinking chain summary type prompt, and extracting character entities, network hotword entities, user pain point entities and product characteristic entities, wherein the character entities, the network hotword entities, the user pain point entities and the product characteristic entities are applicable to the entities; finally, fusing entity results to obtain a final entity;
converting the entity words into word vectors by the final entity through a text vectorization model; then obtaining a plurality of word vectors through a clustering algorithm; then, words in each class are generalized into one or more labels through a large language model, and the content label tree of a tree structure and the keywords of each label are constructed by utilizing the keyword types output by the large language model;
labeling the content text in the class database according to the class content label tree.
Further, the step of labeling the content text includes the steps of:
when labeling the content text, judging whether entity extraction is carried out on the content text; if entity extraction is performed, the entity word becomes a candidate tag; if entity extraction is not performed, adding a candidate label set to the label corresponding to the identified entity word after entity extraction is performed;
keyword matching and regular expression matching are used for keywords or regular expressions corresponding to all labels in the label tree, and matched labels are added into a candidate label set of the content text;
judging all the candidate labels screened out by the content text by using a large language model and utilizing a discriminant template, and determining whether the candidate labels are matched with the meanings of the corresponding content text; if the candidate labels are matched, the candidate labels are confirmed, and if the candidate labels are not matched, the candidate labels are corrected.
Further, the construction of the content class database specifically includes the following steps:
collecting information of each social media content to form a media content database;
text matching is carried out on text information in a media content database by utilizing the commodity knowledge graph, and a content primary screening database of commodity class is established;
respectively converting the picture type content and the video type content in the primary screening database into text content;
and carrying out fine screening classification on the primary screening database, and judging whether the text content is related to commodity class.
Still further, only the original text description information is stored in the media content database: for the image-text content, storing the content title, the content text and the picture link content; for video content, its content title and video link content are stored.
3. Advantageous effects
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the method, after the content in the content database is labeled by acquiring the content database of the social media platform and the electronic commerce commodity database in the social media platform, the content interaction index and the commodity sales index of each content label are calculated according to the electronic commerce commodity database respectively, and then visual analysis is carried out on the content interaction index and the commodity sales index; the whole data source is only through the data disclosed on the social media page, and does not depend on tracking the user behavior link, so that the range of using the object is greatly widened; meanwhile, the influence of the content tag on the content interaction and commodity sales is quantitatively evaluated through the content interaction index and the commodity sales index, so that quantitative comparison is carried out on the content tag, and when a brand manufacturer or a content creator needs to create commodity marketing content, a tag with higher quality in the category can be selected, so that the creation of the marketing content is simple, feasible and accurate; the labor cost and the resource cost are saved;
(2) According to the invention, the commodity knowledge graph is constructed, and the entity and the relation in the commodity knowledge graph are utilized to construct the content class database from the media content database, so that the construction efficiency of the content class database is high, and the working efficiency is effectively improved; when the construction of the product database is completed, information extraction is carried out on the product database to construct a content tag tree, an imaging entity is identified by adopting a RaNER model when the information extraction is carried out, then the abstract entity is identified by combining a large language model with the information extraction type template and a thinking chain summary type template, and different types of entities are identified and extracted by adopting different models, so that the problems of inaccurate extraction identification and incomplete entity recall caused by entity extraction carried out by a single model are effectively solved; finally, labeling the content text through the content label tree; the whole process has simple and not complicated step flow, and the entity extraction accuracy is higher, so that the overall labeling accuracy is higher; meanwhile, the efficiency is high, and the labor cost and the time cost are reduced;
(3) When the content text is labeled, the invention judges whether entity extraction is carried out on the content text so as to improve the working efficiency and save time; the entity words which are extracted by the entity become candidate labels, the entity words which are extracted by the entity become the candidate labels after the entity extraction is not performed, meanwhile, the acquired candidate label set is judged, the possibility of potential errors is avoided, the recall rate is improved as much as possible (omission is reduced), the entity which is matched with the keyword but has the semantically error is filtered as much as possible by utilizing the semantic understanding capability of the large language model, and the entity words which are possibly not existing in the original text and are possibly given by the large language model are filtered as much as possible, so that the accuracy is improved;
(4) When the content class database is constructed, the method comprises the steps of firstly carrying out primary screening on a media content database to obtain a primary screening database, then respectively converting pictures and video contents in the primary screening database into text contents, and finally carrying out fine screening and classification to obtain a final class database; the method has the advantages that the method is low in cost and rapid in primary screening, the data range is reduced, then the fine screening treatment process is carried out, and the efficiency is improved while the cost is reduced on the premise of ensuring accuracy integrally; and only the original text description information is stored in the media content database, so that the storage cost is greatly reduced.
Detailed Description
The invention is further described below in connection with specific embodiments and the accompanying drawings.
Referring to fig. 1 to 2, fig. 1 is a schematic flow chart of the present application; fig. 2 is a schematic diagram of a process of sales index calculation of content tags.
In this embodiment, as shown in fig. 1, an analysis method for quantitatively analyzing interaction and sales indexes of content tags includes the following steps:
s1: acquiring a content database of the social media platform, and labeling the content in the content database to obtain a content label; content information is stored in a content database; the interaction quantity of the content is stored according to the date (the interaction quantity comprises praise number, evaluation number, forwarding number and collection number); storing the identified goods SPU in the content, and marking the content with a content label according to a structured label tree; and a relationship of the item SPU and a content tag describing the item SPU;
acquiring an electronic commerce commodity database of an electronic commerce platform in a social media platform: the electronic commerce commodity database is an electronic commerce commodity database of the electronic commerce platform in the social media platform station, which is used for cleaning brands, classes and SPUs (Standard Product Unit, standardized product units).
In one specific example, the content data in the content database is shown in table 1 below; the interaction data of the content data according to the date is shown in table 2; the data of the commodity SPU corresponding to the content data and the content tag describing the commodity SPU are shown in the following table 3:
table 1 table of contents of the contents database
Table 2 table of contents data interactive data by date
| Date of day | Content id | Praise amount | Number of forwarding | Number of evaluation | Collection number |
| 2023-2-1 | Content 1 | … | … | … | … |
| 2023-2-2 | Content 1 | … | … | … | … |
| 2023-2-3 | Content 1 | … | … | … | … |
Table 3 goods SPU corresponding to the content data and data table describing the content tag of the goods SPU
| Content id | SPU | Content label |
| Content 1 | SPU 1 | Efficacy-whitening |
| Content 1 | SPU 1 | Efficacy-speckle reduction |
| Content 1 | SPU 2 | Efficacy-moisture retention |
As can be seen from tables 1 to 3, two items of merchandise are mentioned in content 1, SPU1 and SPU2, respectively, wherein two content tags are identified in the description of SPU1, efficacy-whitening and efficacy-spot-lightening, respectively; in the description of SPU2, mention is made of content label efficacy-moisturization.
In a specific example, the electronic commerce commodity database stores fields including commodity information, and brands, classes, SPUs and the like after cleaning; and stores sales and sales of the commodities according to the date. One commodity id corresponds to one commodity link on the e-commerce platform, and one SPU may correspond to a plurality of commodity links. The data of the electronic commerce commodity in the electronic commerce commodity database is shown in the following table 4, and the sales line data of the commodity data according to the date is shown in the following table 5;
table 4 electronic commerce goods data table in electronic commerce goods database
Table 5 sales data table of commodity data according to date
| Date of day | Commodity id | Sales amount | Sales amount | Price of |
| 2023-2-1 | Commodity 1 | … | … | … |
| 2023-2-2 | Commodity 1 | … | … | … |
| 2023-2-3 | Commodity 1 | … | … | … |
In one specific embodiment, the labeling process for the content in the content database according to the structured label tree to obtain the content label specifically includes the following steps:
s11: constructing a commodity knowledge graph;
in a specific embodiment, the commodity knowledge graph can be constructed by collecting commodity information (commodity information comprises a platform, a commodity id, a commodity name, a specification, commodity parameters and the like) of each electronic commerce platform to form a commodity database; entity identification is carried out on commodity text information in a commodity database by utilizing a RanER model, so that brands, classes, commodity attribute entities and related keywords are obtained, and commodity knowledge graphs are constructed by utilizing co-occurrence relations of the entities in commodities; the structure of the commodity knowledge graph is in a form of a triplet.
S12: constructing a content class database; acquiring a media content database, screening content data related to the categories of all commodities from the media content database by utilizing a commodity knowledge graph, and constructing a content category database;
specifically, the construction of the content class database comprises the following steps:
s121: collecting information of each social media content to form a media content database; the form of the content on the social media mainly comprises graphic content and video; for this step, only the original text description information is stored in the media content database: for the image-text content, storing the content title, the content text and the picture link content; for video content, storing its content title and video link content; only original text description information is stored in the media content database, and multimedia data such as picture video and the like are not contained, so that the aim of greatly reducing the storage cost is fulfilled;
s122: text matching is carried out on text information in a media content database by utilizing the commodity knowledge graph, and a content primary screening database of commodity class is established; it is described that, since the content information on the social media contains topics in various fields, including related or unrelated to the commodity, the content related to the commodity of a certain class needs to be screened out; meanwhile, the media content database contains massive data (more than 10 hundred million pieces), the content type can be further processed by converting the text besides the text, the cost of converting the text into the video is high and time-consuming, and the cost of classifying the massive data by using the AI model is also high; therefore, text matching is firstly carried out on the text information to obtain a preliminary screening database; the low-cost rapid preliminary screening reduces the data range, ensures the efficiency and reduces the cost;
in the primary screening stage, text matching is carried out on text information in a media content database by utilizing keywords of various entities in a commodity knowledge graph, including keywords of related entities such as brands, classes, attributes and the like, and the step only matches video titles of video contents in the text information, so that content data related to commodities of a certain class can be screened out from massive content data rapidly, and a primary screening database of the certain class is established; a class prescreening database, such as a cosmetic personal care class prescreening database, may be established for all business-required target classes.
S123: respectively converting the picture type content and the video type content in the primary screening database into text content; because the primary screening database in step S122 contains only text information such as the title of the content of the picture and the video, in order to further analyze the text information related to the commodity contained in the picture and the video, the content in the picture and the video needs to be converted into text;
specifically, for the picture content, converting the characters in the picture into text content by utilizing an OCR technology; for video content, the video is converted into text content using OCR technology and ASR technology, respectively: before OCR processing, frame extraction operation is carried out on the video according to a certain time interval, such as one frame is extracted every second, the video is converted into a group of pictures, characters in the pictures are converted into text contents by utilizing an OCR technology, secondary character information in the background is filtered as far as possible by utilizing the position and size information of the characters, and important content texts such as subtitles and the like are reserved; and converting the video speech into text content using ASR techniques, thereby adding (OCR text, ASR text) fields to each piece of video content; combining the two text contents to obtain video contents and converting the video contents into final text contents; because the characters and the voices on the picture in the video respectively contain a part of language information, and some language information is also respectively absent, and the OCR and ASR technologies respectively have a certain error rate, a section of video is simultaneously converted into the text by the OCR and the ASR and is used in a subsequent combination way, thereby being beneficial to more completely analyzing the language information in the video.
S124: and carrying out fine screening classification on the primary screening database, and judging whether the text content is related to commodity class. After obtaining the text content of the picture content and the video content in the preliminary screening database in step S123, it is necessary to further determine whether the content is related to the category merchandise data; since the primary screening is to narrow down data in massive data using keyword quick matching, potentially relevant content is recalled, but ambiguous keywords, such as milk brands "bright", may be used to match "sunny". Therefore, further fine screening of the class prescreening database is required. The traditional method is to label the manual data, train a supervised text classification model and judge whether the content text is related to a certain commodity. Because the number of commodity categories needing to be processed is large, the cost of retraining the classification model by adopting the manual annotation data is high.
In this step S124, the fine screening classification specifically includes: after brand and category keywords are matched with the primary screening data in the primary screening database, judging that text content is related to the data of the commodity category for the entity matched with the brand and category at the same time; meanwhile, the model judgment is not needed when the brand and class entities are matched, so that the speed is improved, and the calculation power consumption is reduced; for keywords matching only brands or categories, because of high ambiguity possibility, a text classifier is built by using a large language model with a Prompt word such as "whether the above text describes a product of skin care category", content text is classified into corresponding categories, and then a content category database is built.
S13: information extraction is carried out on the content database to construct a content label tree of the content, and data in the content database is labeled according to the content label tree of the content; wherein the method specifically comprises the following steps:
s131: firstly, extracting a character entity, a product entity, a brand entity and a commodity attribute entity from a product database by using a RanER model; carrying out semantic recognition on the class database by combining a large language model with an information extraction type prompt and a thinking chain summary type prompt, and extracting a character entity, a network hotword entity, a user pain point entity and a product characteristic entity; finally, fusing entity results to obtain a final entity;
in the step, the RaNER extracts the category information, the brand information and the related information of part of commodity attributes more accurately, extracts the content text with rich part of semantics, has general expression, and has insufficient semantic understanding capability on the content text with rich semantics, such as the word meaning of desert trunk and skin is unknown; therefore, a large language model (LLM model) is used for entity recognition, and various Prompt extraction is adopted. The LLM model can be used to perform various NLP tasks, including entity recognition; the LLM model is an artificial intelligence model aimed at understanding and generating human language. They train on a large amount of text data and can perform a wide range of tasks including text summarization, translation, emotion analysis, etc.; there are still some problems with LLM extraction such as physical recall insufficiency. According to the scheme, a section of content text is subjected to entity identification by adopting a plurality of types of promts, and the results are fused. The label word extraction effect better than that of single promt can be obtained by comprehensively using multiple types of promts, so that the accuracy of entity extraction is further ensured, and the accuracy of the construction of a subsequent content label tree is further ensured; and the entity words extracted from the content text through the RanER model and the LLM model, including the corresponding relation between the content id and the entity word id, are stored in the word extraction database table, and the result can be reused in the subsequent process of marking the content according to the structured tag tree.
S132: converting the final entity (entity words and types of entity words, and the attribute words extracted from the commodity such as functional efficacy and other types of entity words) into word vectors through a text vectorization model; then, obtaining a plurality of word vectors (word vectors with high similarity are gathered into one type) through a clustering algorithm; because the clustering algorithm has certain limitation, after the simple word vectors are clustered, words in the same class can still correspond to different meanings, so that the words in each class are induced into one or more labels through the semantic understanding capability of the large language model, and the keyword type output by the large language model is utilized to construct a content label tree of a tree structure and keywords of each label;
s133: labeling the content text in the class database according to the class content label tree. Tagging the content text in step S133 includes the steps of:
s1331: when the labeling processing is carried out on the content text, judging whether the entity extraction is carried out on the content text (namely, whether the entity extraction is carried out by using a RanER model and a LLM model or not); if entity extraction is performed, the entity word becomes a candidate tag; if the entity extraction is not performed, performing entity extraction (including RaNER entity identification and identification by using LLM large model by using information extraction type promt and hierarchical link summary type promt), and adding a candidate label set to the label corresponding to the identified entity word;
s1332: keyword matching and regular expression matching are used for keywords or regular expressions corresponding to all labels in the label tree, and matched labels are added into a candidate label set of the content text;
s1333: the candidate label set obtained through the previous two steps has potential errors, and some semantic errors can exist for labels identified by keywords and the like; the step uses a large language model to judge all candidate labels screened by the content text by using a discriminant Prompt, and the Prompt words are used for judging whether the content text is provided with the following content labels or not and determining whether the candidate labels are matched with the meanings of the corresponding content text or not; if the candidate labels are matched, the candidate labels are confirmed, and if the candidate labels are not matched, the candidate labels are corrected. Therefore, the recall rate is improved in various modes (omission is reduced), the semantic understanding capability of the LLM model is utilized, entities with keywords matched with but semantically wrong are filtered out as much as possible, and entity words which are possibly given by the LLM model in the previous link and do not exist in the original text are filtered out as much as possible, so that the effect of improving the precision rate and the recall rate is achieved.
In summary, the specific operation of labeling the content in the content database according to the structured label tree to obtain the content label in the application constructs a commodity knowledge graph, and constructs a content class database from the media content database by using the entity and the relation in the commodity knowledge graph, so that the construction efficiency of the content class database is fast, and the working efficiency is effectively improved; when the content database is built, information extraction is carried out on the content database to build a content tag tree, a RaNER model is adopted to identify an imaging entity when the information extraction is carried out, then a large language model is used for combining the information extraction type template and a thinking chain summarization type template to identify an abstract entity, and different types of entities adopt different models to carry out identification extraction, so that the problems of inaccurate extraction identification and incomplete entity recall caused by entity extraction carried out by a single model are effectively solved; finally, labeling the content text through the content label tree; the whole process has simple and not complicated step flow, and the entity extraction accuracy is higher, so that the overall labeling accuracy is higher; meanwhile, the efficiency is high, and the labor cost and the time cost are reduced.
S2: calculating content interaction indexes of all content tags and commodity sales indexes of all content tags; when calculating the index, a time period (a start date Ds and an end date De) is set for the data in the content database and the E-commerce commodity database in the step S1 and for a certain class of commodity class, and calculation is carried out; in a commodity class, X is set for a content label kk For sales index of the content label k, Yk An interaction index of the content label k;
specifically, the step S2 of calculating the content interaction index of the content tag includes the following steps:
s21: determining commodity class in content database, and counting all content number n with content label k in set time period (the set time period can be confirmed according to actual situation, usually determining a start date and an end date, and the days between the start date and the end date are set time period) to obtain content set Ck = { content 1, content 2,., content n };
s22: calculating an interaction value E for each content in a collection of contenti :
Ei Number of endorsements per content + forwarding number per content + per innerThe collection number of the container+the rating number of each content; or (b)
Ei Number of endorsements per content or forwarding number per content or collection number per content or rating number per content; the specific interaction value can be calculated according to actual conditions;
s23: calculating content interaction index Y of content label kk :
Or +.>
Specifically, as shown in fig. 2, the step S2 of calculating the commodity sales index of the content tag includes the following steps:
s201: determining commodity class in the content database, and counting the content quantity n of all content labels k of the commodity class in a set time period to obtain a set C of the contentk = { content 1, content 2,., content n };
s202: for each content i in the content set, assuming that m commodity SPUs are mentioned in the content i, determining m commodity SPUs corresponding to a single content i in the content set to obtain a commodity SPU set P corresponding to the single content ii ={SPU1 ,SPU2 ,...,SpUm };
S203: determining sales of a single commodity SPU on a certain date d in a set time period; the following are illustrated: for content i, from the release date of content i, setting content i to the commodity SPUj The time window length of the sales is t=14 days (also 30 days or so, the time window setting can be determined according to the actual situation), and the content i is set for the goods SPU in 14 days, for example, in 14 daysj Sales contributions were [ S (j, 0), S (j, 1),. The sum of the sales contributions was S (j, 13), respectively]The method comprises the steps of carrying out a first treatment on the surface of the For commodity SPUj In other words, the total Sales of a certain date d within a set time window is Sales (j, d), which value is passed through the databaseSQL can query (SPU date d)j The sum of sales of all corresponding commodity links);
s204: determining a certain date d, and contributing value of single content to sales of single goods SPU:
wherein, sales (j, d) is a commodity SPUj Sales at date d; w (W)i For a single content i, the item SPU is paired on date dj Impact of sales contribution weights; u is the SPU reference to date d within the validity time windowj The total number of content of (a);
in step S204, SPU is executedj Sales (j, d) at a certain date d are split into SPU' sj To obtain the content i on the date d for the goods SPUj Sales contribution value Si (j,d);
Set to a certain date d, there are a total of u pieces of content { content 1, content 2,.,. The content u } are all issued within 14 days, and they all mention the commodity SPUj The method comprises the steps of carrying out a first treatment on the surface of the Set content i to SPU at date dj Impact of sales contribution weight Wi ,Wi Different values may be selected, e.g. value Wi The amount of interaction of content i may be replaced by other values (specific replacement may be according to the actual situation); thereby determining that at date d, a single content i is for the item SPUj Sales contribution value of (c).
To further facilitate understanding of the item SPU of content i at date d in step S204j Sales contribution value Si (j, d), by way of example: as shown in FIG. 3, 3 pieces of content are shown in FIG. 3, and content 1 and content 3 refer to the commodity SPU1 ,SPU2 Content 2 mentions the commodity SPU1 Suppose d9 day of the commodity SPU1 With sales (corresponding to the sum of the linked sales of the merchandise), d10 day for SPU on a certain day2 With sales, dividing sales of SPU of each commodity on each date into all contents of the time window influence range of the current day according to weights, and then dividing each piece of salesSales of all the goods SPUs to which the contents are distributed are accumulated as contribution values of the respective contents to the sales of the goods.
S205: determining the merchandise P to which a single content i refersi ={SPU1 ,SPU2 ,...,SPUm Total sales contribution value:
wherein t is the length of a time window affecting commodity sales calculation after the set content is released, and when t is 14, then
S206: determining a content set C corresponding to a content tag kk Cumulative value of impact on sales of goods:
s207: calculating sales index X of content tagsk :Or->Wherein p is the content set C corresponding to the content label kk The corresponding SPU number of total goods for the duplicate removal can be obtained by statistics from a database.
It is explained here that the sales index X of the content label kk And an interaction index Yk The log processing can be omitted, but because some very large extremum easily appears in the big data, a better visual effect can be obtained after log conversion, and visual analysis can be conveniently carried out by a user.
S3: performing visual analysis on the content interaction index and the commodity sales index; the method specifically comprises the following steps:
s31: establishing a two-dimensional coordinate system, and setting up a sales index X corresponding to the content label kk As the X-axis; interaction index Y corresponding to content labelk As the Y-axis;
s32: and determining the average value of the sales index and the interaction index of all the content labels, dividing the average value of the sales index and the interaction index into four areas, and respectively putting the sales index and the interaction index into the four areas. Specifically, after calculating the average value of the sales index and the average value of the interaction index, the two average values are divided into an X axis and a Y axis, and the two mutually perpendicular lines intersect to form four areas: a sightseeing area (lower left), an opportunity area (upper left), a powerful area (upper right) and a letter service area (lower right); and then putting the sales index and the interaction index values of all the content tags into different areas: the strong area is that the sales index and the interaction index are above the respective average value; the opportunity area is that the sales index is below the average value and the interactive index is above the average value; the sightseeing area is that the sales index and the interaction index are below the respective average value; the trust zone is that the sales index is above the average value and the interactive index is below the average value.
The method and the device help customers find content tags which perform better on content interaction or better on commodity sales or perform better on both aspects by calculating the quantization indexes of the content tags on both aspects of content interaction and commodity sales; and the quantitative comparison of the content labels is performed, so that the creation of the marketing content is simple, feasible and accurate, and the labor cost and the resource cost are saved.
The examples of the present invention are merely for describing the preferred embodiments of the present invention, and are not intended to limit the spirit and scope of the present invention, and those skilled in the art should make various changes and modifications to the technical solution of the present invention without departing from the spirit of the present invention.