Movatterモバイル変換


[0]ホーム

URL:


CN120106939A - Commodity search method and its device, equipment and medium - Google Patents

Commodity search method and its device, equipment and medium
Download PDF

Info

Publication number
CN120106939A
CN120106939ACN202510176006.2ACN202510176006ACN120106939ACN 120106939 ACN120106939 ACN 120106939ACN 202510176006 ACN202510176006 ACN 202510176006ACN 120106939 ACN120106939 ACN 120106939A
Authority
CN
China
Prior art keywords
commodity
user
commodities
target
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510176006.2A
Other languages
Chinese (zh)
Inventor
许强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shangyun Network Technology Co ltd
Original Assignee
Guangzhou Shangyun Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shangyun Network Technology Co ltdfiledCriticalGuangzhou Shangyun Network Technology Co ltd
Priority to CN202510176006.2ApriorityCriticalpatent/CN120106939A/en
Publication of CN120106939ApublicationCriticalpatent/CN120106939A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请涉及一种商品搜索方法及其装置、设备、介质,所述方法包括:基于用户输入的目标搜索语句以及候选商品的商品信息,对候选商品进行粗召回操作,召回商品信息与目标搜索语句匹配的部分商品;将目标搜索语句的目标分词以及部分商品的商品信息对应的分词输入至词权重模型中,得到部分商品对应的权重值,基于权重值对部分商品进行层级标记;针对每个层级内的商品,分别将用户的用户画像特征标签以及不同层级商品的商品特征标签输入至点击通过率模型中,分别计算出每个层级内商品的点击概率,根据点击概率对每个层级内的商品进行排序,构成最终的有序商品列表反馈给用户。本申请实现更精准的商品搜索排序,有效提升用户体验和购买转化率。

The present application relates to a commodity search method and its device, equipment, and medium, the method comprising: based on the target search sentence input by the user and the commodity information of the candidate commodities, a rough recall operation is performed on the candidate commodities, and some commodities whose commodity information matches the target search sentence are recalled; the target segmentation of the target search sentence and the segmentation corresponding to the commodity information of some commodities are input into the word weight model, and the weight values corresponding to some commodities are obtained, and some commodities are hierarchically marked based on the weight values; for the commodities in each level, the user's user portrait feature label and the commodity feature labels of commodities at different levels are respectively input into the click-through rate model, and the click probability of the commodities in each level is respectively calculated, and the commodities in each level are sorted according to the click probability, so as to form a final ordered commodity list and feedback to the user. The present application realizes more accurate commodity search sorting, effectively improving the user experience and purchase conversion rate.

Description

Commodity searching method and device, equipment and medium thereof
Technical Field
The application relates to the technical field of electronic commerce, in particular to a commodity searching method and device, equipment and medium thereof.
Background
With the explosive development of electronic commerce, an independent station is taken as an important electronic commerce mode, and is usually built and operated autonomously by merchants. Often, the stand alone station contains a large number of on-sale items, which results in a significant amount of time being spent by consumers looking for their desired items, having to face a large number of unrelated item information, thus putting the dilemma of information overload. This problem not only runs off the consumer, but also limits the potential for conversion enhancement at the stand alone station. Therefore, the independent station actively adopts a commodity recommendation strategy to assist a user to quickly make a purchase decision, improve the purchase conversion rate, reduce the information overload problem, provide more intelligent and efficient shopping experience, and increase commodity sales of the independent station.
In the prior art, in the independent station scene, a search mode used for commodity recommendation usually adopts two modes of E L AST ICSEARCH (ES) search or semantic search. The ES search is a search method based on full text search, and can return a commodity list matched with a keyword input by a user, however, the ES search has limitations in terms of processing complex queries and semantic understanding, often only can be based on keyword matching, the true intention of the user is difficult to understand, and a large number of commodities with lower relevance are easy to recall, so that the search result is not accurate enough. Semantic search attempts to understand the intent of a user query through natural language processing techniques, but when the user search word is short or ambiguous, it is difficult to accurately capture the user's real-time intent.
Therefore, when the goods are recalled only by using the two modes of ES searching or semantic searching, the problem that the relevance of the commodity recommendation list fed back to the user is insufficient, the ordering is disordered, the user intention cannot be matched accurately, personalized recommendation is lacking, the user still needs to face a large amount of irrelevant commodity information, the information overload problem cannot be solved effectively, and the user experience and the purchase conversion rate are influenced finally.
Disclosure of Invention
A primary object of the present application is to solve at least one of the above problems and provide a commodity searching method, and apparatus, device and medium thereof.
In order to meet the purposes of the application, the application adopts the following technical scheme:
the commodity searching method provided by the application, which is suitable for one of the purposes of the application, comprises the following steps:
based on a target search statement input by a user and commodity information of a candidate commodity, carrying out rough recall operation on the candidate commodity, and recalling partial commodities of which commodity information is matched with the target search statement;
inputting target word segmentation of the target search statement and word segmentation corresponding to commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity, and carrying out hierarchical marking on the partial commodity based on the weight value;
and inputting the user portrait characteristic labels of the user and the commodity characteristic labels of commodities in different levels into a click through rate model aiming at the commodities in each level, respectively calculating the click probability of the commodities in each level, and sorting the commodities in each level according to the click probability to form a final ordered commodity list and feeding the final ordered commodity list back to the user.
In another aspect, a commodity searching apparatus according to one of the objects of the present application includes:
The commodity coarse recall module is used for carrying out coarse recall operation on the candidate commodity based on a target search statement input by a user and commodity information of the candidate commodity, and recalling partial commodity of which the commodity information is matched with the target search statement;
The hierarchical marking module is used for inputting target word segmentation of the target search statement and word segmentation corresponding to commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity, and performing hierarchical marking on the partial commodity based on the weight value;
The ordered commodity list forming module is used for inputting the user portrait characteristic labels of the user and commodity characteristic labels of commodities in different levels into the click through rate model respectively aiming at commodities in each level, calculating click probability of the commodities in each level respectively, and sorting the commodities in each level according to the click probability to form a final ordered commodity list to be fed back to the user.
In yet another aspect, a computer device adapted to one of the objects of the present application comprises a central processor and a memory, said central processor being adapted to invoke the steps of running a computer program stored in said memory for performing the merchandise search method of the present application.
In yet another aspect, a computer program product is provided adapted to another object of the application, comprising a computer program/instruction which, when executed by a processor, carries out the steps of the method described in any of the embodiments of the application.
The technical scheme of the application has various advantages, including but not limited to the following aspects:
According to the application, firstly, based on the target search statement input by the user and the commodity information of the candidate commodity, the part of commodities, of which the commodity information is matched with the target search statement, is recalled, and after preliminary screening, the recalled part of commodities can quickly narrow the search range, and a commodity set with higher correlation with the search intention of the user is screened from a large number of candidate commodities. On the basis of the recalled partial commodities obtained in the step, the method carries out hierarchical marking by introducing the word weight model and personalized sorting by clicking the passing rate model, so that the accuracy and the personalized degree of the search result are further optimized.
Specifically, the word weight model can calculate a weight value of each commodity according to the word segmentation of the target search sentence and the word segmentation of the commodity information, and perform hierarchical marking on the commodity based on the weight value. According to the method, the level division is carried out on partial commodities which are recalled in a rough mode based on the matching degree, commodities with higher levels are matched with the searching intention of the user, and the association degree between commodities with lower levels and the user is low. After hierarchical marking is carried out on part of the commodities which are recalled roughly, the image feature labels and commodity feature labels of the users are input through a click passing rate model, the click probability of the commodities in each hierarchy is calculated, and the commodities subjected to hierarchical marking are ordered, so that the display sequence of the commodities is further optimized, and the display sequence of the commodities is more in line with personalized requirements of the users. The personalized ordering based on the user behaviors and the preferences not only improves the probability of finding the products of the cardiology instrument, but also reduces the time and energy required by the user when screening the products, and remarkably improves the shopping experience of the user.
Finally, the commodity list subjected to hierarchical marking and personalized sorting can guide the user to make a purchase decision more effectively, so that the click rate and conversion rate of commodities are improved, and higher commercial value is brought to an e-commerce platform and merchants.
Drawings
The foregoing and/or additional aspects and advantages of the application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a network architecture of an exemplary e-commerce platform of the present application;
FIG. 2 is a flow chart of an exemplary embodiment of a method for searching for merchandise according to the present application;
FIG. 3 is a flow chart of training a click through rate model in accordance with an embodiment of the present application;
FIG. 4 is a flow chart of determining user portrait characteristic labels in an embodiment of the present application;
FIG. 5 is a schematic flow chart of a coarse recall candidate commodity based on target segmentation in an embodiment of the present application;
FIG. 6 is a schematic flow chart of a coarse recall candidate commodity based on semantic similarity in an embodiment of the present application;
FIG. 7 is a schematic flow chart of hierarchical tagging of a portion of a commodity according to an embodiment of the present application;
FIG. 8 is a flow chart of determining weight values of a portion of an article according to an embodiment of the present application;
fig. 9 is a schematic block diagram of a commodity searching apparatus according to the present application;
Fig. 10 is a schematic structural diagram of a computer device used in the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
In the network architecture shown in fig. 1, the e-commerce platform 82 is deployed in the internet to provide corresponding services to its users, and the merchant user's device 80 and the consumer user's device 81 of the e-commerce platform 82 are similarly connected to the internet to use the services provided by the e-commerce platform.
The exemplary e-commerce platform 82 provides matching of supply and demand for products and/or services to the public by means of an internet infrastructure, in the e-commerce platform 82, the products and/or services are provided as merchandise information, and for simplicity of description, the concept of merchandise, products, etc. is used in the present application to refer to the products and/or services in the e-commerce platform 82, and specifically may be physical products, digital products, tickets, service subscriptions, other off-line fulfillment services, etc.
In reality, each entity of the parties can access the identity of the user to the e-commerce platform 82, and the purpose of participating in the business activity realized by the e-commerce platform 82 is realized by using various online services provided by the e-commerce platform 82. These entities may be natural persons, legal persons, social organizations, etc. The e-commerce platform 82 corresponds to both merchant and consumer entities in commerce, and there are two broad categories of merchant users and consumer users, respectively. The online service can be used in the e-commerce platform 82 by the identity of the merchant user, while the online service can be used in the e-commerce platform 82 by the identity of the consumer, including the real or potential consumer, of the merchant user. In actual business activities, the same entity can perform activities on the identity of a merchant user and the identity of a consumer user, so that the user can flexibly understand the activities.
The infrastructure for deploying the e-commerce platform 82 mainly comprises a background architecture and front-end equipment, wherein the background architecture runs various online services through a service cluster, including middleware or front-end services facing a platform side, services facing a consumer, services facing a merchant and the like to enrich and perfect service functions of the services, and the front-end equipment mainly comprises terminal equipment used by a user as a client to access the e-commerce platform 82, including but not limited to various mobile terminals, personal computers, point-of-sale equipment and the like. For example, a merchant user can enter commodity information for his online store through his terminal device 80 or generate his commodity information by using an interface opened by the e-commerce platform, and a consumer user can access a webpage of the online store implemented by the e-commerce platform 82 through his terminal device 81, trigger a shopping process by a shopping key provided on the webpage, and invoke various online services provided by the e-commerce platform 82 in the shopping process, thereby achieving the purpose of ordering shopping.
In some embodiments, the e-commerce platform 82 may be implemented by a processing facility including a processor and memory that stores a set of instructions that, when executed, cause the e-commerce platform 82 to perform e-commerce and support functions in accordance with the present application. The processing facility may be part of a server, client, network infrastructure, mobile computing platform, cloud computing platform, fixed computing platform, or other computing platform, and provide electronic components of the merchant platform 82, merchant devices, payment gateways, application developers, marketing channels, transport providers, client devices, point-of-sale devices, and the like.
The e-commerce platform 82 may be implemented as online services such as cloud computing services, software as a service (SaaS), infrastructure as a service (iaas), platform as a service (PaaS), desktop as a service (DaaS), hosted software as a service, mobile backend as a service (MBaaS), information technology management as a service (I TMaaS), and the like. In some embodiments, the various features of the e-commerce platform 82 may be implemented to be adapted to operate on a variety of platforms and operating systems, e.g., for an online store, the administrator user may enjoy the same or similar functionality, whether in the various embodiments iOS, android, homonyOS, web pages, etc.
The e-commerce platform 82 may implement its respective independent station for each merchant to run its respective online store, providing the merchant with a respective instance of the commerce management engine for the merchant to establish, maintain, and run one or more of its online stores in one or more independent stations. The business management engine instance can be used for content management, task automation and data management of one or more online stores, and various specific business processes of the online stores can be configured through interfaces or built-in components and the like to support the realization of business activities. The independent station is an infrastructure of the e-commerce platform 82 with cross-border service functionality, and merchants can maintain their online stores more centrally and autonomously based on the independent station. The stand-alone stations typically have merchant-specific domain names and memory space, with relative independence between the different stand-alone stations, and the e-commerce platform 82 may provide standardized or personalized technical support for a vast array of stand-alone stations, so that merchant users may customize their own adaptive commerce management engine instances and use such commerce management engine instances to maintain one or more online stores owned by them.
The online store may implement background configuration and maintenance by the merchant user logging in his business management engine instance with an administrator identity, which, in support of various online services provided by the infrastructure of the e-commerce platform 82, may configure various functions in his online store, consult various data, etc., e.g., the merchant user may manage various aspects of his online store, such as viewing recent activities of the online store, updating online store inventory, managing orders, recent access activities, total order activities, etc., the merchant user may also view more detailed information about businesses and visitors to the merchant's online store, such as displaying sales summaries of the merchant's overall business, specific sales and participation data of the active sales marketing channel, etc., by acquiring reports or metrics.
The e-commerce platform 82 may provide a communications facility and associated merchant interface for providing electronic communications and marketing, such as utilizing an electronic message aggregation facility to collect and analyze communications interactions between merchants, consumers, merchant devices, customer devices, point-of-sale devices, etc., to aggregate and analyze communications, such as for increasing the potential to provide product sales, etc. For example, a consumer may have problems with the product, which may create a dialogue between the consumer and the merchant (or an automated processor-based proxy on behalf of the merchant), where the communication facility is responsible for interacting and providing the merchant with an analysis of how to increase sales probabilities.
In some embodiments, an application program suitable for being installed to a terminal device may be provided to serve access requirements of different users, so that various users can access the e-commerce platform 82 in the terminal device through running the application program, for example, a merchant background module of an online store in the e-commerce platform 82, and in the process of implementing the business activity through the functions, the e-commerce platform 82 may implement various functions related to supporting implementation of the business activity as middleware or online service and open corresponding interfaces, and then implant a tool kit corresponding to the interface access function into the application program to implement function expansion and task implementation. The commerce management engine may include a series of basic functions and expose those functions through APIs to online service and/or application calls that use the corresponding functions by remotely calling the corresponding APIs.
Under the support of the various components of the commerce management engine instance, the e-commerce platform 82 may provide online shopping functionality, enabling merchants to establish contact with customers in a flexible and transparent manner, consumer users may purchase items online, create merchandise orders, provide delivery addresses for the items in the merchandise orders, and complete payment confirmation of the merchandise orders. The merchant may then review and fulfill or cancel the order. The audit component carried by the business management engine instance may enable compliance use of the business process to ensure that the order is suitable for fulfillment prior to actual fulfillment. Orders can sometimes be fraudulent, requiring verification (e.g., identification card checking), a payment method that requires the merchant to wait to ensure funds are received can act to prevent such risk, and so on. The order risk may be generated by fraud detection tools submitted by third parties through an order risk API or the like. Before fulfillment, the merchant may need to acquire payment information or wait to receive payment information in order to mark the order as paid before the merchant can prepare to deliver the product. Such as this, a corresponding examination can be made. The audit flow may be implemented by a fulfillment component. Merchants can review, adjust the job by means of the fulfillment component and trigger related fulfillment services, such as through a manual fulfillment service that can be used when a merchant picks and packages a product in a box, purchases a shipping label and enters its tracking number, or simply marks an item as fulfilled, a custom fulfillment service that can define sending an email to notify, an API fulfillment service that can trigger a third party application to create a fulfillment record at a third party, a legacy fulfillment service that can trigger a custom API call from a business management engine to a third party, and a gift card fulfillment service. Generating a number and activating the gift card may be provided. Merchants may print shipping slips using an order printer application. The fulfillment process may be performed when the items are packaged in boxes and ready for shipment, tracking, delivery, verification by the consumer, etc.
The e-commerce platform 82 may provide all independent station stores therein with a plug-in service that the e-commerce platform 82 provides to merchants of the independent station stores a plug-in system developed and maintained by the e-commerce platform for use by the independent station stores, the plug-in system providing a plurality of plug-ins with diverse functions, including, for example, but not limited to, intelligent customer service plug-ins that may be used to act as customer service roles for the independent station stores, automatically replying to part of preset consultation questions and/or speech based on preset question-answer data of the independent station stores; the commodity image design plug-in can be used for further designing commodity images such as a head image, a commodity detail image and the like of commodities by utilizing self-established image layer template data or default image layer template data of a merchant of an independent station store so as to enhance marketing effects of the commodity images, the commodity transfer list search plug-in can be used for searching corresponding commodity transfer lists by utilizing commodity title data of the independent station store, the data analysis plug-in can be used for helping the independent station store collect, analyze and display operation data of the store, including traffic sources, customer behaviors, sales trends and the like so that the merchant can make more intelligent business decisions, the inventory management plug-in can be used for monitoring and managing inventory data of the independent station store in real time and giving an alarm when inventory is insufficient so as to avoid overstock and backout conditions, the commodity transfer list search plug-in can be used for creating and tracking various sales promoting activities such as coupons, time-limited discounts, member integration and the like so as to attract consumers and improve sales, and the sales plug-in can be used for creating a promotion plug-in by utilizing artificial intelligent media technology based on product data of the independent station store and/or sales activity data of the independent station store to send the sales search plug-in to the product data to the independent station store and the sales search engine which is related to the social station store by using artificial intelligent media, the store decoration plug-in can be used for matching corresponding front-end decoration interface templates based on product data and/or marketing campaign data of independent stores so as to optimize visual effects of store decoration and improve efficiency of store decoration.
The commodity searching method of the present application may be programmed as a computer program product and deployed to be executed on a client or a server, for example, in the exemplary application scenario of the present application, the commodity searching method may be deployed and implemented on a server of an e-commerce customer service platform.
Referring to fig. 2, in an exemplary embodiment of the application, the method for searching a commodity includes the following steps:
step S5100, based on a target search statement input by a user and commodity information of a candidate commodity, carrying out rough recall operation on the candidate commodity, and recalling partial commodities of which commodity information is matched with the target search statement;
In each independent station of the e-commerce platform, by providing a search sentence input component for a user to search for a desired commodity in a search page, the user can input keywords, phrases or complete sentences therein to express the corresponding search intention. In this step, a rough recall operation is performed on the candidate commodity based on the search term (i.e., target search term) input by the user at the input component of the specific stand-alone station and commodity information of the candidate commodity. The candidate commodities comprise commodities in a selling state currently in an independent station, and in the coarse recall operation, recall commodity information is matched with the target search statement to form partial commodities, wherein the commodity information comprises commodity titles, descriptions, feature labels, classifications, prices, stock states and the like and is used for comprehensively describing the attributes, states and performances of the commodities.
In one embodiment, the user enters a target search term, such as "summer new dress," through a search term entry component provided by a stand alone search page. Based on the target search statement, a coarse recall operation is performed on the candidate good using a E L AST ICSEARCH (ES) search engine. The ES is a distributed search engine, supports efficient full-text retrieval and structured data query, and is suitable for quick recall of massive commodity data. In the ES, commodity information (including commodity title, description, feature tag, category, price, stock state, etc.) of the candidate commodity is indexed in advance, and an inverted index structure is constructed. The inverted index generates mapping relation data of the entry and the unique commodity identifier containing the entry through word segmentation processing of commodity information. For example, the commodity heading "summer new dress" is segmented into "summer", "new dress" and "dress" and the mapping of entry to commodity ID is established in the inverted index. After a user inputs a target search sentence, the ES firstly performs word segmentation processing on the target search sentence to obtain a target word segmentation set. For example, the target search term "summer new dress" is divided into "summer", "new dress". Then, the ES is based on the target word segmentation set and the inverted index, and the recall term at least includes an initial commodity set of the target word, and reference is made to the following detailed description, which is omitted herein.
To enhance recall effects, the ES also supports fuzzy queries and synonym extensions, thereby recalling more relevant items. After recalling the initial commodity set, the ES calculates a score for each commodity based on the correlation of the commodity information with the target search statement. The relevance score comprehensively considers the frequency of occurrence, the position weight and the field weight (such as the title weight is higher than the description weight) of the target word in the commodity title, the description and the feature labels. For example, the article of merchandise ID 1001 is entitled "summer new dress," described as "fashion and slimming design," characterized by "summer," "dress," "new," and having a higher correlation score, while the article of merchandise ID 1004 is entitled "summer T-shirt," described as "breathable and comfortable," characterized by "summer," "T-shirt," and having a lower correlation score. And finally, the ES sorts the initial commodity set according to the relevance score, and selects commodities with the score higher than a preset threshold value as partial commodities to complete the coarse recall operation. Through the high-efficiency retrieval capability of the ES, partial commodities matched with the target search statement can be recalled from massive candidate commodities, and basic data support is provided for subsequent refined sorting.
In another embodiment, the user enters the target search term through a search term input component provided by a stand alone search page, such as "frivolous skirt fit for summer wear". And performing rough recall operation on the candidate commodity by utilizing a semantic search technology based on the target search statement. Semantic search can more accurately capture a user's search intent by understanding the semantics of a search term rather than simple keyword matching. Specifically, the target search statement and the commodity information (including commodity titles, descriptions, feature tags, etc.) of the candidate commodity are converted into semantic vector representations. The conversion of semantic vectors can be accomplished through pre-trained language models (e.g., BERT, sentence-BERT, etc.) that can map text to a high-dimensional vector space such that semantically similar text is closer in vector space. Then, a similarity score between the semantic vector of the target search term and the semantic vector of each candidate item is calculated. The similarity score is generally calculated by using cosine similarity, the range of values is [ -1,1], and the closer the value is to 1, the more similar the semantics. In order to improve recall effect, the similarity score can be weighted and adjusted by combining the historical click rate, purchase conversion rate, user evaluation score and other behavior data of the commodity. For example, for a commodity with a higher historical click rate, its similarity score may be appropriately increased. And finally, sorting the candidate commodities according to the similarity scores, screening commodities with scores higher than a preset threshold value as partial commodities, and completing the coarse recall operation. Through the semantic search technology, the search intention of the user can be more accurately understood, partial commodities which are semantically matched with the target search statement are recalled, and a high-quality data base is provided for subsequent refined sorting.
In another embodiment, the candidate commodity can be subjected to coarse recall operation by combining the ES searching technology and the semantic searching technology so as to consider the keyword matching efficiency and the semantic understanding accuracy. By combining the ES searching and semantic searching technologies, the keyword matching efficiency and the semantic understanding accuracy can be simultaneously considered, and partial commodities highly matched with the target search statement are recalled.
Step S5200, inputting target word segmentation of the target search statement and word segmentation corresponding to commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity, and carrying out hierarchical marking on the partial commodity based on the weight value;
And acquiring a target word segmentation set of the target search statement and a word segmentation set corresponding to commodity information of part of commodities. The target word segmentation set is obtained by carrying out word segmentation processing on target search sentences. The word segmentation set corresponding to the commodity information of the partial commodity is obtained by carrying out word segmentation processing on the commodity title, description and feature labels. And then inputting the target word segmentation set and the word segmentation set corresponding to the commodity information into a word weight model. In one embodiment, the word weight model is a model based on machine learning, and can calculate the weight value of each commodity according to the matching degree of the target word and the commodity information word, the searching frequency of the target word, the latest searching time and other factors. The word weight model firstly calculates the matching degree score of the target word and the commodity information word, for example, the matching degree score of the target word and the commodity title word is 1.0, and the matching degree score of the target word and the commodity description word is 0.8. Then, the matching degree score is weighted based on the search frequency and the latest search time of the target word. The higher the searching frequency is, the greater the importance of the target word is, the higher the corresponding forward weighting value is, the closer the latest searching time is, the stronger the timeliness of the target word is, and the lower the corresponding attenuation weighting value is. For example, the search frequency of the target word "summer" is 1000 times/month, the forward weighting value is 1.2 before the latest search time is 1 day, the decay weighting value is 0.9, the search frequency of the target word "new" is 500 times/month, the forward weighting value is 1.0 before the latest search time is 7 days, and the decay weighting value is 0.7. And adding the forward weighted value and the attenuation weighted value to obtain the comprehensive weighted value of the target word, wherein the comprehensive weighted value of summer is 2.1, and the comprehensive weighted value of new summer is 1.7. And finally, multiplying the comprehensive weighted value by the matching degree score, and accumulating the weighted matching degree scores of all target words to obtain the weighted value of partial commodities. And carrying out hierarchical marking on part of commodities based on the matching relation between the weight value and a preset hierarchical threshold. For example, preset level thresholds of 5.0, 4.0 and 3.0 are marked as a first level, commodity with weight value not less than 5.0 is marked as a second level, commodity with weight value not less than 4.0 and less than 5.0 is marked as a third level, and commodity with weight value not less than 3.0 and less than 4.0 is marked as a third level. According to the embodiment, part of commodities can be finely layered according to the matching degree of the target word segmentation and commodity information and the importance of the target word segmentation.
Step S5300, inputting the user portrait characteristic labels of the user and the commodity characteristic labels of commodities in different levels into a click through rate model respectively aiming at the commodities in each level, calculating the click probability of the commodities in each level respectively, and sorting the commodities in each level according to the click probability to form a final ordered commodity list and feeding the final ordered commodity list back to the user.
And acquiring user portrait characteristic labels of users and commodity characteristic labels of commodities in different levels. User portrayal feature labels are obtained by analyzing and modeling user historical behavior data (such as purchase records, click records, etc.), and include, but are not limited to, general labels such as "new product sensitivity", "hot product sensitivity", "activity sensitivity", "price sensitivity", etc. For example, user A's user portrayal feature labels are "new sensitive", "hot sensitive", and "active sensitive", indicating that user A has a high interest in new, hot, and active merchandise. The commodity characteristic label is obtained by extracting and modeling commodity information (such as title, description, classification, price and the like) and is used for describing the attribute, state and performance of the commodity, and it is noted that the commodity characteristic label corresponds to the user portrait label, for example, the user portrait label is sensitive to a new product and corresponds to the commodity characteristic label. Commodity feature labels include, but are not limited to, "new," "hot," "active," and the like.
The user portrait characteristic tag and the commodity characteristic tag are input into a click through rate model (CTR model). The CTR model is a model based on machine learning, and can predict click probability of a user on a commodity according to matching degree of image features of the user and commodity features. The CTR model first performs feature cross and embedd i ng mapping on the user image feature labels and merchandise feature labels, converting them into a high-dimensional vector representation. A matching score between the user vector and the merchandise vector is calculated and converted to a click probability value by an activation function (e.g., sigmoid function).
In order to improve the prediction accuracy, the CTR model may also be dynamically adjusted in combination with real-time performance data (such as current click-through rate, inventory status, promotional information, etc.) of the commodity. For example, for items that are in tension in inventory, their click probability value may be increased appropriately. And finally, sorting the commodities in each level according to the click probability to form a final ordered commodity list and feeding the final ordered commodity list back to the user. According to the method and the device, the accurate ordered commodity list can be generated according to the personalized requirements of the user and the real-time performance of the commodities, and the searching experience and the purchasing conversion rate of the user are improved.
As can be appreciated from the exemplary embodiments of the present application, the technical solution of the present application has various advantages, including but not limited to the following aspects:
The application firstly recalls partial commodities of which commodity information is matched with the target search statement by utilizing an ES search and/or semantic search mode based on the target search statement input by a user and commodity information of the candidate commodity, ensures the universality and the relativity of the recalled commodities by utilizing the full text retrieval capability of the ES search and the deep understanding of the semantic search, and can quickly narrow the search range after the recalled partial commodities are preliminarily screened, and screen out commodity sets with higher relativity with the search intention of the user from massive candidate commodities. On the basis of the recalled partial commodities obtained in the step, the method carries out hierarchical marking by introducing the word weight model and personalized sorting by clicking the passing rate model, so that the accuracy and the personalized degree of the search result are further optimized.
Specifically, the word weight model can calculate the weight value of each commodity according to the matching degree of the word segmentation of the target search sentence and the word segmentation of commodity information, the search frequency of the target word segmentation, the latest search time and other factors, and hierarchical marking is carried out on the commodity based on the weight value. According to the method, the level division is carried out on partial commodities which are recalled in a rough mode based on the matching degree, commodities with higher levels are matched with the searching intention of the user, and the association degree between commodities with lower levels and the user is low. After hierarchical marking is carried out on part of the commodities which are recalled roughly, the image feature labels and commodity feature labels of the users are input through a click passing rate model, the click probability of the commodities in each hierarchy is calculated, and the commodities subjected to hierarchical marking are ordered, so that the display sequence of the commodities is further optimized, and the display sequence of the commodities is more in line with personalized requirements of the users. The personalized ordering based on the user behaviors and the preferences not only improves the probability of finding the products of the cardiology instrument, but also reduces the time and energy required by the user when screening the products, and remarkably improves the shopping experience of the user.
Finally, the commodity list subjected to hierarchical marking and personalized sorting can guide the user to make a purchase decision more effectively, so that the click rate and conversion rate of commodities are improved, and higher commercial value is brought to an e-commerce platform and merchants.
In a further embodiment, referring to fig. 3, based on a target search statement input by a user and commodity information of a candidate commodity, performing a coarse recall operation on the candidate commodity, and before recalling a part of commodities with commodity information matched with the target search statement, the method includes the following steps:
Step S6100, a training sample set is obtained, wherein the training sample set comprises a plurality of user history behavior samples and corresponding supervision labels thereof, each user history behavior sample comprises a user portrait characteristic label of a user and a commodity characteristic label of a candidate commodity, and the supervision labels represent whether the user clicks the corresponding candidate commodity;
in one embodiment, a collection program is set to collect user behavior samples (hereinafter referred to as user historical behavior samples) in a specific time period, in which the search behavior of the user is recorded, and whether the user portrait characteristic label, the commodity characteristic label of the candidate commodity and the candidate commodity are clicked or not is recorded. The user portrait feature tag may be obtained from a user information base, and reference is made to the following detailed description for determining the user portrait feature tag, which is not repeated herein. Likewise, the merchandise feature labels of the candidate merchandise may be obtained from a merchandise information base, the user portrayal feature labels and the merchandise feature labels may be combined into user historical behavior samples, and a supervision label may be added to each sample. The supervision tab is used to characterize whether the user clicked on the corresponding candidate item. For example, the supervision tab of the combined sample for user a and product ID 1001 is 1 (indicating click), and the supervision tab of the combined sample for user a and product ID 1004 is 0 (indicating no click). And finally, forming a training sample set by all the user history behavior samples and the corresponding supervision labels. The training sample set is used for subsequent training of click through rate models (CTR models), and reference is made to the following detailed description, which is omitted here.
And S6200, calling the training sample set to train the click through rate model to a convergence state, so that the training sample set acquires the ability of predicting the click probability of the user on the candidate commodity according to the user portrait characteristic label and the commodity characteristic label of the candidate commodity.
In one embodiment, user portrayal feature labels (e.g., "new-feature sensitive", "hot-pin sensitive", "activity sensitive") and merchandise feature labels (e.g., "new-feature", "hot-pin") in a training sample set are feature engineered to include feature codes, feature crossings, and feature embedd i ng mappings. For example, the user portrait feature tag "new item sensitive" is encoded as [1,0], the merchandise feature tag "new item" is encoded as [0,1], and mapped to a high dimensional vector representation through embedd i ng layers. Next, a network structure of a click through rate model (CTR model) is constructed. The CTR model typically employs a Deep Neural Network (DNN) or a wide-Deep model (Wi de & Deep), where the wide portion is used to handle sparse features (e.g., user portrayal features and merchandise features) and the Deep portion is used to learn higher-order interactions between features. For example, the user portrait feature vector U1 of the user a and the commodity feature vector P1 having the commodity ID 1001 perform feature intersection and interaction by DNN, and generate a matching degree score. The training sample set is then input into the CTR model for training. In the training process, the error between the click probability predicted by the model and the real supervision label is calculated by adopting a cross entropy loss function, and model parameters are updated through a back propagation algorithm. To improve the generalization ability of the click through rate model, regularization techniques (e.g., L2 regularization) and optimization algorithms (e.g., adam optimizers) may also be employed in the training process. Finally, training the CTR model through multiple rounds of iterative training until the model loss function converges to a preset threshold range. For example, after 100 rounds of training, the model loss function value stabilizes below 0.01, indicating that the model has reached a converged state. According to the embodiment, a high-precision CTR model can be trained, so that the capability of predicting the click probability of the user on the candidate commodity according to the user portrait characteristic label and the commodity characteristic label is learned, and support is provided for subsequent commodity sequencing and recommendation.
In the embodiment, the click probability of the user on the candidate commodity can be accurately predicted by acquiring the historical behavior sample of the user and the supervision label thereof, constructing a high-quality training sample set and combining the feature engineering processing and the deep neural network model training. The trained CTR model can accurately predict the click probability of the user on the commodity according to the user portrait characteristic label and the commodity characteristic label, and provides high-precision data support for subsequent commodity ordering and recommendation.
In a further embodiment, referring to fig. 4, before obtaining the training sample set, the method includes the following steps:
Step S7100, based on the unique identification of the user, acquiring behavior data of each independent station of the user in the e-commerce platform, and determining corresponding user portrait data, wherein the behavior data comprises purchase behavior data, purchase behavior data and click behavior data;
And extracting historical behavior data of the user from user behavior logs of each independent station in the e-commerce platform through unique user identifiers (such as user ID, user mailbox, user mobile phone number and the like). The user behavior log records all interaction behaviors of the user on each independent station, including purchase behavior data, shopping behavior data and clicking behavior data. In one embodiment, the extracted user behavior data is cleaned and preprocessed to remove duplicate records, invalid records, and abnormal data, and ensure accuracy and integrity of the data. And generating user portrait data of the user based on the cleaned behavior data. The generated user portrait data may include user information and behavioral data of the user for subsequent determination of user portrait feature tags. The implementation of the step can extract valuable user portrait data from the historical behavior data of the user, and provide a data basis for the generation of subsequent user portrait feature labels.
The data of multiple independent stations cover the behavior of the user under different scenes, the interest preference, consumption capability, purchasing habit and the like of the user can be reflected more comprehensively, the data association across the independent stations can be realized by utilizing the unique identification of the user, and the integrity and the continuity of the behavior data of the user are ensured. By integrating the multi-station data, the consumer characteristics of the user can be more fully characterized.
Step S7200, inputting the user portrait data and a preset general label set into a pre-trained label definition model, determining a user portrait characteristic label of a corresponding user, and storing the user portrait characteristic label in association with the unique user identifier, wherein the general label set comprises a plurality of user portrait characteristic labels.
The universal tag set is a predefined standardized tag set for uniformly describing the features of the user. For example, "new sensitivity" in the user portrait data is mapped to a general label "new sensitivity", and "hot sensitivity" is mapped to a general label "hot sensitivity". User image data and a general label set are input into a pre-trained label definition model. The label definition model is a model based on rules or machine learning, and can generate user portrait characteristic labels corresponding to users according to the matching degree of user portrait data and general labels. The generated user portrait characteristic tag is then stored in association with a user unique identification (e.g., user ID). The step can standardize the user portrait data into a unified user portrait feature tag, and the unified user portrait feature tag is associated with a user unique identifier for storage, and can be directly extracted based on the user unique identifier in the subsequent extraction.
In the embodiment, the standardized user portrait characteristic label can be efficiently and accurately generated and stored in association with the unique identifier of the user by acquiring the behavior data of each independent station of the user in the e-commerce platform and combining the predefined general label set and the label definition model. According to the method and the device, based on the unique user identification, the behavior characteristics of the user at multiple independent stations of the E-commerce platform can be comprehensively captured, the user portraits can be uniformly described through the standardized labels, the accuracy and consistency of the user portraits are remarkably improved, and high-quality data support is provided for follow-up personalized recommendation and accurate marketing.
In a further embodiment, referring to fig. 5, based on a target search sentence input by a user and commodity information of a candidate commodity, performing a coarse recall operation on the candidate commodity, recall a part of commodities with commodity information matched with the target search sentence, including the following steps:
Step S5110, performing word segmentation processing on target search sentences input by a user to obtain a plurality of target word segments to form a target word segment set;
And acquiring a target search sentence input by a user on the search page, and then performing word segmentation processing on the target search sentence by using a word segmentation tool or a word segmentation model. The word segmentation tool may be a dictionary-based word segmentation tool (e.g., jieba word segmentation) or a machine-learning-based word segmentation model (e.g., BERT word segmentation). In order to improve the accuracy of word segmentation, the context semantics can be combined for optimization in the word segmentation process. For example, for the target search sentence "light and thin skirt fit for summer wear", the word segmentation tool can identify "fit for summer wear" as one semantic unit and segment it as "fit", "summer", "wear", "light and thin" skirt. And then, cleaning and de-duplicating the word segmentation result, and removing stop words (such as ' and ' having ' and invalid characters (such as punctuation marks), so as to ensure the purity and effectiveness of the target word segmentation set. Finally, the cleaned word segmentation results form a target word segmentation set, so that target search sentences input by a user can be segmented into meaningful word segmentation units, and basic data are provided for subsequent inverted index inquiry and commodity recall.
Step 5120, determining that an entry at least comprises an initial commodity set of the target word segmentation based on the target word segmentation set and a preset inverted index, wherein the inverted index is constructed by performing word segmentation on commodity titles, commodity descriptions and commodity feature labels of candidate commodities to generate mapping relation data of the entry and a commodity unique identifier comprising the entry;
And obtaining a target word segmentation set and a preset inverted index. The inverted index is constructed by performing word segmentation processing on the commodity title, commodity description and commodity feature labels of candidate commodities to generate mapping relation data of the vocabulary entry and the commodity unique identifier containing the vocabulary entry. And a mapping from the entry to the commodity ID is established in the inverted index, such as "summer" mapping to commodity IDs 1001, 1004, 1005, and "one-piece dress" mapping to commodity IDs 1001, 1002, 1003. And then based on the target word segmentation set and the inverted index, executing Boolean query or matching query, and recalling that the term at least comprises an initial commodity set of the target word segmentation. For example, if the target segment "summer" is mapped to the article ID 1001, 1004, 1005 and the target segment "one-piece dress" is mapped to the article ID 1001, 1002, 1003, the initial article set is 1001, 1002, 1003, 1004, 1005. Wherein the inverted index also supports fuzzy queries and synonym extensions to recall more related items. For example, the target segment "dress" is expanded by synonyms to "skirt" and "longeron". Through the step, the initial commodity set matched with the target search statement can be efficiently screened from a large number of candidate commodities.
And step 5130, calculating the relevance score of each commodity and the target search statement based on the commodity information of each commodity in the initial commodity set, and screening each commodity in the initial commodity set according to the relevance score to obtain the partial commodity, thereby completing the rough recall operation.
And acquiring commodity information of each commodity in the initial commodity set, wherein the commodity information comprises commodity titles, commodity descriptions, commodity feature labels and the like. And calculating the relevance score of each commodity and the target search sentence based on the target word segmentation set of the target search sentence and commodity information. The calculation of the relevance score comprehensively considers the occurrence frequency, the position weight and the field weight of the target word in the commodity title, the commodity description and the commodity characteristic label. For example, the target word "summer" appears 1 time in the title of the commodity, 1 time in the description of the commodity, 1 time in the characteristic label of the commodity with the frequency of 3, the target word "new" appears 1 time in the title of the commodity, 0 time in the description of the commodity, 1 time in the characteristic label of the commodity with the frequency of 2. Then, according to the preset field weight (for example, the title weight is 0.5, the description weight is 0.3, and the tag weight is 0.2) and the position weight (for example, the position weight of the title before is higher), the weighted score of the target word in the commodity information is calculated. For example, the target word "summer" has a weight score of 1×0.5=0.5 in the commodity title, a weight score of 1×0.3=0.3 in the commodity description, and a weight score of 1×0.2=0.2 in the commodity feature label, and the total weight score thereof is 0.5+0.3+0.2=1.0. And finally, adding the weighted scores of the target segmentation words to obtain the correlation score of the commodity and the target search statement. And then, based on the correlation score, the commodities with the score higher than a preset threshold value can be directly screened out to serve as partial commodities, and the commodities in the initial commodity set can be sequenced, and the preset number of commodities are screened out to serve as partial commodities, so that the rough recall operation is completed. Through the step, partial commodities with high correlation with the target search statement can be screened from the initial commodity set.
In the embodiment, the word segmentation processing is performed on the target search statement input by the user, and the preset inverted index is combined, so that the initial commodity set matched with the target search statement can be recalled efficiently and accurately. According to the word segmentation processing and inverted index query method, the initial commodity set matched with the target search statement can be rapidly and accurately screened from a large number of candidate commodities, and the recall relevance and coverage rate can be improved through fuzzy query and synonym expansion, so that the matching accuracy is improved.
In a further embodiment, referring to fig. 6, based on a target search sentence input by a user and commodity information of a candidate commodity, performing a coarse recall operation on the candidate commodity, recall a part of commodities with commodity information matched with the target search sentence, including the following steps:
Step S5140, inputting the target word segmentation set of the target search statement, the commodity title, commodity description and commodity feature labels of the candidate commodities into a semantic similarity model, and calculating the semantic similarity score of the target search statement and each candidate commodity;
And acquiring a target word segmentation set of the target search statement, and commodity titles, commodity descriptions and commodity feature labels of the candidate commodities. And inputting the target word segmentation set and commodity information into the semantic similarity model. The semantic similarity model is a model based on a pre-training language model (such as BERT, sentence-BERT and the like), and can map texts to a high-dimensional vector space and measure the semantic similarity by calculating cosine similarity among vectors. Then, a similarity score between the semantic vector of the target search term and the semantic vector of each candidate item is calculated. The similarity score is typically calculated using cosine similarity. In order to improve the accuracy of the semantic similarity score, the semantic similarity model may also be optimized in combination with context information. For example, for the target search statement "light and thin skirt fit for summer wear", the semantic similarity model can identify "fit for summer wear" as a semantic unit and match it with "fit for summer wear" in the commodity description, thereby improving the similarity score. Through the method, the semantic similarity score of the target search statement and each candidate commodity can be accurately calculated.
Step S5150, calculating the comprehensive score of each candidate commodity by combining the historical click rate, purchase conversion rate, user evaluation score and preset weight coefficient of the candidate commodity based on the semantic similarity score;
The semantic similarity score, historical click rate, purchase conversion rate, user rating score, and pre-set weighting coefficients for candidate merchandise are obtained, which may be set by one of ordinary skill in the art based on experience or data analysis. And then, carrying out normalization processing on each index to ensure that the value range of each index is between 0 and 1. Then, according to the preset weight coefficient, the weight score of each index is calculated. And finally, adding the weighted scores of the indexes to obtain the comprehensive score of the candidate commodity. Through the method, multidimensional indexes such as semantic similarity, user behavior data, commodity evaluation and the like can be comprehensively considered, the comprehensive score of the candidate commodity is calculated, and comprehensive and accurate data support is provided for subsequent commodity screening.
And step 5160, screening out candidate commodities with the comprehensive score higher than a preset threshold value as the partial commodities, and completing the coarse recall operation.
And obtaining the comprehensive score of all the candidate commodities and a preset threshold value, comparing the comprehensive score of each candidate commodity with the preset threshold value, and screening candidate commodities with the comprehensive score higher than the threshold value. And taking the screened candidate commodity as a part of commodity to finish the coarse recall operation. Through the step, partial commodities with high comprehensive scores can be screened from the candidate commodities.
In this embodiment, by combining the semantic similarity score with multidimensional indexes such as user behavior data (e.g., historical click rate, purchase conversion rate, user evaluation score), the comprehensive score of the candidate commodity can be accurately and comprehensively calculated, so that partial commodities highly matched with the target search statement can be screened out. Specifically, in the embodiment, the semantic similarity score of the target search statement and the candidate commodity is calculated by utilizing the semantic similarity model, and semantic association between the search statement and commodity information is captured through a pre-training language model (such as BERT, sentence-BERT and the like), so that the accuracy of semantic understanding is remarkably improved. And then, combining the historical click rate, purchase conversion rate, user evaluation score and other behavior data of the candidate commodity, and carrying out weighted adjustment on the semantic similarity score to generate a comprehensive score of the candidate commodity. The comprehensive calculation of the multi-dimension index not only considers the accuracy of semantic matching, but also integrates the user behavior feedback and commodity expression data, so that the comprehensive score is more representative and practical. And finally, screening candidate commodities higher than a preset threshold value based on the comprehensive score, and completing the rough recall operation. The method combining semantic understanding and behavioral data analysis can remarkably improve accuracy and correlation of commodity recall, and provide high-quality data support for follow-up refined sequencing and recommendation, so that search experience and purchase conversion rate of users are effectively improved.
In a further embodiment, referring to fig. 7, inputting the target word of the target search sentence and the word corresponding to the commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity, and performing hierarchical marking on the partial commodity based on the weight value, including the following steps:
Step S5210, obtaining target word segmentation of the target search statement, and counting the search frequency and the last search time of the target word segmentation within a preset time range based on preset historical search log data;
and acquiring a target word segmentation set of the target search sentence, and extracting a search record related to the target word segmentation from preset historical search log data. The historical search log data records all search behaviors of the user in a preset time range (such as the past 30 days), including search sentences, search time and search result clicking conditions, and based on the extracted search records, the search frequency of each target word in the preset time range is counted. Then, the last search time of each target word is extracted from the search record.
Step S5220, inputting the target word, the search frequency, the search time and the word corresponding to the commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity;
And inputting the target word segmentation set, the search frequency of each target word, the latest search time and the word segmentation set corresponding to the commodity information of part of commodities into a word weight model. The word weight model is a model based on machine learning, and can calculate the weight value of each commodity according to the matching degree of the target word and the commodity information word, the searching frequency of the target word and the latest searching time. The word weight model firstly calculates the matching degree score of the target word segmentation and the commodity information word segmentation. Then, the matching degree score is weighted based on the search frequency and the latest search time of the target word. The higher the searching frequency is, the greater the importance of the target word is, the higher the corresponding forward weighting value is, the closer the latest searching time is, the stronger the timeliness of the target word is, and the lower the corresponding attenuation weighting value is. And adding the forward weighted value and the attenuation weighted value to obtain the comprehensive weighted value of the target word. And finally, multiplying the comprehensive weighted value by the matching degree score, and accumulating the weighted matching degree scores of all target words to obtain the weighted value of partial commodities. The matching degree of the target word and commodity information and the importance and effectiveness of the target word are comprehensively considered, weight values of partial commodities are calculated, and data support is provided for subsequent hierarchical labels.
And step S5230, marking the hierarchy of the partial commodity according to the matching relation between the weight value and the preset hierarchy threshold value of each hierarchy.
And acquiring weight values of partial commodities (for example, the weight value of commodity ID is 1001 is 5.3, the weight value of commodity ID is 1002 is 4.5, the weight value of commodity ID is 1003 is 3.8) and preset level thresholds (for example, the first level threshold is 5.0, the second level threshold is 4.0 and the third level threshold is 3.0). And then comparing the weight value of each commodity with a preset level threshold value to determine the level to which the commodity belongs. For example, commodity ID 1001 has a weight value of 5.3 greater than the first level threshold of 5.0 and thus is labeled as first level, commodity ID 1002 has a weight value of 4.5 between the second level threshold of 4.0 and the first level threshold of 5.0 and thus is labeled as second level, and commodity ID 1003 has a weight value of 3.8 between the third level threshold of 3.0 and the second level threshold of 4.0 and thus is labeled as third level. And finally, storing the hierarchical marking result and the unique commodity identification in an associated mode. For example, commodity ID 1001 is labeled as a first hierarchy, commodity ID 1002 is labeled as a second hierarchy, and commodity ID 1003 is labeled as a third hierarchy. Through the step, partial commodities can be finely layered according to the matching relation between the weight value and the level threshold.
In this embodiment, the weight value of a part of the commodity can be accurately and dynamically calculated by combining the search frequency of the target word segment, the latest search time and the matching degree of the commodity information word segment. The weight value calculating method based on the multi-dimensional data (search frequency, search time and matching degree) can remarkably improve the accuracy and the practicability of the commodity weight value, and therefore the commodity recommendation accuracy and the user satisfaction are effectively improved.
In a further embodiment, referring to fig. 8, inputting the target word, the search frequency, the search time, and the word corresponding to the commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity, including:
step S5221, invoking a word weight model to calculate the matching degree of the target word and the word corresponding to the commodity information, and obtaining a corresponding matching degree score;
And inputting the target word segmentation set and the word segmentation set corresponding to the commodity information of the partial commodity into the word weight model. The word weight model generates a matching degree score by calculating semantic similarity or accurate matching degree between the target word and the commodity information word.
Step S5222, determining a forward weighting value of the matching degree score corresponding to the target word segment based on the search frequency and a preset first mapping relation;
And acquiring the search frequency of the target word segmentation and a preset first mapping relation. The first mapping relationship defines a correspondence relationship between the search frequency and the forward weighting value, and is typically represented by a piecewise function or a linear function. For example, the forward weighting value is 1.0 when the search frequency is 0 to 500 times/month, 1.2 when the search frequency is 501 to 1000 times/month, and 1.5 when the search frequency is 1001 times/month or more. And then determining a corresponding forward weighting value according to the search frequency of the target word segmentation and the first mapping relation. Through the step, the importance of the target word can be dynamically adjusted according to the searching frequency of the target word.
Step S5223, determining an attenuation weighted value of the matching degree score corresponding to the target word segment based on the search time and a preset second mapping relation;
and acquiring the latest searching time of the target word segmentation and a preset second mapping relation. The second mapping relationship defines a correspondence between the search time and the decay weight, and is typically represented by a piecewise function or an exponential decay function. And then determining a corresponding attenuation weighted value according to the latest searching time of the target word and the second mapping relation.
Step S5224, adding the forward weighted value and the attenuation weighted value to obtain a comprehensive weighted value of the target word;
and adding the forward weighted value and the attenuation weighted value to obtain the comprehensive weighted value of the target word. The searching frequency and timeliness of the target word segmentation can be comprehensively considered, the comprehensive weighting value is generated, and data support is provided for subsequent weight value calculation.
And step S5225, multiplying the comprehensive weighted value by the matching degree score of the corresponding target word, and accumulating the weighted matching degree scores of all the target word obtained by multiplication to obtain the final weighted value of the partial commodity.
And obtaining the comprehensive weighted value of the target word and the corresponding matching degree score. And multiplying the comprehensive weighted value of each target word with the matching degree score to obtain a weighted matching degree score. And then, accumulating the weighted matching degree scores of all the target segmentation words to obtain the final weight value of part of commodities. Through the step, the importance, timeliness and matching degree of the target word can be comprehensively considered, the final weight value of part of commodities is calculated, and data support is provided for subsequent hierarchical marks.
According to the method, the semantic association of the target word and the commodity information can be accurately captured through the comprehensive calculation method of the multi-dimensional data (the matching degree, the searching frequency and the searching time), the importance of the target word can be dynamically adjusted, the accuracy and the practicability of the weight value are remarkably improved, high-quality data support is provided for subsequent hierarchical marking and refined sorting, and therefore the commodity recommendation accuracy and the user satisfaction are effectively improved.
Referring to fig. 9, a commodity searching apparatus provided in accordance with one of the objects of the present application is a functional implementation of the commodity searching method of the present application, and on the other hand, the commodity searching apparatus provided in accordance with one of the objects of the present application includes a commodity coarse recall module 5100, a hierarchical tagging module 5200 and an ordered commodity list forming module 5300, wherein the commodity coarse recall module 5100 is configured to perform coarse recall operation on a candidate commodity based on a target search sentence input by a user and commodity information of the candidate commodity, recall commodity information is a partial commodity matching the target search sentence, the hierarchical tagging module 5200 is configured to input a target word of the target search sentence and a word corresponding to commodity information of the partial commodity into a word weight model, obtain a weight value corresponding to the partial commodity, and perform hierarchical tagging on the partial commodity based on the weight value, and the ordered commodity list forming module 5300 is configured to perform click probability on each commodity in each of the layers by respectively inputting a user portrait feature tag of the user and a commodity feature tag of different levels into each hierarchical layer, and to perform click probability calculation on each commodity in the ordered list according to the click probability.
In a further embodiment, before the commodity coarse recall module 5100, the commodity coarse recall module includes a sample set acquisition sub-module configured to acquire a training sample set, where the training sample set includes a plurality of user historical behavior samples and corresponding supervision labels thereof, each user historical behavior sample includes a user portrait characteristic label of a user and a commodity characteristic label of a candidate commodity, where the supervision labels characterize whether the user clicks on the corresponding candidate commodity, and a model training sub-module configured to invoke the training sample set to train a click through rate model to a convergence state, so that the training sample set learns an ability of predicting a click probability of the user on the candidate commodity according to the user portrait characteristic label and the commodity characteristic label of the candidate commodity.
In a further embodiment, the sample set acquisition sub-module comprises a behavior data acquisition sub-module for acquiring behavior data of each independent station of the user in the e-commerce platform based on a unique user identifier and determining corresponding user portrait data, wherein the behavior data comprises purchase behavior data, purchasing behavior data and clicking behavior data, and a portrait characteristic tag determination sub-module for inputting the user portrait data and a preset general tag set into a pre-trained tag definition model and determining user portrait characteristic tags of corresponding users, and storing the user portrait characteristic tags in association with the unique user identifier, wherein the general tag set comprises a plurality of user portrait characteristic tags.
In a further embodiment, the commodity coarse recall module 5100 includes a word segmentation processing sub-module for performing word segmentation processing on a target search sentence input by a user to obtain a plurality of target words to form a target word segmentation set, and a commodity set determination sub-module for determining that an entry at least includes an initial commodity set of the target word based on the target word set and a preset inverted index, wherein the inverted index performs word segmentation processing on a commodity title, a commodity description and a commodity feature tag of a candidate commodity to generate mapping relation data of the entry and a commodity unique identifier including the entry, and a relevance score calculation sub-module for calculating a relevance score of each commodity and the target search sentence based on commodity information of each commodity in the initial commodity set, and screening each commodity in the initial commodity set according to the relevance score to obtain the partial commodity to complete coarse recall operation.
In a further embodiment, the commodity coarse recall module 5100 includes a similarity score calculation sub-module for inputting a target word segmentation set of the target search sentence, commodity titles, commodity descriptions and commodity feature labels of the candidate commodities into a semantic similarity model, calculating semantic similarity scores of the target search sentence and each candidate commodity, a comprehensive score calculation sub-module for calculating a comprehensive score of each candidate commodity by combining a historical click rate, purchase conversion rate, user evaluation score and a preset weight coefficient of the candidate commodity based on the semantic similarity scores, and a commodity coarse recall sub-module for screening out candidate commodities with the comprehensive scores higher than a preset threshold as the partial commodities, thereby completing coarse recall operation.
In a further embodiment, the hierarchical marking module 5200 includes a target word segment obtaining sub-module configured to obtain a target word segment of the target search statement, and count a search frequency and a last search time of the target word segment within a preset time range based on preset historical search log data, a weight value determining sub-module configured to input the target word segment, the search frequency, the search time and a word segment corresponding to commodity information of the partial commodity into a word weight model to obtain a weight value corresponding to the partial commodity, and a hierarchical marking sub-module configured to perform hierarchical marking on the partial commodity according to a matching relationship between the weight value and a hierarchical threshold of each preset hierarchy.
In a further embodiment, the weight value determining submodule comprises a matching degree score calculating submodule, a forward weight value determining submodule, an attenuation weight value determining submodule, a comprehensive weight value determining submodule and a final weight value determining submodule, wherein the matching degree score calculating submodule is used for adjusting a word weight model to calculate matching degree of the target word and the word corresponding to commodity information to obtain a corresponding matching degree score, the forward weight value determining submodule is used for determining a forward weight value of the matching degree score corresponding to the target word based on the search frequency and a preset first mapping relation, the attenuation weight value determining submodule is used for determining an attenuation weight value of the matching degree score corresponding to the target word based on the search time and a preset second mapping relation, the comprehensive weight value determining submodule is used for adding the forward weight value and the attenuation weight value to obtain a comprehensive weight value of the target word, and the final weight value determining submodule is used for multiplying the comprehensive weight value and the matching degree score of the corresponding target word and accumulating the weighted matching degree score of all the target word obtained through multiplication to obtain a final weight value of the part commodity.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. As shown in fig. 10, the internal structure of the computer device is schematically shown. The computer device includes a processor, a computer readable storage medium, a memory, and a network interface connected by a system bus. The computer readable storage medium of the computer device stores an operating system, a database and computer readable instructions, the database can store a control information sequence, and the computer readable instructions can enable the processor to realize a commodity searching method when the computer readable instructions are executed by the processor. The processor of the computer device is used to provide computing and control capabilities, supporting the operation of the entire computer device. The memory of the computer device may have stored therein computer readable instructions that, when executed by the processor, cause the processor to perform the merchandise search method of the application. The network interface of the computer device is for communicating with a terminal connection. It will be appreciated by those skilled in the art that the structure shown in FIG. 10 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
The processor in this embodiment is configured to execute specific functions of each module and its sub-modules in fig. 9, and the memory stores program codes and various types of data required for executing the above modules or sub-modules. The network interface is used for data transmission between the user terminal or the server. The memory in this embodiment stores program codes and data necessary for executing all modules/sub-modules in the commodity searching apparatus according to the present application, and the server can call the program codes and data of the server to execute the functions of all sub-modules.
The present application also provides a storage medium storing computer readable instructions that, when executed by one or more processors, cause the one or more processors to perform the steps of the merchandise search method of any one of the embodiments of the present application.
Those skilled in the art will appreciate that all or part of the processes implementing the methods of the above embodiments of the present application may be implemented by a computer program for instructing relevant hardware, where the computer program may be stored on a computer readable storage medium, where the program, when executed, may include processes implementing the embodiments of the methods described above. The storage medium may be a computer readable storage medium such as a magnetic disk, an optical disk, a Read-only memory (Read-On l yMemory, ROM), or a random access memory (Random Access Memory, RAM).
Those of skill in the art will appreciate that the various operations, methods, steps in the flow, acts, schemes, and alternatives discussed in the present application may be alternated, altered, combined, or eliminated. Further, other steps, means, or steps in a process having various operations, methods, or procedures discussed herein may be alternated, altered, rearranged, disassembled, combined, or eliminated. Further, various operations, methods, steps, means, or arrangements of procedures found in the prior art with the open source of the present application may be alternated, altered, rearranged, split, combined, or eliminated.
The foregoing is only a partial embodiment of the present application, and it should be noted that it will be apparent to those skilled in the art that modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.

Claims (10)

CN202510176006.2A2025-02-172025-02-17 Commodity search method and its device, equipment and mediumPendingCN120106939A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202510176006.2ACN120106939A (en)2025-02-172025-02-17 Commodity search method and its device, equipment and medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202510176006.2ACN120106939A (en)2025-02-172025-02-17 Commodity search method and its device, equipment and medium

Publications (1)

Publication NumberPublication Date
CN120106939Atrue CN120106939A (en)2025-06-06

Family

ID=95873072

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202510176006.2APendingCN120106939A (en)2025-02-172025-02-17 Commodity search method and its device, equipment and medium

Country Status (1)

CountryLink
CN (1)CN120106939A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120471650A (en)*2025-07-112025-08-12吉林交通职业技术学院Electronic commerce data processing method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120471650A (en)*2025-07-112025-08-12吉林交通职业技术学院Electronic commerce data processing method and system

Similar Documents

PublicationPublication DateTitle
CN115002200B (en)Message pushing method, device, equipment and storage medium based on user portrait
CN118628214B (en)Personalized clothing recommendation method and system for electronic commerce platform based on artificial intelligence
CN112785397A (en)Product recommendation method, device and storage medium
CN109189904A (en)Individuation search method and system
CN114238573A (en)Information pushing method and device based on text countermeasure sample
CN111400613A (en)Article recommendation method, device, medium and computer equipment
CN114266443A (en)Data evaluation method and device, electronic equipment and storage medium
US20190080352A1 (en)Segment Extension Based on Lookalike Selection
CN114693215A (en) Purchase request processing method, device, computer equipment and storage medium
CN116468460B (en)Consumer finance customer image recognition system and method based on artificial intelligence
CN113254775A (en)Credit card product recommendation method based on client browsing behavior sequence
CN119205259A (en) Data recommendation method, device, computer equipment and storage medium
CN120106939A (en) Commodity search method and its device, equipment and medium
Turlapati et al.The Impact of Influencer Marketing on Consumer Purchasing Decisions in the Digital Age Based on Prophet ARIMA-LSTM Model
Jie et al.Bidding via clustering ads intentions: an efficient search engine marketing system for ecommerce
CN118154255A (en)Marketing method, system, equipment and medium based on user similarity
CN114996579B (en) Information push method, device, electronic device and computer readable medium
CN119415693A (en) Commodity marking method and its device, equipment and medium
CN118246411A (en)Search article generation method, device, equipment and medium thereof
KR20200029647A (en)Generalization method for curated e-Commerce system by user personalization
CN115906858A (en)Text processing method and system and electronic equipment
CN116823321B (en)Method and system for analyzing economic management data of electric business
CN113779967A (en)Enterprise transformation information generation method and device, storage medium and electronic equipment
HaqueE-commerce product recommendation system based on ml algorithms
CN118644260A (en) An assistant system for intelligent shopping guides based on a large language model

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp