CN119226476A

Movatterモバイル変換

Info

Publication number: CN119226476A
Application number: CN202411363585.3A
Authority: CN
Inventors: 凌天东; 王健宗; 程宁
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2024-09-27
Filing date: 2024-09-27
Publication date: 2024-12-31

Abstract

The invention relates to an automatic model prompt word optimizing method, which comprises the steps of obtaining an initial prompt word and a corresponding training data set, wherein the data set comprises input data and corresponding real labels. And then, inputting the initial prompt word and the input data into the target model together to generate a corresponding prediction tag. On the basis, the target model is guided by using the feedback prompt word, and language prompt feedback is generated by comparing the predicted label with the real label. The feedback is used for identifying potential defects in the initial prompt word, optimizing the initial prompt word based on the potential defects, and finally generating the target prompt word. The invention not only reduces the tedious process of manually adjusting the prompting words, but also realizes the gradual improvement of the prompting words through a feedback mechanism, thereby ensuring the high accuracy and consistency of model output.

Description

Automatic optimization method, device, equipment and storage medium for model prompt words

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for automatically optimizing a model prompt word.

Background

Large Language Models (LLMs) have demonstrated a powerful capability as generic agents in a variety of tasks that can handle complex tasks such as natural language understanding, generation, and reasoning. However, the performance of LLM is largely dependent on manually written prompts (prompts), which often require extensive experimentation and error to find the best form. This laborious process is not only time consuming, but also limits the application of the model in a wider range of scenarios.

Some existing techniques attempt to improve the effect of cues in a variety of ways. For example, the generation of hints is automatically optimized using an auxiliary model or microtablets (differentiable prompts), thereby reducing the need for manual intervention. However, these approaches generally assume that the internal state variables of the LLM are accessible, meaning that they require in-depth knowledge and control of the underlying mechanisms of the model. This requirement is difficult to achieve in many practical applications, especially when using a pre-trained black box model.

In addition, reinforcement learning or feedback mechanisms based on LLM itself have also been studied to optimize cues. These algorithms increase the effectiveness of the prompts by performing discrete operations on the prompts. However, these methods typically require low-level access to the LLM and may generate unintelligible outputs. Furthermore, some algorithms rely on undirected monte carlo searches, which, while being able to explore a larger search space, may result in a significant waste of computing resources due to lack of explicit directionality, and may not necessarily be able to find an optimal hint.

Thus, current techniques face challenges in hint optimization processes such as reliance on model internal mechanisms, complexity in generating output, and computational inefficiency.

Disclosure of Invention

The invention mainly aims to provide a method, a device, equipment and a storage medium for automatically optimizing model prompt words, which aim at solving the technical problems of realizing automatic optimization of the prompt words and improving the efficiency and accuracy of generating the prompt words.

In order to achieve the above object, the present invention provides a method for automatically optimizing model hint words, comprising:

Acquiring an initial prompt word and a training data set, wherein the training data set comprises input data and corresponding real labels;

inputting the initial prompt word and the input data into a target model together to obtain a prediction tag;

Guiding the target model to generate language prompt feedback based on the prediction tag and the real tag by using a feedback prompt word, wherein the language prompt feedback is used for identifying defects in the initial prompt word;

And optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word.

In one embodiment, after optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word, the method further comprises:

generating a plurality of candidate prompt words based on the target prompt words through a prompt word expansion method, wherein the prompt word expansion method comprises a random sampling method and/or a genetic algorithm;

constructing a candidate prompting word set according to the generated plurality of candidate prompting words, and selecting and eliminating the candidate prompting words in the candidate prompting word set through multiple iterations;

Randomly sampling a candidate subset from the candidate set of the prompting words, calculating the score of each candidate prompting word in the candidate subset by using a metric function, and eliminating the candidate prompting word with the lowest score from the candidate set of the prompting words;

And until only the last candidate prompt word remains in the prompt word candidate set after multiple iterations, obtaining the optimized target prompt word.

setting a preset selection number and a preset iteration round number, setting a beam set and initializing, wherein the initialized beam set takes the target prompt word as an initial value;

Performing iteration of a preset iteration round number on the beam set, initializing an empty candidate prompt word set in each iteration round, expanding each prompt word of the beam set, generating a plurality of candidate prompt words, adding the candidate prompt words into the candidate prompt word set, scoring each candidate prompt word in the candidate prompt word set by using a metric function, and selecting the candidate prompt words with the highest scores and the preset selection number to be added into the beam set to form the beam set of the next iteration round;

And selecting candidate prompt words with highest scores from the beam set of the last round until the iteration of the preset iteration round number is completed, and obtaining the optimization target prompt words.

In one embodiment, expanding each cue word of the beam set, generating a plurality of candidate cue words and adding the candidate cue words to the candidate cue word set includes:

Randomly replacing key words of each prompt word in the wave beam set from a preset vocabulary, and generating a plurality of diversified prompt words for each prompt word;

Randomly transforming sentence structures of diversified prompt words, including changing words or introducing clauses to obtain variant prompt words;

And adding or adjusting the background description of the variant prompt word according to the context information of the task corresponding to the target prompt word, generating a plurality of candidate prompt words, and adding the candidate prompt words into a candidate prompt word set.

In one embodiment, before using the feedback cue word to guide the target model to generate language cue feedback based on the predictive tag and the real tag, further comprising:

Collecting user demand data and historical performance data of the target model;

determining task requirements to be met by the target model, wherein the task requirements comprise specific output forms;

and generating the feedback prompt word according to the user demand data, the historical performance data and the task requirement of the target model.

analyzing the semantics and the structure of the initial prompt word, and generating a preliminary feedback prompt word based on the initial prompt word in a semantic expansion and structure adjustment mode;

Introducing a group of guide sentences for the preliminary feedback prompt words, so that the preliminary feedback prompt words can guide a model to generate feedback information, and the guide feedback prompt words are obtained;

And carrying out multi-round test on the guided feedback prompt word by using the target model, collecting feedback information output by the target model, and adjusting the guided feedback prompt word according to the feedback information to obtain a final feedback prompt word.

In one embodiment, optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word includes:

identifying problems in the language hint feedback and determining specific optimization objectives including improving semantic accuracy, enhancing contextual relevance, or improving logical structure;

selecting a target optimization strategy from an optimization strategy library according to the specific optimization target, and optimizing the initial prompt words according to the target optimization strategy to obtain a plurality of undetermined prompt words;

and scoring the plurality of undetermined prompting words by using the target model and the measurement function, and selecting the undetermined prompting word with the highest score as the target prompting word.

Further, in order to achieve the above object, the present invention also provides an automatic optimizing device for model prompt words, including:

the data acquisition module acquires an initial prompt word and a training data set, wherein the training data set comprises input data and corresponding real labels;

The model prediction module inputs the initial prompt word and the input data into a target model together to obtain a prediction tag;

The feedback generation module is used for guiding the target model to generate language prompt feedback based on the prediction tag and the real tag by using a feedback prompt word, and the language prompt feedback is used for identifying defects in the initial prompt word;

and the prompt word optimization module optimizes the initial prompt word based on the generated language prompt feedback to obtain a target prompt word.

Further, in order to achieve the above object, the present invention also provides an automatic model hint word optimizing apparatus, which includes a memory, a processor, and an automatic model hint word optimizing program stored in the memory and executable on the processor, wherein the automatic model hint word optimizing program when executed by the processor implements the steps of the automatic model hint word optimizing method as described above.

Further, in order to achieve the above object, the present invention also provides a computer storage medium, on which a model prompt word automatic optimization program is stored, which when executed by a processor, implements the steps of the model prompt word automatic optimization method as described above.

The method has the beneficial effects that the method comprises the steps of obtaining initial prompt words and corresponding training data sets, wherein the data sets comprise input data and corresponding real labels. And then, inputting the initial prompt word and the input data into the target model together to generate a corresponding prediction tag. On the basis, the target model is guided by using the feedback prompt word, and language prompt feedback is generated by comparing the predicted label with the real label. The feedback is used for identifying potential defects in the initial prompt word, optimizing the initial prompt word based on the potential defects, and finally generating the target prompt word. The invention not only reduces the tedious process of manually adjusting the prompting words, but also realizes the gradual improvement of the prompting words through a feedback mechanism, thereby ensuring the high accuracy and consistency of model output.

Drawings

The invention will be further described with reference to the accompanying drawings and examples, in which:

FIG. 1 is a flow chart of an embodiment of an automatic optimization method for model hint words according to the present invention;

FIG. 2 is a schematic diagram of functional modules of a preferred embodiment of the automatic model prompt word optimizing apparatus of the present invention;

FIG. 3 is a schematic diagram of a hardware operating environment of a device according to an embodiment of the present invention.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be noted that a Large Language Model (LLM) has demonstrated a strong capability as a generic agent among a variety of tasks, capable of handling complex tasks such as natural language understanding, generation, and reasoning. However, the performance of LLM is largely dependent on manually written prompts (prompts), which often require extensive experimentation and error to find the best form. This laborious process is not only time consuming, but also limits the application of the model in a wider range of scenarios.

Referring to fig. 1, fig. 1 is a flowchart illustrating an embodiment of a method for automatically optimizing model hint words according to the present invention. It should be noted that although a logical order is depicted in the flowchart, in some cases the steps depicted or described may be performed in a different order than presented herein.

As shown in FIG. 1, the automatic optimization method of the model prompt words provided by the invention comprises the following steps:

S10, acquiring an initial prompt word and a training data set, wherein the training data set comprises input data and corresponding real labels;

In this embodiment, the initial prompt is the starting point for the model prompt auto-optimization process. The initial prompt may be a standardized input provided by human design, automatically generated, or based on a pre-trained model. The initial prompt is intended to guide a Large Language Model (LLM) to generate a relevant output whose quality and accuracy directly affect the effect of the subsequent steps.

The training data set contains input data and corresponding real labels. The input data may be natural language text, structured data, or other types of data, with the real tag being the desired output corresponding to the input data. The quality of the training data set determines the learning effect of the model, and the learning effect is directly related to the optimizing effect of the prompt words.

In one embodiment, when the initial prompt is obtained, the system may call a library of pre-designed templates for the prompt, which templates cover common language task types, such as questions and answers, translations, abstracts, etc. And selecting proper prompting words from the template library as initial prompting words according to specific requirements of the task.

The training dataset may consist of pairs of user inputs in the historical task and model-generated outputs. The real tags may be generated manually or automatically by a pre-trained model. The system will update the training data set periodically to ensure timeliness and coverage of the data.

In another embodiment, the initial prompt may be entered manually by the user, particularly in situations where highly customized is desired. And the user writes a proper initial prompt word according to the specific requirement of the current task, and the system directly uses the prompt word to perform subsequent optimization.

The training dataset may include cross-domain multimodal data, such as a text-to-image combined dataset. The real tags include not only natural language, but also descriptive or categorical tags for the image. Such a multimodal training dataset may enhance the generalization ability of the model.

In other embodiments, the system automatically generates initial alert words by analyzing user historical behavior. Through analysis of past interaction data of the user and the system, language expression modes preferred by the user are identified, so that initial prompt words are automatically generated, and the prompt words are ensured to be more accordant with the habit of the user.

Training data sets may be dynamically acquired from a number of data sources, such as from Internet public data sets, corporate internal data, and user generated content. The real labels are generated by combining multiple rounds of manual auditing and automatic machine labeling, so that the accuracy and the diversity of the data are ensured.

The initial prompt word and the training data set are automatically acquired, so that the development process is simplified, manual intervention is reduced, the efficiency and accuracy of prompt word optimization are improved, the learning capacity and adaptability of the model are enhanced, and the performance of the model and the stability of the system are improved.

S20, inputting the initial prompt word and the input data into a target model together to obtain a prediction tag;

In this embodiment, the initial prompt is a text for guiding the model to generate an output, which is combined with input data (e.g., a natural language sentence or context provided by the user) and submitted as input to the target model. The role of the initial prompt is to provide context or task description for the model, thereby helping the model to generate a more predictable output.

The object model is typically a Large Language Model (LLM), such as GPT, capable of generating predictive labels from input text data. Predictive labels are the results that a model generates based on an initial prompt and input data, typically an inference, answer, or classification label to the input data.

After receiving the initial prompt word and the input data, the model outputs one or more predictive labels through a language understanding and generating mechanism inside the model. These tags may be natural language sentences, classification results, or other types of model output, depending on the nature of the task.

In one embodiment, the initial prompt is a query description or question description provided by the user, and the input data is context information or text segments associated with the query. The target model combines the two to generate a prediction label, and the prediction label can be an answer to a user question or a classification result.

In another embodiment, the initial prompt is automatically generated by the system, containing instructions or targets of the task. The input data is real-time data from the user, such as input text content or speech transcription. The target model generates a predictive tag, such as a result of emotion analysis or a topic tag, based on the hint word and the input data.

In other embodiments, such as in the context of multimodal data, the initial prompt may be used in conjunction with text, image, or audio data. The input data contains multiple types of inputs (e.g., text and images) and the object model combines these input data to generate predictive labels, such as image descriptions or classification results.

By combining the initial prompt word with the input data and inputting the target model, the model can better understand task requirements and generate more accurate and relevant prediction labels. The process improves the accuracy and adaptability of model output, simplifies the interactive experience of users, and remarkably improves the performance of the whole system.

S30, guiding the target model to generate language prompt feedback based on the prediction tag and the real tag by using a feedback prompt word, wherein the language prompt feedback is used for identifying defects in the initial prompt word;

in this embodiment, the feedback prompt is an optimized or specifically designed input for guiding the target model to generate feedback information. These hints facilitate the generation of targeted feedback by directing the model to particular input and output relationships.

The target model generates a predictive tag after processing the initial prompt and the input data. By comparing these predictive labels with the corresponding real labels, the model generated linguistic prompt feedback may reveal possible defects or shortfalls in the initial prompt word. Such feedback is typically in the form of natural language, indicating portions of the model that may be misinterpreted or mishandled.

The goal of the language hint feedback is to help identify defects in the initial hint word that may lead to erroneous or inaccurate predictions. By analyzing the feedback, the system can find problems in the prompt words, such as ambiguous semantics, insufficient contexts or inconsistent logic, and the like, thereby providing basis for subsequent optimization.

In one embodiment, feedback cues are automatically generated by the system to guide the model to certain key input features or output patterns. The generated linguistic cue feedback will specify differences between the predicted tag and the actual tag and identify the cause of these differences in the initial cue word that may be caused.

In another embodiment, the user may manually write feedback cues to generate feedback based on experience or a specific demand guidance model. For example, in a text classification task, a user may use a prompt containing certain keywords to determine the impact of those keywords in the classification result. Feedback generated by the model will indicate from these hints which vocabularies or structures may lead to classification errors.

In other embodiments, the system may perform multiple iterations during the hint word optimization process. Multiple iterations refer to the system continually generating and using different feedback cues to guide the model in generating feedback in each cycle of cue optimization. In each iteration, the system uses different feedback cues to trigger the model to generate different language cue feedback. By analyzing the feedback for each round, the system can identify a variety of questions in the initial prompt. For example, a first round of iterations may use one hint word focusing on contextual relevance and a second round of iterations may use another hint word focusing on semantic accuracy. Through multiple rounds of such iterations, the system gradually optimizes the initial prompt word, and finally generates a more accurate target prompt word.

Or in each iteration of the alert word optimization, the system dynamically generates a feedback alert word. And according to the feedback information in the previous iteration, the system adjusts and generates a new feedback prompt word, and continuously guides the model to perform feedback generation of the next iteration. The goal of each iteration round is to gradually improve the prompt word by the feedback of the previous round, so that the prompt word is more and more close to the optimal state. This iterative process continues until the system generates a final, fully optimized target cue word.

By using the feedback prompt word guide model to generate language prompt feedback, the defects in the initial prompt word can be accurately identified, so that a clear direction is provided for prompt word optimization, the optimization efficiency is improved, and the interpretation and performance of the model are enhanced.

And S40, optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word.

In this embodiment, the language hint feedback is generated by the target model, based on the difference between the predicted tag and the real tag, guided by the feedback hint word. These feedback indicate problems in the initial prompt, such as semantic ambiguity, logical inconsistencies, or context inconsistencies.

And analyzing and identifying defects in the initial prompt words by the system according to the generated language prompt feedback, and further adjusting and improving the defects. The optimization process may involve modification of the semantics, structure, or content of the hint words to ensure that the hint words more effectively guide the model to generate accurate output.

And finally generating the target prompt word by the system through the optimized initial prompt word. The target prompt word is subjected to repeated iteration and feedback optimization, so that the original defects are removed, the requirement of the model can be well met, and the output quality of the model is improved.

In one embodiment, after generating the language prompt feedback, the system first analyzes the feedback content to determine the main problem in the initial prompt word. For example, the feedback may indicate that certain words of the prompt are not sufficiently accurate or are misleading. Based on these feedback, the system fine adjusts the initial cue words, possibly optimizing the cue words by replacing words, adjusting sentence structure, etc. The optimized cue words are further tested to ensure that the improved cue words can better guide the model to generate correct output.

In another embodiment, after feedback generation, the system introduces an automated optimization strategy, parses the feedback through natural language processing algorithms, and automatically generates a set of candidate optimization schemes. These schemes may include increasing context information of the cue words, refining semantic directionality of the cue words, or simplifying the expression structure of the cue words. The system finally selects the optimal scheme to generate the target prompt word through evaluating and screening the candidate schemes.

In other embodiments, the system gradually optimizes the initial prompt word in multiple iterations using previously generated language prompt feedback. After each round of optimization, the system inputs the newly generated prompt words into the model again for testing, and collects new feedback. Through the feedback-optimization process of round by round, the system continuously improves the quality of the prompt words, and the finally generated target prompt words can guide the model to generate high-quality output under various conditions.

Through the guidance of language prompt feedback, the system can accurately optimize the initial prompt word, improve the output quality of the model, and ensure that the generated target prompt word shows higher adaptability and accuracy in various tasks and contexts.

In one embodiment, after S40, the method further includes:

S401, generating a plurality of candidate prompt words based on the target prompt words through a prompt word expansion method, wherein the prompt word expansion method comprises a random sampling method and/or a genetic algorithm;

s402, constructing a candidate set of cue words according to the generated plurality of candidate cue words, and selecting and eliminating the candidate cue words in the candidate set of cue words through multiple iterations;

S403, randomly sampling a candidate subset from the candidate set of the cue words during each iteration, calculating the score of each candidate cue word in the candidate subset by using a metric function, and eliminating the candidate cue word with the lowest score from the candidate set of the cue words;

s404, until the candidate set of the prompt words only leaves the last candidate prompt word after a plurality of iterations, obtaining the optimized target prompt word.

In this embodiment, after optimizing the initial alert word and generating the target alert word, the system further generates a plurality of candidate alert words by the alert word expansion method. These expansion methods may include techniques such as random sampling and genetic algorithms. Random sampling can generate candidate prompt words with diversity and randomness, and the genetic algorithm gradually optimizes the quality of the candidate prompt words in a mode of simulating biological evolution.

Through the generated plurality of candidate prompt words, the system constructs a candidate set of prompt words. This candidate set contains a plurality of different versions of the cue words, each of which is likely to be the final optimization target cue word.

In each iteration, the system randomly samples a subset from the candidate set, evaluating each candidate prompt term therein. These cue words are scored using a metric function, and candidate cue words with the lowest scores are eliminated from the candidate set. The process is repeated for a plurality of rounds, and the optimal prompt words are screened step by step.

The metric function is an index or criteria for evaluating candidate hint word performance, including, but not limited to, accuracy (Accuracy), recall (Recall), BLEU Score (BLEU Score), ROUGE Score (ROUGE Score), and semantic similarity (SEMANTIC SIMILARITY), among others.

After multiple iterations, only the last prompt word in the candidate set is left, and the prompt word is regarded as an optimized target prompt word subjected to comprehensive optimization, so that the method has the highest quality and applicability.

In one embodiment, the system first generates a plurality of candidate alert words with diversified features based on the target alert word by a random sampling method. Then, by constructing a candidate set of alert words, the system randomly extracts a portion of the candidate alert words from the candidate set in each iteration and scores them using a preset metric function. The candidate cue words with the lowest scores are eliminated in each iteration until only one optimal cue word is left.

In another embodiment, using a genetic algorithm, the system generates a plurality of candidate hints words based on the target hints word. The genetic algorithm simulates natural selection and generates new candidate prompt words through crossover and mutation. And (3) carrying out multiple iterations on the candidate prompting word set, wherein a metric function is used for evaluating and eliminating the candidate prompting word in each iteration, and finally, keeping the optimal candidate prompting word as a final target prompting word.

In other embodiments, the system combines two methods, random sampling and genetic algorithm, to generate candidate hint words and construct a candidate set. Each round is randomly sampled from the candidate set and evaluated through multiple rounds of iteration. The use of the metric function ensures that the worst cue words in each iteration are eliminated, and the cue words with the highest quality are finally reserved as final optimization targets.

According to the method and the system, through the expansion of the prompt words and multiple rounds of iterative selection, the prompt words can be effectively screened and optimized by the system, and the target prompt words with the highest quality can be generated. The method ensures the diversity and applicability of the prompt words and obviously improves the precision and stability of the model.

In one embodiment, after S40, the method further includes:

s405, setting a preset selection number and a preset iteration round number, setting a beam set and initializing, wherein the initialized beam set takes the target prompt word as an initial value;

S406, performing iteration of a preset iteration round number on the beam set, initializing an empty candidate prompt word set in each iteration round, expanding each prompt word of the beam set, generating a plurality of candidate prompt words, adding the candidate prompt words into the candidate prompt word set, scoring each candidate prompt word in the candidate prompt word set by using a metric function, and selecting the candidate prompt words with the highest scores and the preset selection number to be added into the beam set to form the beam set of the next iteration round;

S407, until the iteration of the preset iteration round number is executed, selecting the candidate prompt word with the highest score from the wave beam set of the last round, and obtaining the optimization target prompt word.

In this embodiment, the preset selection number refers to the number of the candidate hint words reserved from the candidate hint word set after each iteration. This number determines which hints will continue to be optimized for the next iteration. The preset iteration round number refers to the iteration number to be performed in the whole optimization process. This parameter determines the depth of optimization, and the cue words are progressively screened and optimized through multiple iterations.

A beam set is a set that retains a number of candidate hinting terms in each round of iteration. The initialized beam set takes the target prompt word as an initial value, which means that the optimization process starts from the prompt word which is optimized preliminarily.

In each iteration, the system initializes an empty candidate prompt word set and expands each prompt word in the beam set to generate a plurality of candidate prompt words. These candidate hint words will be added to the set of candidate hint words.

Each cue in the candidate set of cues is scored using a metric function. The metric function may include accuracy, recall, F1 score, etc. These scores are used to determine the quality of the cue words. And selecting the preset number of prompt words with the highest scores, adding the prompt words into the wave beam set, and forming the basis of the next iteration.

After the iteration of the preset number of rounds, the candidate prompt word with the highest score is finally selected from the wave beam set of the last round to be used as the final optimization target prompt word.

In one embodiment, the system first sets a predetermined number of selections and iteration rounds. Assuming a number of 5 and a number of iteration rounds of 10, the system will keep 5 hint words in each round of iteration for a total of 10 rounds of iterative optimization.

The initial value of the beam set is the target cue word. In each iteration, the system performs semantic expansion on each prompt word, generates candidate prompt words, and scores the candidate prompt words by using the accuracy and recall as a measurement function.

After each iteration is finished, the system keeps 5 prompt words with the highest scores, and the next iteration is carried out. After the iteration is finished, the prompt word with the highest score is selected from the last round to be used as the final optimization target prompt word.

In another embodiment, the system may use different metric functions, such as F1 score or BLEU score, to capture the performance of the cue words in a particular task. In each iteration, besides semantic expansion, the system can also generate more diversified prompt words by combining a random sampling method.

In each iteration, the system further evaluates the generalization capability of the prompt words through cross-validation, so that the finally selected prompt words are ensured to be excellent in performance in the current task and can adapt to possible future scenes.

In other embodiments, the system may employ genetic algorithms in each iteration to generate a new round of candidate hints by selection, crossover, and mutation operations. Through genetic algorithm, the system can explore the possibility of more cue words, and gradually optimize the quality of the cue words by combining the scores of the metric function.

In addition, the target prompt word is further optimized to obtain an optimized prompt word, and the optimized prompt word can be obtained based on the UCB Bandits algorithm:

initializing a prompt word set, wherein the set comprises a plurality of candidate prompt words, and the prompt words are generated based on the target prompt words through slight modification or expansion. And carrying out multiple rounds of iteration selection, and selecting a plurality of prompt words from the prompt word set by using UCB Bandits algorithm as candidate prompt words of the round in each round of iteration.

Each candidate cue word is scored based on the performance of the cue word on the test dataset, such as accuracy, recall, F1 score, etc. According to the scoring result, UCB Bandits algorithm can preferably select the prompting words with higher scores in the next round to further test and optimize.

And updating the prompt word set according to the feedback result of the UCB Bandits algorithm, reserving the prompt words with excellent performance, and generating new variant prompt words for further testing.

Further optimizing the target prompt term to obtain the optimized prompt term can be based on Successive Rejects algorithm:

Initializing a candidate prompting word set, applying Successive Rejects algorithm to the preliminarily screened prompting word set, and carrying out more strict screening on each prompting word.

And after each iteration, scoring the candidate prompt words according to the performance of the target model on the test data set, and eliminating the prompt words with the worst performance.

The Successive Rejects algorithm gradually reduces the number of candidate prompt words through multiple iterations, and finally the optimal prompt words are reserved.

After the last iteration, the rest prompt words are finally optimized prompt words, and the prompt words are subjected to multiple rounds of screening and optimization, so that the method has higher adaptability and effectiveness.

According to the method and the system, the system can effectively screen and optimize the prompt words by setting the beam set and performing multi-round iterative optimization, and the generated target prompt words have higher quality and adaptability. According to the method, the precision of the prompt word is continuously improved through multiple iterations, and the fact that the final prompt word can better guide the model to accurately and stably output is ensured.

In one embodiment, in S406, expanding each hint word of the beam set to generate a plurality of candidate hint words and adding the candidate hint words to the candidate hint word set includes:

S4061, randomly replacing the key words of each prompting word in the beam set from a preset vocabulary, and generating a plurality of diversified prompting words for each prompting word;

S4062, randomly transforming sentence structures of diversified prompt words, including changing words or introducing clauses to obtain variant prompt words;

S4063, adding or adjusting the background description of the variant prompt word according to the context information of the task corresponding to the target prompt word, generating a plurality of candidate prompt words, and adding the candidate prompt words into a candidate prompt word set.

In this embodiment, the system semantically and structurally expands each of the hint words in the set of beams to generate a plurality of candidate hint words. These extensions can be achieved in several ways:

And randomly replacing the keywords from a preset vocabulary, wherein the system uses the preset vocabulary to randomly replace the keywords for the prompt words in the beam set. In this way, the system may generate a plurality of alert words having a variety of characteristics. The step increases the diversity of candidate prompt words by introducing vocabulary variation, thereby expanding the search space of the prompt words.

Random transformation syntax structure-after generating diversified cue words, the system further randomly transforms the syntax structure of these cue words. This includes changing word order in the cue words, introducing clauses, etc., so that the generated variant cue words are structurally different from the original cue words, thereby enhancing semantic diversity of the cue words.

And adjusting the background description, namely adding or adjusting the background description of the prompt word according to the context information of the task corresponding to the target prompt word by the system. This step further enhances the applicability and accuracy of the alert words by adding additional context information or adjusting the existing description to generate candidate alert words that are more relevant to the task.

In one embodiment, in the beam set, the system first applies a replacement rule in a preset vocabulary to each prompt word, randomly selects and replaces keywords, and generates a set of diversified candidate prompt words. The system then performs syntactic transformations on these hint words, for example changing active sentences into passive sentences, or introducing more clauses. Finally, the system adjusts the background description of the prompt words according to the context of the specific task, so that the prompt words are more fit with task requirements.

In another embodiment, the system generates preliminary diversified prompt words and then syntactically adjusts them using natural language processing techniques. This may include automatically identifying and adjusting sentence structure using a dependency parsing tool, generating variant hint words that conform to specific grammar rules. The adjustment of the background description is completed through a context awareness model, so that the prompt word can reflect the specific requirements of the task more accurately.

According to the method and the system, the prompting words in the beam set are expanded, so that a large number of diversified candidate prompting words can be generated by the system. Through keyword replacement, syntactic structure transformation and background description adjustment, the system can explore a wider prompting word space, and the finally selected prompting words are ensured to have higher applicability and accuracy under various tasks and scenes. The method remarkably improves the flexibility of the prompt words and the stability of model output.

In one embodiment, in S30, before the target model is guided to generate the language hint feedback based on the predictive tag and the real tag by using the feedback hint word, the method further includes:

s301, collecting user demand data and historical performance data of the target model;

s302, determining task requirements to be met by the target model, wherein the task requirements comprise specific output forms;

S303, generating the feedback prompt word according to the user demand data, the historical performance data and the task requirement of the target model.

In this embodiment, the system needs to collect various data related to the object model. Such data includes user demand and desire for the model (e.g., task type, output preferences), and model past performance data (e.g., model performance over different tasks, user feedback, etc.). These data are used to accurately understand the behavior of the model in different tasks and to help generate feedback cues appropriate for the current task.

Based on the collected user requirements and historical performance data, the system will analyze and determine the specific task requirements that the target model needs to meet in the current task. These requirements may include a particular output form (e.g., format of answer, language style) or a particular functional requirement (e.g., high accuracy, quick response, etc.). Determining these requirements helps to generate more accurate and demand-compliant feedback cues.

Based on the results collected and analyzed above, the system generates one or more feedback cue words. These hints are tailored to the specific needs and historical performance of the target model, and can more effectively guide the model to generate output content that meets expectations.

In one embodiment, the system first collects user demand data and historical performance data of the model via the interface. The user demand data may include user expectations for model output, such as simplicity, formatting requirements, and the like. The historical performance data may then include accuracy of the model in past tasks, user feedback scores, and the like.

After the system analyzes the data, the task requirements of the target model in the current task are identified, if short answers need to be generated or specific terms are used. Based on these requirements, the system generates a set of feedback cues, such as "please answer with a short sentence" or "please describe using a term of art".

Finally, these feedback cues may be used to guide the target model, ensuring that the generated output meets the user's needs and best performance of the model.

In another embodiment, in a chat robot application, the system may periodically collect user interaction data with the robot, including the type of questions posed by the user, the language style used, and the past response time and accuracy of the robot. These data will help the system determine that the robot needs to respond quickly in the current task and use informal language.

The system generates feedback cues accordingly, such as "answer the following questions quickly" or "please talk with informal mood". These prompt words guide the model in subsequent processing to ensure that the generated answers can meet the immediate needs and mood preferences of the user.

According to the embodiment, the system can generate more accurate feedback prompt words by collecting and analyzing the user demand data and the historical performance data of the target model. These hints help guide the model to generate an expected output in a particular task, improving the response quality and user satisfaction of the model. The method ensures that the generated prompt words are not only suitable for the current task, but also can be optimized according to the past performance of the model, thereby improving the overall efficiency and user experience of the model.

s304, analyzing the semantics and the structure of the initial prompt word, and generating a preliminary feedback prompt word based on the initial prompt word in a semantic expansion and structure adjustment mode;

S305, introducing a group of guide sentences for the preliminary feedback prompt words, so that the preliminary feedback prompt words can guide a model to generate feedback information, and obtaining guide type feedback prompt words;

S306, carrying out multi-round testing on the guide type feedback prompt words by using the target model, collecting feedback information output by the target model, and adjusting the guide type feedback prompt words according to the feedback information to obtain final feedback prompt words.

In this embodiment, the system first analyzes the initial prompt word in detail, and understands its semantic content and syntax structure. This step aims at identifying the core meaning of the initial prompt word and the structural weakness or imperfection that may exist.

On the basis of analysis, the system performs semantic expansion and structure adjustment. Semantic expansion may include introducing synonyms or related concepts to enhance the expressive power of hints. Structural adjustments may involve sentence-based transformations, introducing clauses, or other syntactic adjustments to generate preliminary feedback cues.

The system introduces a set of guide sentences for the generated preliminary feedback prompt. The role of these guide sentences is to enable the feedback prompt to explicitly guide the model to generate the required feedback information. The design of the guide statement is based on the expected output form of the model to ensure accuracy and relevance of the feedback information.

The system uses the target model to carry out multiple rounds of testing on the generated guide type feedback prompt words. And through repeated testing, the system collects feedback information generated by the target model. Such feedback information is used to evaluate the validity of the feedback cue words.

And the system adjusts the guiding type feedback prompt words according to the feedback information collected in the multiple rounds of testing. The content of the adjustment may include semantic refinement, structural optimization or improvement of the guide statement, and finally feedback prompt words capable of effectively guiding the model are generated.

In one embodiment, in a natural language processing task, the system first analyzes the vocabulary and the syntactic structure of the initial prompt word to identify its core semantic units. The system then semantically expands the synonyms, related concepts, or expanded context information by introducing them, while syntactically adjusting. The generated preliminary feedback cue words may include multiple versions of different expressions or grammatical structures.

The system designs a set of guide sentences for these preliminary feedback cues, such as "please generate relevant feedback from the following cues" or "describe potential deficiencies of the cues" to ensure that the model is able to generate useful feedback information.

The target model is used for carrying out multiple rounds of tests on the guide type feedback prompt words, and useful feedback information is collected by comparing the quality and the relativity of the model output.

Finally, according to the collected feedback, the system optimizes the prompt words and generates final feedback prompt words. These hints can effectively guide the model, and perform better in actual tasks.

In another embodiment, during optimization of a chat robot, the system analyzes the semantics and structure of the initial prompt word and finds that it lacks sufficient contextual information. The system generates a group of preliminary feedback prompt words by expanding the background description of the prompt words and introducing clauses or additional information.

The system introduces a guide statement of why or how for each feedback cue word so that the model can provide a deeper analysis when generating feedback information.

The system gradually optimizes the prompt words by repeatedly using the target model to test the prompt words, so that the system can effectively trigger the model to generate useful feedback information and adjust the feedback information according to feedback.

Finally, the system generates a group of feedback prompt words capable of maximizing the output quality of the model, and the response quality of the chat robot is remarkably improved.

According to the embodiment, the semantics and the structure of the initial prompt word are analyzed, the feedback prompt word is generated by combining semantic expansion and structure adjustment, and the system can generate the prompt word with better guidance and pertinence. Through multiple rounds of testing and feedback optimization, the finally generated feedback prompt words not only improve the response quality of the model, but also effectively solve the defects of the initial prompt words and ensure the efficient performance of the model under different tasks and scenes.

In one embodiment, the step S40 includes:

S4001, identifying problems in the language prompt feedback and determining specific optimization targets, wherein the specific optimization targets comprise improving semantic accuracy, enhancing context relevance or improving logic structure;

S4002, selecting a target optimization strategy from an optimization strategy library according to the specific optimization target, and optimizing the initial prompt words according to the target optimization strategy to obtain a plurality of undetermined prompt words;

S4003, scoring the plurality of undetermined prompting words by using the target model and the measurement function, and selecting the undetermined prompting word with the highest score as the target prompting word.

In this embodiment, the system first analyzes the generated linguistic prompt feedback to identify problems with the initial prompt word reflected in the feedback. These problems may be related to semantic accuracy, contextual relevance, or logical structure. Based on the analysis results, the system determines specific optimization objectives, such as improving semantic accuracy, enhancing context relevance, or improving the logical structure of the hint word.

And selecting a proper optimization strategy from a preset optimization strategy library according to the determined optimization target by the system. The optimization strategy library contains a plurality of optimization methods for different types of problems, such as vocabulary replacement strategies for semantic problems, syntax adjustment strategies for context relevance problems, and the like.

The system applies the selected optimization strategy to adjust and optimize the initial prompting words and generate a plurality of versions of undetermined prompting words. These pending hints are generated by different optimization strategies to increase diversity and selection space.

The system tests the generated plurality of undetermined cue words by using the target model, and scores the performance of each cue word by using a metric function. The metric function may include a number of indicators of accuracy, relevance, fluency, etc. The system selects the undetermined prompting word with the highest score as the final target prompting word.

In one embodiment, when a specific task is processed, the system recognizes that the initial prompt word has a problem of inaccurate semantics by analyzing language prompt feedback. And the system takes the word as an optimization target, selects a related vocabulary replacement strategy and a context enhancement strategy from an optimization strategy library, optimizes the initial prompt word and generates a plurality of undetermined prompt words.

The system then tests these pending hints using the goal model and scores them for accuracy and relevance. Finally, the system selects the prompt word with the highest score as the target prompt word.

In another embodiment, in a dialog system, the system finds that the logical structure of the initial prompt word is problematic by analyzing the language prompt feedback. The system takes the "improved logical structure" as an optimization target and selects a syntactic reconstruction policy from the optimization policy library. The system reconstructs the initial prompt words and generates a plurality of undetermined prompt words with more strict logic structures.

The undetermined prompt words are evaluated through the target model, and the system finally selects the prompt word with the optimal performance as the target prompt word according to the accuracy and fluency score.

According to the method and the system, the problems in the language prompt feedback are identified, and the optimization strategy is applied in a targeted mode, so that the initial prompt word can be effectively improved, and the finally generated target prompt word is more accurate and meets task requirements. Through multiple rounds of evaluation and optimization, the system can generate the optimal target prompt words, so that the overall performance and output quality of the model are remarkably improved.

The invention also provides an automatic model prompt word optimizing device, referring to fig. 2, fig. 2 is a schematic diagram of functional modules of a preferred embodiment of the automatic model prompt word optimizing device. The automatic optimizing device for the model prompt words comprises the following components:

the data acquisition module is used for acquiring an initial prompt word and a training data set, wherein the training data set comprises input data and corresponding real labels;

the model prediction module is used for inputting the initial prompt word and the input data into a target model together to obtain a prediction tag;

And the prompt word optimization module is used for optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word.

The specific implementation manner of the automatic model prompt word optimizing device is basically the same as that of each embodiment of the automatic model prompt word optimizing method, and is not repeated here.

The present invention also provides a model prompt automatic optimizing apparatus, which may include a processor 1001, such as a CPU, a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005, as shown in fig. 3. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.

Those skilled in the art will appreciate that the hardware configuration of the model hint word automatic optimization device shown in FIG. 3 does not constitute a limitation of the model hint word automatic optimization device, and may include more or fewer components than shown, or may combine certain components, or may be arranged in a different arrangement of components.

As shown in fig. 3, an operating system, a network communication module, a user interface module, and a model hint word auto-optimizing program may be included in the memory 1005 as one type of storage medium. The operating system is a program for managing and controlling the automatic optimizing equipment of the model prompt words and software resources, and supports the operation of a network communication module, a user interface module, the automatic optimizing program of the model prompt words and other programs or software, wherein the network communication module is used for managing and controlling the network interface 1004, and the user interface module is used for managing and controlling the user interface 1003.

In the hardware structure of the model prompt automatic optimizing apparatus shown in fig. 3, the network interface 1004 is mainly used for connecting a background server and performing data communication with the background server, the user interface 1003 is mainly used for connecting a client and performing data communication with the client, and the processor 1001 may call the model prompt automatic optimizing program stored in the memory 1005 and perform the same operation as the model prompt automatic optimizing method.

The specific implementation mode of the automatic model prompt word optimizing device is basically the same as the above-mentioned embodiments of the automatic model prompt word optimizing method, and will not be repeated here.

In addition, the embodiment of the invention also provides a computer storage medium, and the computer storage medium is stored with a model prompt word automatic optimizing program, so that the steps of the model prompt word automatic optimizing method are realized when the model prompt word automatic optimizing program is executed by a processor.

The specific implementation manner of the computer storage medium is basically the same as that of each embodiment of the automatic optimization method of the model prompt word, and is not repeated here.

While the embodiments of the present invention have been described above with reference to the drawings, the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many modifications may be made thereto by those of ordinary skill in the art without departing from the spirit of the present invention and the scope of the appended claims, which are to be accorded the full scope of the present invention as defined by the following description and drawings, or by any equivalent structures or equivalent flow changes, or by direct or indirect application to other relevant technical fields.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes various media capable of storing program codes, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), a magnetic disk, or an optical disk.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.

The above description is merely illustrative of various embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the scope of the present application, and the application is intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

It should be noted that, if a software tool or component other than the company appears in the embodiment of the present application, the embodiment is merely presented by way of example, and does not represent actual use.

Claims

1. The automatic model prompt word optimizing method is characterized by comprising the following steps:

2. The automatic model prompt word optimization method according to claim 1, further comprising, after optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word:

3. The automatic model prompt word optimization method according to claim 1, further comprising, after optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word:

4. The model hint word automatic optimization method of claim 3, wherein expanding each hint word of the set of beams to generate a plurality of candidate hint words and adding the candidate hint words to the set of candidate hint words includes:

5. The model hint word automatic optimization method of claim 1, further comprising, before using feedback hint words to guide the target model to generate language hint feedback based on the predictive tag and the real tag:

6. The model hint word automatic optimization method of claim 1, further comprising, before using feedback hint words to guide the target model to generate language hint feedback based on the predictive tag and the real tag:

7. The automatic model prompt word optimization method as claimed in claim 1, wherein optimizing the initial prompt word based on the generated language prompt feedback to obtain a target prompt word comprises:

8. The automatic model prompt word optimizing device is characterized by comprising:

9. A model prompt automatic optimization device, characterized in that it comprises a memory, a processor and a model prompt automatic optimization program stored on the memory and executable on the processor, the model prompt automatic optimization program implementing the steps of the model prompt automatic optimization method according to any one of claims 1-7 when executed by the processor.

10. A computer storage medium, wherein a model prompt word automatic optimization program is stored on the storage medium, and the model prompt word automatic optimization program realizes the steps of the model prompt word automatic optimization method according to any one of claims 1-7 when being executed by a processor.