wherein the weight matrix of the hidden layer is W_hBias matrix is b_hOutput value of alpha_hThe output value of the activation function is h_hThe output value z of the neural network is obtained by:

z＝h_hW_o+b_o，

wherein the weight matrix of the output layer is W_oBias matrix is b_oThe output value is z.

Step 9 comprises: the construction idea of The Semantic net of The control instructions is derived from frame semantics (cited documents: Grigoris.A, Paul.G, Frank.v.H, Rinke.H.A Semantic Web Primer (Third Edition) [ M ]. Cambridge: The MIT Press, 2012: 1-288.), and The basic theory is The lattice syntax theory of The American linguistics' Fillmore. The definition of the verb grid in the semantic network can be used for more finely describing the relationship between the argument and the verb in the regulation instruction, and generally comprises the following steps: a stop grid, a source grid, a place grid, an end grid, etc. Because the semantic relation between the control intention verb and different argument words is unilaterally mapped in most cases in the control instruction, a semantic network is constructed according to experience, and the semantic network comprises the control intention verb, the argument words and semantic association triple knowledge;

the invention also comprises a step 12 of constructing a structured template:

the structured template herein comprises two parts: 1. a basic governing term phrase; 2. verb-argument-lattice triplets. Because the control instruction comprises a plurality of verbs, the second part is composed of verb-argument relation triples according to the appearance sequence of the verbs. And establishing a structured template with two parts of basic control terms and verb-argument-lattice triples, and filling deconstructed control instruction information into the template to form a structured instruction.

The method can be applied to semantic understanding of the control command in the air traffic control system. Since a plurality of verbs may be included in the regulation instruction issued by the regulator in actual work, judging arguments associated with different verbs and finding out the relationship between them are important for accurately understanding the regulation instruction. The method can better analyze the control command and form a structured command, and can effectively process the work from the speech recognition of the control command to the movement trend prediction based on the control command content.

Has the advantages that: the invention has the following technical effects:

1. and (4) enabling the computer to autonomously understand the semantics of the control instruction and judge the motion process of the aircraft.

2. Semantic role labeling is performed on a plurality of verbs appearing in the management instruction.

3. Basic regulation term information appearing in the regulation instruction is extracted.

4. And converting the unstructured regulatory instruction into a structured regulatory instruction.

Drawings

The foregoing and other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.

FIG. 1 is a neural network-based regulatory instruction multi-verb role classification and structured instruction extraction process.

FIG. 2 is a main flowchart of a method for extracting a structured policing instruction.

Fig. 3 is a structure of a neural network.

FIG. 4 is a generated structured policing instruction template.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

The present invention is described in further detail below with reference to the use cases of the policing instructions and the associated figures. First, an example of a policing instruction is given: CES1234, eastern tower, slides along taxiways d5-p4-a5 to the 18 left runway waiting point, waiting outside the 18 left runway.

For ease of illustration and description, the steps of implementation herein are divided according to the main flow diagrams shown in fig. 1 and 2.

Step 1: part of speech analysis

The method comprises two steps of treatment processes: chinese word segmentation and part-of-speech tagging. And carrying out Chinese word segmentation and part-of-speech tagging on the control instruction by using a jieba word segmentation in Python software to obtain a corresponding result. In the word segmentation process, basic control terms are added in the word segmentation dictionary, so that the precision of the word segmentation result can be improved.

Step 2: extraction of words from each component

In the last step, words of different composition are labeled as different parts of speech, wherein the part of speech of "east tower" is labeled as "sp", "slide to" and "wait" are labeled as "v". According to the part of speech, the oriental tower is a basic control term and is extracted. The original regulatory terms become: CES1234, sliding along taxiway d5-p4-a5 to the 18 left runway waiting point, waiting outside the 18 left runway, closer to natural language. Get verbs "slide to" and "wait" at the same time. Other component words in the sentence are extracted as candidate arguments.

And step 3: relationship determination

The step is taken as the core step of the invention, and is mainly introduced in three parts:

and 3-1, converting the word vector into a word vector to obtain an input vector. According to the principle of word2vec, words can train out corresponding low-dimensional and dense distributed vectors through a neural network, and the relation among similar words can be obtained through the distribution of word vectors. Word vectors for each word can be calculated from regulatory instruction text data using the gensim framework package in Python software. It should be noted that the amount of data in the text data set of the regulatory directive is large enough to contain all of the regulatory directive vocabulary (not including the basic regulatory terms).

And 3-2, training model parameters of the neural network. The corpus training set of the control instruction is labeled, and the input vector not only considers the relation between the target argument and the target verb in the sentence, but also considers the function of adjacent words of the target argument, namely the function of context information. If the target argument is associated with the target verb when in a context, the tag is defined as the same tag as the target argument. It should also be noted that the number of positive and negative samples should be equal when preparing the training set. The fully-connected neural network was written using Python, the structure is shown in fig. 3, and multiple rounds of training were performed on the parameters using the training data. The larger the amount of training data, the better the model.

And 3-3, predicting by using a neural network. In this step, two models can be used: one is a word vector model which is responsible for converting words into word vectors, and the other is a fully-connected neural network classification model which is responsible for judging whether the target argument is associated with the verb. Through this step, verb-theoretic tuples can be obtained, but at this time the relationship between the two is not known.

And 4, step 4: semantic Web construction

The step completes the relation between the associated verb and the argument by constructing a semantic network. The semantic web is constructed in the form of an ontology and can express the relationship between the verb and other entities in the control field. The lattice types of verbs in the present invention include: the site lattice, the object lattice, the place lattice, the direction lattice, the source point lattice and the end point lattice. Since the field of empty management belongs to a special field, the format type carried by the verb appearing in the regulatory instruction is not the same as that in the normal natural language environment, such as: and (4) sliding. Two example sentences are given below:

1. CES1234, sliding to the end of the runway.

2. He steps on the skateboard to glide.

In the first sentence, the regulation instruction shows that the 'slide' in the regulation instruction has no tool lattice, and in the second sentence, the 'slide' is the tool lattice of the verb 'slide'.

Because the definition between verbs and arguments in the semantic web is fixed, and the numbers such as flight number, height, speed and the like in the control instruction do not have fixed values, another advantage of using word vectors to express words is that the similarity of similar words can be expressed, and the problem can be solved by setting templates and using the similarity of word vectors.

And 5: structured instructions

Through the processing of the previous steps, the deconstructed policing instruction is obtained, wherein the verb is judged to have two, so the structured policing instruction in the example is divided into three parts:

the basic regulatory terms: an east tower.

Verb 1: slide to CES1234, schlieren, slide to d5-p4-a5, slide to wait point to end point grid.

Verb 2: wait-CES 1234-shit grid, wait-off-runway-azimuth grid.

The resulting structured template results are shown in fig. 4.

From the results, flight CES1234 performs a taxi action first, and the taxi action is performed on taxiway d5-p4-a5, and its taxi end point is a runway waiting point, and performs a second action: and waiting for action. And obtaining a sliding path and an end point of flight CES1234 according to the analysis.

Examples

For convenience of illustration and description, the steps implemented herein are divided according to the main flow chart shown in fig. 2, and are explained in conjunction with the actual policing instructions. First, an example of a policing instruction is given:

1. CQH1207, east tower, sliding along D5-P4, waiting outside runway 35L.

2. CES3984, east tower, runway 35L, may take off.

Step 1, carrying out voice recognition processing on the control voice to obtain a control instruction in a text format; the policing voice is processed using a speech recognition device resulting in unstructured policing instruction text, as shown above.

Step 2, performing part-of-speech analysis on words contained in the control instruction in the text format, for example:

1. CQH/eng,1207/m, east tower/sp, edge/P, D5-P4/m, glide/v, runway/n, 35L/m, outer/f, wait/v.

2. CES/eng,3984/m, east tower/sp, runway/n, 35L/m, and take-off/v.

The part-of-speech tagging is carried out on the words in the control instruction, so that the part-of-speech of different words can be obtained, the control-purpose words and other words can be distinguished, meanwhile, the different words are combined into word groups in a mode of using a well-defined control-instruction special word group dictionary, and if: CQH1207, CES3984, runway 35L.

And 3, extracting a control intention based on the part-of-speech analysis result, and simultaneously extracting other words in the control instruction to form a candidate argument set, wherein the word of the verb part-of-speech is usually the control intention, for example: taxiing, waiting and taking off. Other words, such as: CQH1207, CES3984, east tower, runway 35L, edge, D5-P4, outer, etc. are all argument words.

Step 4, performing BIO labeling processing on the candidate argument set, and training parameters of the fully-connected neural network by using labeled data, wherein the goal is to obtain a control intention verb-argument group through the neural network, for example: the control instruction 1 includes two verbs, which are denoted as X ═ 0 and X ═ 1, the CQH1207 is associated with both verbs, and the control instruction E is denoted, the east tower is not associated with both verbs, and the control instruction O is denoted, and the control instruction B-0 is denoted, along with the association with only the glide.

and 6, generating an input word vector, obtaining the number sequence word vectors of different words by using word2vec, and forming the word vectors of sentences by using the word vectors of the words.

step 8, judging the relation: if the probability output value z has a higher probability in a category, it is determined that the argument has a relationship with the corresponding regulatory intention, step 10 is executed, otherwise, no processing is performed, for example: if the edge is associated with glide only and not with wait in step 4, then its probability at marker B-0 will be much greater than at marker B-1.

Step 9, constructing a semantic network, wherein the semantic network comprises regulation intention verbs, argument words and semantic association triple knowledge, such as: the semantic relationship between the flight and the verb is the operator of the action, the relationship between the runway 35L and the takeoff capability is the starting position of the action, and the relationship between D5-P4 and the taxi is the path of the action implementation.

step 11, the structured regulation instruction is used to directly detect whether a scene conflict is caused by the wrong, forgotten, or missed sending of the wrong regulation instruction by the controller, for example: when the controller sends the above two instructions in sequence, the following structured instruction can be obtained from instruction 1:

CQH1207, glide: the implementer;

CQH1207, wait for: the implementer;

D5-P4, slide: an action path;

outside runway 35L, wait: an action location;

then the end of the taxi is outside the runway 35L, as can be seen from the meaning of waiting.

From instruction 2, the following structured instructions can be derived:

CES3984, can take off: the implementer;

runway 35L, which can take off: an action source point;

therefore, the paths represented by the two instructions can be obtained, and the two control instructions are judged not to conflict.

The method and the device can be used for analyzing the intention of the empty control command, and the control command containing a plurality of different verbs can be processed by using the method and the device to obtain the structured commands with the same number as the verbs. The invention can be used for auxiliary alarm for scene conflicts caused by error, forgetting and omission of controllers. The semantic lattice analyzes the action intention of the control command and the action route of the aircraft, so that whether the conflict between the aircraft can be caused by different control commands can be judged.

The present invention provides a classification method for regulatory instructions based on semantic network, and a plurality of methods and approaches for implementing the technical scheme, where the foregoing is only a preferred embodiment of the present invention, it should be noted that, for those skilled in the art, a plurality of improvements and modifications may be made without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims

Translated fromChinese

1.一种基于语义网的管制指令分类方法，其特征在于，包括如下步骤：1. a kind of control instruction classification method based on semantic web, is characterized in that, comprises the steps:

步骤1，对管制语音进行语音识别处理，得到文本格式的管制指令；Step 1, performing speech recognition processing on the control voice to obtain control instructions in text format;

步骤2，对文本格式的管制指令中包含的词语进行词性分析；Step 2: Perform part-of-speech analysis on the words contained in the control instruction in text format;

步骤3，基于词性分析结果提取出管制意图，同时对管制指令中的其他词语进行提取，形成候选论元集合；Step 3, extracting the control intention based on the part-of-speech analysis result, and at the same time extracting other words in the control instruction to form a candidate argument set;

步骤4，对候选论元集合进行BIO标注处理，并用标注的数据训练全连接神经网络的参数，目标为通过神经网络得到管制意图动词-论元组；Step 4: Perform BIO annotation processing on the candidate argument set, and train the parameters of the fully connected neural network with the marked data, the goal is to obtain the control intent verb-argument group through the neural network;

步骤5，对管制指令中的管制意图动词数量进行判断，如果数量大于1个，则进入步骤6；Step 5, judge the number of control intent verbs in the control instruction, if the number is greater than 1, go to step 6;

步骤6，生成输入词向量；Step 6, generate the input word vector;

步骤7，使用神经网络对输入的词向量进行分类，得到概率输出值z；Step 7, use the neural network to classify the input word vector, and obtain the probability output value z;

步骤8，判断关系：如果概率输出值z在一个类别处出现较高的概率，则判定该论元与这个对应的管制意图存在关系，执行步骤10，否则不做处理；Step 8, determine the relationship: if the probability output value z has a high probability in a category, then determine that the argument has a relationship with the corresponding control intention, and execute step 10, otherwise no processing;

步骤9，构建语义网，语义网包含了管制意图动词、论元词语、语义关联三元组知识；Step 9, build a semantic web, which includes the triple knowledge of control intent verbs, argument words, and semantic associations;

步骤10，将管制意图动词与论元词语代入语义网，获取动词与相应论元间的具体语义关系，形成结构化管制指令；Step 10: Substitute control intent verbs and argument words into the Semantic Web to obtain the specific semantic relationship between verbs and corresponding arguments to form a structured control instruction;

步骤11，将结构化管制指令用于直接检测是否因管制员错、忘、漏发出错误管制指令而造成了场面冲突；Step 11: Use the structured control instruction to directly detect whether a scene conflict is caused by the controller's mistake, forgetting, or omission to issue an erroneous control instruction;

步骤2包括：Step 2 includes:

步骤2-1，利用jieba分词对管制指令进行中文分词操作，得到词序列；Step 2-1, using jieba word segmentation to perform Chinese word segmentation operation on the control instruction to obtain a word sequence;

步骤2-2，词性标注：对词序列中的每个词按对应的词性进行标注，得到词性分析结果；Step 2-2, part-of-speech tagging: tagging each word in the word sequence according to the corresponding part-of-speech to obtain the part-of-speech analysis result;

步骤3包括：基于词性分析结果，去除管制指令中为基本管制术语的词语，去除基本管制术语后的一条管制指令包含动词和其他成份的词性词，通过词性标注结果进行判断，将动词提取出来形成管制意图动词集合，同时将其他成份的词提取出来，形成候选论元集合；Step 3 includes: based on the part-of-speech analysis results, remove the words that are basic regulatory terms in the control instruction, and after removing the basic regulatory terms, a control instruction that contains the part-of-speech words of verbs and other components, judges based on the part-of-speech tagging results, and extracts the verbs to form. Control the set of intent verbs, and at the same time extract the words of other components to form a set of candidate arguments;

步骤4中，所述BIO标注是指：将每个元素标注为B-X、I-X或者O，其中，B-X表示此元素所在的片段属于X类型并且此元素在此片段的开头，I-X表示此元素所在的片段属于X类型并且此元素在此片段的中间位置，O表示不属于任何类型；In step 4, the BIO labeling refers to: labeling each element as B-X, I-X or O, where B-X indicates that the segment where this element is located belongs to type X and this element is at the beginning of this segment, and I-X indicates that this element is located. Fragment is of type X and this element is in the middle of this fragment, O means not of any type;

步骤4包括：Step 4 includes:

步骤4-1，转成词向量：使用word2vec方法对词语进行预处理，并生成输入句子的词向量表示，作为神经网络的输入；Step 4-1, convert to word vector: use the word2vec method to preprocess the words, and generate the word vector representation of the input sentence as the input of the neural network;

步骤4-2，使用全连接神经网络，训练全连接神经网络的模型参数：收集实际工作场景中的管制指令并形成语料训练集，对管制指令的语料训练集进标注，用语料训练集对全连接神经网络的各层神经元的权重参数进行训练，得到训练好的神经网络；Step 4-2, use the fully connected neural network to train the model parameters of the fully connected neural network: collect the control instructions in the actual work scene and form a corpus training set, annotate the corpus training set of the control instructions, use the corpus training set to Connect the weight parameters of the neurons in each layer of the neural network for training to obtain a trained neural network;

步骤4-3，使用神经网络进行预测：通过训练好的神经网络判断候选论元集和动词集合中的目标论元与动词是否存在关联，得到动词-论元组；Step 4-3, use the neural network to predict: determine whether the target argument in the candidate argument set and the verb set is related to the verb through the trained neural network, and obtain the verb-argument group;

步骤6包括：将步骤4-1中得到的句子词向量作为神经网络模型的输入Step 6 includes: using the sentence word vector obtained in step 4-1 as the input of the neural network model

步骤7中，神经网络包含输入层、隐藏层和输出层，定义神经网络有100×n个输入x，若输入层有n个神经元，则定义权重矩阵为W_(2w+2)×n，偏置矩阵为b_1×n，输入层的输出值为α_1×n，激活函数的输出值为h_1×n，定义ReLU激活函数f_ReLU(t)为：In step 7, the neural network includes an input layer, a hidden layer and an output layer. It is defined that the neural network has 100×n inputs x. If the input layer has n neurons, the weight matrix is defined as W_(2w+2)×n , The bias matrix is b_1×n , the output value of the input layer is α_1×n , the output value of the activation function is h_1×n , and the ReLU activation function f_ReLU (t) is defined as:

其中t是输入数值，隐藏层的神经元的输出值由下式得到：where t is the input value, and the output value of the neurons in the hidden layer is given by:

α_h＝h_hW_h+b_h，α_h =h_h W_h +b_h ,

h_h＝f_ReLU(α_h)，h_h =f_ReLU (α_h ),

其中隐藏层的权重矩阵为W_h，偏置矩阵为b_h，输出值为α_h，激活函数的输出值为h_h，通过下式得到神经网络的输出值z：The weight matrix of the hidden layer is W_h , the bias matrix is b_h , the output value is α_h , the output value of the activation function is h_h , and the output value z of the neural network is obtained by the following formula:

z＝h_hW_o+b_o，z=h_h W_o +b_o ,

其中输出层的权重矩阵为W_o，偏置矩阵为b_o。The weight matrix of the output layer is W_o and the bias matrix is_bo .