Summary of the invention
For addressing the above problem, the invention provides method and the system thereof of indoor positioning, can reduce indoor wireless location technology collecting training data cost.
The invention discloses a kind of method of indoor positioning, comprising:
Step 1, the training data of input tape position mark and without the training data of position mark, described training data forms training dataset;
Step 2, according to without the training data of position mark with the similarity relation between the training data of position mark, drawn without position mark corresponding to the training data of position mark by the position mark with the training data of position mark, make described training data concentrate whole training datas to have position mark;
Step 3 adopts the supervised learning method to be located by described training dataset and waits to set the goal.
Described step 2 further is,
Step 21, the adjacent map of the signal strength signal intensity vector similarity relation of all training datas of structure expression carries out Eigenvalues Decomposition to described in abutting connection with Laplacian matrix of graphs, characteristic vector composition characteristic vector space;
Step 22 makes up the grader on the described characteristic vector space, finds the solution the parameter of described grader according to the position mark of described training data with position mark, is drawn described without position mark corresponding to the training data of position mark by described grader.
Described step 21 further is,
Step 31 makes up described adjacent map;
Step 32, be calculated as follows described in abutting connection with Laplacian matrix of graphs,
L=D-W,
Wherein, L is described in abutting connection with Laplacian matrix of graphs, and D is the weights degree diagonal matrix of the node of described adjacent map, and W is the weight matrix on described adjacent map limit;
Step 33 is carried out Eigenvalues Decomposition to described Laplacian Matrix, obtains the characteristic vector of described Laplacian Matrix, described characteristic vector composition characteristic vector space.
Also comprise between described step 32 and the described step 33,
Step 41 is carried out normalization to described Laplacian Matrix.
Described step 22 further is,
Step 51 makes up described grader and is,
A whereiniBe the parameter of grader, φJiThe value that represents j element of i characteristic vector, fjBe the position mark of training data, during j≤l, fjBe the position mark with the training data of position mark, during j>l, fjBe the position mark without the training data of position mark, p is the quantity of the characteristic value used, and l is the quantity with the training data of position mark, and u is the quantity without the training data of position mark;
Step 52, the error of calculating the position mark of described training data with position mark is
By the send as an envoy to parameter of described grader of this error minimum of least square solution;
Step 53, the position mark that is drawn without the position mark training data by described grader is
Described step 31 further is,
Step 61, the training data of concentrating take described training data is as node, and the weights between computing node are
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance on signal strength space, σ is the window coefficient of gaussian kernel function;
Step 62 for each node, is arranged node by the weights order from big to small with described node, chooses a front k node as the neighbours of described node, and k is the integer greater than 0;
Step 63 with the connection of described node with described node neighbours, constructs described adjacent map.
Described step 31 further is,
Step 71, the training data of concentrating take described training data connects all nodes in twos as node, forms described adjacent map;
Step 72 determines that the weights on limit in the described adjacent map are
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance in signal strength space, σ is the window coefficient of Gaussian function.
Described training data in the distance of signal strength space is || RSSj-RSSi||, RSSiBe the signal strength signal intensity of training data i, RSSjSignal strength signal intensity for training data j.
The invention also discloses a kind of system of indoor positioning, comprising:
Training dataset forms module, is used for the training data of input tape position mark and without the training data of position mark, described training data forms training dataset;
Training data position mark module, be used for according to without the training data of position mark with the similarity relation between the training data of position mark, drawn without position mark corresponding to the training data of position mark by the position mark with the training data of position mark, make described training data concentrate whole training datas to have position mark;
The target localization module is used for adopting the supervised learning method to be located by described training dataset and waits to set the goal.
Described training data position mark module further comprises:
Characteristic vector space makes up module, is used for making up the adjacent map of the signal strength signal intensity vector similarity relation that represents all training datas, carries out Eigenvalues Decomposition to described in abutting connection with Laplacian matrix of graphs, characteristic vector composition characteristic vector space;
The position mark module, be used for making up the grader on the described characteristic vector space, position mark according to described training data with position mark is found the solution the parameter of described grader, is drawn described without position mark corresponding to the training data of position mark by described grader.
Described characteristic vector space makes up module and is further used for making up described adjacent map; By formula L=D-W calculating is described in abutting connection with Laplacian matrix of graphs, and wherein, L is described in abutting connection with Laplacian matrix of graphs, and D is the weights degree diagonal matrix of the node of described adjacent map, and W is the weight matrix on described adjacent map limit; Described Laplacian Matrix is carried out Eigenvalues Decomposition, obtain the characteristic vector of described Laplacian Matrix, described characteristic vector composition characteristic vector space.
Described characteristic vector space makes up module and also is used for described Laplacian Matrix is carried out normalization after having calculated Laplacian Matrix.
Described position mark module is further used for making up described grader
A whereiniBe the parameter of grader, φJiThe value that represents j element of i characteristic vector, fjBe the position mark of training data, during j≤l, fjBe the position mark with the training data of position mark, during j>l, fjBe the position mark without the training data of position mark, p is the quantity of the characteristic value used, and l is the quantity with the training data of position mark, and u is the quantity without the training data of position mark; The error of calculating the position mark of described training data with position mark is
By the send as an envoy to parameter of described grader of this error minimum of least square solution; The position mark that is drawn without the position mark training data by described grader is
The training data that described position mark module is further used for concentrating take described training data when making up described adjacent map is as node, and the weights between computing node are
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance on signal strength space, σ is the window coefficient of gaussian kernel function; For each node, by the weights order from big to small with described node node is arranged, choose a front k node as the neighbours of described node, k is the integer greater than 0; With the connection of described node with described node neighbours, construct described adjacent map.
The training data that described position mark module is further used for concentrating take described training data when making up described adjacent map connects all nodes in twos as node, forms described adjacent map; The weights of determining limit in the described adjacent map are
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance in signal strength space, σ is the window coefficient of Gaussian function.
Described training data in the distance of signal strength space is || RSSj-RSSi||, RSSiBe the signal strength signal intensity of training data i, RSSjSignal strength signal intensity for training data j.
Beneficial effect of the present invention is, utilization is without the training data of position mark with the potential relation between the training data of position mark, by the position mark with the training data of position mark the training data without position mark is carried out mark, can under the prerequisite that guarantees the location model positioning accuracy, reduce the collection cost of training dataset.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, is not intended to limit the present invention.
The method of indoor positioning of the present invention comprises:
Step S100, the training data of input tape position mark and without the training data of position mark, described training data forms training dataset.
Step S200, according to without the training data of position mark with the similarity relation between the training data of position mark, drawn without position mark corresponding to the training data of position mark by the position mark with the training data of position mark, make described training data concentrate whole training datas to have position mark.
Step S300 adopts the supervised learning method to be located by described training dataset and waits to set the goal.
Better, described step S200 further is,
Step S210, the adjacent map of the signal strength signal intensity vector similarity relation of all training datas of structure expression carries out Eigenvalues Decomposition to described in abutting connection with Laplacian matrix of graphs, characteristic vector composition characteristic vector space;
Step S220 makes up the grader on the described characteristic vector space, finds the solution the parameter of grader according to the position mark with the training data of position mark, is drawn described without position mark corresponding to the training data of position mark by grader.
The embodiment of the inventive method is as described below.
The inventive method need to gather a small amount of training data with position mark, utilize simultaneously a large amount of relatively low nothings of cost that gather with the training data of position mark, utilize the relation of all training datas on the characteristic vector space of Laplacian Matrix, train the location model with certain generalization ability.By forming training dataset with the training data of position mark with without the training data of position mark; Carry out spectral factorization by the adjacent map that all training datas are consisted of, spectral factorization is for to carry out Eigenvalues Decomposition to the adjacency Laplacian matrix of graphs, obtain the expression of all training datasets on characteristic vector space, then according to the constant principle of the position mark of identical data node on different representation spaces, know the position mark of training data on characteristic vector space with position mark, according to the position mark with the training data of position mark the training data without position mark is carried out mark, thereby obtain one entirely with the training dataset of position mark.
The method may further comprise the steps:
Step S101, the training data of input tape position mark and without the training data of position mark, those training datas form training datasets.
Training dataset is S={ldata, udata}, wherein ldata={fLab, RSS} is the training data with position mark, fLabBe position mark corresponding to this training data, RSS be this training data corresponding signal strength signal intensity vector, udata={fUnlal, RSS} is the training data without position mark, RSS is signal strength signal intensity vector corresponding to this training data, fUnlabBe estimated position mark corresponding to this training data.
Step S201, the adjacent map of the signal strength signal intensity vector similarity relation of all training datas of structure expression.
Adjacent map is expressed as G (V, E), and V is node, and E is the limit.Wherein, set of node is all training datas, comprises l with training data and u the training data without position mark of position mark, and the limit collection E between the node is the nonnegative value matrix W of (l+u) * (l+u), wherein wIjRepresent the syntople between i training data and j the training data, this syntople is measured according to the similarity between the signal strength signal intensity vector of training data.
Make up the embodiment one of adjacent map
According to all training datas, comprise with the training data of position mark with without the training data of position mark, adopt the mode of k arest neighbors to make up adjacent map G (V, E).
It is described that the mode of employing k arest neighbors makes up being implemented as follows of adjacent map.
Be w with the weights of any two nodesIj>0.
According to RSSiWith RSSjBetween distance on signal strength space, adopt the mode of gaussian kernel function to compose power.
Gaussian kernel function is,
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance on signal strength space, σ is the window coefficient of gaussian kernel function.The distance of training data on signal strength space be || RSSj-RSSi||.
For each node, according to the weights of this node, other nodes except this node are pressed from big to small arranged sequentially, choose a front k node as the neighbours of this node, k is the integer greater than 0; Node connects with its neighbours, constructs adjacent map.
Make up the embodiment two of adjacent map
Adopt full method of attachment to make up adjacent map, adjacency all between any two training datas.
The distance of training data on signal strength space be || RSSj-RSSi||.
According to RSSiWith RSSjBetween distance on signal strength space, adopt the mode of gaussian kernel function to compose power
Wherein, d (RSSi, RSSj) distance of expression training data on signal strength space, σ is the window coefficient of gaussian kernel function.
Step S202 calculates in abutting connection with Laplacian matrix of graphs.
Calculating is in abutting connection with the embodiment one of Laplacian matrix of graphs
On the basis of adjacent map G (V, E), the calculating Laplacian Matrix is L=D-W, and wherein D is the weights degree diagonal matrix of node, and W is the weight matrix on adjacent map limit.
Weights degree diagonal matrix is D={dIi=∑jwIj, represent the diagonal matrix that all limit weights sums relevant with node consist of.
When k arest neighbors method, all limit weights sums that the element representation of weights diagonal matrix and node are adjacent; When adopting full method of attachment, degree of node in the element representation adjacent map of weights diagonal matrix.
Calculating is in abutting connection with the embodiment two of Laplacian matrix of graphs
Normalization Laplacian Matrix on the basis of embodiment one:
Wherein, I representation unit vector, L is Laplacian Matrix, and D is the weights degree diagonal matrix of node, and W is the weight matrix on adjacent map limit.
Step S203 carries out Eigenvalues Decomposition to Laplacian Matrix.
Obtain spectrum and the characteristic of correspondence vector of Laplacian Matrix by Eigenvalues Decomposition, thereby all training datas can be mapped on the orthogonal intersection space of being opened by p characteristic vector, p is the number of the characteristic value of the Laplacian Matrix of use.
After Laplacian Matrix carried out Eigenvalues Decomposition, characteristic value and the characteristic vector of Laplacian Matrix were respectively { λi, φi, λiThe representation feature value, φiRepresentation feature value λiCharacteristic of correspondence vector, and characteristic value arranges with the non-order that falls, then
The spectrum of Laplacian Matrix be the characteristic value collection of Laplacian Matrix, { λi.
The characteristic vector of Laplacian Matrix has embodied bunch information that original training data is concentrated.
For example, the chain figure with two connected components as shown in Figure 2, solid node represents the training data with position mark, and hollow node represents the training data without position mark, and the syntople of all nodes on the orthogonal intersection space that 2 characteristic vectors of this figure are opened as shown in Figure 3 among Fig. 2.Dotted line among Fig. 3 between two nodes has represented these two nodes adjacency on this orthogonal intersection space, from Fig. 2 and Fig. 3 as seen, the characteristic vector of Laplacian Matrix has implied bunch information of node in the original graph, has effectively characterized with the training data of position mark with without the annexation between the training data of position mark.
Step S204, the grader on the construction feature vector space is found the solution the parameter of grader according to the position mark with the training data of position mark, is drawn described without position mark corresponding to the training data of position mark by grader.
Grader is used for according to the expression of training data on characteristic vector space, determines the position mark of training data.
By minimizing the formula error function, make up a grader between the characteristic vector sky, obtain classifier parameters according to least square solution.
After all training datas are mapped to characteristic vector space, utilize this bunch information, because sample example in identical cluster (Cluster) has and larger may have identical position mark, and then realizes having the position mark of flag data to propagate in whole datagram.
Because all characteristic values of Laplacian Matrix satisfy 0=λ1≤ λ2λL+u, therefore, Laplacian Matrix is positive semidefinite (positive semi-definite), the characteristic vector of Laplacian Matrix forms Hilbert space L2(M) one group of orthogonal basis.For any function f (x) ∈ L2(M), can be expressed as the linear combination of orthogonal basis:
Wherein, φiBe characteristic vector, expression orthogonal basis, aiRepresent basic φiCorresponding linear coefficient.
Based on above-mentioned principle, step S204 embodiment is as follows.
The structure grader is
Wherein, aiBe the parameter of grader, φJiThe value that represents j element of i characteristic vector, fjBe the position mark of training data, during j≤l, fjBe the training data with position mark, during j>l, fjBe the training data without position mark, p is the quantity of the characteristic value of use, and l is the quantity with the training data of position mark, and u is the quantity without the training data of position mark.
In the error with the training data of position mark be
Optimum fitting parameter a=(a1, a2Ap)T,
Tried to achieve by least square solution, the parameter of grader is
Wherein, fLab=(f1, f2... fl)TBe the position mark with the training data of position mark,
ELabFor the training data with position mark projects to p dimension expression on the orthogonal intersection space that p characteristic vector generate.
By grader, the training data without position mark is carried out mark.Position mark without the training data of position mark is
(j=l+1,...l+u)
Step S301 according to complete all mark training dataset, adopts the localization method of supervised learning, such as maximum a posteriori (Maximum a Posteriori, MAP) method, treats localizing objects and positions.
The system of indoor positioning of the present invention as shown in Figure 4.
Trainingdataset forms module 100, is used for the training data of input tape position mark and without the training data of position mark, those training datas form training datasets.
Training dataposition mark module 200, be used for according to without the training data of position mark with the similarity relation between the training data of position mark, drawn without position mark corresponding to the training data of position mark by the position mark with the training data of position mark, make described training data concentrate whole training datas to have position mark;
Target localization module 300 is used for adopting the supervised learning method to be located by described training dataset and waits to set the goal.
Training dataposition mark module 200 further comprises:
Characteristic vector space makes upmodule 201, is used for making up the adjacent map of the signal strength signal intensity vector similarity relation that represents all training datas, carries out Eigenvalues Decomposition to described in abutting connection with Laplacian matrix of graphs, characteristic vector composition characteristic vector space.
Position mark module 202, be used for the grader on the construction feature vector space, position mark according to described training data with position mark is found the solution the parameter of described grader, is drawn described without position mark corresponding to the training data of position mark by described grader.
Better, characteristic vector space makes upmodule 201 and is further used for making up described adjacent map; By formula L=D-W calculating is described in abutting connection with Laplacian matrix of graphs,
Wherein, L is described in abutting connection with Laplacian matrix of graphs, and D is the weights degree diagonal matrix of the node of described adjacent map, and W is the weight matrix on described adjacent map limit;
Described Laplacian Matrix is carried out Eigenvalues Decomposition, obtain the characteristic vector of described Laplacian Matrix, described characteristic vector composition characteristic vector space.
Better, characteristic vector space makes upmodule 201 and also is used for described Laplacian Matrix is carried out normalization after having calculated Laplacian Matrix.
Better,position mark module 202 is further used for making up described grader and is
(j=1,...l+u,p≤l)
A whereiniBe the parameter of grader, φJiThe value that represents j element of i characteristic vector, fjBe the position mark of training data, during j≤l, fjBe the position mark with the training data of position mark, during j>l, fjBe the position mark without the training data of position mark, p is the quantity of the characteristic value used, and l is the quantity with the training data of position mark, and u is the quantity without the training data of position mark;
The error of calculating the position mark of described training data with position mark is
By the send as an envoy to parameter of described grader of this error minimum of least square solution; The position mark that is drawn without the position mark training data by described grader is
(j=l+1,...l+u)。
Better, the training data that positionmark module 202 is further used for concentrating take described training data when making up described adjacent map is as node, and the weights between computing node are
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance on signal strength space, σ is the window coefficient of gaussian kernel function; For each node, by the weights order from big to small with described node node is arranged, choose a front k node as the neighbours of described node, k is the integer greater than 0; With the connection of described node with described node neighbours, construct described adjacent map.
Better, the training data that positionmark module 202 is further used for concentrating take described training data when making up described adjacent map connects all nodes in twos as node, forms described adjacent map; The weights of determining limit in the described adjacent map are
Wherein, RSSiBe the signal strength signal intensity of training data i, RSSjBe the signal strength signal intensity of training data j, d (RSSi, RSSj) expression training data the distance in signal strength space, σ is the window coefficient of Gaussian function.
Described training data in the distance of signal strength space is || RSSj-RSSi||, RSSiBe the signal strength signal intensity of training data i, RSSjSignal strength signal intensity for training data j.
Beneficial effect of the present invention is as follows.
When Fig. 5 is training data with position mark for concentrate 17% training data at training data, to carry out the precision of mark without the training data of position mark.As can be seen from Figure 1, along with the increase of used characteristic vector number, the namely increase of above-mentioned p, the trend that the position mark precision slowly reduces after presenting and improving rapidly first.This be because, when p hour, the number of the characteristic vector that adopts is less, causes occurring not foot phenomenon of match when the solution node classifier parameters, thereby cause to the Generalization Capability that carries out mark without the position mark training data a little less than, so the position mark precision is lower.And when p increases to a certain degree, as reach haveflag data quantity 50% the time, the telltale mark precision begins to descend, this is because too much characteristic vector causes overfitting when finding the solution classifier parameters, has caused the lower result of positioning accuracy.
When Fig. 6 is training data with position mark forconcentrate 25% training data at training data, the training data without position mark is carried out the precision of mark and the relation between the characteristic vector number of adopting.Comparison diagram 5 and Fig. 6 can find out, when the characteristic vector number that adopts accounted for the 20%-25% left and right sides of all characteristic vector numbers, the position mark precision reached peak.
Fig. 7 is under the training data condition with position mark of different proportion, to carry out the Accuracy of mark without the training data of position mark.As can be seen from Figure 7, along with the increasing of the training data ratio of position mark, the label information that provides with the training data of position mark increases, and the precision of therefore training data without position mark being carried out mark increases.Simultaneously, when the training data ratio with position mark rises to a certain degree, such as 15%, the trend that increases progressively of the training data without position mark being carried out the precision of mark becomes slow, this be because, when the parameter of application class device is carried out mark to the training data without position mark, can obtain preferably fitting coefficient when reaching certain degree with the quantity 1 of the training data of position mark, the quantity that further increases with the training data of position mark can not obtain equal repayment.
In sum, for with the higher problem of the training dataset cost of position mark, based on two training samples close on signal strength space, their position mark should be similar, this patent has adopted the method for figure Laplce matrix spectra, by making up on a small quantity with the training data of position mark and propagating relation without the mark between the training data of position mark in a large number, unmarked training data is carried out mark, thereby reduced the collection cost of training dataset.Experimental result shows, only needs a small amount of flag data that has, about 20%, just can with higher precision, near 80%, realize the mark without the position mark training data.
Those skilled in the art can also carry out various modifications to above content under the condition that does not break away from the definite the spirit and scope of the present invention of claims.Therefore scope of the present invention is not limited in above explanation, but determined by the scope of claims.