Disclosure of Invention
In view of the above, the invention provides a method, medium and system for outputting a cell mass removal signal by a flow cytometer, which can solve the technical problems that the existing cell mass identification and removal method depends on manually setting a threshold and parameters, and the identification rule needs to be manually adjusted for different samples, so that the automation cannot be realized.
The invention is realized in the following way:
in a first aspect, the present invention provides a method for outputting a decellularized signal by a flow cytometer, comprising the steps of:
S10, collecting an electric pulse signal set collected by a photoelectric converter of a flow cytometer, wherein the electric pulse signal set comprises an electric pulse signal set generated by forward scattered light, an electric pulse signal set generated by side scattered light and an electric pulse signal set generated by fluorescent signals, which are respectively marked as an FSC pulse set, an SSC pulse set and a fluorescent pulse set, and the electric pulse signals comprise time, length, width and area;
S20, respectively establishing an FSC matrix, an SSC pulse matrix and a fluorescence pulse matrix according to the obtained FSC pulse set, SSC pulse set and fluorescence pulse set;
S30, traversing each electric pulse signal in the FSC pulse set, and acquiring electric pulse signals with channel values larger than a first threshold value to form an FSC large pulse set;
S40, screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix;
S50, establishing an aggregation matrix from the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescence pulse matrix;
S60, analyzing the aggregation matrix by utilizing a pre-trained cell mass analysis model to obtain an electric pulse signal of a corresponding cell mass in the aggregation matrix;
s70, deleting the electric pulse signals corresponding to the cell clusters in the electric pulse signal set, and outputting the obtained electric pulse signal set to a data processing system of the flow cytometer.
The method for outputting the cell mass removal signal by the flow cytometer has the technical effects that as certain relations exist among FSC, SSC and fluorescent signals of each cell or cell mass, for example, the cell mass is relatively large in size, the internal structure is more complex, so that the FSC and SSC can generate certain relations, meanwhile, the fluorescent signals of the cell mass can generate different channel values from those of common cells due to relatively more fluorescent dyes of the cell mass, the manual judgment is simply adopted, and the selected threshold value is not accurate enough due to the fact that the data size is too large, so that the multi-channel fusion method is utilized, the multi-feature extraction analysis is carried out by utilizing a polymerization matrix, and the cell mass signal can be better determined.
Based on the technical scheme, the method for outputting the cell mass removal signal by the flow cytometer can be improved as follows:
The step of establishing the FSC matrix, the SSC pulse matrix and the fluorescence pulse matrix according to the obtained FSC pulse set, the SSC pulse set and the fluorescence pulse set respectively further comprises the step of eliminating noise points of the electric pulse signal.
The step of traversing each electric pulse signal in the FSC pulse set to obtain an electric pulse signal with a channel value larger than a first threshold value and forming an FSC large pulse set specifically comprises the following steps:
Traversing the FSC pulse set, and extracting pulse signals with FSC channel values larger than a first threshold value as a first pulse set;
calculating the inter-pulse distance of the first pulse set, and merging adjacent pulses;
Smoothing the first pulse set after combining the adjacent pulses;
The first pulse set after smoothing filter processing is subjected to dimensionality reduction by adopting PCA method
The first pulse set after dimension reduction is used for seating the FSC large pulse set.
By default, the first threshold is set to 3 times the FSC baseline noise.
The step of screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix specifically comprises the following steps:
traversing the FSC matrix, and reserving a row corresponding to the FSC large pulse;
screening corresponding rows from SSC and fluorescent matrix according to FSC large pulse;
smoothing the filtered matrix;
outputting the filtered and smoothed FSC, SSC and fluorescent matrix.
The step of establishing an aggregation matrix by using the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescent pulse matrix specifically comprises the following steps:
transversely splicing screening smooth matrixes of different channels to form an aggregation matrix;
Normalizing and standardizing the aggregation matrix;
and performing dimension reduction on the normalized and standardized aggregation matrix by adopting PCA to obtain a dimension reduction aggregation matrix fused with different channel information.
The cell mass analysis model building and training method specifically comprises the following steps of:
collecting a plurality of historical aggregation matrix samples, and marking cell mass data to serve as training samples;
constructing a machine learning classification model;
and training the machine learning classification model by using a training sample to obtain a cell mass analysis model.
Further, the model used for constructing the machine learning classification model is an SVM or random forest.
Wherein, the step of deleting the electric pulse signals corresponding to the cell mass in the electric pulse signal set specifically comprises the following steps:
Traversing an original electric pulse signal set;
Judging whether each pulse is in the identified cell mass collection;
If so, deleting the pulse from the original set;
Retaining the pulse of the non-cell mass to form a filtered output set;
returning the filtered set of electrical pulse signals.
A second aspect of the present invention provides a computer readable storage medium having stored therein program instructions which when executed are adapted to carry out a method of outputting a degranulation signal by a flow cytometer as described above.
A third aspect of the present invention provides a system for outputting a degranulation signal by a flow cytometer, wherein the system comprises the computer readable storage medium.
Compared with the prior art, the method, medium and system for outputting the cell mass removal signal by the flow cytometer have the beneficial effects that:
1. Realizing the automatic identification and removal of cell clusters
According to the invention, by constructing the pre-trained machine learning model, the cell clusters in the aggregation matrix can be automatically identified without manually setting identification rules and threshold values. This avoids the complexity of parameter adjustment and achieves intelligent automatic identification of cell clusters.
In contrast, existing methods based on threshold determination require a professional to adjust parameters for different samples, are very empirical, and cannot implement automatic batch processing. The automatic identification method of the invention greatly simplifies the operation flow.
2. Improving the accuracy of cell mass identification
The invention comprehensively utilizes multi-channel information such as FSC, SSC and the like to construct the aggregation matrix with strong expression capacity, and adopts a pre-training machine learning model, thereby ensuring the accuracy of cell mass identification.
The prior method often generates misrecognition or misrecognition due to improper threshold setting. The multi-channel aggregation matrix provides richer features, and the recognition accuracy can be effectively improved by combining a machine learning model.
3. The operation complexity is reduced
Compared with cell mass identification based on image processing, the method directly models the pulse matrix, does not need to store and process a large number of images, and has higher calculation efficiency.
Meanwhile, compared with the method of traversing and judging the threshold value for many times, the machine learning model can be calculated in parallel, and the prediction speed is faster. The method is efficient in operation.
4. Improving robustness and the ability to adapt to different samples
The machine learning model can obtain better generalization capability through pre-training and adapt to different parameter changes.
The model is obtained based on the pre-training of various samples, can identify cell clusters of different sizes and types, and has strong robustness without adjusting parameters for each sample.
5. Lay a foundation for standardized automatic processing flow
The invention realizes standardization and automation of the step of cell mass identification. The flow cytometry analysis can establish a standard end-to-end flow, and automation from acquisition to analysis results is realized.
The user can obtain an accurate result after removing the cell clusters only by simple operation, and the detection efficiency is greatly improved.
6. The detection cost is reduced
The automation degree is improved, so that the workload of professional operators can be reduced. Meanwhile, the calculation efficiency is higher, and the instrument use time can be reduced.
The application of the invention can reduce the labor and equipment cost of flow cytometry detection and improve the detection efficiency.
In summary, the invention can solve the technical problems that the existing cell mass identification and removal method relies on manual setting of threshold and parameters, manual adjustment of identification rules is required for different samples, automation cannot be realized, and the storage and calculation amount of the image-based method is large.
Detailed Description
In order to make the purposes, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention are clearly and completely described.
Referring to FIG. 1, a flow chart of a method for outputting a cell mass removal signal by a flow cytometer according to a first aspect of the present invention is provided, the method comprising the steps of:
S10, collecting an electric pulse signal set collected by a photoelectric converter of a flow cytometer, wherein the electric pulse signal set comprises an electric pulse signal set generated by forward scattered light, an electric pulse signal set generated by side scattered light and an electric pulse signal set generated by fluorescent signals, which are respectively recorded as an FSC pulse set, an SSC pulse set and a fluorescent pulse set, wherein the electric pulse signals comprise time, length, width and area;
S20, respectively establishing an FSC matrix, an SSC pulse matrix and a fluorescence pulse matrix according to the obtained FSC pulse set, SSC pulse set and fluorescence pulse set;
S30, traversing each electric pulse signal in the FSC pulse set, and acquiring electric pulse signals with channel values larger than a first threshold value to form an FSC large pulse set;
S40, screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix;
S50, establishing an aggregation matrix from the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescence pulse matrix;
s60, analyzing the aggregation matrix by using a pre-trained cell mass analysis model to obtain electric pulse signals of corresponding cell masses in the aggregation matrix;
S70, deleting the electric pulse signals corresponding to the cell clusters in the electric pulse signal set, and outputting the obtained electric pulse signal set to a data processing system of the flow cytometer.
Specifically, the implementation steps of S10 are described as follows:
In this step we collect the raw electrical signal collected by the photoelectric converter of the flow cytometer. The photoelectric converter generally adopts a device such as a photodiode or a phototransistor. As the cell sample flows through the laser beam, forward scattered light, side scattered light, and fluorescent signals are generated. The optical signals are converted into current signals by the photodiodes and then into voltage signals by the operational amplifiers, namely original electric pulse signals.
The raw electrical pulse signal contains all the optical information about each cellular or non-cellular event. Different types of electrical pulse signals can be obtained according to the scattering and fluorescence characteristics of cellular or non-cellular events in different channels. In the present invention, we divide these electrical pulse signals into three categories:
(1) FSC pulse set-set of electrical pulse signals generated by forward scattered light (FSC). FSC reflects mainly cell size.
(2) SSC pulse set-the set of electrical pulse signals generated by side scattered light (SSC). SSC reflects mainly the internal structure and complexity of the cell.
(3) Fluorescence pulse set, namely an electric pulse signal set generated by fluorescence detected by different fluorescence channels. Different fluorescent channels correspond to different fluorescent labels.
The electric pulse signal contains the following information:
(1) Time of day: the time of occurrence of the pulse signal.
(2) Length: the length of the pulse signal, i.e., the X-axis direction span. Reflecting the amount of photon flux.
(3) Width: the width of the pulse signal, i.e., the Y-axis direction height. Is related to the pulse signal amplitude.
(4) Area is the area under the pulse signal curve. Correlated to the number of photons detected.
In summary, the key of this step is to collect the original electrical pulse signals, and classify them into FSC pulse sets, SSC pulse sets and fluorescence pulse sets according to different optical channels, and these signals include the optical characteristic information of each cell/non-cell event, which lays a foundation for the subsequent identification of cell clusters.
The implementation of this step is described in detail below:
1. the optical path system of the flow cytometer is configured to include a laser, a condenser lens, a flow cell, etc., so that when the laser irradiates the sample, forward scattered light, side scattered light, and a fluorescent signal can be generated.
2. And a proper number of photodiodes or phototriodes are selected as photoelectric converters, and proper numbers are configured according to actual needs, so that signals of forward scattered light, side scattered light and each fluorescent channel can be detected.
3. The output of the photodiode or phototransistor is connected to an operational amplifier, which is configured with a magnification factor to amplify the current signal.
4. The digital acquisition card is connected, and the sampling frequency is configured to acquire the voltage signal output by the operational amplifier.
5. The flow cytometer is operated using a cell sample with a predetermined fluorescent label. When a cell sample is passed through the laser beam, an optical signal is generated.
6. The photodiode converts the optical signal into a current signal, and the operational amplifier amplifies the signal and converts the signal into a voltage pulse signal.
7. The digital acquisition card samples the voltage pulse signal at a certain frequency to obtain an original electric pulse signal.
8. Depending on the photodiode location, it can be determined whether the electrical pulse signal is coming from an FSC, SSC or a specific fluorescent channel.
9. And classifying and sorting the acquired original electric pulse signals into an FSC pulse set, an SSC pulse set and a fluorescence pulse set according to the judgment.
10. Analyzing the electric pulse signal, extracting characteristic information such as time, length, width, area and the like, and providing basis for the follow-up identification of the cell mass.
11. And storing the processed electric pulse signals in a memory or a hard disk of a computer for later use.
12. Repeating steps 5-11, collecting a sufficient number of cell samples, and obtaining an original electric pulse signal set containing cell mass information.
Through the steps, the original electric pulse signals acquired by the photoelectric converter of the flow cytometer can be effectively obtained, the signals are classified according to different optical channels, characteristic information is extracted, a foundation is laid for subsequent cell mass identification, and the step S10 is completed. Places to be noted include:
(1) The light path system is reasonable to be configured, so that scattered light and fluorescence can be effectively detected.
(2) The photoelectric converter needs to be selected correctly and has high response speed.
(3) The amplification factor is proper to prevent waveform distortion.
(4) The sampling frequency is set reasonably, and the Nyquist sampling theorem is satisfied.
(5) The extracted features are effective to represent signal characteristics.
(6) The storage is to guarantee the access speed.
By paying attention to the points, the original electric pulse signal with good quality and rich information can be obtained, reliable basic data is provided for the subsequent cell mass identification, and the acquisition and processing of the step S10 are completed.
In the above technical solution, the step of establishing the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC pulse set, the SSC pulse set and the fluorescent pulse set obtained by the electrical pulse signal, respectively, further includes a step of eliminating noise points of the electrical pulse signal.
Specifically, the implementation steps of S20 are described as follows:
first, some variables and constants are defined:
FSC_pulse_set represents a FSC pulse set, comprising m pulse signals, denoted { pSc1,psc2,...,pScm };
Ssc_pulse_set represents a SSC pulse set comprising n pulse signals, denoted { pssc1,pssc2,...,psscn };
FL_pulse_set represents a set of fluorescent pulses, comprising 1 pulse signal, denoted { pfl1,pfl2,...,pfll };
FSC matrix represents the FSC matrix to be established, and the dimension is m multiplied by 4;
the sscmatrix represents the SSC matrix to be established, with dimensions n x 4;
-FL matrix represents the fluorescent matrix to be established, with dimension lx4;
The pulse signal p consists of 4 features, p= { t, w, a, h }, where t represents the pulse instant, w represents the pulse width, a represents the pulse area, and h represents the pulse height;
then, each pulse signal psci in fsc_pulse_set is traversed and written into the ith row of fsc_matrix;
similarly, traversing SSC_pulse_set and FL_pulse_set, and writing pulse signals into corresponding matrixes;
to this end, an FSC matrix, an SSC matrix and a fluorescence matrix are established from the pulse signal set.
Preferably, to further eliminate the influence of noise points, the matrix may be smoothed:
1) FSC matrix smoothing
The distance d (psci) of each pulse psci on the time axis is calculated:
If d (psci) < threshold, psci and psci+1 are combined into one pulse.
Repeating the steps until d (psci) is equal to or greater than threshold.
2) The SSC matrix and the fluorescence matrix are smoothed in the same way as the FSC matrix.
3) And carrying out mean filtering on the smoothed matrix to further eliminate random noise.
FSC_matrix=smooth(FSC_matrix);
SSC_matrix=smooth(SSC_matrix);
FL_matrix=smooth(FL_matrix);
Through the steps, the FSC matrix, the SSC matrix and the fluorescence matrix after the smoothing treatment are obtained, the matrixes reflect the pulse distribution condition of each channel, and a foundation is laid for subsequent recognition analysis.
The matrix processing algorithm adopts a smoothing and filtering method, so that random errors caused by instrument noise can be effectively eliminated, and the accuracy of subsequent identification is improved. The time complexity is O (m+n+l), and the space complexity is O (m+n+l). When the cell samples are more, the matrix size becomes larger, and the dimension reduction can be considered by adopting PCA and other methods to reduce the calculated amount. PCA (Principal Component Analysis) principal component analysis is a common way of data analysis that uses orthogonal transformation to transform data represented by linearly related variables, called principal components, into a few data represented by linearly independent variables. The number of principal components is typically smaller than the number of original variables, so principal component analysis is often used for dimension reduction of high-dimensional data, extracting the principal characteristic components of the data.
Furthermore, the threshold for pulse combining needs to be empirically determined based on specific instrument parameters, typically taken as 10-20% of the pulse width. When the matrix is smoothed, mean filtering, median filtering or Gaussian filtering can be adopted, and filter parameters are required to be selected according to actual conditions.
In conclusion, the step establishes a pulse matrix by traversing the pulse signals, and performs smooth filtering treatment, so that a foundation is laid for subsequent identification and analysis, the algorithm thought is clear and direct, the calculated amount is moderate, and better pulse expression can be obtained. The method can be widely applied to pulse signal processing in the fields of flow cytometry and the like.
In the above technical solution, traversing each electric pulse signal in the FSC pulse set, obtaining an electric pulse signal with a channel value greater than a first threshold value, and forming a FSC large pulse set, which specifically includes:
Traversing the FSC pulse set, and extracting pulse signals with FSC channel values larger than a first threshold value as a first pulse set;
calculating the inter-pulse distance of the first pulse set, and merging adjacent pulses;
Smoothing the first pulse set after combining the adjacent pulses;
The first pulse set after smoothing filter processing is subjected to dimensionality reduction by adopting PCA method
The first pulse set after dimension reduction is used for seating the FSC large pulse set.
Specifically, the implementation steps of S30 are described as follows:
1. Defining variables and constants
Threshold1 FSC channel first threshold
FSC_large_set, FSC large pulse set to be formed
2. Traversing each pulse signal psci in FSC_pulse_set, if psci.h>threshold1, adding psci to FSC_large_set;
3. Calculating the time distance between each pulse signal in the FSC_large_set, and if the distance is smaller than threshold2, combining the two pulse signals;
4. Smoothing filtering
And smoothing the FSC_large_set by applying mean filtering and median filtering.
5. Dimension reduction
And the method such as PCA can be selected to reduce the dimension of the FSC_large_set, so that the redundancy characteristic is reduced.
Through the steps, large pulse signals are extracted from the FSC pulse set to form a FSC large pulse set FSC_large_set, and preparation is made for subsequent cell mass identification analysis.
By default, the first threshold1 is set to 3 times the FSC baseline noise;
further, the first threshold1 is used for extracting the pulse signal with larger FSC channel intensity from the FSC pulse set. The determination needs to consider the following factors:
1. Instrument parameters
The baseline noise and sensitivity of the FSC channel will vary from flow cytometer to flow cytometer, requiring an empirical range to be predetermined based on the parameters of the instrument.
2. Sample type
Different cell sample size distributions result in different FSC intensity distribution ranges, requiring adjustment of the threshold according to the sample.
3. Data distribution
And visualizing or counting the data distribution condition of the FSC pulse set, and determining a threshold value capable of effectively distinguishing strong pulses from weak pulses.
4. Error testing method
Several candidate thresholds can be taken more from the experience range, and the screening effect is tested by combining the samples, so that the optimal threshold is selected.
5. Adaptive threshold
A data-driven adaptive threshold algorithm, such as Otsu's method, may be designed to automatically determine the threshold based on the FSC pulse signal profile.
Taking the above factors into account, it is generally recommended that the first threshold1 be set in the range of 3-5 times the FSC baseline noise, with specific values being determined for the sample and instrument. Meanwhile, the threshold value can be optimized by a trial and error method, or an automatic threshold value selection algorithm is designed.
In step S30, threshold2 is a threshold used to determine whether the distance between two FSC large pulses is sufficiently close to determine whether to combine them into one pulse. Default threshold2 is 3 times the pulse time length;
of course, the threshold2 may also be determined according to the following method, taking into account the following factors:
1. Instrument acquisition frequency
The acquisition frequency of the flow cytometer determines the minimum time interval between two successive pulse events. If threshold2 is below this interval, an erroneous merge may result.
2. Physical width of cells or cell clusters
Considering the physical width of the cell mass, too small threshold2 is avoided causing different cell masses to be combined.
3. Noise jitter effect
Too small a threshold2 may also mismerge noise jitter into one pulse.
4. Statistical multi-sample distance distribution
The distribution of pulse distances in multiple samples can be observed and an appropriate interval is selected as threshold2.
5. Empirical parameter adjustment
And (5) manually testing different parameters, and selecting the threshold with the best recognition effect.
Considering the above factors, the threshold2 is set to be 3-5 times of the corresponding time of the acquisition frequency, or the pulse of the cell mass can be effectively combined from more than 0.5ms by testing step by step, and the optimal threshold value for not introducing excessive combination is selected.
In the above technical solution, the steps of screening the FSC matrix, the SSC pulse matrix and the fluorescent pulse matrix according to the FSC large pulse set to obtain a screened FSC matrix, a screened SSC pulse matrix and a screened fluorescent pulse matrix specifically include:
traversing the FSC matrix, and reserving a row corresponding to the FSC large pulse;
screening corresponding rows from SSC and fluorescent matrix according to FSC large pulse;
smoothing the filtered matrix;
outputting the filtered and smoothed FSC, SSC and fluorescent matrix.
Specifically, the implementation steps of S40 are described as follows:
1. Defining variables and constants
FSC_large_set, which is obtained in the step S30;
-FSC matrix, obtained in step S20;
SSC matrix obtained in step S20;
-FL matrix: S20 step;
FSC_filter, the FSC matrix after screening;
-ssc_filter: a filtered SSC matrix;
-fl_filter, fluorescent matrix after screening;
2. screening FSC matrix
The FSC matrix is traversed, if the pulse psci of the current line is present in fsc_large_set, the line is reserved, otherwise the line is deleted.
3. SSC and fluorescent matrix are screened according to the same principle
4. Smoothing filtering
And carrying out smooth filtering on the screened matrix.
FSC_filter=smooth(FSC_filter);
SSC_filter=smooth(SSC_filter);
FL_filter=smooth(FL_filter);
Through the steps, other matrixes are screened by using the FSC large pulse set, and a smooth matrix after screening is obtained, so that preparation is made for the next step of identification.
In the above technical solution, the step of establishing an aggregation matrix by using the obtained screening FSC matrix, the screening SSC pulse matrix and the screening fluorescent pulse matrix specifically includes:
transversely splicing screening smooth matrixes of different channels to form an aggregation matrix;
Normalizing and standardizing the aggregation matrix;
and performing dimension reduction on the normalized and standardized aggregation matrix by adopting PCA to obtain a dimension reduction aggregation matrix fused with different channel information.
Specifically, the implementation steps of S50 are described as follows:
1. Definition of variables
Fusion matrix to be built
2. Matrix aggregation
And performing matrix transverse splicing on the three matrixes according to the rows to obtain fusion_matrix:
3. normalization
Column normalization was performed on fusion_matrix:
normalization can eliminate the effects of different feature magnitudes.
4. Dimension reduction
The dimension of fusion matrix can be reduced using PCA or the like:
So far, the feature matrixes of different channels are aggregated to form an aggregation matrix, and standardized and dimension-reduced processing is carried out to prepare for the next identification.
In the above technical solution, the steps of establishing and training the cell mass analysis model specifically include:
collecting a plurality of historical aggregation matrix samples, and marking cell mass data to serve as training samples;
constructing a machine learning classification model;
and training the machine learning classification model by using a training sample to obtain a cell mass analysis model.
The specific description is as follows:
1. collecting a plurality of historical aggregation matrix samples, and marking;
-collecting a plurality of samples of the aggregation matrix comprising single cells and cell clusters produced in step S50;
-labeling, for each sample, the rows in which the corresponding single cells and cell clusters are identified by means of manual judgment;
2. Construction of classification models
-Taking each matrix row as a sample input;
-taking the annotated class as a label;
-building machine learning classification models, such as SVM, random forest, etc.;
3. Training model
-Partitioning the training set and the validation set;
-training a classification model, optimizing model hyper-parameters;
-evaluating model effects on the validation set;
4. further, it also includes model evaluation
-Evaluating classification performance on an independent test set;
-calculating the index of accuracy, recall, etc.
Further, in the above technical solution, the model used for constructing the machine learning classification model is an SVM or a random forest.
In the above technical solution, the step of deleting the electric pulse signal corresponding to the cell mass in the electric pulse signal set specifically includes:
Traversing an original electric pulse signal set;
Judging whether each pulse is in the identified cell mass collection;
If so, deleting the pulse from the original set;
Retaining the pulse of the non-cell mass to form a filtered output set;
returning the filtered set of electrical pulse signals.
A second aspect of the present invention provides a computer readable storage medium having stored therein program instructions which when executed are adapted to carry out a method of outputting a degranulation signal by a flow cytometer as described above.
A third aspect of the present invention provides a system for outputting a degranulation signal by a flow cytometer, wherein the system comprises the computer readable storage medium.
Specifically, the principle of the invention is as follows:
1. multi-channel information fusion
The flow cytometer may acquire multiple parameters for each cell or cell pellet simultaneously, such as FSC, SSC, fluorescence, etc. The invention fuses the information matrixes of different channels to form an aggregate matrix with comprehensive multiple characteristics. Because certain relations exist among FSC, SSC and fluorescent signals of each cell or cell mass, for example, the cell mass is relatively large in size, the internal structure is more complex, so that certain relations are generated among FSC and SSC, meanwhile, the fluorescent signals of the cell mass can generate different channel values from those of common cells due to relatively more fluorescent dyes of the cell mass, the selected threshold value can be inaccurate due to the fact that the data size is too large by simple manual judgment, and the multi-channel fusion method is utilized to conduct multi-feature extraction analysis by utilizing a polymerization matrix, so that the cell mass signals can be better determined.
The aggregation matrix provides a richer feature describing each event than a single parameter, facilitating subsequent identification and classification. The fusion of multimodal information is an important approach to improve expression.
2. Matrix representation of coded weak changes
The invention organizes the description index of the pulse event, such as pulse shape, time and other information, in a matrix form.
Compared with simple statistics, the matrix saves small changes of the original pulse, especially relations among different channel signals, and provides richer features for cell mass identification.
3. Pre-trained machine learning model
The invention uses a pre-trained machine learning model to carry out classification analysis on the aggregation matrix, and identifies the cell clusters therein.
Compared with a fixed threshold and a manual judgment rule, the machine learning model can learn complex distribution of samples and has stronger generalization capability on unknown data.
4. End-to-end automation flow
From the original signal to the final output, the invention establishes a fully-automatic analysis flow without manual intervention and parameter setting.
Standardized end-to-end flow is critical to achieving automated detection.
5. Accuracy and efficiency are taken into consideration
The invention considers the identification accuracy and the operation efficiency simultaneously, namely, firstly, the information quantity is improved by multi-channel aggregation, and then, the efficiency is ensured by using the algorithm dimension reduction.
The balance of accuracy and efficiency is also an important indicator of an automated detection system.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.