Movatterモバイル変換


[0]ホーム

URL:


US11900615B2 - Tracking device, endoscope system, and tracking method - Google Patents

Tracking device, endoscope system, and tracking method
Download PDF

Info

Publication number
US11900615B2
US11900615B2US17/179,903US202117179903AUS11900615B2US 11900615 B2US11900615 B2US 11900615B2US 202117179903 AUS202117179903 AUS 202117179903AUS 11900615 B2US11900615 B2US 11900615B2
Authority
US
United States
Prior art keywords
representative points
representative
tracking
outlier
representative point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/179,903
Other versions
US20210183076A1 (en
Inventor
Makoto ISHIKAKE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Olympus Corp
Original Assignee
Olympus Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Olympus CorpfiledCriticalOlympus Corp
Assigned to OLYMPUS CORPORATIONreassignmentOLYMPUS CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: ISHIKAKE, Makoto
Publication of US20210183076A1publicationCriticalpatent/US20210183076A1/en
Application grantedgrantedCritical
Publication of US11900615B2publicationCriticalpatent/US11900615B2/en
Activelegal-statusCriticalCurrent
Adjusted expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

A tracking device includes a processor including hardware, and the processor sets a start frame, extracts multiple representative points of a contour of a tracking target, tracks the extracted multiple representative points, performs outlier determination based on an interrelationship of the tracked multiple representative points, performs a process of removing an outlier representative point determined to be an outlier, and extracts new representative points based on multiple representative points after the process of removing the outlier representative point when a given condition is met.

Description

CROSS REFERENCE TO RELATED APPLICATION
This application is a continuation of International Patent Application No. PCT/JP2019/013606, having an international filing date of Mar. 28, 2019, which designated the United States, the entirety of which is incorporated herein by reference.
BACKGROUND
There are conventional methods for estimating a location of a designated target in each frame image included in a video. Such methods are hereinafter referred to as tracking, and the designated target is hereinafter referred to as a tracking target. The tracking can be considered as a method for tracking how the tracking target has moved over multiple frame images.
For example, Japanese Unexamined Patent Application Publication No. 2007-222533 discloses a method for tracking an organ in a medical image by using contour points of the organ.
SUMMARY
In accordance with one of some aspect, a tracking device comprising a processor including hardware,
the processor being configured to:
    • set a start frame to start tracking of a tracking target in a video including multiple frames;
    • extract multiple representative points of a contour of the tracking target in the start frame;
    • track the extracted multiple representative points in frames subsequent to the start frame;
    • perform outlier determination on the tracked multiple representative points based on an interrelationship of the multiple representative points;
    • perform a process of removing an outlier representative point that is a representative point determined to be an outlier; and
    • update the representative points by extracting new representative points based on multiple representative points after the process of removing the outlier representative point, in a case where any frame subsequent to the start frame meets a given condition.
In accordance with one of some aspect, there is provided an endoscope system comprising:
    • a memory that stores a trained model;
    • an endoscopic scope that captures a detection image; and
    • a processor that accepts the detection image as input, and performs a process of detecting a position of a given object from the detection image by using the trained model,
    • the trained model having been trained by machine learning based on training data in which annotation data is associated with a frame image in a video,
    • the annotation data being generated by:
    • acquiring the video including multiple frames;
    • setting a start frame to start tracking of a tracking target;
    • extracting multiple representative points of a contour of the tracking target in the start frame;
    • tracking the extracted multiple representative points in frames subsequent to the start frame;
    • performing outlier determination on the tracked multiple representative points based on an interrelationship of the multiple representative points;
    • performing a process of removing an outlier representative point that is a representative point determined to be an outlier;
    • updating the representative points by extracting new representative points based on multiple representative points after the process of removing the outlier representative point, in a case where any frame subsequent to the start frame meets a given condition; and
    • generating the annotation data in which an inside of a closed curve generated based on the tracked multiple representative points is defined as an annotation region for each frame subsequent to the start frame.
In accordance with one of some aspect, there is provided a tracking method comprising:
    • acquiring a video including multiple frames;
    • setting a start frame to start tracking of a tracking target;
    • extracting multiple representative points of a contour of the tracking target in the start frame;
    • tracking the extracted multiple representative points in frames subsequent to the start frame;
    • performing outlier determination on the tracked multiple representative points based on an interrelationship of the multiple representative points;
    • performing a process of removing an outlier representative point that is a representative point determined to be an outlier; and
    • updating the representative points by extracting new representative points based on multiple representative points after the process of removing the outlier representative point, in a case where any frame subsequent to the start frame meets a given condition.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG.1 is a configuration example of a tracking device.
FIG.2 is a flowchart illustrating processing procedures performed by the tracking device.
FIG.3 is a diagram illustrating a process of extracting representative points based on a tag region.
FIG.4 is a diagram illustrating a tracking process.
FIG.5 is a diagram illustrating a process of removing an outlier representative point.
FIG.6 is a diagram illustrating a process of updating the representative points.
FIG.7 is a diagram illustrating a process of generating the tag region based on the representative points.
FIGS.8A to8C are examples of objects whose positions and shapes are not clearly displayed in images.
FIG.9 is a flowchart illustrating a series of procedures from training data generation to object detection.
FIG.10 is an example of annotation.
FIG.11 is an example of training data generated by the annotation.
FIG.12 is a diagram illustrating automatic tagging by tracking.
FIG.13 is a configuration example of a learning device.
FIG.14 is a flowchart illustrating learning procedures.
FIG.15 is an example of a neural network.
FIG.16 is a configuration example of an endoscope system including an information processing system.
FIGS.17A and17B are diagram illustrating scattering of a region due to tracking errors.
DESCRIPTION OF EXEMPLARY EMBODIMENTS
The following disclosure provides many different embodiments, or examples, for implementing different features of the provided subject matter. These are, of course, merely examples and are not intended to be limiting. In addition, the disclosure may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. Further, when a first element is described as being “connected” or “coupled” to a second element, such description includes embodiments in which the first and second elements are directly connected or coupled to each other, and also includes embodiments in which the first and second elements are indirectly connected or coupled to each other with one or more other intervening elements in between.
Exemplary embodiments are described below. Note that the following exemplary embodiments do not in any way limit the scope of the content defined by the claims laid out herein. Note also that all of the elements described in the present embodiment should not necessarily be taken as essential elements.
1. Overview
Methods for tracking a tracking target in a video are conventionally widely used. For example, implementation of machine learning for recognizing an object in an image requires a large number of images attached with tags. The images attached with the tags are hereinafter referred to as tagged images. Generation of the tagged images needs to be done manually, and thus takes a lot of time. In a case of surgery using an endoscope, as will be described later referring toFIG.16, tagging needs to be done by a surgeon who is skilled in the surgery, however, tagging a large number of images is not easy.
In order to reduce a load for generating the tagged images, there is a method in which a tag generated in a given frame is tracked to tag a new frame using a tracking result. However, an intended tracking target in the present embodiment is a region including a group of pixels in an image.
FIGS.17A and17B are schematic diagrams illustrating a conventional method of region-based tracking.FIG.17A is a tagged image to which a tag is manually attached, for example. A region corresponding to the tag in the image is hereinafter referred to as a tag region. In the conventional method of the region-based tracking, processing is performed for each pixel. As for an example illustrated inFIG.17A, tracking is performed for each of a plurality of pixels in the tag region. When the tracking is continued over multiple frames, tracking errors are accumulated as a number of times of tracking increases, or as time passes in a video.FIG.17B illustrates a result of a predetermined number of times of tracking. As illustrated inFIG.17B, the tag region that is a single continuous region in an original image may be scattered due to an influence of the tracking errors.
Atracking device200 according to the present embodiment extracts a contour of a tag region and tracks a contour line of an extraction result as a target. Then, the tracking device performs mask processing to an inside of a contour line of a tracking result to determine a tag region in a new frame. Thetracking device200 may track all points of the contour line, or some points of the contour line. As a result, scattering of the tag region is suppressed, so that the tag region can be appropriately tracked.
However, even when the contour line is tracked, the tracking errors occur. In view of this, according to the present embodiment, an outlier is removed to suppress the influence of the tracking errors. Moreover, according to the present embodiment, when a number of representative points to be tracked decreases by outlier removal, the representative points used for tracking are re-extracted when a predetermined condition is met. With these methods, tracking accuracy can be further enhanced. The methods according to the present embodiment are described in detail below.
2. Tracking Device
FIG.1 is a diagram illustrating a configuration example of thetracking device200 according to the present embodiment. As illustrated inFIG.1, thetracking device200 includes aframe setting section210, a representativepoint extracting section221, atracking section222, anoutlier removing section223, a representativepoint updating section224, and an annotationdata generating section260. However, thetracking device200 is not limited to the configuration illustrated inFIG.1, and can be implemented in various modified manners, for example, by omitting some of components or adding other components. For example, the annotationdata generating section260 may be omitted.
Thetracking device200 according to the present embodiment includes hardware described below. The hardware may include at least one of a circuit that processes a digital signal and a circuit that processes an analog signal. For example, the hardware may include one or more circuit devices mounted on a circuit board, or one or more circuit elements. The one or more circuit devices are an integrated circuit (IC) or a field programmable gate array (FPGA), for example. The one or more circuit elements are a resistor or a capacitor, for example.
Furthermore, each section of thetracking device200 including theframe setting section210, representativepoint extracting section221, trackingsection222,outlier removing section223, representativepoint updating section224, and annotationdata generating section260 may be implemented by a processor described below. Thetracking device200 includes a memory that stores information, and a processor that operates based on the information stored in the memory. The information includes, for example, a program and various data. The processor includes hardware. The processor may be any one of various processors such as a central processing unit (CPU), a graphics processing unit (GPU), or a digital signal processor (DSP). The memory may be a semiconductor memory such as a static random access memory (SRAM) or a dynamic random access memory (DRAM), a register, a magnetic storage device such as a hard disk drive, or an optical storage device such as an optical disc device. For example, the memory stores a computer-readable instruction. A function of each of the sections of thetracking device200 is implemented as a process when the processor executes the instruction. The instruction used here may be an instruction set that is included in a program, or may be an instruction that instructs a hardware circuit included in the processor to operate. Furthermore, all or part of theframe setting section210, representativepoint extracting section221, trackingsection222,outlier removing section223, representativepoint updating section224, and annotationdata generating section260 may be implemented by cloud computing, so that a video is acquired through a network and a tracking process described later is performed on the cloud computing.
Theframe setting section210 acquires a video and sets a start frame to start tracking. The video used here is a tagged video including some tagged frames, for example.
The representativepoint extracting section221 extracts representative points used for tracking from a contour of a tracking target in the start frame. The contour of the tracking target in the start frame can be obtained from a tagged region in the start frame.
Thetracking section222 tracks the representative points extracted by the representativepoint extracting section221 in frames subsequent to the start frame. As will be described later, when the representative points are updated by the representativepoint updating section224, thetracking section222 tracks the updated representative points.
Theoutlier removing section223 performs outlier determination mutually among the representative points tracked by thetracking section222, and a process of removing an outlier representative point determined to be an outlier.
The representativepoint updating section224 determines whether the representative points need to be updated. When the representativepoint updating section224 determines that the representative points need to be updated, the representativepoint updating section224 extracts new representative points based on a state of remaining representative points in a frame of a processing target after the outlier representative point is removed.
The annotationdata generating section260 performs a process of generating annotation data based on a tracking result for each frame subsequent to the start frame. The annotation data is data in which an inside of a closed curve connecting the tracked representative points is defined as an annotation region, and is metadata provided to an associated frame image. Data including a frame image and annotation data provided to the frame image is used as training data for machine learning, for example.
FIG.2 is a flowchart illustrating a process according to the present embodiment. When this process starts, theframe setting section210 sets a tagged frame as a start frame in a step S101. Theframe setting section210 may automatically set a first frame of a video as the start frame for tracking.
Next, in a step S102, the representativepoint extracting section221 extracts representative points to be tracked from a contour of a tracking target included in the start frame.FIG.3 is a diagram illustrating an extraction process of the representative points. The tag according to the present embodiment is information input by an operator such as a surgeon, and is the annotation data provided to an image as the metadata, as will be described later referring toFIGS.10 and11, for example. The annotation data is a mask image including a tag region set with a first pixel value and a region other than the tag region set with a second pixel value different from the first pixel value, for example.
The representativepoint extracting section221 extracts a contour of the tag region. When information about the tag region is acquired as the mask image as described above, the representativepoint extracting section221 extracts pixels with the first pixel value that are adjacent to pixels with the second pixel value as the contour of the tag region, for example. However, an extraction process of the contour may be implemented in various modified manners, for example, by using a known edge extraction filter.
The representativepoint extracting section221 may select all pixels on the contour as the representative points used for tracking. Also in this case, the tracking target does not need to include pixels inside the tag region. Accordingly, scattering of the region can be suppressed, and a processing load can be reduced. The representativepoint extracting section221 may also extract some pixels on the contour as the representative points. For example, the representativepoint extracting section221 extracts the representative points from the pixels on the contour at a regular interval. For example, the representativepoint extracting section221 extracts twelve representative points such that intervals between adjacent representative points are the same (approximately the same included) as illustrated inFIG.3. A number n (n is an integer of two or more) of representative points to be extracted may be set in advance, and the representativepoint extracting section221 may divide the contour line into n to set n representative points. Alternatively, the interval between adjacent representative points may be set in advance, and the representativepoint extracting section221 may set the representative points according to the interval. In this case, the number of representative points changes depending on a length of the contour line.
As will be described later referring toFIG.7, thetracking device200 according to the present embodiment generates a closed curve connecting the representative points of the tracking result, and defines the inside of the closed curve as the tag region. Therefore, the representative points used for tracking need to be points that can reproduce the contour of the tracking target with rather high accuracy when the representative points are connected. With a contour of a simple shape, information about the contour is unlikely to be lost even when the number of representative points is small. On the contrary, with a contour of a complicated shape, the information about the contour may be lost unless many representative points are set.
The representativepoint extracting section221 may set the representative points based on a curvature of the contour. For example, the representativepoint extracting section221 divides the extracted contour into multiple curves, and obtains the curvature of each of the divided curves. For example, assuming that a curve is approximated by a circle, a radius of the circle is a curvature radius, and a reciprocal of the curvature radius is a curvature. The curvature of the contour may be obtained for each pixel. The representativepoint extracting section221 extracts more representative points from a portion with a large curvature of the contour than from a portion with a small curvature. As a result, a density of the representative points can be regulated according to the shape of the contour, and thus the contour can be appropriately reproduced based on the representative points. That is, the region of the tracking target can be tracked with high accuracy.
After the representative points are extracted in the start frame, thetracking section222 tracks the extracted representative points in a step S103. Specifically, thetracking section222 infers at which position in an image in a subsequent second frame a given representative point in a first frame is present.
FIG.4 is a diagram illustrating a tracking process. The tracking is performed using frame images in two frames. Thetracking section222 extracts a region in a vicinity of a given representative point P1 as a template image TI from a frame image F1 in the first frame. For example, the template image TI is a square image in a predetermined size having the representative point P1 as its center, however, the size and the shape may be implemented in various modified manners. Thetracking section222 performs template matching using the template image TI in a frame image F2 in the second frame as illustrated inFIG.4. Then, thetracking section222 determines a position with a lowest difference degree or a position with a highest matching degree with respect to the template image TI as a point corresponding to the representative point P1. A detection range for the template matching may include the entire frame image F2 or part of the frame image F2. Thetracking section222 performs the process illustrated inFIG.4 for each of the representative points in the first frame to track the representative points. A tracking method may be implemented in various modified manners, such as a tracking method based on luminance or contrast of a frame image at a representative point, or a tracking method by an optical flow.
Next, in a step S104, theoutlier removing section223 performs outlier removal from points after tracking. The representative points according to the present embodiment are points representing the contour of the tracking target. A significant change in shape of the tracking target in the image is unlikely to happen in one frame. A target to be imaged significantly changes when a scene change occurs, as will be described later referring toFIG.12, for example. In such a case, continuation of the tracking is unlikely to be needed. That is, in a scene where the tracking is performed, moving tendencies of multiple representative points have similarity to some extent. When a given representative point obviously moves differently from other representative points, the tracking of the given representative point is likely to be an error.
Theoutlier removing section223 extracts a representative point that moves differently from other representative points as an outlier representative point based on an interrelationship of the representative points. For example, theoutlier removing section223 determines that the given representative point is the outlier representative point when a difference between a moving distance of the given representative point and a moving distance of at least one adjacent representative point exceeds a predetermined value. Alternatively, theoutlier removing section223 determines that the given representative point is the outlier representative point when a distance between the given representative point and at least one adjacent representative point exceeds a predetermined value.
Alternatively, theoutlier removing section223 obtains a curvature of a curve formed by the given representative point and adjacent representative points, and determines that the given representative point is the outlier representative point when the obtained curvature exceeds a predetermined value. The adjacent representative points used here are two representative points adjacent to the given representative point in a direction along the contour line, i.e., the representative points on both sides of the given representative point. However, the adjacent representative points may be implemented in various modified manners, for example, by adding a representative point other than the adjacent two representative points. As a result, determination of a deviation degree of the representative point enables appropriate removal of the outlier representative point.
FIG.5 is a diagram illustrating an outlier removal process. A moving distance of a representative point indicated by P2 inFIG.5 is larger than moving distances of representative points P3 and P4 adjacent to the representative point P2. Alternatively, as for a curve passing through the representative points P2 to P4, the curvature of the curve is large. The curvature may be compared with a given fixed value, a curvature in the first frame, or a curvature of another representative point in the second frame. In any case, the curvature corresponding to the representative point P2 is determined to be large. Accordingly, theoutlier removing section223 removes the representative point P2.
As a result of processing in the steps S103 and S104, representative points excluding the inappropriate representative point can be acquired with high accuracy in the second frame subsequent to the first frame of a tracking source. The tracking process illustrated inFIG.4 is performed for each of the multiple representative points after the outlier removal in the second frame, so that the tracking can be continued in a third frame subsequent to the second frame and after. The outlier removal process may be performed for each frame, or for every series of tracking in a predetermined multiple number of frames.
However, according to the present embodiment, the representativepoint updating section224 determines whether the representative points need to be updated in a step S105 in order to perform the tracking with accuracy. As described above, in the method according to the present embodiment, the representative point determined to be the outlier representative point is removed, and the number of representative points may decrease. When the number of remaining representative points becomes excessively small, reproducing the contour of the tracking target with the remaining representative points is difficult. As a result, tracking accuracy is degraded. In view of this, the representativepoint updating section224 determines that the representative points need to be updated when the number of representative points becomes smaller than a predetermined number.
FIG.6 is a diagram illustrating an update process of the representative points. In a step S106, the representativepoint updating section224 first connects an all point group of the remaining representative points after the outlier removal to generate a closed curve. The representativepoint updating section224 performs known spline interpolation to generate the closed curve, for example. However, there are various known methods for generating a closed curve with multiple points, and these methods are widely applicable to the present embodiment.
Next, in a step S108, the representativepoint updating section224 re-extracts representative points from the generated closed curve. Meanwhile, a purpose of the update process of the representative points is to continue the tracking with accuracy. Thus, in the flowchart inFIG.2, whether to terminate the tracking is first determined in a step S107, and then re-extraction of the representative points is performed when the tracking is not terminated.
A re-extraction process of the representative points is the same as the extraction process of the representative points from the contour in the start frame. That is, the representativepoint updating section224 may extract the representative points from pixels on the closed curve at a regular interval, or may change the density of the representative points according to the curvature of the closed curve. At this time, the representative points to be newly extracted do not need to coincide with original representative points. For example, when the closed curve is generated from eleven representative points and twelve representative points are re-extracted as illustrated inFIG.6, it is not necessary to keep the original eleven representative points to add one representative point. All the twelve representative points may be newly selected. This is because the method according to the present embodiment is for tracking the contour of the tracking target, and not for considering positions of the representative points on the contour.
The representativepoint updating section224 may determine that the representative points need to be updated when reliability of a tracking result becomes lower than a predetermined value. The reliability of the tracking result is, for example, a lowest value of the difference degree or a highest value of the matching degree of the template matching. The difference degree is a sum of squared difference (SSD), or a sum of absolute difference (SAD), for example. The reliability is determined to be low when the lowest value is equal to or higher than a predetermined threshold value. The matching degree is a normalized cross correlation (NCC), for example. The reliability is determined to be low when the highest value is equal to or lower than a predetermined threshold value. The update of the representative points can update the template image for the template matching. As a result, the update of the representative points can enhance tracking accuracy.
Considering that the representative points are refreshed when the tracking accuracy is degraded, the representativepoint updating section224 may determine that the representative points need to be updated when the tracking is performed in a predetermined number of frames, i.e., when given time passes. When the tracking is continued over multiple frames, tracking errors are accumulated. By setting passage of the given time as a determination condition, the representative points can be updated when the tracking accuracy is likely to be degraded.
When the representative points do not need to be updated (No in the step S105), or after the update of the representative points (after processing in the step S108), the process returns to the step S103 to continue. Thetracking section222 performs the tracking for one frame based on the representative points in a latest frame. Processing after that is the same, i.e., the outlier removal process is performed, the update process of the representative points is performed as needed, and the tracking is performed in a subsequent frame based on the results of the processes.
When the tracking is determined to be terminated (Yes in the step S107), thetracking device200 performs a generation process of a tag region in a step S109.FIG.7 is a diagram illustrating the generation process of the tag region. Specifically, a closed curve connecting the representative points is generated in each frame, and a process of generating the annotation data in which the inside of the closed curve is defined as the tag region is performed. The generation process of the closed curve is the same as the process in the step S106. The representative points in each frame are the representative points after the removal process of the outlier representative points. As for a frame applied with the update process of the representative points, the closed curve of a processing result in the step S106 may be used.
The tag region according to the present embodiment may be the metadata (annotation data) provided to the image. In this case, the process illustrated inFIG.7 is performed by the annotationdata generating section260. The annotation data generated by the annotationdata generating section260 is a mask image for identifying the tag region, for example.
As described above, thetracking device200 according to the present embodiment includes theframe setting section210, the representativepoint extracting section221, thetracking section222, theoutlier removing section223, and the representativepoint updating section224. Theframe setting section210 sets the start frame to start the tracking of the tracking target in the video including the multiple frames. The representativepoint extracting section221 extracts the multiple representative points of the contour of the tracking target in the start frame. Thetracking section222 tracks the extracted multiple representative points in the frames subsequent to the start frame. Theoutlier removing section223 performs the outlier determination on the multiple representative points, tracked by thetracking section222, based on the interrelationship of the multiple representative points, and removes the outlier representative point that is the representative point determined to be the outlier. The representativepoint updating section224 updates the representative points by extracting new representative points based on the multiple representative points after the process of removing the outlier representative point when any frame subsequent to the start frame meets a given condition.
According to the method in the present embodiment, the representative points are extracted from the contour, and the tracking is performed based on the representative points. Tracking the contour can suppress occurrence of variation of pixels. As a result, the region-based tracking can be appropriately performed. Furthermore, since the tracking of the pixels inside the region can be omitted, high-speed processing can be implemented. At this time, the outlier determination is performed to remove the inappropriate representative point from the tracking result, so that the tracking accuracy can be enhanced. Since the representative points are all set on the contour, the outlier representative points can be appropriately detected using the interrelationship of the representative points. Furthermore, since the representative points are updated, the tracking can be implemented with accuracy even when the outlier representative point is removed. Specifically, updating the representative points enables implementation of an identifying process of the contour of the tracking target from the representative points with high accuracy.
Furthermore, the representativepoint extracting section221 may set a tag region, which is a region tagged to the start frame, as the tracking target. The representativepoint extracting section221 extracts multiple representative points of a contour of the tag region. As a result, the tracking can be appropriately performed with the tag region as the target. The region to be tagged may be an object whose position and shape are not clearly captured in the image, as will be described later referring toFIGS.8A to8C. Tagging such an object is not easy unless it is performed by an expert. However, it can be efficiently performed by tracking.
Furthermore, the representativepoint extracting section221 may extract the multiple representative points such that adjacent representative points are spaced apart at a given interval on the contour of the tracking target. As a result, the representative points can be efficiently set.
Furthermore, the representativepoint extracting section221 may extract the multiple representative points such that, on the contour of the tracking target, a density of representative points at a portion with a large curvature of the contour is higher than a density of representative points at a portion with a small curvature of the contour. The density used here is a number of representative points set for each unit length of the contour. Accordingly, the representative points can be set by taking the shape of the contour in consideration. As a result, the contour of the tracking target can be appropriately reproduced based on the representative points.
Furthermore, theoutlier removing section223 may determine a deviation degree of a first representative point of the multiple representative points based on the first representative point and one or more adjacent representative points adjacent in a direction along the contour so as to determine whether the first representative point is the outlier representative point. Specifically, theoutlier removing section223 determines the deviation degree of the first representative point based on relative distance information between the first representative point and the one or more adjacent representative points. The relative distance information may be information about a distance between the first representative point and at least one adjacent representative point. Alternatively, the relative distance information may be information about a relationship between a moving distance of the first representative point between frames and a moving distance of at least one adjacent representative point between the frames. Furthermore, theoutlier removing section223 determines the deviation degree of the first representative point based on a curvature of a curve formed by the first representative point and multiple adjacent representative points. As a result, a representative point that is highly likely to be an error in tracking can be removed as the outlier representative point based on a relative relationship between a given representative point and one or more surrounding representative points.
Furthermore, the representativepoint updating section224 extracts new representative points based on remaining multiple representative points after the process of removing the outlier representative point, when the number of representative points becomes equal to or smaller than a given number threshold value due to the process of removing the outlier representative point. When the outlier representative point is removed, processing accuracy can be enhanced from a viewpoint of removing the inappropriate representative point from the processing. However, this decreases the number of representative points. When the number of representative points excessively decreases, reproducing the contour of the tracking target based on the representative points is difficult. As a result, tracking accuracy is degraded. According to the method in the present embodiment, the representative points can be updated while an enough number of representative points remain to reproduce the contour with sufficient accuracy. This can suppress degradation of accuracy due to outlier removal. In other words, combined with the update process of the representative points, the outlier removal process can appropriately contribute to accuracy enhancement.
Furthermore, the representativepoint updating section224 may extract the new representative points based on the multiple representative points after the process of removing the outlier representative point, when the reliability of the tracking result is equal to or lower than a given reliability threshold value. Alternatively, the representativepoint updating section224 may extract the new representative points based on the multiple representative points after the process of removing the outlier representative point at a given time interval. As a result, since the representative points are refreshed when the tracking accuracy may be degraded, the tracking accuracy can be enhanced.
Furthermore, the representativepoint updating section224 may generate a closed curve based on the multiple representative points after the process of removing the outlier representative point, and extract the new representative points from the generated closed curve. With such a closed curve, the new representative points also become points corresponding to the contour of the tracking target. As a result, the region of the tracking target can be tracked appropriately even when the representative points are updated.
Furthermore, thetracking device200 may include the annotationdata generating section260. The annotationdata generating section260 generates the annotation data that the inside of the closed curve generated based on the tracked multiple representative points is defined as the annotation region for each frame subsequent to the start frame. More specifically, the annotationdata generating section260 generates the annotation data in which the inside of the closed curve generated based on the multiple representative points after the process of removing the outlier representative point is defined as the annotation region. As a result, the annotationdata generating section260 can provide the metadata capable of identifying the region of the tracking target to each frame of the video. The annotation data is used as training data for machine learning as described later, for example.
Furthermore, the processes performed by thetracking device200 according to the present embodiment may be implemented as a tracking method. The tracking method includes steps of acquiring the video including the multiple frames, setting the start frame to start the tracking of the tracking target, extracting the multiple representative points of the contour of the tracking target in the start frame, tracking the extracted multiple representative points in the frames subsequent to the start frame, performing the outlier determination based on the interrelationship of the tracked multiple representative points, removing the outlier representative point that is the representative point determined to be the outlier, and updating the representative points by extracting the new representative points based on the multiple representative points after the process of removing the outlier representative point when any frame subsequent to the start frame meets the given condition.
3. Endoscope System, Learning Device, and Trained Model
Output of thetracking device200 described above may be used for machine learning. For example, in an endoscopic surgery, an operator may have difficulty to recognize an object whose position and shape are not clearly displayed in an image. For example, the operator follows procedures using a predetermined landmark as a guide in the endoscopic surgery, however, a position and shape of the landmark may not be clearly displayed in the image. At this time, an inexperienced surgeon may not be able to recognize this indistinct landmark. The position and shape are a position and a shape.
FIGS.8A to8C illustrate examples of objects whose positions and shapes are not clearly displayed in images. The objects inFIGS.8A,8B, and8C are a common bile duct, a cystic duct, and a Rouviere's sulcus, respectively.FIGS.8A to8C are schematic diagrams and do not show accurate shapes of an actual organ or tissue. The same applies toFIG.10 and after.
FIGS.8A and8B illustrate examples of a state where the object is covered with the organ or tissue. In this case, even when the object is in an angle of view, the object itself is not displayed in the image, or the position and shape of the object are not clear.FIG.8C illustrates an example of a state where the object is exposed in the image and visually recognizable, but a boundary of the object is not distinct. As illustrated inFIG.8C, in an endoscope image of laparoscopic cholecystectomy, the Rouviere's sulcus is visually recognizable and a start portion of the sulcus is comparatively distinct. However, the sulcus gradually disappears toward an end portion of the sulcus, and the boundary of the Rouviere's sulcus becomes indistinct.
The common bile duct, cystic duct, and Rouviere's sulcus, and an S4 inferior border described later are the landmarks in the laparoscopic cholecystectomy. The landmark is a guide used for following the procedures of the surgery. According to the present embodiment, these landmarks are annotated as the objects to generate the training data, and the training data is used for machine learning.
FIG.9 is a flowchart illustrating a series of procedures from generation of the training data to detection of the object according to the present embodiment.
Steps S1 and S2 are steps for generating the training data. In the step S1, an operator tags a predetermined frame image in a surgery video. The operator is a surgeon skilled in a target surgery, for example. As will be described later, the predetermined frame image is a first frame image after a scene change in the video. Next, in the step S2, thetracking device200 tracks a tagged region to generate the training data. Details of the tracking method are as described above. Each frame image tagged in the steps S1 and S2 in the surgery video is the training image. Tagging an image is referred to as annotation.
A step S4 is a learning step. That is, a learning device performs machine learning using the training data generated in the steps S1 and S2. A trained model obtained by this machine learning is stored in astorage section7 of aninformation processing system10 described later.
A step S5 is a step for inferring by learned artificial intelligence (AI). That is, aprocessing section4 of theinformation processing system10 detects an object from a detection image based on the trained model stored in thestorage section7. Theprocessing section4 displays information about the detected object on the detection image.
Next, a method for generating the training data is described. In order to generate the training data, the annotation indicating the position and shape of the object is attached to the training image including the object whose position and shape are not clearly displayed in the image in an angle of view. “Whose position and shape are not clearly displayed in the image” means a state that the position and shape of the object can not be identified by a method for detecting the boundary based on the luminance or contrast.
As for the landmarks whose positions and shapes are not clearly displayed in the image, described above, an operator identifies the positions and shapes in the image based on tacit knowledge to provide the positions and shapes as the annotation data. The operator who performs the annotation is a surgeon who has plenty of tacit knowledge of the laparoscopic cholecystectomy, for example.
FIG.10 illustrates an example of the annotation. A training image before the annotation includes a liver KZ, a gallbladder TNN, and treatment tools TL1 and TL2. An angle of view of this training image includes a common bile duct, a cystic duct, a Rouviere's sulcus, and an S4 inferior border. InFIG.10, solid lines in a right lobe of the liver represent a start portion (a comparatively distinct portion) of the Rouviere's sulcus, and broken lines represent a state where the Rouviere's sulcus gradually disappears toward an end portion of the sulcus. A broken line near a lower edge inside a left lobe of the liver represents a region of the S4 inferior border that is an object visually recognizable in the image, but having an indistinct boundary.
The operator performing the annotation identifies the common bile duct, cystic duct, Rouviere's sulcus, and S4 inferior border from the training image and tags each of them. The training image after the annotation is attached with a tag TGA representing the common bile duct, a tag TGB representing the cystic duct, a tag TGC representing the Rouviere's sulcus, and a tag TGD representing the S4 inferior border. For example, the operator specifies a region of the common bile duct and so on using a pointing device such as a mouse or a touch panel. The learning device tags the region specified by the operator in the training image.
FIG.11 illustrates an example of the training data generated by the annotation. As illustrated inFIG.11, flags are set to pixels in the tagged regions. Map data including flagged pixels is hereinafter referred to as flag data (annotation data). The flag data is generated for each of the tags TGA to TGD. That is, the training data includes the training image and four layers of the flag data generated by tagging the training image.
FIG.12 is a diagram illustrating automatic tagging by tracking.FIG.12 illustrates frame images of a video captured by anendoscopic scope2. Each of the frame images is the training image. Predetermined frame images F1 and Fx+1 are selected from the video. x is an integer of one or more. The predetermined frame images F1 and Fx+1 may be selected by the operator, or by the learning device through scene detection by image processing, for example. The operator tags the selected predetermined frame images F1 and Fx+1.
The predetermined frame images F1 and Fx+1 are frame images when a surgical procedure changes, when brightness of the video changes, when deviation between frames largely changes, or when an object to be imaged changes, for example.
Frame images subsequent to the tagged predetermined frame images are tagged by tracking. Assume that the operator tags the frame image F1 with a tag TGE1. Assuming that a scene change occurs between a frame image Fx and the frame image Fx+1, frame images F2 to Fx are targets to be tagged by tracking. For example, between the frame image F1 and the frame image F2, the tag TGE1 is tracked to acquire a tag TGE2 for the frame image F2. Specifically, as described above, the tag TGE2 is acquired by the respective processes such as extraction of the representative points, tracking of the representative points, outlier removal, generation of the closed curve, and generation of the tag region. Similarly, tags TGE3 to TGEx are generated for frame images F3 to Fx.
Similarly, assume that the operator tags the frame image Fx+1 after the scene change with a tag TGF1. Similarly as above, tags TGF2 to TGFy are attached to frame images Fx+2 to Fx+y by tracking. y is an integer of one or more.
FIG.13 is a configuration example of alearning device50. Thelearning device50 includes thetracking device200, aprocessing section51, astorage section52, anoperation section53, and adisplay section54. For example, thelearning device50 is an information processing device such as a personal computer (PC). Theprocessing section51 is a processor such as a CPU. Theprocessing section51 performs the machine learning of a training model to generate a trained model. Thestorage section52 is a storage device such as a semiconductor memory, or a hard disk drive. Theoperation section53 includes various operation input devices such as a mouse, a touch panel, or a keyboard. Thedisplay section54 is a display device such as a liquid crystal display. AlthoughFIG.13 illustrates an example that thelearning device50 includes thetracking device200, thelearning device50 and thetracking device200 may be separate devices.
FIG.14 is a flowchart illustrating learning procedures. The annotation data generated by thetracking device200 is associated with the training image and is stored in thestorage section52 as the training data.
The machine learning according to the present embodiment may use a neural network.FIG.15 is a schematic diagram illustrating the neural network. The neural network includes an input layer that accepts input data, an intermediate layer that calculates based on output from the input layer, and an output layer that outputs data based on output from the intermediate layer.FIG.15 illustrates an example of a network including two intermediate layers, however, a number of intermediate layers may be one, or three or more. In addition, a number of nodes (neurons) included in each layer is not limited to a number in the example illustrated inFIG.15, and can be modified in various manners. In view of accuracy, it is preferable to perform deep-layered learning (deep learning) using a neural network including multiple layers in the present embodiment. The multiple layers used here means four layers or more in a narrow sense.
As illustrated inFIG.15, nodes included in a given layer are connected to nodes in an adjacent layer. Each connection is set with a weight. Each node multiplies output from previous nodes by the weights to obtain a total value of multiplication results. In addition, the node further adds a bias to the total value, and applies an activation function to an addition result to obtain output of the node. This process is sequentially performed from the input layer to the output layer to obtain output of the neural network. Learning by the neural network is a process of determining an appropriate weight (bias included). There are various known methods of learning such as an error inverse propagation method, and these methods are widely applicable to the present embodiment.
More specifically, the neural network according to the present embodiment is a convolutional neural network (CNN) suitable for image recognition processing. The CNN includes a convolution layer that performs a convolution operation and a pooling layer. The convolution layer is a layer that performs filter processing. The pooling layer is a layer that performs a pooling operation for reducing sizes in a vertical direction and a lateral direction. An output layer of the CNN is a widely known softmax layer, for example. Specific configurations of the CNN may be implemented in various modified manners as to a number of convolution layers, a number of pooling layers, a mode of the output layer, or the like. The weight of the convolution layer of the CNN is a parameter of a filter. That is, learning by the CNN includes learning of a filter used for the convolution operation. The neural network including the CNN is a widely known method and any further detailed description is omitted. The machine learning according to the present embodiment is not limited to the method using the neural network. For example, as for the method of the machine learning according to the present embodiment, machine learning using various widely known methods, such as a support vector machine (SVM), is applicable. In addition, machine learning using methods that are improvements of these methods is also applicable.
In a step S11, theprocessing section51 reads out the training data from thestorage section52. For example, one training image and corresponding flag data are read out for one inference. However, a plurality of training images and corresponding flag data may be used for one inference.
In a step S12, theprocessing section51 infers a position and shape of an object, and outputs a result. That is, theprocessing section51 inputs the training image to the neural network. Theprocessing section51 performs an inference process by the neural network to output flag data indicating the position and shape of the object.
In a step S13, theprocessing section51 compares the inferred position and shape with the position and shape indicated by the annotation, and calculates an error based on a comparison result. That is, theprocessing section51 calculates an error between the flag data output from the neural network and the flag data of the training data.
In a step S14, theprocessing section51 adjusts a model parameter of the training model to reduce the error. That is, theprocessing section51 adjusts a weight coefficient or the like between the nodes of the neural network based on the error obtained in the step S13.
In a step S15, theprocessing section51 determines whether parameter adjustment is completed a predetermined number of times. When the parameter adjustment is not completed the predetermined number of times, theprocessing section51 performs the steps S11 to S15 again. When the parameter adjustment is completed the predetermined number of times, theprocessing section51 terminates the learning process as described in a step S16. Alternatively, theprocessing section51 determines whether the error obtained in the step S13 becomes equal to or lower than a predetermined value. When the error is not equal to or lower than the predetermined value, theprocessing section51 performs the steps S11 to S15 again. When the error becomes equal to or lower than the predetermined value, theprocessing section51 terminates the learning process as described in the step S16. As a result of the process described above, the trained model is output as a learning result.
FIG.16 is a configuration example of theinformation processing system10, and anendoscope system100 including theinformation processing system10. Theinformation processing system10 is a inference device that performs the inference process using the trained model. Theendoscope system100 includes aprocessor unit1, theendoscopic scope2, and adisplay section3. Theendoscope system100 may further include anoperation section9.
Theendoscopic scope2 includes an imaging device on its distal end portion that is inserted into an abdominal cavity. The imaging device captures an image in the abdominal cavity, and captured image data is transmitted from theendoscopic scope2 to theprocessor unit1.
Theprocessor unit1 is a device that performs various processes in theendoscope system100. For example, theprocessor unit1 performs control of theendoscope system100 and image processing. Theprocessor unit1 includes a captured imagedata receiving section8 that receives the captured image data from theendoscopic scope2, and theinformation processing system10 that detects an object from the captured image data based on the trained model.
The captured imagedata receiving section8 is a connector to which a cable of theendoscopic scope2 is connected, or an interface circuit that receives the captured image data, for example.
Theinformation processing system10 includes thestorage section7 that stores the trained model, and theprocessing section4 that detects the object from the image based on the trained model stored in thestorage section7.
Thestorage section7 is a storage device such as a semiconductor memory, a hard disk drive, or an optical disk drive. Thestorage section7 stores the trained model in advance. Alternatively, a trained model may be input to theinformation processing system10 via a network from an external device such as a server so as to be stored in thestorage section7.
Theprocessing section4 includes adetection section5 that detects the object from the image by the inference based on the trained model, and an output section6 that superimposes information about the object on the image based on a detection result, and causes thedisplay section3 to display a resultant. There may be various types of hardware that performs the inference based on the trained model. For example, thedetection section5 is a general purpose processor such as a CPU. In this case, thestorage section7 stores a program including an inference algorithm and a parameter used for the inference algorithm as the trained model. Alternatively, thedetection section5 may be a single purpose processor implementing the inference algorithm as hardware. In this case, thestorage section7 stores the parameter used for the inference algorithm as the trained model. The inference algorithm may use the neural network. In this case, the weight coefficient of the connection between the nodes in the neural network is the parameter.
Thedetection section5 inputs the detection image captured by theendoscopic scope2 to the trained model. Thedetection section5 performs a detection process based on the trained model to detect the position and shape of the object in the detection image. That is, the detection result is output as detected flag data. The detected flag data is a flag map including pixels set with flags corresponding to the detected position and shape of the object. For example, similarly to the training data described referring toFIG.11, four layers of the detected flag data corresponding to the respective objects are output.
Thedisplay section3 is a monitor to display the image output from the output section6, and is a display device such as a liquid crystal display, or an organic electroluminescence (EL) display.
Theoperation section9 is a device used by the operator for operating theendoscope system100. For example, theoperation section9 includes a button, a dial, a foot switch, or a touch panel. As will be described later, the output section6 may change a display mode of the object based on input information from theoperation section9.
In the above description, theinformation processing system10 is included in theprocessor unit1, however, theinformation processing system10 may partially or entirely disposed outside theprocessor unit1. For example, thestorage section7 and thedetection section5 may be implemented by an external processing device such as a PC or a server. In this case, the captured imagedata receiving section8 transmits the captured image data to the external processing device via a network or the like. The external processing device transmits information about the detected object to the output section6 via the network or the like. The output section6 superimposes the received information on the image and causes thedisplay section3 to display a resultant.
The method according to the present embodiment is applicable to the trained model that causes a computer to function to accept the detection image as input, perform the process of detecting the position of the given object from the detection image, and output the detection result. The trained model has been learned by the machine learning based on the training data in which the annotation data generated by the tracking method described above is associated with the frame image included in the video. The frame image associated with the annotation data may include all frames included in the video. However, the method according to the present embodiment is not limited to this, and the frame image associated with the annotation data may include some frames in the video. In this case, the machine learning is performed using the frame images associated with annotation data.
According to the tracking method in the present embodiment, the tracking is accurately performed with respect to the tagged region attached to the object in the video, so that highly accurate annotation data can be generated. As a result, performing the machine learning using the annotation data as the training data enables generation of the trained model that can implement a highly accurate detection process.
Furthermore, the method according to the present embodiment is applicable to theendoscope system100 including thestorage section7 that stores the trained model described above, theendoscopic scope2 that captures the detection image, and theprocessing section4 that performs the process of detecting the position of the given object from the detection image based on the trained model.
As a result, a desired object can be accurately detected from the detection image. Specifically, the machine learning is performed using the training data including the annotation data attached to the object whose position and shape are not clearly displayed in the image, so that detection of the object based on the tacit knowledge of the skilled surgeon or the like can also be implemented. At this time, since the training data can be generated by tracking, a load of the surgeon or the like for the annotation can be reduced.
Although the embodiments to which the present disclosure is applied and the modifications thereof have been described in detail above, the present disclosure is not limited to the embodiments and the modifications thereof, and various modifications and variations in components may be made in implementation without departing from the spirit and scope of the present disclosure. The plurality of elements disclosed in the embodiments and the modifications described above may be combined as appropriate to implement the present disclosure in various ways. For example, some of all the elements described in the embodiments and the modifications may be deleted. Furthermore, elements in different embodiments and modifications may be combined as appropriate. Thus, various modifications and applications can be made without departing from the spirit and scope of the present disclosure. Any term cited with a different term having a broader meaning or the same meaning at least once in the specification and the drawings can be replaced by the different term in any place in the specification and the drawings.

Claims (10)

What is claimed is:
1. A tracking device comprising a processor including hardware,
the processor being configured to:
set a start frame to start tracking of a tracking target in a video including multiple frames;
extract multiple representative points of a contour of the tracking target in the start frame;
track the extracted multiple representative points in frames subsequent to the start frame;
perform outlier determination on the tracked multiple representative points based on an interrelationship of the multiple representative points;
perform a process of removing an outlier representative point that is a representative point determined to be an outlier;
update the representative points by extracting new representative points based on multiple representative points after the process of removing the outlier representative point, in a case where any frame subsequent to the start frame meets a given condition; and
determines a deviation degree of a first representative point of the multiple representative points based on any one of relative distance information between the first representative point and one or more of the representative points adjacent in a direction along the contour, and a curvature of a curve formed by the first representative point and some of the representative points adjacent in the direction along the contour, so as to determine whether the first representative point is the outlier representative point.
2. The tracking device as defined inclaim 1, wherein
the processor extracts multiple representative points of a contour of a tag region tagged as the tracking target to the start frame.
3. The tracking device as defined inclaim 1, wherein
the processor extracts the multiple representative points from the contour of the tracking target such that adjacent representative points are spaced apart at a given interval.
4. The tracking device as defined inclaim 1, wherein
the processor extracts the multiple representative points such that, on the contour of the tracking target, a density of representative points at a portion with a large curvature of the contour is higher than a density of representative points at a portion with a small curvature of the contour.
5. The tracking device as defined inclaim 1, wherein
the processor extracts the new representative points based on the multiple representative points after the process of removing the outlier representative point, when a number of representative points becomes equal to or smaller than a given number threshold value due to the process of removing the outlier representative point and generates a closed curve based on the multiple representative points after the process of removing the outlier representative point, and extracts the new representative points from the generated closed curve.
6. The tracking device as defined inclaim 1, wherein
the processor extracts the new representative points based on the multiple representative points after the process of removing the outlier representative point, when reliability of a tracking result becomes equal to or lower than a given reliability threshold value.
7. The tracking device as defined inclaim 1, wherein
the processor extracts the new representative points based on the multiple representative points after the process of removing the outlier representative point at a given time interval.
8. The tracking device as defined inclaim 1, wherein
the processor generates annotation data in which an inside of a closed curve generated based on the tracked multiple representative points is defined as an annotation region for each frame subsequent to the start frame.
9. An endoscope system comprising:
a memory that stores a trained model;
an endoscopic scope that captures a detection image; and
a processor that accepts the detection image as input, and performs a process of detecting a position of a given object from the detection image by using the trained model,
the trained model having been trained by machine learning based on training data in which annotation data is associated with a frame image in a video,
the annotation data being generated by:
acquiring the video including multiple frames;
setting a start frame to start tracking of a tracking target;
extracting multiple representative points of a contour of the tracking target in the start frame;
tracking the extracted multiple representative points in frames subsequent to the start frame;
performing outlier determination on the tracked multiple representative points based on an interrelationship of the multiple representative points;
performing a process of removing an outlier representative point that is a representative point determined to be an outlier;
updating the representative points by extracting new representative points based on multiple representative points after the process of removing the outlier representative point, in a case where any frame subsequent to the start frame meets a given condition;
determining a deviation degree of a first representative point of the multiple representative points based on any one of relative distance information between the first representative point and one or more of the representative points adjacent in a direction along the contour, and a curvature of a curve formed by the first representative point and some of the representative points adjacent in the direction along the contour, so as to determine whether the first representative point is the outlier representative point; and
generating the annotation data in which an inside of a closed curve generated based on the tracked multiple representative points is defined as an annotation region for each frame subsequent to the start frame.
10. A tracking method comprising:
acquiring a video including multiple frames;
setting a start frame to start tracking of a tracking target;
extracting multiple representative points of a contour of the tracking target in the start frame;
tracking the extracted multiple representative points in frames subsequent to the start frame;
performing outlier determination on the tracked multiple representative points based on an interrelationship of the multiple representative points;
performing a process of removing an outlier representative point that is a representative point determined to be an outlier;
updating the representative points by extracting new representative points based on multiple representative points after the process of removing the outlier representative point, in a case where any frame subsequent to the start frame meets a given condition; and
determining a deviation degree of a first representative point of the multiple representative points based on any one of relative distance information between the first representative point and one or more of the representative points adjacent in a direction along the contour, and a curvature of a curve formed by the first representative point and some of the representative points adjacent in the direction along the contour, so as to determine whether the first representative point is the outlier representative point.
US17/179,9032019-03-282021-02-19Tracking device, endoscope system, and tracking methodActive2040-07-26US11900615B2 (en)

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
PCT/JP2019/013606WO2020194663A1 (en)2019-03-282019-03-28Tracking device, pretained model, endoscope system, and tracking method

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
PCT/JP2019/013606ContinuationWO2020194663A1 (en)2019-03-282019-03-28Tracking device, pretained model, endoscope system, and tracking method

Publications (2)

Publication NumberPublication Date
US20210183076A1 US20210183076A1 (en)2021-06-17
US11900615B2true US11900615B2 (en)2024-02-13

Family

ID=72610337

Family Applications (1)

Application NumberTitlePriority DateFiling Date
US17/179,903Active2040-07-26US11900615B2 (en)2019-03-282021-02-19Tracking device, endoscope system, and tracking method

Country Status (3)

CountryLink
US (1)US11900615B2 (en)
JP (1)JP7105369B2 (en)
WO (1)WO2020194663A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR102387978B1 (en)*2020-05-182022-04-18이미진Electronic commerce integrated meta-media generating method, distributing system and method for the electronic commerce integrated meta-media
CN117119967A (en)*2021-03-222023-11-24富士胶片株式会社Medical image processing device, endoscope system, medical image processing method, and medical image processing program
JP2023082590A (en)*2021-12-022023-06-14キヤノン株式会社Image processing apparatus and image processing method

Citations (36)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5764283A (en)*1995-12-291998-06-09Lucent Technologies Inc.Method and apparatus for tracking moving objects in real time using contours of the objects and feature paths
US5999651A (en)*1997-06-061999-12-07Matsushita Electric Industrial Co., Ltd.Apparatus and method for tracking deformable objects
US6259802B1 (en)*1997-06-302001-07-10Siemens Corporate Research, Inc.Object tracking technique using polyline contours
JP2002230548A (en)*2001-01-302002-08-16Namco Ltd Pattern recognition processing device, method and pattern recognition program
US6546117B1 (en)*1999-06-102003-04-08University Of WashingtonVideo object segmentation using active contour modelling with global relaxation
JP2003250804A (en)2002-03-052003-09-09Toshiba Corp Image processing apparatus and ultrasonic diagnostic apparatus
US6678416B1 (en)*2000-02-082004-01-13University Of WashingtonDetecting and segmenting local deformation in a tracked video object
WO2004081875A2 (en)2003-03-072004-09-23Siemens Corporate Research Inc.System and method for tracking a global shape of an object in motion
JP2005160688A (en)2003-12-022005-06-23Hitachi Medical CorpImage diagnosis apparatus
US6937760B2 (en)*2000-12-282005-08-30University Of WashingtonInteractive frame segmentation with dynamic programming
US20060008138A1 (en)*2004-06-212006-01-12Zhou Xiang SSystem and method for 3D contour tracking of anatomical structures
US20060262960A1 (en)*2005-05-102006-11-23Francois Le ClercMethod and device for tracking objects in a sequence of images
JP2007222533A (en)2006-02-272007-09-06Aloka Co LtdUltrasonic diagnostic apparatus and ultrasonic image processing method
US20080100709A1 (en)*2006-10-272008-05-01Matsushita Electric Works, Ltd.Target moving object tracking device
JP2009037518A (en)*2007-08-032009-02-19Nippon Telegr & Teleph Corp <Ntt> Representative point tracking method
CN101477690A (en)*2008-12-302009-07-08清华大学Method and device for object contour tracking in video frame sequence
US20110052071A1 (en)*2009-09-032011-03-03Canon Kabushiki KaishaImage processing apparatus, image processing method, and program
US20110158474A1 (en)*2009-12-312011-06-30Indian Institute Of Technology BombayImage object tracking and segmentation using active contours
US20120114173A1 (en)*2008-09-042012-05-10Sony Computer Entertainment Inc.Image processing device, object tracking device, and image processing method
US20140010409A1 (en)*2011-03-102014-01-09Omron CorporationObject tracking device, object tracking method, and control program
US20160055648A1 (en)*2014-08-222016-02-25Canon Kabushiki KaishaNon-uniform curve sampling method for object tracking
JP2016055040A (en)2014-09-112016-04-21日立アロカメディカル株式会社Ultrasonic diagnostic device
CN105761277A (en)*2016-02-012016-07-13西安理工大学Moving target tracking method based on optical flow
US9478033B1 (en)*2010-08-022016-10-25Red Giant SoftwareParticle-based tracking of objects within images
JP6055565B1 (en)2016-03-152016-12-27株式会社日立製作所 Ultrasonic diagnostic equipment
WO2017011833A1 (en)*2015-07-162017-01-19Canfield RaymondCyber security system and method using intelligent agents
US20170111585A1 (en)*2015-10-152017-04-20Agt International GmbhMethod and system for stabilizing video frames
WO2017091833A1 (en)2015-11-292017-06-01Arterys Inc.Automated cardiac volume segmentation
US20180165808A1 (en)*2016-06-272018-06-14University Of Central Florida Research Foundation, Inc.System and method for image-based quantification of white and brown adipose tissue at the whole-body, organ and body-region levels
WO2019022663A1 (en)*2017-07-282019-01-31National University Of SingaporeMethod of modifying a retina fundus image for a deep learning model
US20190197703A1 (en)*2017-12-212019-06-27Baidu Online Network Technology (Beijing) Co., Ltd.Method and apparatus for tracking target profile in video
US20200074673A1 (en)*2018-08-292020-03-05Adobe Inc.Object Tracking Verification in Digital Video
US20200160540A1 (en)*2018-11-152020-05-21Sony CorporationObject tracking based on a user-specified initialization point
US20200226781A1 (en)*2019-01-112020-07-16Beijing Boe Optoelectronics Technology Co., Ltd.Image processing method and apparatus
US20200279373A1 (en)*2019-02-282020-09-03EndoSoft LLCAi systems for detecting and sizing lesions
US20210196101A1 (en)2018-09-212021-07-01Fujifilm CorporationImage processing apparatus and image processing method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPS6055565B2 (en)*1982-03-311985-12-05新日本製鐵株式会社 Side charging type walking furnace

Patent Citations (42)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5764283A (en)*1995-12-291998-06-09Lucent Technologies Inc.Method and apparatus for tracking moving objects in real time using contours of the objects and feature paths
US5999651A (en)*1997-06-061999-12-07Matsushita Electric Industrial Co., Ltd.Apparatus and method for tracking deformable objects
US6259802B1 (en)*1997-06-302001-07-10Siemens Corporate Research, Inc.Object tracking technique using polyline contours
US6546117B1 (en)*1999-06-102003-04-08University Of WashingtonVideo object segmentation using active contour modelling with global relaxation
US6912310B1 (en)*1999-06-102005-06-28University Of WashingtonVideo object segmentation using active contour model with directional information
US6678416B1 (en)*2000-02-082004-01-13University Of WashingtonDetecting and segmenting local deformation in a tracked video object
US6937760B2 (en)*2000-12-282005-08-30University Of WashingtonInteractive frame segmentation with dynamic programming
JP2002230548A (en)*2001-01-302002-08-16Namco Ltd Pattern recognition processing device, method and pattern recognition program
JP2003250804A (en)2002-03-052003-09-09Toshiba Corp Image processing apparatus and ultrasonic diagnostic apparatus
US20030171668A1 (en)2002-03-052003-09-11Kabushiki Kaisha ToshibaImage processing apparatus and ultrasonic diagnosis apparatus
WO2004081875A2 (en)2003-03-072004-09-23Siemens Corporate Research Inc.System and method for tracking a global shape of an object in motion
US20040208341A1 (en)2003-03-072004-10-21Zhou Xiang SeanSystem and method for tracking a global shape of an object in motion
JP2005160688A (en)2003-12-022005-06-23Hitachi Medical CorpImage diagnosis apparatus
US20060008138A1 (en)*2004-06-212006-01-12Zhou Xiang SSystem and method for 3D contour tracking of anatomical structures
US20060262960A1 (en)*2005-05-102006-11-23Francois Le ClercMethod and device for tracking objects in a sequence of images
JP2007222533A (en)2006-02-272007-09-06Aloka Co LtdUltrasonic diagnostic apparatus and ultrasonic image processing method
US20080100709A1 (en)*2006-10-272008-05-01Matsushita Electric Works, Ltd.Target moving object tracking device
JP2009037518A (en)*2007-08-032009-02-19Nippon Telegr & Teleph Corp <Ntt> Representative point tracking method
US20120114173A1 (en)*2008-09-042012-05-10Sony Computer Entertainment Inc.Image processing device, object tracking device, and image processing method
CN101477690A (en)*2008-12-302009-07-08清华大学Method and device for object contour tracking in video frame sequence
US20110052071A1 (en)*2009-09-032011-03-03Canon Kabushiki KaishaImage processing apparatus, image processing method, and program
US20110158474A1 (en)*2009-12-312011-06-30Indian Institute Of Technology BombayImage object tracking and segmentation using active contours
US9478033B1 (en)*2010-08-022016-10-25Red Giant SoftwareParticle-based tracking of objects within images
US20140010409A1 (en)*2011-03-102014-01-09Omron CorporationObject tracking device, object tracking method, and control program
US20160055648A1 (en)*2014-08-222016-02-25Canon Kabushiki KaishaNon-uniform curve sampling method for object tracking
JP2016055040A (en)2014-09-112016-04-21日立アロカメディカル株式会社Ultrasonic diagnostic device
US20170251998A1 (en)2014-09-112017-09-07Hitachi, Ltd.Ultrasonic diagnostic device
WO2017011833A1 (en)*2015-07-162017-01-19Canfield RaymondCyber security system and method using intelligent agents
US20170111585A1 (en)*2015-10-152017-04-20Agt International GmbhMethod and system for stabilizing video frames
US20180259608A1 (en)2015-11-292018-09-13Arterys Inc.Automated cardiac volume segmentation
WO2017091833A1 (en)2015-11-292017-06-01Arterys Inc.Automated cardiac volume segmentation
CN105761277A (en)*2016-02-012016-07-13西安理工大学Moving target tracking method based on optical flow
WO2017158897A1 (en)2016-03-152017-09-21株式会社日立製作所Ultrasonic diagnostic device
JP6055565B1 (en)2016-03-152016-12-27株式会社日立製作所 Ultrasonic diagnostic equipment
US20180165808A1 (en)*2016-06-272018-06-14University Of Central Florida Research Foundation, Inc.System and method for image-based quantification of white and brown adipose tissue at the whole-body, organ and body-region levels
WO2019022663A1 (en)*2017-07-282019-01-31National University Of SingaporeMethod of modifying a retina fundus image for a deep learning model
US20190197703A1 (en)*2017-12-212019-06-27Baidu Online Network Technology (Beijing) Co., Ltd.Method and apparatus for tracking target profile in video
US20200074673A1 (en)*2018-08-292020-03-05Adobe Inc.Object Tracking Verification in Digital Video
US20210196101A1 (en)2018-09-212021-07-01Fujifilm CorporationImage processing apparatus and image processing method
US20200160540A1 (en)*2018-11-152020-05-21Sony CorporationObject tracking based on a user-specified initialization point
US20200226781A1 (en)*2019-01-112020-07-16Beijing Boe Optoelectronics Technology Co., Ltd.Image processing method and apparatus
US20200279373A1 (en)*2019-02-282020-09-03EndoSoft LLCAi systems for detecting and sizing lesions

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
International Search Report (ISR1) dated Jun. 18, 2019, issued in counterpart International Application No. PCT/JP2019/013606, with English Translation. (4 pages).
International Search Report (ISR2) dated Jun. 18, 2019, issued in counterpart International Application No. PCT/JP2019/013607, with English Translation. (4 pages).
Non-Final Office Action dated Jan. 3, 2023, issued in related co-pending U.S. Appl. No. 17/179,919. (24 pages).
Related Co-pending U.S. Appl. No. 17/179,919.

Also Published As

Publication numberPublication date
WO2020194663A1 (en)2020-10-01
JP7105369B2 (en)2022-07-22
JPWO2020194663A1 (en)2021-10-14
US20210183076A1 (en)2021-06-17

Similar Documents

PublicationPublication DateTitle
US11907849B2 (en)Information processing system, endoscope system, information storage medium, and information processing method
US11380084B2 (en)System and method for surgical guidance and intra-operative pathology through endo-microscopic tissue differentiation
US11900615B2 (en)Tracking device, endoscope system, and tracking method
US10452899B2 (en)Unsupervised deep representation learning for fine-grained body part recognition
JP2021530061A (en) Image processing methods and their devices, electronic devices and computer-readable storage media
KR20190100011A (en)Method and apparatus for providing surgical information using surgical video
US20220157047A1 (en)Feature Point Detection
EP4309075A1 (en)Prediction of structures in surgical data using machine learning
US11790537B2 (en)Tracking device, endoscope system, and tracking method
Shen et al.Content-aware specular reflection suppression based on adaptive image inpainting and neural network for endoscopic images
US10083278B2 (en)Method and system for displaying a timing signal for surgical instrument insertion in surgical procedures
JP7376677B2 (en) Image processing system, endoscope system and method of operating the endoscope system
JP6824845B2 (en) Image processing systems, equipment, methods and programs
CN107249464A (en)Robust calcification tracking in fluorescence imaging
WO2022156425A1 (en)Minimally invasive surgery instrument positioning method and system
KR102174246B1 (en)Catheter tracking system and controlling method thereof
WO2020194662A1 (en)Information processing system, endoscope system, pretrained model, information storage medium, information processing method, and method for producing pretrained model
Martínez et al.Estimating the size of polyps during actual endoscopy procedures using a spatio-temporal characterization
WO2025013186A1 (en)Image processing device, surgery support system, image processing method, and program
JP7315033B2 (en) Treatment support device, treatment support method, and program
CN119318543B (en) Medical imaging blood vessel tracking method, surgical path planning method and system
CN120107301B (en)Image processing method and device
WO2024122027A1 (en)Image processing device, surgical support system, image processing method, and program

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:OLYMPUS CORPORATION, JAPAN

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ISHIKAKE, MAKOTO;REEL/FRAME:055335/0196

Effective date:20210208

FEPPFee payment procedure

Free format text:ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPPInformation on status: patent application and granting procedure in general

Free format text:APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPPInformation on status: patent application and granting procedure in general

Free format text:DOCKETED NEW CASE - READY FOR EXAMINATION

STPPInformation on status: patent application and granting procedure in general

Free format text:NON FINAL ACTION MAILED

STPPInformation on status: patent application and granting procedure in general

Free format text:RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPPInformation on status: patent application and granting procedure in general

Free format text:NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPPInformation on status: patent application and granting procedure in general

Free format text:PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPPInformation on status: patent application and granting procedure in general

Free format text:PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCFInformation on status: patent grant

Free format text:PATENTED CASE


[8]ページ先頭

©2009-2025 Movatter.jp