What is Image Segmentation based on OpenCV 04/21 Update SLTechnology News&Howtos

What is Image Segmentation based on OpenCV

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

What is the image segmentation based on OpenCV? aiming at this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Data scientists and medical researchers can use this method as a template for more complex image data sets (such as astronomical data) and even some non-image data sets. Since the image is represented as a matrix in the computer, we have a special sorted data set as the basis. Throughout the process, we will use the Python package, as well as several tools such as OpenCV, scikit images, and so on. In addition, we will use numpy to ensure that values in memory are stored consistently.

Main content

Denoising

In order to eliminate noise, we use a simple median filter to remove outliers, but we can also use some different noise removal methods or artifact removal methods. This workpiece is determined by the acquisition system (microscope technology) and may require complex algorithms to recover the lost data. Artifacts are usually divided into two categories:

1. Blur or out-of-focus area

two。 Unbalanced foreground and background (correct using histogram)

Split up

For this article, we use Otsu's method to segment, use a median filter to smooth the image, and then verify the results. As long as the segmentation result is binary, you can use the same verification method for any segmentation algorithm. These algorithms include, but are not limited to, various cyclic threshold methods considering different color spaces.

Some examples include:

1. Li threshold

two。 Adaptive threshold method dependent on local strength

3. Unet and other deep learning algorithms commonly used in biomedical image segmentation

4. A deep learning method for semantic segmentation of images

Verification

Let's start with the underlying dataset that has been manually segmented. In order to quantify the performance of the segmentation algorithm, we compare the binary segmentation of the real data with the predicted data, and show the accuracy and more effective indicators at the same time. Although the number of true positive (TP) or false negative (FN) is low, the accuracy may be unusually high. In this case, F1 scores and MCC are better quantitative indicators of binary classification. We will describe the advantages and disadvantages of these indicators in detail later.

In order to verify qualitatively, we superimpose the confusion matrix results, that is, the real positive, true negative, false positive and false negative pixels are exactly on the grayscale image. This verification can also be applied to color images on binary image segmentation results, although the data used in this paper are grayscale images. Finally, we will introduce the whole implementation process. Now, let's look at the data and the tools used to process it.

Loading and visualizing data

We will use the following modules to load, visualize, and transform data. These are very useful for image processing and computer vision algorithms, with simple and complex array mathematics. If installed separately, the module name in parentheses will be helpful.

If you encounter problems with the matplotlib backend while running the sample code, disable the interactive mode by removing the plt.ion () call, or call plt.show () at the end of each section by uncommenting the suggestion call in the sample code. "Agg" or "TkAgg" will be the back end of the image display. The drawing is displayed in the article.

Code import

Import cv2 import matplotlib.pyplot as plt import numpy as np import scipy.misc import scipy.ndimage import skimage.filters import sklearn.metrics # Turn on interactive mode. Turn off with plt.ioff () plt.ion ()

In this section, we will load visual data. The data are images of mouse brain tissue stained with Indian ink, generated by a microscope (KESM). This 512 x 512 image is a subset called a block. The complete dataset is 17480 x 8026 pixels with a depth of 799 and a size of 10gb. Therefore, we will write an algorithm to deal with blocks of size 512 x 512, which is only 150 KB.

Each block can be mapped to run on multiprocessing / multithreading (that is, distributed infrastructure), and then stitched together to get a complete segmented image. We do not introduce specific suture methods. In short, stitching involves indexing the entire matrix and reassembling the blocks according to that index. Can be done using map-reduce, Map-Reduce indicators such as the sum of all F1 scores of all blocks, and so on. We just need to add the results to the list and then perform a statistical summary.

The black oval structure on the left is blood vessels, and the rest is tissue. Therefore, the two classes in this dataset are:

Foreground (ship)-marked 255

Background (Organization)-marked as 0

The last image on the lower right is the real image. The outline is tracked manually by drawing and filling the outline, and the real situation is obtained by the pathologist. We can use similar examples provided by experts to train deep learning networks and conduct large-scale verification. We can also expand the data by providing these examples to other platforms and having them manually track a different set of images in a larger proportion for validation and training.

Grayscale = scipy.misc.imread ('grayscale.png') grayscale = 25585-grayscalegroundtruth = scipy.misc.imread (' groundtruth.png') plt.subplot (1,3,1) plt.imshow (255grayscale, cmap='gray') plt.title ('grayscale') plt.axis (' off') plt.subplot (1,3,2) plt.imshow (grayscale, cmap='gray') plt.title ('inverted grayscale') plt.axis (' off') plt.subplot (1,3,3) plt.imshow (groundtruth) Cmap='gray') plt.title ('groundtruth binary') plt.axis (' off')

Pre-treatment

Before segmenting the data, we should check the data set to determine whether there are artifacts caused by the imaging system. In this example, we discuss only one image. By looking at the image, we can see that there is no obvious artifact that will interfere with the segmentation. However, friends can use the median filter to eliminate outlier noise and smooth the image. The median filter replaces outliers with median values (within the kernel of a given size).

Median filter for kernel size 3

Median_filtered = scipy.ndimage.median_filter (grayscale, size=3) plt.imshow (median_filtered, cmap='gray') plt.axis ('off') plt.title ("median filtered image")

To determine which threshold technique is most suitable for segmentation, we can first use the threshold to determine whether there is a unique pixel intensity that separates the two categories. In this case, the image can be binarized using the intensity obtained through visual inspection. The intensity of many pixels in the image we use is less than 50, and these pixels correspond to the background category in the inverted grayscale image.

Although the distribution of the category is not bimodal, there is still a difference between the foreground and the background, where the lower intensity pixels reach the peak and then reach the trough. This exact value can be obtained through various threshold techniques. The segmentation section will study such a method in detail.

Histogram of visual pixel intensity

Counts, vals = np.histogram (grayscale, bins=range (2 * * 8)) plt.plot (range (0, (2 * * 8)-1), counts) plt.title ("Grayscale image histogram") plt.xlabel ("Pixel intensity") plt.ylabel ("Count")

Split up

After removing the noise, we can use the skimage filter module to compare the results of all thresholds to determine the pixels to be used. Sometimes, in an image, the histogram of pixel intensity is not bimodal. Therefore, there may be another threshold method that is better than an adaptive threshold method that thresholds in the kernel shape based on the threshold shape. The functions in Skimage can easily see the processing results of different thresholds.

Try all thresholds

Result = skimage.filters.thresholding.try_all_threshold (median_filtered)

The easiest way to handle the threshold is to use a manually set threshold for the image. However, using the automatic threshold method on the image can calculate its value better than the human eye, and it can be easily copied. For the image in this example, it seems that the Otsu,Yen and Triangle methods work well.

In this paper, we will use Otsu threshold technology to segment the image into binary images. Otsu calculates the threshold by calculating a value that maximizes the variance between categories (the variance between the foreground and the background) and minimizes the variance within the category (the variance within the foreground or the variance within the background). The effect is good if there is a bimodal histogram (with two different peaks) or a threshold that can better separate categories.

Otsu thresholding and Visualization

Threshold = skimage.filters.threshold_otsu (median_filtered) print ("Threshold value is {}" .format (threshold)) predicted = np.uint8 (median_filtered > threshold) * 255plt.imshow (predicted, cmap='gray') plt.axis ('off') plt.title ("otsu predicted binary image")

If the above simple techniques cannot be used for binary image segmentation, UNet, ResNet with FCN or various other supervised deep learning techniques can be used to segment the image. To remove small objects caused by the segmentation of foreground noise, you can also consider trying skimage.morphology.remove_objects ().

Verification mode

In general, we need people with image type expertise to manually generate basic facts to verify accuracy and other indicators, and to see the degree of image segmentation.

Confusion matrix

We sklearn.metrics.confusion_matrix () to get the matrix element, as shown below. Assuming that the input is a list of elements with binary elements, the Scikit-learn obfuscation matrix function returns four elements of the obfuscation matrix. For extreme cases where everything is a binary value (0) or something else (1), sklearn returns only one element. We wrapped the sklearn obfuscation matrix function and wrote our own edge cases, as follows:

Get_confusion_matrix_elements ()

Def get_confusion_matrix_elements (groundtruth_list, predicted_list): "returns confusion matrix elements i.e TN, FP, FN, TP as floats See example code for helper function definitions" _ assert_valid_lists (groundtruth_list, predicted_list)

If _ all_class_1_predicted_as_class_1 (groundtruth_list, predicted_list) is True: tn, fp, fn, tp = 0,0,0, np.float64 (len (groundtruth_list))

Elif _ all_class_0_predicted_as_class_0 (groundtruth_list, predicted_list) is True: tn, fp, fn, tp = np.float64 (len (groundtruth_list)), 0,0,0

Else: tn, fp, fn, tp = sklearn.metrics.confusion_matrix (groundtruth_list, predicted_list). Ravel () tn, fp, fn, tp = np.float64 (tn), np.float64 (fp), np.float64 (fn), np.float64 (tp)

Return tn, fp, fn, tp

Accuracy.

In the case of binary classification, accuracy is a common verification index. Calculated as where TP = true, TN = true negative, FP = false positive, FN = false negative

Get_accuracy ()

Def get_accuracy (groundtruth_list, predicted_list):

Tn, fp, fn, tp = get_confusion_matrix_elements (groundtruth_list, predicted_list) total = tp + fp + fn + tn accuracy = (tp + tn) / total return accuracy

It varies from 0 to 1, with 0 being the worst and 1 being the best. If the algorithm detects everything as the whole background or foreground, it will still have high accuracy. Therefore, we need an indicator that takes into account the imbalance in class size. Especially because the current image has more foreground pixels (class 1) than background 0.

F1 scores range from 0 to 1, and the formula is as follows:

0 is the worst prediction, while 1 is the best. Now, considering the marginal situation, deal with the F1 score calculation.

Get_f1_score ()

Def get_f1_score (groundtruth_list, predicted_list): "Return F1 score covering edge cases"

Tn, fp, fn, tp = get_confusion_matrix_elements (groundtruth_list, predicted_list) if _ all_class_0_predicted_as_class_0 (groundtruth_list, predicted_list) is True: f1_score = 1 elif _ all_class_1_predicted_as_class_1 (groundtruth_list, predicted_list) is True: f1_score = 1 else: f1_score = (2 * tp) / (2 * tp) + fp + fn)

Return f1_score

F1 scores higher than 0.8 are considered to be good F1 scores, indicating good predictive performance.

Customer center

MCC stands for Matthews correlation coefficient, which is calculated as follows:

It is between-1 and + 1. -1 is the absolute opposite correlation between the actual situation and the prediction, 0 is the random result, in which some predictions match, and + 1 is the absolute match between the actual situation and the prediction, maintaining a positive correlation. Therefore, we need better verification metrics, such as MCC.

In MCC calculation, the numerator consists of only four internal units (the cross product of elements), while the denominator consists of four external elements of the confusion matrix (the product of points). In the case of a denominator of 0, MCC will be able to notice the wrong direction of our classifier and warn by setting it to an undefined value (that is, numpy.nan). However, in order to get valid values and to be able to average MCC for different images if necessary, we set MCC to-1 (the worst value in the range). Other edge conditions include all elements that are correctly detected as foreground and background if the MCC and F1 scores are set to 1. Otherwise, MCC is set to-1 and the F1 score is 0.

To learn about MCC and marginal cases, and why MCC is better than accuracy or F1 scores, read the following article:

Https://lettier.github.io/posts/2016-08-05-matthews-correlation-coefficient.html

Https://en.wikipedia.org/wiki/Matthews_correlation_coefficient#Advantages_of_MCC_over_accuracy_and_F1_score

Get_mcc ()

Def get_mcc (groundtruth_list, predicted_list): "Return mcc covering edge cases"

Tn, fp, fn, tp = get_confusion_matrix_elements (groundtruth_list, predicted_list) if _ all_class_0_predicted_as_class_0 (groundtruth_list, predicted_list) is True: mcc = 1 elif _ all_class_1_predicted_as_class_1 (groundtruth_list, predicted_list) is True: mcc = 1 elif _ all_class_1_predicted_as_class_0 (groundtruth_list Predicted_list) is True: mcc =-1 elif _ all_class_0_predicted_as_class_1 (groundtruth_list, predicted_list) is True: mcc =-1

Elif _ mcc_denominator_zero (tn, fp, fn, tp) is True: mcc =-1

# Finally calculate MCC else: mcc = ((tp * tn)-(fp * fn)) / (np.sqrt ((tp + fp) * (tp + fn) * (tn + fp) * (tn + fn)) return mcc

Finally, we can compare and verify the indicators side by side according to the results.

> validation_metrics = get_validation_metrics (groundtruth, predicted) {'mcc': 0.8533910225863214,' F1 score score: 0.8493358633776091, 'tp': 5595.0,' fn': 1863.0, 'fp': 122.0,' accuracy': 0.9924278259277344, 'tn': 254564.0}

The accuracy is close to 1 because there are many background pixels in the example image that can be correctly detected as the background (that is, the real negative is naturally higher). This explains why precision is not a good method for binary classification.

The F1 score is 0.84. Therefore, in this case, we may not need a more complex threshold algorithm for binary segmentation. If all the images in the stack have similar histogram distribution and noise, you can use Otsu and get pretty good prediction results.

When the MCC is high, it also indicates that there is a high correlation between the ground live and the predicted image, which can be clearly seen from the predicted image picture in the previous section.

Now, let's visualize and look at the distribution of the confusion matrix element TP,FP,FN,TN around the image. It shows us that the threshold is picking up the foreground (container) when there is no threshold (FP), at the location where the real blood vessel is not detected (FN), and vice versa.

Verification Visualization

In order to visualize the obfuscation matrix elements, we accurately find out the position of the obfuscation matrix elements in the image. For example, we find that TP arrays (that is, pixels that are correctly detected as foreground) are found by finding the real situation and predicting the logic of the array. Again, we use logical Boolean operations commonly referred to as FP,FN,TN arrays.

Get_confusion_matrix_intersection_mats ()

Def get_confusion_matrix_intersection_mats (groundtruth, predicted): "Returns dict of 4 boolean numpy arrays with True at TP, FP, FN, TN"

Confusion_matrix_arrs = {}

Groundtruth_inverse = np.logical_not (groundtruth) predicted_inverse = np.logical_not (predicted)

Confusion_matrix_arrs ['tp'] = np.logical_and (groundtruth, predicted) confusion_matrix_arrs [' tn'] = np.logical_and (groundtruth_inverse, predicted_inverse) confusion_matrix_arrs ['fp'] = np.logical_and (groundtruth_inverse, predicted) confusion_matrix_arrs [' fn'] = np.logical_and (groundtruth, predicted_inverse)

Return confusion_matrix_arrs

We can then map the pixels in each array to a different color. For the image below, we map TP,FP,FN,TN to the CMYK (cyan, magenta, yellow, black) space. You can also map them to (green, red, red, green) colors. Then we will get an image in which all the red indicates a wrong prediction. CMYK space enables us to distinguish between TP,TN.

Get_confusion_matrix_overlaid_mask ()

Def get_confusion_matrix_overlaid_mask (image, groundtruth, predicted, alpha, colors): "Returns overlay the 'image' with a color mask where TP, FP, FN, TN are each a color given by the' colors' dictionary" image = cv2.cvtColor (image, cv2.COLOR_GRAY2RGB) masks = get_confusion_matrix_intersection_mats (groundtruth, predicted) color_mask = np.zeros_like (image) for label Mask in masks.items (): color = colors [label] mask_rgb = np.zeros_like (image) mask_ RGB [mask! = 0] = color color_mask + = mask_rgb return cv2.addWeighted (image, alpha, color_mask, 1-alpha, 0)

Alpha = 0.5confusion_matrix_colors = {'tp': (0,255,255), # cyan' fp': (255,0,255), # magenta 'fn': (255,255,0), # yellow' tn': (0,0,0) # black} validation_mask = get_confusion_matrix_overlaid_mask (255grayscale, groundtruth, predicted, alpha Confusion_matrix_colors) print ('Cyan-TP') print (' Magenta-FP') print ('Yellow-FN') print (' Black-TN') plt.imshow (validation_mask) plt.axis ('off') plt.title (' confusion matrix overlay mask')

We use OpenCV here to cover this color mask as a transparent layer over the original (non-inverted) grayscale image. This is called Alpha synthesis:

The answer to the question about OpenCV-based image segmentation is shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.