Some Problematic Aspects of Coliform Bacteria Clustering on Medical Images in the Task of Identifying Possible Diseases

Medical image analysis methods are one of the sources for obtaining additional information about the investigated phenomena. We are looking at images of coliform bacteria. Analysis of these images allows you to determine the possibility of developing certain diseases. To do this, it is necessary to cluster the set of bacteria and count the bacteria. The paper highlights the features of clustering for coliform bacteria. Clustering results for real data are presented.


Introduction
A medical image is an image of cells, tissue sections of various human organs. These images are usually taken under a microscope. Thus, the medical image is the image of the microworld. These images help visualize and understand the processes that are hidden to the human eye (Karthick & Maniraj, 2019;Orobinskyi, Petrenko & Lyashenko, 2019). It can also be processes in those places of the human body that are inaccessible. Medical images are data that have their own specific properties and features of their use (Orobinskyi, Petrenko & Lyashenko, 2019;Lyashenko, Sinelnikova, Zeleniy & Babker, 2020;. Therefore, medical images are a source of information for decision-making in the diagnosis and treatment of various diseases. One example of medical imaging is coliform bacteria. Coliform bacteria are a collection of bacteria that are often used as indicators of the sanitary quality of food and water (Jeon & et al., 2019;Wang & Deng, 2019). At the same time, coliform bacteria indicate the possibility of the presence of other pathogenic organisms. It is important to know the quantitative characteristics of the accumulation of such bacteria. It is the concentration of bacteria that helps to determine the possibility of infection, to correctly diagnose the alleged disease.
Various image processing methods are used to analyze medical images (Lyashenko, Sinelnikova, Zeleniy & Babker, 2020;Rabotiahov, Kobylin, Dudar & Lyashenko, 2018;. These methods can be different, and their use depends on the problem that needs to be solved in the process of the corresponding analysis. Therefore, it is important not only to select certain methods for image analysis, but also to justify the sequence of their use for such data analysis. Some of these issues are discussed in this paper. M. I. Razzak, S. Naz and A. Zaib discuss various problematic aspects in medical imaging (Razzak, Naz & Zaib, 2018). At the same time, attention is paid to the fact that each medical image is unique and requires an individual approach in each case. The authors also note the importance of such a procedure as the analysis of medical images in the process of diagnosing diseases. At the same time, the authors pay special attention to the analysis of medical images based on neural networks.
The study by authors D. Shen, G. Wu, and H. I. Suk provides a general overview of image analysis methods in the field of medical imaging (Shen, Wu & Suk, 2017). Particular attention is paid to the method of neural networks, which are widely used in the processing of medical images. Nevertheless, the authors of the studies emphasize the need to preserve the original information as a result of using image analysis methods.
A review of medical image processing methods is presented in (Goel, Yadav & Singh, 2016). The study authors note that medical imaging is becoming an important procedure in various medical applications. This procedure provides additional information. This additional information helps to improve the efficiency of diagnosis of various diseases and improve the treatment process. The authors also highlight certain problematic aspects in the processing of medical images.
R. Merjulah and J. Chandra in their work pay special attention to segmentation as one of the procedures for processing medical images (Merjulah & Chandra, 2017). The authors emphasize that segmentation is one of the most common methods for processing medical images. In this case, the main purpose of segmentation is to highlight the area of interest for its subsequent processing and analysis. At the same time, it is necessary to pay attention to clustering, which allows us to analyze all the objects that interest us.
In work (Wang & et al., 2020), the issues of detection and classification of living bacteria are considered. To do this, the authors use a sequence of several images and analysis of such images based on deep learning neural networks. It can also be seen from this work that an important aspect is the clustering of bacteria that can be distinguished in the image.
M. Hiremath discusses the procedure for segmentation and recognition of E.coli bacterial cells (Hiremath, 2019). M. Hiremath uses medical images taken with a microscope for appropriate analysis. At the same time, the author of the work speaks about the need for preliminary coloring of such images. Also an important element of image processing is filtering small particles.
In work (Cernicchiaro & et al., 2019), the problem of quantitative determination of bacteria is considered to determine pollution and possible contamination of the environment. D. Połap and M. Woźniak solve the problem of classifying different forms of bacteria (Połap & Woźniak, 2019). To do this, the authors use a preliminary separation of different bacteria based on their descriptive characteristics. Next, a convolutional neural network model is used to flesh out this preliminary separation of bacteria. B. Kis, M. Unay, G. D. Ekimci, U. K. Ercan, and A. Akan use imaging techniques to count bacterial colonies (Kis & et al., 2019). For this, various methods of image analysis are used. At the same time, one of the problematic issues is the count of bacteria in each colony.
Thus, we can say that there are a number of problematic aspects that can be encountered as a result of the application of processing methods for the analysis of coliform bacteria images. Figure 1 shows various images of coliform bacteria (Orobinskyi, Deineko & Lyashenko, 2020). The difficulty in separating clusters of bacteria into individual bacteria is due to the fact that it is difficult to determine the junction points between such bacteria (Figure 3). In work  it is shown that the solution of this problem should be divided into two subtasksfinding the junction points of two objects and determining the boundaries of objects. Thus, one of the problematic aspects in the clustering of accumulations of bacteria is the determination of the boundary points. The solution to this problem is to find the optimal relationship between the individual boundary points of the images of bacteria. This can be done using algorithms for finding the optimal path on the graph .

Some examples of coliform bacteria images in the context of problematic aspects of their processing
Another problematic aspect when clustering bacterial clumps is to improve the quality of the original image. The main issue in this context is not to degrade the quality of the original image (Orobinskyi, Deineko & Lyashenko, 2020). Examples of such low-quality processing are shown in Figure 4, which shows the results of filtering the original image ( Figure 1b). a) b) c) Figure 4. Examples of different ways to filter the original image Therefore, when solving the problem of clustering accumulations of bacteria in medical images, it is necessary to carefully approach the filtering procedure of the original image. At the same time, it is also necessary to take into account the geometric and morphological characteristics of bacterial clusters and their individual varieties.

An example of clustering results bacteria concentrations
First of all, we note that the problem of clustering a set of bacteria will be considered as the problem of coloring each individual bacterium with a certain color. This then makes it possible to quickly count the number of bacteria and determine a possible identification of the degree of disease. Figure  We see that all bacteria have their own unique color. This allows for a simple and quick enumeration of bacteria in their entirety.
For clustering, we used: the possible minimum and maximum size of bacteria, the depth of separation of bacteria at the junction points, the size of the area of the local extremum, the geometry of the connectivity points in the vicinity of the junction points. These are the so-called clustering input parameters.