Automatically obtaining by methods of flow cytometry and cluster analysis simplified leukocyte formula

Authors

  • Andrey V. Orekhov St. Petersburg State University, 199034, St. Petersburg, Russian Federation
  • Viktor I. Shishkin St. Petersburg State University, 199034, St. Petersburg, Russian Federation
  • Galina V. Kudriavtseva St. Petersburg State University, 199034, St. Petersburg, Russian Federation
  • Galina V. Pavilaynen St. Petersburg State University, 199034, St. Petersburg, Russian Federation
  • Viktor V. Shishkin St. Petersburg State University, 199034, St. Petersburg, Russian Federation
  • Nikolay S. Lyudkevich St. Petersburg State University, 199034, St. Petersburg, Russian Federation

DOI:

https://doi.org/10.21638/11701/spbu10.2023.404

Abstract

The leukocyte formula is the percentage of different groups of white blood cells. According to morphological features, three subpopulations can be distinguished among leukocytes: lymphocytes, monocytes and granulocytes. Granulocytes are divided into neutrophilic, eosinophilic, and basophilic cells. Automatic typologization of white blood cells is an unsolved problem, since at present, during cytometric research, the counting of the number of cells in various subpopulations of leukocytes is actually done manually, which in turn causes the subjectivity of the experiment and large values of errors in calculations. To solve this problem, attempts have been made repeatedly to use cluster analysis methods. In computational experiments, it was shown that the use of standard algorithms, such as the agglomerative methods, EM algorithm, DBSCAN, etc., does not allow to obtain the desired results. In recent years, a large number of research papers have been published describing specialized clustering algorithms for detecting and determining populations of white blood cells, some of them have found practical application, but the problems associated with the presence of a large amount of noise and different data density distribution during leukocyte clustering by flow cytometry methods remain relevant. The article considers an approach to constructing a strategy for automatic allocation of the main leukocyte subpopulations using a modified agglomerative centroid clustering method and discusses the results of computational experiments. The results of calculating the proportion of lymphocytes are compared “manually” and automatically using a modified centroid algorithm.

Keywords:

leukocyte formula, flow cytometry, cluster analysis, Markov moment, least squares method

Downloads

Download data is not yet available.
 

References

Литература

Зурочка А. В., Хайдуков С. В., Кудрявцев И. В., Черешнев В. А. Проточная цитометрия в медицине и биологии. 2-е изд. Екатеринбург: Урал. отд. РАН, 2014. 574 с.

Балалаева И. В. Проточная цитофлуориметрия: учеб.-метод. пособие. Нижний Новгород: Нижегородский государственный университет, 2014. 75 с.

Агаджанян Н. А., Смирнов В. М. Нормальная физиология: учебник для студентов медицинских вузов. М.: Медицинское информационное агентство, 2009. 520 с.

Хаитов Р. М., Игнатьева Г. А., Сидорович И. Г. Иммунология: учебник. М.: Медицина, 2000. 432 с.

Orekhov A. V., Shishkin V. I., Lyudkevich N. S. Clusterization of white blood cells on the modified upgmc method // Stability and Control Processes. Proceedings of the 4th International Conference dedicated to the memory of professor Vladimir Zubov. Cham: Springer, 2022. Р. 559–566.

Основы доказательной медицины: учеб. пособие для системы послевузовского и дополнительного профессионального образования врачей / под общ. ред. Р. Г. Оганова.  М.: Силицея-Полиграф, 2010. 136 с.

Pedersen N. W., Chandran P. A., Qian Y., Rebhahn J., Petersen N. V., Hoff M. D., White S., Lee A. J., Stanton R. H. Ch., Jakobsen K., Mosmann T., Gouttefangeas C., Chan C., Scheuermann R. H., Hadrup S. R. Automated analysis of flow cytometry data to reduce inter-lab variation in the detection of major histocompatibility complex multimer-binding T cells // Front Immunol. 2017. Vol. 8. P. 858.

Daneau G., Buyze J., Wade D., Diaw P. A., Dieye T. N., Sopheak T., Florence E., Lynen L., Kestens L. CD4 results with a bias larger than hundred cells per microliter can have a significant impact on the clinical decision during treatment initiation of HIV patients // Cytometry B Clin Cytom. 2017. Vol. 92. N 6. P. 476–484. https://doi.org/10.1002/cyto.b.21366

Qian Y., Kim H., Purawat Sh., Wang J., Stanton R., Lee A., Xu W., Altintas I., Sinkovits R., Scheuermann R. H. FlowGate: towards extensible and scalable web-based flow cytometry data analysis // XSEDE '15: Proceedings of the 2015 XSEDE Conference. Scientific advancements enabled by enhanced cyberinfrastructure. July 2015. Art. N 5. P. 1–8. https://doi.org/10.1145/2792745.2792750

Omana-Zapata I., Mutschmann C., Schmitz J., Gibson S., Judge K., Indig M. A., Lu B., Taufman D., Sanfilippo A. M., Shallenberger W., Graminske Sh., McLean R., Hsen R. I., d'Empaire N., Dean K., O'Gorman M. Accurate and reproducible enumeration of T-, B-, and NK lymphocytes using the BD FACSLyric 10-color system: A multisite clinical evaluation // PLoS One. 2019. Vol. 14. N 1. Art. N e0211207. https://doi.org/10.1371/journal.pone.0211207

Лепский А. И. Сравнительный анализ алгоритмов кластеризации лейкоцитов по FS и SS параметрам при цитофлуориметрическом исследовании крови // Информационные технологии. 2020. Т. 26. № 1. С. 56–61.

Steinhaus H. Sur la division des corps materiels en parties // Bull. Acad. Polon. Sci. C1. III. 1956. Vol. IV. P. 801–804.

Lloyd S. Least squares quantization in PCM // IEEE Transactions on Information Theory. 1982. Vol. 28. Iss. 2. P. 129–137. https://doi.org/10.1109/TIT.1982.1056489

Dempster A. P., Laird N. M., Rubin D. B. Maximum Likelihood from incomplete data via the EM algorithm // Journal of the Royal Statistical Society. Series B. 1977. Vol. 39. Iss. 1. P. 1–38.

Everitt B. S. Cluster analysis. Chichester: John Wiley & Sons Ltd, 2011. 330 p.

Hartigan J. A. Clustering algorithms. New York; London; Sydney; Toronto: John Wiley & Sons Inc. Press, 1975. 351 p.

Ester M., Kriegel H.- P., Sander J., Xu. X. A density-based algorithm for discovering clusters in large spatial databases with noise // Proceedings of the 2nd International Conference on knowledge discovery and data mining (KDD-96) / eds E. Simoudis, J. Han, U. M. Fayyad. Portland: AAAI Press, 1996. P. 226–231.

Weber L. M., Robinson M. D. Comparison of clustering methods for high-dimensional sngle-cell flow and mass cytometry data // Cytometry. Pt A. 2016. Vol. 89A. Iss. 12. P. 1084–1096.

Zhang C., Xiao X., Li X., Chen Y.-J., Zhen W., Chang J., Zheng Ch., Liu Z. White blood cell segmentation by color-space-based K-means clustering // Sensors. 2014. Vol. 14. Iss. 9. P. 16128–16147.

Виль М. Ю. Анализ статистической связи между клиническими факторами и появлением аномальной субпопуляции лейкоцитов // Процессы управления и устойчивость. 2020. linebreak Т. 7. № 1. С. 143–147.

Орехов А. В. Марковский момент остановки агломеративного процесса кластеризации в евклидовом пространстве // Вестник Санкт-Петербургского университета. Прикладная математика. Информатика. Процессы управления. 2019. Т. 15. Вып. 1. С. 76–92. https://doi.org/10.21638/11702/spbu10.2019.106

Orekhov A. V. Quasi-deterministic processes with monotonic trajectories and unsupervised machine learning // Mathematics. 2021. Vol. 9. Art. N 2301. https://doi.org/10.3390/math9182301

Булинский А. В., Ширяев А. Н. Теория случайных процессов. М.: Физматлит, 2003. 400 с.

Wald A. Sequential analysis. New York: John Wiley & Sons. Inc. Press, 1947. 212 p.

Sirjaev A. N. Statistical sequential analysis: Optimal stopping rules. Providence, Rhode Island: American Mathematical Society, 1973. 174 p.

Milligan G. W. Ultrametric hierarchical clustering algorithms // Psychometrika. 1979. linebreak Vol. 44. Iss. 3. P. 343–346.


References

Zurochka A. V., Khaidukov S. V., Kudryavtsev I. V., Chereshnev V. A. Protochnaya tsitometriya v meditsine i biologii [Flow cytometry in medicine and biology]. 2nd ed. Ekaterinburg, Ural. dept. RAS Publ., 2014, 574 p. (In Russian)

Balalaeva I. V. Protochnaya tsitofluorimetriya. Ucheb.-metod. posobiye [Flow cytometry. Educational method. allowance]. Nizhny Novgorod, Nizhny Novgorod State University Press, 2014, 75 p. (In Russian)

Agadzhanyan N. A., Smirnov V. M. Normal'naya fiziologiya. Uchebnik dlya studentov meditsinskikh vuzov [Normal physiology. A textbook for medical students]. Moscow, Medical Information Agency Press, 2009, 520 p. (In Russian)

Khaitov R. M., Ignatieva G. A., Sidorovich I. G. Immunologiya. Uchebnik [Immunology. Textbook]. Moscow, Medicine Publ., 2000, 432 p. (In Russian)

Orekhov A. V., Shishkin V. I., Lyudkevich N. S. Clusterization of white blood cells on the modified upgmc method. Stability and Control Processes. Proceedings of the 4th International Conference dedicated to the memory of professor Vladimir Zubov. Cham, Springer Publ., 2022, pp. 559–566.

Osnovy dokazatel'noy meditsiny. Uchebnoe posobiye dlya sistemy poslevuzovskogo i dopolnitel’nogo professional'nogo obrazovaniya vrachey [Fundamentals of evidence-based medicine. Textbook manual for the system of postgraduate and additional professional education of doctors]. Ed. by R. G. Oganova. Moscow, Silicea-Poligraf Publ., 2010, 136 p. (In Russian)

Pedersen N. W., Chandran P. A., Qian Y., Rebhahn J., Petersen N. V., Hoff M. D., White S., Lee A. J., Stanton R. H. Ch., Jakobsen K., Mosmann T., Gouttefangeas C., Chan C., Scheuermann R. H., Hadrup S. R. Automated analysis of flow cytometry data to reduce inter-lab variation in the detection of major histocompatibility complex multimer-binding T cells. Front Immunol, 2017, vol. 8, p. 858.

Daneau G., Buyze J., Wade D., Diaw P. A., Dieye T. N., Sopheak T., Florence E., Lynen L., Kestens L. CD4 results with a bias larger than hundred cells per microliter can have a significant impact on the clinical decision during treatment initiation of HIV patients. Cytometry B Clin Cytom, 2017, vol. 92, no. 6, pp. 476–484. https://doi.org/10.1002/cyto.b.21366

Qian Y., Kim H., Purawat Sh., Wang J., Stanton R., Lee A., Xu W., Altintas I., Sinkovits R., Scheuermann R. H. FlowGate: Towards extensible and scalable web-based flow cytometry data analysis. XSEDE '15: Proceedings of the 2015 XSEDE Conference. Scientific advancements enabled by enhanced cyberinfrastructure, July 2015, art. no. 5, pp. 1–8. https://doi.org/10.1145/2792745.2792750

Lepsky A. I. Sravnitel'nyy analiz algoritmov klasterizatsii leykotsitov po FS i SS parametram pri tsitofluorimetricheskom issledovanii krovi [Comparative analysis of leukocyte clustering algorithms according to FS and SS parameters in cytofluorometric blood tests]. Information technologies, 2020, vol. 26, no. 1, pp. 56–61. (In Russian)

Steinhaus H. Sur la division des corps materiels en parties. Bull. Acad. Polon. Sci. C1. III, 1956, vol. IV, pp. 801–804.

Lloyd S. Least squares quantization in PCM. IEEE Transactions on Information Theory, 1982, vol. 28, iss. 2, pp. 129–137. https://doi.org/10.1109/TIT.1982.1056489

Everitt B. S. Cluster analysis. Chichester, John Wiley & Sons Ltd Publ., 2011, 330 p.

Hartigan J. A. Clustering algorithms. New York, London, Sydney, Toronto, John Wiley & Sons Inc. Press, 1975, 351 p.

Ester M., Kriegel H.- P., Sander J., Xu. X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the 2nd International Conference on knowledge discovery and data mining (KDD-96). Eds E. Simoudis, J. Han, U. M. Fayyad. Portland, AAAI Press, 1996, pp. 226–231.

Weber L. M., Robinson M. D. Comparison of clustering methods for high-dimensional sngle-cell flow and mass cytometry data. Cytometry, Pt A, 2016, vol. 89A, iss. 12, pp. 1084–1096.

Zhang C., Xiao X., Li X., Chen Y.-J., Zhen W., Chang J., Zheng Ch., Liu Z. White blood cell segmentation by color-space-based K-means clustering. Sensors, 2014, vol. 14, iss. 9, pp. 16128–16147.

Vil M. Yu. Analiz statisticheskoy svyazi mezhdu klinicheskimi faktorami i poyavleniyem anomal'noy subpopulyatsii leykotsitov [Analysis of the statistical relationship between clinical factors and the appearance of an abnormal subpopulation of leukocytes]. Management Processes and Sustainability, 2020, vol. 7, no. 1, pp. 143–147. (In Russian)

Orekhov A. V. Markovskii moment ostanovki aglomerativnogo protsessa klasterizatsii v evklidovom prostranstve [Markov moment for the agglomerative method of clustering in Euclidean space]. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2019, vol. 15, iss. 1, pp. 76–92. https://doi.org/10.21638/11702/spbu10.2019.106 (In Russian)

Orekhov A. V. Quasi-deterministic processes with monotonic trajectories and unsupervised machine learning. Mathematics, 2021, vol. 9, art. no. 2301. https://doi.org/10.3390/math9182301

Wald A. Sequential analysis. New York, John Wiley & Sons. Inc. Press, 1947, 212 p.

Milligan G. W. Ultrametric hierarchical clustering algorithms. Psychometrika, 1979, vol. 44, iss. 3, pp. 343–346.

Published

2023-12-29

How to Cite

Orekhov, A. V., Shishkin, V. I., Kudriavtseva, G. V., Pavilaynen, G. V., Shishkin, V. V., & Lyudkevich, N. S. (2023). Automatically obtaining by methods of flow cytometry and cluster analysis simplified leukocyte formula. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 19(4), 469–483. https://doi.org/10.21638/11701/spbu10.2023.404

Issue

Section

Applied Mathematics