Cooperative game theory methods for text ranking
DOI:
https://doi.org/10.21638/11701/spbu10.2022.105Abstract
A method of ranking the corpus of texts of a news portal, based on measures of graph centrality, is proposed. Each text is assigned a vertex of a certain graph, and its structure is determined based on the semantic connectivity of the texts. As a measure of centrality, the Myerson value is used in a cooperative game on a graph, where the number of simple paths in a subgraph of a certain length m is chosen as a characteristic function For different values of m, the ranking based on the Myerson value will be different. For the final ranking, it is proposed to use the ranking procedure based on the tournament matrix. The operation of the ranking algorithm is illustrated by numerical examples related to a specific news portal.
Keywords:
text corpus of news, graph, centrality measure, Myerson value, tournament matrix, ranking procedure
Downloads
References
Silva A., Lozkins A., Bertoldi L. R., Rigo S., Bure V. M. Semantic textual similarity on Brazilian Portuguese: An approach based on language-mixture models // Вестник Санкт-Петербургского университета. Прикладная математика. Информатика. Процессы управления. 2019. Т. 15. Вып. 2. С. 235–244. https://doi.org/10.21638/11701/spbu10.2019.207
Jones K. S. A statistical interpretation of term specificity and its application in retrieval // J. Documentation. 2004. Vol. 60. N 5. P. 493–502. https://doi.org/10.1108/00220410410560573
Page L., Brin S., Motwani R., Winograd T. The pagerank citation ranking: Bringing order to the Web // Proceedings of the 7th International World Wide Web Conference. Brisbane, Australia. 1998. P. 161–172. URL: http://citeseer.nj.nec.com/page98pagerank.html (дата обращения: 15 июля 2021 г.).
Freeman L. C. A set of measures of centrality based on betweenness // Sociometry. 1977. Vol. 40. N 1. P. 35–41. http://dx.doi.org/10.2307/3033543
Brandes U. Centrality measures based on current flow // STACS 2005. 22nd Annual Symposium on Theoretical Aspects of Computer Science. Stuttgart, Germany. February 24–26, 2005. Proceedings. Eds. by V. Diekert, B. Durand. Vol. 3404 of Lecture Notes in Computer Science. Stuttgart: Springer, 2005. P. 533–544. https://doi.org/10.1007/978-3-540-31856-9_44
Avrachenkov K., Litvak N., Medyanikov V., Sokol M. Alpha current flow betweenness centrality // Algorithms and Models for the Web Graph. 10th International Workshop (WAW 2013). Cambridge, MA, USA. December 14–15, 2013. Proceedings. Eds. by A. Bonato, M. Mitzenmacher, P. Pralat. Vol. 8305 of Lecture Notes in Computer Science. Cambridge: Springer, 2013. P. 106–117. https://doi.org/10.1007/978-3-319-03536-9_9
Avrachenkov K. E., Mazalov V. V., Tsynguev B. T. Beta current flow centrality for weighted networks // Computational Social Networks. 4th International Conference (CSoNet 2015). Beijing, China. August 4–6, 2015. Proceedings. Lecture Notes in Computer Science. N 9197. 2015. P. 216–227. https://doi.org/10.1007/978-3-319-21786-4_19
Newman M. E. J. A measure of betweenness centrality based on random walks // Social Networks. 2005. Vol. 27. P. 39–54. http://dx.doi.org/10.1016/j.socnet.2004.11.009
Jackson M. O. Social and economic networks. Princeton, USA: Princeton University Press, 2008. 504 p. https://doi.org/10.1515/9781400833993
G’omez D., Gonz’alez-Arang”uena E., Manuel C. et al. Centrality and power in social networks: a game theoretic approach // Math. Soc. Sci. 2003. Vol. 46, N 1. P. 27–54. https://doi.org/10.1016/S0165-4896(03)00028-3
Skibski O., Tomasz P., Talal R. Axiomatic characterization of game theoretic centrality // J. Artif. Intell. Res. 2018. Vol. 62. P. 33–68. https://doi.org/10.1613/jair.1.11202
Mazalov V. V., Trukhina L. I. Generating functions and the Myerson vector in communication networks // Diskr. Mat. 2014. Vol. 26. N 3. P. 65–75. https://doi.org/10.1515/dma-2014-0026
Avrachenkov K., Kondratev A. Yu., Mazalov V. V., Rubanov D. G. Network partitioning as cooperative games // Computational social networks. 2018. Vol. 5. N 11. P. 1–28.
Мазалов В. В., Хитрая В. А. Модифицированное значение Майерсона для определения центральности вершин графа // Математическая теория игр и еe приложения. 2019. Vol. 11. № 2. С. 19–39.
Мазалов В. В., Никитина Н. Н. Метод максимального правдоподобия для выделения сообществ в коммуникационных сетях // Вестник Санкт-Петербургского университета. Прикладная математика. Информатика. Процессы управления. 2018. Т. 14. Вып. 3. С. 200–214. https://doi.org/10.21638/11701/spbu10.2018.302
Korobov M. Morphological analyzer and generator for russian and ukrainian languages // Analysis of Images, Social Networks and Texts / Eds. by M. Yu. Khachay, N. Konstantinova, A. Panchenko et al. Cham: Springer International Publ., 2015. Vol. 542 of Communications in Computer and Information Science. P. 320–332. http://dx.doi.org/10.1007/978-3-319-26123-231
Lovins J. B. Development of a stemming algorithm // Mech. Transl. Comput. Linguistics. 1968. Vol. 11. N 12. P. 22–31. URL: http://www.mtarchive.info/MT1968-Lovins.pdf (дата обращения: 15 июля 2021 г.).
Van Rijsbergen C. J., Robertson S. E., Porter M. F. New models in probabilistic information retrieval // Computer Laboratory. Cambridge, USA: Cambridge University Press, 1980. 613 p.
Harris Z. Distributional structure // Word. 1954. Vol. 10, N 2–3. P. 146–162. URL: https://link.springer.com/chapter/10.1007/978-94-009-8467-71 (дата обращения: 15 июля 2021 г.).
Manning C. D., Raghavan P., Sch”utze H. Introduction to information retrieval. Cambridge, USA: Cambridge University Press, 2008. 535 p.
Kondratev A. A., Mazalov V. V. Ranking procedure with the shapley value // Intelligent Information and Database Systems. 9th Asian Conference (ACIIDS 2017). Kanazawa, Japan. April 3–5,linebreak 2017. Proceedings. P. II / Eds. by N. T. Nguyen, S. Tojo, L. M. Nguyen, B. Trawinski. 2017. Vol. 10192 of Lecture Notes in Computer Science. P. 691–700. https://doi.org/10.1007/978-3-319-54430-4_66
References
Silva A., Lozkins A., Bertoldi L. R., Rigo S., Bure V. M. Semantic textual similarity on Brazilian Portuguese: An approach based on language-mixture models. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2019, vol. 15, iss. 2, pp. 235–244. https://doi.org/10.21638/11701/spbu10.2019.207 (In Russian)
Jones K. S. A statistical interpretation of term specificity and its application in retrieval. J. Documentation, 2004, vol. 60, no. 5, pp. 493–502. https://doi.org/10.1108/00220410410560573
Page L., Brin S., Motwani R., Winograd T. The pagerank citation ranking: Bringing order to the Web. Proceedings of the 7th International World Wide Web Conference. Brisbane, Australia, 1998, pp. 161–172. Available at: citeseer.nj.nec.com/page98pagerank.html (accessed: July 15, 2021).
Freeman L. C. A set of measures of centrality based on betweenness. Sociometry, 1977, vol. 40, no. 1, pp. 35–41. http://dx.doi.org/10.2307/3033543
Brandes U. Centrality measures based on current flow. STACS 2005. 22nd Annual Symposium on Theoretical Aspects of Computer Science. Stuttgart, Germany, February 24–26, 2005. Proceedings, Eds. by V. Diekert, B. Durand, vol. 3404 of Lecture Notes in Computer Science. Stuttgart, Springer Publ., 2005, pp. 533–544. https://doi.org/10.1007/978-3-540-31856-9_44
Avrachenkov K., Litvak N., Medyanikov V., Sokol M. Alpha current flow betweenness centrality. Algorithms and Models for the Web Graph. 10th International Workshop (WAW 2013). Cambridge, MA, USA, December 14–15, 2013. Proceedings. Eds. by A. Bonato, M. Mitzenmacher, P. Pralat, vol. 8305 of Lecture Notes in Computer Science. Cambridge, Springer Publ., 2013, pp. 106–117. https://doi.org/10.1007/978-3-319-03536-9_9
Avrachenkov K. E., Mazalov V. V., Tsynguev B. T. Beta current flow centrality for weighted networks. Computational Social Networks. 4th International Conference (CSoNet 2015). Beijing, China, August 4–6, 2015. Proceedings, Lecture Notes in Computer Science, no. 9197, 2015, pp. 216–227. https://doi.org/10.1007/978-3-319-21786-4_19
Newman M. E. J. A measure of betweenness centrality based on random walks. Social Networks, 2005, vol. 27, pp. 39–54. http://dx.doi.org/10.1016/j.socnet.2004.11.009
Jackson M. O. Social and economic networks. Princeton, USA, Princeton University Press, 2008, 504 p. https://doi.org/10.1515/9781400833993
G’omez D., Gonz’alez-Arang”uena E., Manuel C. et al. Centrality and power in social networks: a game theoretic approach. Math. Soc. Sci., 2003, vol. 46, no. 1, pp. 27–54. https://doi.org/10.1016/S0165-4896(03)00028-3
Skibski O., Tomasz P., Talal R. Axiomatic characterization of game theoretic centrality. J. Artif. Intell. Res., 2018, vol. 62, p. 33–68. https://doi.org/10.1613/jair.1.11202
Mazalov V. V., Trukhina L. I. Generating functions and the Myerson vector in communication networks. Diskr. Mat., 2014, vol. 26, no. 3, pp. 65–75. https://doi.org/10.1515/dma-2014-0026
Avrachenkov K., Kondratev A. Yu., Mazalov V. V., Rubanov D. G. Network partitioning as cooperative games. Computational social networks, 2018, vol. 5, no. 11, pp. 1–28.
Mazalov V. V., Khitraya V. A. Modificirovannoe znachenie Maiersona dlya opredeleniya centralnosti vershin grafa [Modified Mayerson value for determining the centrality of graph vertices]. Matematicheskaia teoriia igr i ee prilozheniia [Mathematical theory players and its supplements], 2019, vol. 11, no. 2, pp. 19–39. (In Russian)
Mazalov V. V., Nikitina N. N. Metod maksimalnogo pravdopodobia dlya vydelenya soobshestv v commukacionnih setyah [Maximum likelihood method for detecting communities in communication networks]. Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes, 2018, vol. 14, no. 3, pp. 200–214. https://doi.org/10.21638/11701/spbu10.2018.302 (In Russian)
Korobov M. Morphological analyzer and generator for Russian and Ukrainian languages. Analysis of Images, Social Networks and Texts. Eds. by M. Yu. Khachay, N. Konstantinova, A. Panchenko et al. Cham, Springer International Publ., 2015, vol. 542 of Communications in Computer and Information Science, pp. 320–332. http://dx.doi.org/10.1007/978-3-319-26123-231
Lovins J. B. Development of a stemming algorithm. Mech. Transl. Comput. Linguistics, 1968, vol. 11, no. 12, pp. 22–31. Available at: http://www.mtarchive.info/MT1968-Lovins.pdf (accessed: July 15, 2021).
Van Rijsbergen C. J., Robertson S. E., Porter M. F. New models in probabilistic information retrieval. Computer Laboratory. Cambridge, USA, Cambridge University Press, 1980, 613 p.
Harris Z. Distributional structure. Word, 1954, vol. 10, no. 2-3, pp. 146–162. Available at: https://link.springer.com/chapter/10.1007/978-94-009-8467-71 (accessed: July 15, 2021).
Manning C. D., Raghavan P., Sch”utze H. Introduction to information retrieval. Cambridge, USA, Cambridge University Press, 2008, 535 p.
Kondratev A. A., Mazalov V. V. Ranking procedure with the shapley value. Intelligent Information and Database Systems. 9th Asian Conference (ACIIDS 2017). Kanazawa, Japan, April 3–5, 2017. Proceedings, P. II / Eds. by N. T. Nguyen, S. Tojo, L. M. Nguyen, B. Trawinski, 2017, vol. 10192 of Lecture Notes in Computer Science, pp. 691–700. https://doi.org/10.1007/978-3-319-54430-4_66
Downloads
Published
How to Cite
Issue
Section
License
Articles of "Vestnik of Saint Petersburg University. Applied Mathematics. Computer Science. Control Processes" are open access distributed under the terms of the License Agreement with Saint Petersburg State University, which permits to the authors unrestricted distribution and self-archiving free of charge.