The "Diversum" Project
In 2010, Web Mining Lab of PJIIT, obtained funding from National Science Centre for a research project:
Title of the project:"Models and algorithms for supporting non-standard information processing on semantic knowledge graphs including search result diversification and adaptation to Polish language"
Project's goal: to study the area of processing the information in the form of semantic knowledge graphs with particular emphasis on applications of the concept of diversity of result to graphical entity summarisation
Team: dr Marcin Sydow (head); Mariusz Pikuła; Mateusz Chochół (admin); Grzegorz Sobczak; dr hab. Ralf Schenkel (MPII, Saarbruecken); dr Jakub Piskorski (IPI PAN); Prof. dr hab. Witold Kosiński; A.Wróblewska; T.Kuśmierczyk
The obtained results include:
- Design and implementation of novel algorithms for graphical entity summarisation including the diversification approach
- Successful user evaluation experiments proving the value of the novel algorithms and the application of the diversification approach
- Novel method for supporting automatic extraction of semantic knowledge graphs from web documents adapted to Polish language
- Preparation of a semantic knowledge graph extracted from Polish news articles
- Novel automatic visualisation methods tailored to graphical entity summarisation
- Formulating the Diversum problem as an optimisation one, with a new objective function, and showing its NP-hardness
- Experiments with a novel ant-colony optimisation algorithm with self-adaptation, applied to the above problem
- New effective greedy heuristic with approximation factor guarantee for the above problem
- Design of a new graphical result layout quality measure, its NP-hardness result and proposal of an algorithm for its computation
Publications related to the project (13 publications, including 1 in a journal with "impact factor"):
- M.Sydow, M.Pikuła, R.Schenkel "The notion of diversity in graphical entity summarisation on semantic knowledge graphs" Journal of Intelligent Information Systems, Volume 41, Issue 2, pages 109-149, ISSN 0925-9902, http://dx.doi.org/10.1007/s10844-013-0239-6, Springer US (Open Access), 2013 [pdf] [bibtex]
- W.Kosiński, T.Kuśmierczyk, P.Rembelski, M.Sydow "Application of Ant-Colony Optimisation to Compute Diversified Entity Summarisation on Semantic Knowledge Graphs", Proc. of International IEEE AAIA 2013/FedCSIS Conference, Annals of Computer Science and Information Systems, Volume 1, pp. 69-76, ISSN 2300-5963, ISBN 978-1-4673-4471-5 (Web), 2013 [pdf]
- J. Piskorski and M. Ehrmann. On Named Entity Recognition in Targeted Twitter Streams in Polish. In Proceedings of the 4th Biennial Workshop on Balto-Slavic Natural Language Processing (BSNLP), collocated with ACL 2013, pages 84-89. Association for Computational Linguistics, 2013.
- M.Kacprzak, W.Kosiński, and K.Węgrzyn-Wolska. "Diversity of opinion evaluated by ordered fuzzy numbers". In L.Rutkowski, M.Korytkowski, R.Scherer, R.Tadeusiewicz, L.A. Zadeh, and JacekM. Zurada, editors, Artificial Intelligence and Soft Computing, volume 7894 of Lecture Notes in Computer Science, pages 271-281. Springer Berlin Heidelberg, 2013.
- G.Sobczak, M.Pikuła, M.Sydow "AGNES: a Novel Algorithm for Visualising Diversified Graphical Entity Summarisations on Knowledge Graphs" Foundations of Intelligent Systems, Proc. of 20th International Symposium, ISMIS 2012, Macau, China, December 4-7, 2012, LNCS Vol.7661, pp. 182--191, ISBN 978-3-642-34623-1 LNCS/Springer, 2012
- A.Wróblewska, M.Sydow "DEBORA: dependency-based Method for Extracting Entity-relationship Triples from Open-domain Texts in Polish" Foundations of Intelligent Systems, Proc. of 20th International Symposium, ISMIS 2012, Macau, China, December 4-7, 2012, LNCS Vol.7661, pp. 155-161, ISBN 978-3-642-34623-1 LNCS/Springer, 2012
- A.Wróblewska, M.Sydow "Dependency-based Extraction of Entity-relationship Triples from Polish Open-domain Texts" (extended version) Proc. of Artificial Intelligence Studies, vol.7(30), pp. 61--70, ISBN 978-83-7051-687-1, Publ. House of Univ. of Natural Sciences and Humanities, Siedlce, Poland, 2012
- M.Sydow, M.Pikuła, R.Schenkel "To Diversify or not to Diversify Entity Summaries on RDF Knowledge Graphs?" 19th ISMIS 2011 Conference, Vol. 6804, pp. 490--500, ISBN 978-3-642-21915-3 LNAI/Springer, 2011 [bibtex]
- M.Sydow, M.Pikuła, R.Schenkel "Entity Summarisation with Limited Edge Budget on Undirected and Directed Knowledge Graphs (extended journal version)" Investigationes Linguisticae, Vol. 21, pp. 76-89, ISSN 1426-188X, 2011 [pdf] [bibtex]
- M.Sydow "Towards the Foundations of Diversity-Aware Node Summarisation on Knowledge Graphs" "Diversity in Document Retrieval" Workshop on ECIR 2011 Conference, 2011 [bibtex]
- M.Sydow, K.Ciesielski, J.Wajda "Introducing Diversity to Log-based Query Suggestions to Deal with Underspecified User Queries" Proc. of the SIIS 2011 Conference, Vol. 7053, pp. 251-264, ISBN 978-3-642-25260-0 LNCS/Springer
- M.Sydow, M.Pikuła, R.Schenkel, A.Siemion "Entity Summarisation with Limited Edge Budget on Knowledge Graphs" IMCSIT/CLA 2010 Conference, pp. 513-516, ISBN 978-83-60810-22-4, IEEE, 2010 [bibtex]
- M.Sydow, M.Pikuła, R.Schenkel "DIVERSUM: Towards Diversified Summarisation of Entities in Knowledge Graphs" Data Engineering Workshops (ICDEW), 2010 IEEE 26th ICDE Conference, pp. 221-226, ISBN 978-1-4244-6522-4, IEEE, 2010 [bibtex]