Semantic Integration and Annotation Techniques for Tourism Data Based on the Semantic Web
DOI:
https://doi.org/10.63313/CS.8018Keywords:
Semantic Web, Tourism Knowledge Graph, Ontology Mapping, Semantic Annotation, Large Language Models, Data IntegrationAbstract
As culture and tourism integration accelerates, the tourism industry is increasingly challenged by heterogeneous data sources, inconsistent structures, and semantic fragmentation. Semantic Web technologies-through ontology construction, RDF modeling, semantic mapping, and logical reason-ing-offer a knowledge-driven approach to data integration and annotation. This paper comprehen-sively surveys key techniques for semantic tourism data integration and annotation, including data extraction, transformation, ontology alignment, and conflict resolution. It contrasts rule-based, ma-chine learning, and deep learning annotation paradigms, and evaluates mainstream tools for struc-tured and unstructured tourism data processing. The study further identifies technical bottlenecks in scalability, data quality, and semantic drift, proposing future directions such as LLM-assisted semantic annotation, incremental ontology evolution, and privacy-preserving reasoning. This research aims to provide a technical foundation and strategic reference for intelligent tourism systems and semantic data governance.
References
[1] Berners-Lee, T., Hendler, J., & Lassila, O. (2001). Web semantic. Scientific American, 284(5), 34-43.
[2] Chen, L., Cai, X., & Liu, Z. (2025). Multi-Source Data and Semantic Segmentation: Spatial Quality Assessment and Enhancement Strategies for Jinan Mingfu City from a Tourist Perception Perspective. Buildings, 15(13), 2298.
[3] Bizer, C., Heath, T., & Berners-Lee, T. (2023). Linked data-the story so far. In Linking the World’s Information: Essays on Tim Berners-Lee’s Invention of the World Wide Web (pp. 115-143).
[4] El Archi, Y., Benbba, B., Nizamatdinova, Z., Issakov, Y., Vargáné, G. I., & Dávid, L. D. (2023). Systematic liter-ature review analysing smart tourism destinations in context of sustainable development: Current applications and future directions. Sustainability, 15(6), 5086.
[5] Euzenat, J., & Shvaiko, P. (2013). Overview of matching systems. In Ontology Matching (pp. 201-283). Berlin, Heidelberg: Springer Berlin Heidelberg.
[6] Lenzerini, M. (2002, June). Data integration: A theoretical perspective. In Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 233-246).
[7] Kejriwal, M. (2022). Knowledge graphs: Constructing, completing, and effectively applying knowledge graphs in tourism. In Applied Data Science in Tourism: Interdisciplinary Approaches, Methodologies, and Applications (pp. 423-449). Cham: Springer International Publishing.
[8] Kushmerick, N. (2000). Wrapper induction: Efficiency and expressiveness. Artificial intelligence, 118(1-2), 15-68.
[9] Huang, C. (2024). Advancing Social Insights Through NLP: Social Media Reactions, Mental Health and Beyond (Doctoral dissertation, The University of Iowa).
[10] Arenas, M., Bertails, A., Prud’hommeaux, E., & Sequeda, J. (2012). A direct mapping of relational data to RDF. W3C recommendation, 27, 1-11.
[11] Calvanese, D., Cogrel, B., Komla-Ebri, S., Kontchakov, R., Lanti, D., Rezk, M., ... & Xiao, G. (2016). Ontop: Answering SPARQL queries over relational databases. Semantic Web, 8(3), 471-487.
[12] Dong, X. L., Berti-Equille, L., & Srivastava, D. (2009). Integrating conflicting data: the role of source dependence. Proceedings of the VLDB Endowment, 2(1), 550-561.
[13] Sevgili, Ö., Shelmanov, A., Arkhipov, M., Panchenko, A., & Biemann, C. (2022). Neural entity linking: A survey of models based on deep learning. Semantic Web, 13(3), 527-570.
[14] Heath, T., & Bizer, C. (2011). Linked data: Evolving the web into a global data space. Morgan & Claypool Publishers.
[15] Cimiano, P., Handschuh, S., & Staab, S. (2004, May). Towards the self-annotating web. In Proceedings of the 13th international conference on World Wide Web (pp. 462-471).
[16] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, June). Bert: Pre-training of deep bidirectional trans-formers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) (pp. 4171-4186).
Downloads
Published
Issue
Section
License
Copyright (c) 2025 by author(s) and Erytis Publishing Limited.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.







