WebMap -- Large Language Model-assisted Semantic Link Induction in the Web
Shiraj Pokharel, Georg P. Roßrucker, Mario M. Kubek
TL;DR
This paper addresses the inadequacy of conventional web search for research tasks by proposing WebMap extensions that fuse LLM-powered semantic induction with a peer-to-peer overlay of Cluster Files (TRCs). It introduces local term proximity graphs derived from contextual embeddings, enabling more nuanced document clustering and topic signaling than co-occurrence alone. A semantic signpost is built within clusters using an extended HITS framework to identify authorities and hubs (keywords and source topics), guiding directed connections between documents. Subcluster detection via density-based methods and a discussion of limitations (text-centric content, cross-cluster navigation, and distributed data integrity) accompany a roadmap for future multimodal integration and enterprise deployments. Together, these contributions aim to deliver more accurate, navigable, and semantically organized web-scale research support.
Abstract
Carrying out research tasks is only inadequately supported, if not hindered, by current web search engines. This paper therefore proposes functional extensions of WebMap, a semantically induced overlay linking structure on the web to inherently facilitate research activities. These add-ons support the dynamic determination and regrouping of document clusters, the creation of a semantic signpost in the web, and the interactive tracing of topics back to their origins.
