Unsupervised Estimation of Subjective Content Descriptions

Magnus Bender, Tanya Braun, Ralf Möller, Marcel Gehrke

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

Abstract

An agent in pursuit of a task may work with a corpus containing text documents. One possible task of the agent is to retrieve documents of similar content and highlight relevant locations in retrieved documents. To perform information retrieval on the corpus, the agent may need additional data associated with the documents. Subjective Content Descriptions (SCDs) provide additional location-specific data for text documents. However, the agent needs SCDs referencing sentences of similar content across various documents in the corpus and most text documents are not associated with SCDs. Therefore, this paper presents UESM, an unsupervised approach to estimate SCDs for text documents, i.e., to associate any corpus with SCDs. In an evaluation, we show that the performance of UESM is on par with latent Dirichlet allocation, while UESM provides SCDs referencing sentences of similar content.
OriginalsprogEngelsk
TitelProceedings - 17th IEEE International Conference on Semantic Computing, ICSC 2023
Antal sider8
ForlagIEEE
Publikationsdato20 mar. 2023
Sider266-273
ISBN (Trykt)978-1-6654-8264-6
ISBN (Elektronisk)978-1-6654-8263-9
DOI
StatusUdgivet - 20 mar. 2023
Udgivet eksterntJa
Begivenhed2023 IEEE 17th International Conference on Semantic Computing (ICSC) -
Varighed: 1 feb. 20233 feb. 2023
Konferencens nummer: 17

Konference

Konference2023 IEEE 17th International Conference on Semantic Computing (ICSC)
Nummer17
Periode01/02/202303/02/2023

Emneord

  • "Embedding; Named Entity Recognition; Entailment"
  • "Semantics; Models; Recommender Systems"

Citer dette