Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Cross-language information retrieval by reduced k- means (CROSBI ID 256412)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Dobša, Jasminka ; Mladenić, Dunja ; Rupnik, Jan ; Radošević, Danijel ; Magdalenić, Ivan Cross-language information retrieval by reduced k- means // International journal of computer information systems and industrial management applications, 10 (2018), 1; 314-322

Podaci o odgovornosti

Dobša, Jasminka ; Mladenić, Dunja ; Rupnik, Jan ; Radošević, Danijel ; Magdalenić, Ivan

engleski

Cross-language information retrieval by reduced k- means

Cross-language information retrieval aims at retrieving relevant documents in one language for a query set in another language. Here we propose a new approach to the problem of cross-language information retrieval based on factorization of a term-document matrix by an iterative method of Reduced k-means clustering. Method of Reduced k- means intended at simultaneous reduction of objects (documents) and variables (index terms). Proposed method is compared to standard machine learning techniques of cross-language information retrieval by usage of latent semantic indexing and canonical correlation analysis. Motivation for usage of Reduced k-means method for a task of cross-language information retrieval comes from an observation that documents in a semantic space obtained by method of latent semantic indexing are clustered by their language and not by their topics in the first place. As Reduced k-means aims at preserving clustering structure of data, the idea is that the proposed method could address the mentioned problem.

cross-language information retrieval, dimensionality reduction, latent semantic indexing, canonical correlation analysis, Reduced k-means

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

10 (1)

2018.

314-322

objavljeno

2150-7988

Povezanost rada

Informacijske i komunikacijske znanosti