Effects of dataset characteristics on the performance of feature selection techniques (CROSBI ID 234375)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Oreški, Dijana ; Oreški, Stjepan ; Kliček, Božidar
engleski
Effects of dataset characteristics on the performance of feature selection techniques
While extensive research in data mining has been devoted to developing better feature selection techniques, none of this research has examined the intrinsic relationship between dataset characteristics and a feature selection technique’s performance. Thus, our research examines experimentally how dataset characteristics affect both the accuracy and the time complexity of feature selection. To evaluate the performance of various feature selection techniques on datasets of different characteristics, extensive experiments with five feature selection techniques, three types of classification algorithms, seven types of dataset characterization methods and all possible combinations of dataset characteristics are conducted on 128 publicly available datasets. We apply the decision tree method to evaluate the interdependencies between dataset characteristics and performance. The results of the study reveal the intrinsic relationship between dataset characteristics and feature selection techniques’ performance. Additionally, our study contributes to research in data mining by providing a roadmap for future research on feature selection and a significantly wider framework for comparative analysis.
dataset characteristics ; feature selection ; comparative analysis ; data sparsity ; feature noise
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
Povezanost rada
Informacijske i komunikacijske znanosti