Abstrakt

Impact of Similarity Measures on Causal Relation Based Feature Selection Method for Clustering Maritime Accident Reports

Santosh Tirunagari, Maria Hanninen, Guggilla Abhishek, Kaarle Stahlberg, and Pentti Kujala

Unsupervised document clustering is an automated process in which documents are analyzed based on their similarity. In this paper, we propose a new feature selection method based on causal relations to classify maritime accident reports in unsupervised manner. We also compare the impact of different similarity measures on proposed feature selection method. Based on the analysis, we conclude that the proposed feature selection method has better performance over the conventional method due to the effect of dimensionality curse. The impact of similarity measures improves with the proposed feature selection method. In the analysis, we have compared Correlation, Cosine, Spearman, Bray-Curtis, Euclidean, City-block, Squared-Euclidean, Standardized Euclidean, and, Chebychev similarity measures. The first two produced the best results, followed by the next two. The rest did not produce good results with the maritime accident reports used in our analysis. Interestingly Chi-Square gave good results with proposed method in our analysis.

Indiziert in

Google Scholar
Academic Journals Database
Open J Gate
Academic Keys
ResearchBible
CiteFactor
Elektronische Zeitschriftenbibliothek
RefSeek
Hamdard-Universität
Gelehrter
International Innovative Journal Impact Factor (IIJIF)
Internationales Institut für organisierte Forschung (I2OR)
Kosmos

Mehr sehen