%0 Journal Article %T Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset %J Journal of Kerman University of Medical Sciences %I Kerman University of Medical Sciences %Z 1023-9510 %A Raeisi shahraki, Hadi %A Kalantari, Shiva %A Nafar, Mohsen %D 2019 %\ 05/01/2019 %V 26 %N 3 %P 207-213 %! Classification of Chronic Kidney Disease Patients via k-important Neighbors in High Dimensional Metabolomics Dataset %K Chronic kidney disease %K Classification %K High dimensional data %K KNN %K SCAD %R 10.22062/jkmu.2019.89501 %X Background: Chronic kidney disease (CKD), characterized by progressive loss of renal function, is becoming a growing problem in the general population. New analytical technologies such as “omics”-based approaches, including metabolomics, provide a useful platform for biomarker discovery and improvement of CKD management. In metabolomics studies, not only prediction accuracy is attractive, but also variable importance is critical because the identified biomarkers reveal pathogenic metabolic processes underlying the progression of chronic kidney disease. We aimed to use k-important neighbors (KIN), for the analysis of a high dimensional metabolomics dataset to classify patients into mild or advanced progression of CKD. Methods: Urine samples were collected from CKD patients (n=73). The patients were classified based on metabolite biomarkers into the two groups: mild CKD (glomerular filtration rate (GFR)> 60 mL/min per 1·73 m2) and advanced CKD (GFR2). Accordingly, 48 and 25 patients were in mild (class 1) and advanced (class 2) groups respectively. Recently, KIN was proposed as a novel approach to high dimensional binary classification settings. Through employing a hybrid dissimilarity measure in KIN, it is possible to incorporate information of variables and distances simultaneously. Results: The proposed KIN not only selected a few number of biomarkers, it also reached a higher accuracy compared to traditional k-nearest neighbors (61.2% versus 60.4%) and random forest (61.2% versus 58.5%) which are currently known as the best classifieres. Conclusion: Real metabolomics dataset demonstrate the superiority of proposed KIN versus KNN in terms of both classification accuracy and variable importance. %U https://jkmu.kmu.ac.ir/article_89501_c8a72792051ecc279653033c790bbde7.pdf