Document Type: Original Article
Student in Biostatistics, Department of Biostatistics and Epidemiology, Kerman University of Medical Sciences, Kerman, Iran
Professor of Biostatistics, Physiology Research Center, Institute of Basic and Clinical Physiology Sciences & Modeling in Health Research Center, Faculty of Health, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
Assistant Professor of Health Education and Health Promotion, Mother and Child Welfare Research Center, Hormozgan University of Medical Sciences, Bandar Abbas, Iran
Professor of Biostatistics, Modeling in Health Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
Background:Dialysis is a process for eliminating extra uremic fluids of patients with chronic renal failure. The present study aimed to determine the variables that influence the survival of dialysis patients using random survival forest model (RSFM) in low-dimensional data with low events per variable (EPV).
Methods:In this historical cohort study, information was collected from 252 dialysis patients in Bandar Abbas hospitals, Iran. The survival time of the patients was calculated in years from the onset of dialysis to death or until the end of the study in 2016. RSFM was used as the number of events per variable (EPV) was low. The data collected from 252 patients were randomly divided into training and testing sets, and this process was repeated 100 times. C-index and Brier Score (BS) were used to assess the performance of the model in the test set.
Results: In this study, 35 (13.9%) mortality cases were observed. Based on the findings, the mean C-index value in training and testing sets was 0.640 and 0.687, and the mean BS value in training and testing sets was 0.017 and 0.023, respectively. The results of the RSFM revealed that BMI, education, occupation, dialysis duration, number of dialysis sessions and age at dialysis onset were the most important factors.
Conclusion: RSFM can be used to determine the survival of dialysis patients and manage low-dimensional data with few-events if the researcher desires to select a nonparametric model.