Using Naïve Bayes to Predict Diabetes
Introduction: Diabetes affects approximately 1.25 million American adults and children . Type II Diabetes, caused by a rise in insulin resistance, resulting in hyperglycemia (high blood-sugar), yielding symptoms such as excessive thirst, frequent urination, fatigue, dizziness, headache and nausea. The risk of type II diabetes has been highly correlated with obesity, and it has been hypothesized that specific markers such as age, and even gender. Naïve Bayes has been successfully applied in medical diagnoses before, with high accuracy rates, but not for diabetes. With these developments, it seemed possible that a mathematical classification model such as Naïve Bayes could be used to 'predict' diabetes based on probability outcomes.
research aims to achieve just that.
Aim: To determine whether the Naïve Bayes mathematical model serves as a suitable predictor for diabetes, given specific patient characteristics.
Method: The Naïve Bayes classifier was applied on a patient given three specific attributes of age, gender, and frame, based upon an existing Statcrunch diabetes database. A code was then developed (see bit.ly/NaiveBayes) and applied using (1) 10, randomly selected training data and (2) 20 training data.
Results: Using only three of the 16 given attributes of patients, initial trials with 10 training data should a mild yield of 40%. However, upon increasing it to 20, the accuracy of the model's predictions increased by 80%.
Conclusion: If only three highly-simplified categories could yield an 80% accuracy rate using only 20 training data, the Naïve Bayes model shows immense promise for future development. Although by no means can this machine-learning algorithm replace physicians and tests, it can serve as a guide for them, and possibly uncover previously unnoticed trends. Above all, these results show the power of mathematical computation across all realms, including healthcare.