Developing Machine Learning Models to Predict Dental Diseases | Teen Ink

Developing Machine Learning Models to Predict Dental Diseases

March 8, 2023
By jesseoh3709 BRONZE, La Canada, California
jesseoh3709 BRONZE, La Canada, California
1 article 0 photos 0 comments


A variety of dental diseases can be traced back to one of the most commonly known forms of poor dental hygiene: cavities. Cavities are by far the most common dental disease. Among adults aged 20 years and older, about 90% have had at least one cavity (1). Although most cavities are not very severe, it is important to identify such cavities in their early stages to improve treatment options. In extreme cases, cavities can lead to more severe tooth decay, which if not treated, can spread under the gums and to other parts of the body, resulting in rare fatal cases (2). Although easily treatable, early cavities often go unnoticed as they can be tedious and difficult to identify through film reading and physical diagnosis alone, depending on the location of the cavity. Recently, machine learning however was used to quickly identify cavities and other dental diseases (3). The rise of the use of machine learning in a variety of healthcare related fields suggests that it may provide many benefits to early cavity detection and improvements in treatment. Therefore, it is imperative to utilize new computing techniques in the rising field of machine learning as a way to vastly improve the speed, efficiency, and possibly even the accuracy of diagnosis. 

A very common and relatively low power requirement form of identifying cavities through machine learning is image classification. Recently, companies such as Google have developed open-source software libraries for machine learning (i.e. Tensorflow, Microsoft CNTK, PyTorch, Keras, etc.), making it possible for practically anyone to create and develop a deep neural network for image classification on their own. This image classification software is growing in use and practicality in a variety of fields in healthcare (i.e. tumor and cancer diagnosis, brain research, etc.)(4, 5). The recent growing use of these programs has the potential/have been used to help dental professionals in making decisions related to prevention, diagnosis and/or treatment planning.

In this study, we propose a Convolutional Neural Network model for automatic teeth detection and overall diagnosis of periapical X-rays. Using a varied dataset acquired from Vineland Dental Practice to train my model, I ultimately achieved a 99% or higher level of accuracy in predicting/labeling cavities. In order to achieve this, I used a dataset of 700 X-ray images and used Torchvision’s ResNet 18 architecture to train my program. This type of image classification uses supervised image classification through object-based image analysis. Although this dataset only differentiated between teeth with cavities and teeth without cavities, this model could be adapted to classify other dental diseases like ulcers or gingivitis in the future to greatly improve the efficiency of dental diagnoses. By incorporating bounding boxes in the classification of X-rays, it is possible to develop labeled X-rays, allowing professionals to cross-check the machine’s classification with their own diagnosis. 

Materials and Methods


The dataset used in this experiment was an anonymous dataset of periapical Dental X-rays from Vineland Dental Practice in Burbank California (used with permission and patient identifying information removed). This dataset included 700 patient x-rays across a range of ages 18-65 from the Burbank area. Images in this dataset were randomly selected and not selected based on ethnic background or sex. This dataset was then transferred from the practice’s database to my computer, which I then uploaded via .zip file onto Google Colab. 

Coding the program

The code of this program was developed using Torchvision’s CNN models (6), Pandas and  Matplotlib’s graphical modeling, and Google Collab’s text editor. The code used for this program can be accessed here (7). 

Training and Validation

To train this program, 700 x-ray images were presented and each image was labeled with necessary information (whether or not a cavity was present) using bounding boxes. For validation, the program then tested itself against the same dataset to confirm a percent accuracy. Using the unzip command in Google Colab, you can import a relatively large dataset of images. Additionally, it is essential to create a spreadsheet beforehand to classify each of the images (cavity or no-cavity). Using a model trained on the CIFAR-10 dataset before training the actual dataset and ResNet18, I trained on one hundred epochs and plotted the training and validation accuracy using matplotlib.pyplot. 


In this study, our aim was to create a model that efficiently and quickly detects cavities as this would be of great use to dentists. To achieve this, an anonymous dataset of periapical Dental X-rays from Vineland Dental Practice was used to develop a program using Torchvision’s CNN models (6). I then collected training accuracy and validation accuracy data and analyzed it using Matplotlib. I first collected training accuracy data by measuring training loss across epochs tested and found that loss immediately drops close to 0 (Figure 1). This suggests that my program is not learning significantly from additional epochs, meaning the dataset is either not varied enough or the model is overfitting (Figure 1). Validation loss also exhibited a similar drop, which suggests that the model may not fit well with additional data (Figure 1).

Next, the validation accuracy of the model was measured by collecting accurate information across multiple epochs. The resulting data shows that training and validation accuracy both increased over time, but the data plateaus after about 70 epochs (Figure 2). We also found that the final validation accuracy is about 80.576%, not nearly accurate enough to be used in widespread use cases (should be close to 99%) (Figure 2). Training accuracy was also measured, which reached 1.0 by 100 epochs which suggest that the training was complete (Figure 2). Since the Validation accuracy did not reach the same level as training accuracy, this suggests that the training data is not complex enough to detect cavities in dental x-rays properly. 


In this study, our goal was to create a model that efficiently detects cavities from dental x-rays. In our study, although training accuracy reached 1.0 by 100 epochs, which suggests that the training was complete, and both training and validation accuracy increased over time, the validation data plateaus after about 70 epochs at about 80.576%, not nearly accurate enough for widespread use (should be close to 99%) (Figure 2). This is likely due to the fact that there are simply not enough images to develop a practical accuracy. In a similar study (8), 1250 images from Peking University School were used, but these images were transformed and standardized for more consistent classification. Therefore, with additional images and standardization, it is possible our program could improve, however currently it is not ready for applied use. 

Although this program is not ready for applied use, it revealed important information about how to create a similar program that would function well in the future. In future studies, this program could be modified to identify a wider variety of dental diseases and incorporate a drastically larger dataset which should drastically improve its validation accuracy. Additionally, although here, this program was focused on dental x-rays, the application of machine learning in medical fields has enormous potential. This program could be modified to train to identify other important x-ray information, such as the presence of tumors, and cysts, and is even utilized with Tomography, Echography, Mammography, and MRIs (9). 


Cavities. 2022.
Periodontitis - Symptoms and causes. Mayo Clinic. 2023.
Bhattacharjee, N. Automated Dental Cavity Detection System Using Deep Learning and Explainable AI. AMIA Annual Symposium Proceedings, 2022.
Stephen, O., Sain, M., Maduh, U. J. & Jeong, D.-U. An Efficient Deep Learning Approach to Pneumonia Classification in Healthcare. Journal of Healthcare Engineering, 2019.
Faes, L., Wagner, S.K., Fu, D.J., Liu, X., Korot E., Ledsam, J.R., Back, T., Chopra, R., Pontikos, N., Kern, C., Moraes, G., Schmid, M.K., Sim, D., Balaskas, K., Bachmann, L.M., Denniston, A.K., Keane, P.A.. Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study. The Lancet Digital Health, 1: e232–e242, 2019.
Models and pre-trained weights — Torchvision main documentation.
Google Colaboratory.
Chen, H., Zhang, K., Lyu, P., Li H., Zhang, L., Wu, J., Lee, C-H.. A deep learning approach to automatic teeth detection and numbering based on object detection in dental periapical films. Scientific Reports, 9: 1–11, 2019.
Tchapga, C. T., Mih, T.A., Kouanou, A.T., Fonzin, T., Fogang, P.K., Mezatio, B.A., and Tchiotsop, D.. Biomedical Image Classification in a Big Data Architecture Using Machine Learning Algorithms. Journal of Healthcare Engineering, 2021.

The author's comments:

My name is Jesse Oh, and I am a senior at La Canada High School. Back in 2020, I had the opportunity to explore the burgeoning field of Artificial Intelligence, which sparked my interest and curiosity. During the summer, I enrolled in several courses and was fascinated to discover the vast and practical applications of this rapidly-evolving technology.

As I delved deeper into the field, I began to develop a keen interest in healthcare. However, it wasn't until I explored the world of dentistry that I realized my true passion. Something about the intersection of AI and dentistry captured my imagination, and I knew that this was the field I wanted to pursue.

Overall, I am excited to continue exploring the fascinating field of AI, and I am eager to apply my knowledge and skills to the world of dentistry to make a meaningful impact in people's lives.

Similar Articles


This article has 0 comments.