Unsupervised learning

VALKENBORG, Dirk; ROUSSEAU, Axel-Jan; GEUBBELMANS, Melvin; BURZYKOWSKI, Tomasz

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/40758

Title:	Unsupervised learning
Authors:	VALKENBORG, Dirk ROUSSEAU, Axel-Jan GEUBBELMANS, Melvin BURZYKOWSKI, Tomasz
Issue Date:	2023
Publisher:	MOSBY-ELSEVIER
Source:	AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 163 (6) , p. 877 -882
Abstract:	A s mentioned in the previous article, 1 unsuper-vised learning involves using datasets without clear notice of the dependent (response) variable. Unsupervised means that the machine or computer should learn patterns from the data without referring to any specific response. Unsupervised learning aims to explore the data structure and generate a hypothesis rather than to test any hypothesis by statistical methods or to construct prediction or classification models on the basis of a set of conditions and a specified response. Algorithms for unsupervised learning can be subdivided into 2 categories: (1) clustering algorithms and (2) informative data transformations To better illustrate the concepts, we will use the data-set of Konstantonis et al 2 to investigate decisions about extraction and identification of treatment predictors in Class I malocclusions. The dataset comprises 542 randomly selected records of patients with a Class I relationship observed in a university graduate program and 5 private orthodontic offices. For each participant, several variables are observed: 26 cephalometric variables , 6 model measurements, 2 demographic variables (gender, age), and the type of treatment: nonextraction (397) or extraction of the 4 first premolars (145). More details about the dataset can be found in Konstantonis et al. 2 The scope of this study is evident as the authors want to predict the optimal treatment (response) given the set of explanatory (predictor) variables. Furthermore, they wanted to identify essential variables in predicting the treatment. The data can be presented in a tabular format that organizes all the information, as depicted in Table I. Clustering A clustering task can be best defined by an example. Consider the image in Figure 1. A task related to this image could be determining how many herds of animals with different genera are visible in this picture. On the basis of the physical characteristics of each animal, you could try to lump them into homogenous clusters (groups). In this example, you could cluster (group) the animals with black and white stripe patterns and place the horned animals with brownish fur in another cluster. To execute this task, it is not necessary to be an expert in wildebeest or zebra, nor is it required to have these animals tagged by a label that explains the genus of the animal. Clustering algorithms can discover this structure in a dataset without any prior knowledge. Toward this aim, a clustering algorithm will compute a distance measure to quantify similarity or dissimilarity between different subjects in the dataset. On the basis of this measure, subjects will be clustered (grouped) or split from each other to yield clusters (groups) that have the highest similarity within the cluster and the largest differences between the clusters. Typically, a clustering method has 3 key elements: (1) a distance measure to quantify the similarity or dissimi-larity between subjects; (2) an additional distance measure to quantify the difference between clusters or between a cluster and a subject (ie, linkage); and (3) a computer algorithm that maximizes the similarity within a cluster and the dissimilarity between the clusters. The variance is often used to measure the heterogeneity in a dataset. In this case, clustering will minimize the variance within the clusters and maximize the variance between the clusters. The distance is a number that tells us how far 2 subjects are separated by considering the difference for each observed variable. In the next example, the Frankfort mandibular incisor angle (FMIA) and the incisor mandib-ular plane angle (IMPA) are examined for 3 patients in the dataset of Konstantonis et al. 2 Two patients exhibit an FMIA and IMPA combination of (41.8/113.0) and (52.0/114.5). These patients seem very alike when considering these 2 covariates, especially when contrasting these observations with our third patient, who has an FMIA and IMPA of (89.1/76.0). Intuitively, the third
Notes:	Valkenborg, D (corresponding author), Hasselt Univ, Data Sci Inst, Agoralaan 1,Bldg D, B-3590 Diepenbeek, Belgium. dirk.valkenborg@uhasselt.be
Keywords:	Humans;Unsupervised Machine Learning
Document URI:	http://hdl.handle.net/1942/40758
ISSN:	0889-5406
e-ISSN:	1097-6752
DOI:	10.1016/j.ajodo.2023.04.001
ISI #:	001015008200001
Rights:	2023 by the American Association of Orthodontists. All rights reserved. https://doi.org/10.1016/j.ajodo.2023.04.001. Open access
Category:	A2
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
Unsupervised learning.pdf	Published version	1.16 MB	Adobe PDF	View/Open

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM