Support vector machines

VALKENBORG, Dirk; ROUSSEAU, Axel-Jan; GEUBBELMANS, Melvin; BURZYKOWSKI, Tomasz

Please use this identifier to cite or link to this item: http://hdl.handle.net/1942/41935

Title:	Support vector machines
Authors:	VALKENBORG, Dirk ROUSSEAU, Axel-Jan GEUBBELMANS, Melvin BURZYKOWSKI, Tomasz
Issue Date:	2023
Publisher:	MOSBY-ELSEVIER
Source:	AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 164 (5) , p. 754 -757
Abstract:	A support vector machine (SVM) is a supervised machine learning (ML) method capable of learning from data and making decisions. The fundamental principles of the SVM were already introduced in the 1960s by Vapnik and Chervonenkis 1 in a theory that was further developed throughout the next decennia. However, it was only in the 1990s that SVMs attracted greater attention from the scientific community , and this was attributed to 2 significant improvements. The first extension is a kernel trick that allows the SVM to classify highly nonlinear problems. 2 The second permitted the extension of the SVM to solve problems in a regression framework 3 called support vector regression machine. These improvements have resulted in a decisive general approximator that nowadays finds its use in many applications. Typically, the mathematics and theory behind SVMs are complex and require a deep understanding of optimization theory, algebra, and learning theory. Nonetheless, the main idea can be intuitively explained , and this article will consider a classification problem to illustrate the concepts. In what follows, it can be noticed that SVMs differ from previously presented methods as they exploit geometries in the data and are not directly rooted in statistics (eg, generalized linear models). However, they originate from mathematics and engineering and are often compared with logistic regression explained in the previous article. The starting point of an SVM is straightforward as it will try to solve a particular binary classification problem by the simplest model possible, separating the subjects that belong to the 2 different classes by a classification boundary. In 2 dimensions, this classification boundary will form a straight line. In 3 dimensions, this classification boundary will become a plane, a line generalization. This boundary will be called a hyperplane for higher dimensions, which can be considered a plane in .3 dimensions and is beyond our imagination. Again, the question is how well such a simple model classifies and how well-learned concepts are generalizable to previously unseen data. Figure 1, A shows a separable classification problem. It is perfectly possible to separate the blue from the red by using a straight line as a classification boundary. However, as illustrated in the plot, multiple options are possible. Which line should we select as our boundary to minimize the risk of misclassifying a previously unseen subject? The solution to this question presents itself in Figure 1, B and is called a maximum-margin classi-fier. The basic idea is simple. To minimize misclassifica-tion risk, we want our classification boundary positioned as far as possible from neighboring subjects belonging to the different classes. The margin is maximized between the classification boundary and the training data allowing for a tolerance region when predicting a class label for new subjects. An important observation can be made from the figure. Data points far from the classification line do not influence its position. The only data points that determine the decision boundary are the 3 points in black in Figure 1, B. These points are called support points or support vectors. In other words, if we would remove all the subjects from our training dataset apart from these 3 support vectors, then the location of the decision boundary would remain unaltered. This example indicates that support vectors significantly influence the decision boundary, and changes in the training data will dramatically impact the decision boundary. Figure 1, C shows an additional subject indicated by an arrow added to the training dataset. Coincidentally, this subject lies close to the decision boundary and is an influential support vector that will modify the maxim-margin problem, resulting in a different classification boundary, as indicated by the green. However, when eyeballing the classification problem, we can be pretty satisfied with the previous classifier, indicated by dashes, which yields wider overall margins concerning their neighboring training subjects.
Notes:	Burzykowski, T (corresponding author), Hasselt Univ, Data Sci Inst, Agoralaan 1,Bldg D, B-3590 Diepenbeek, Belgium. tomasz.burzykowski@uhasselt.be
Keywords:	Humans;Support Vector Machine;Algorithms
Document URI:	http://hdl.handle.net/1942/41935
ISSN:	0889-5406
e-ISSN:	1097-6752
DOI:	10.1016/j.ajodo.2023.08.003
ISI #:	001105793300001
Rights:	2023
Category:	A2
Type:	Journal Contribution
Appears in Collections:	Research publications

Files in This Item:

File	Description	Size	Format
Support vector machines.pdf Restricted Access	Published version	683.41 kB	Adobe PDF	View/Open Request a copy

Show full item record

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM