How to realize Linear support Vector Machine SVM by TensorFlow 07/17 Update SLTechnology News&Howtos

How to realize Linear support Vector Machine SVM by TensorFlow

2025-07-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

TensorFlow how to implement linear support vector machine SVM, I believe that many inexperienced people do not know what to do, so this paper summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

What we want to talk about today is the implementation of support vector machine in the case of linear separability. For points in the plane, the purpose of support vector machine is to find a straight line, separate the training samples, so that the distance between the straight line and the two samples is equal. If it is a high-dimensional space, it is a hyperplane.

Then let's take a brief look at what the svm principle is for linearly separable, and for linear models:

The training samples are

The label is:

Then the sample is classified as positive, otherwise it is classified as negative.

So the goal of svm is to find W (vector) and b, and then suppose we find a straight line that can separate the data, then the distance between the data and the straight line is:

Then we call the distance between the two sides of the hyperplane to the hyperplane margin. The optimization goal is to maximize the margin, so that the resulting hyperplane has a good generalization ability (it can also be correctly classified with other data).

The optimization objectives of SVM are:

The condition is:

Note here that because tn can take + 1 and-1, when you take-1, both sides of the inequality will be multiplied by-1, so the direction of the unequal sign will change. To solve this optimization problem (quadratic programming), we can use the Lagrange multiplier method, where alpha is the Lagrange multiplier.

If we take the derivative of w and b, we can get:

Then the result of this solution is substituted into the L above, and the dual form of L can be obtained.

The condition of the dual form is:

Then replace the parameter W in the starting linear model with the kernel function to get:

The dual form of L above is a simple quadratic programming problem, which can be solved by KKT condition:

Then bring the above y (xn) into this equation, and you get the following formula:

When solving the above equation, b is:

Where Ns represents the support vector and K (Xn,Xm) represents the kernel function.

Let's take the chestnut of a kernel function for a point in a two-dimensional plane.

After more than two hours, I finally adjusted the code. Although it is not difficult, I still feel that my level is limited and there will still be a lot of problems in its implementation.

Import numpy as np

Import tensorflow as tf

From sklearn import datasets

X_data = tf.placeholder (shape= [None, 2], dtype=tf.float32)

Y_target = tf.placeholder (shape= [None, 1], dtype=tf.float32)

# get data of batch size

Def gen_data (batch_size):

Iris = datasets.load_iris ()

Iris_X = np.array ([[x [0], x [3]] for x in iris.data])

Iris_y = np.array ([1 if y = = 0 else-1 for y in iris.target])

Train_indices = np.random.choice (len (iris_X))

Int (round (len (iris_X) * 0.8)), replace=False)

Train_x = iris_ X [train _ indices]

Train_y = iris_ y [train _ indices]

Rand_index = np.random.choice (len (train_x), size=batch_size)

Batch_train_x = train_ x [rand _ index]

Batch_train_y = np.transpose ([train_ y [rand _ index]])

Test_indices = np.array (

List (set (range (len (iris_X)-set (train_indices)

Test_x = iris_ X [test _ indices]

Test_y = iris_ y [test _ indices]

Return batch_train_x, batch_train_y, test_x, test_y

# define the model

Def svm ():

A = tf.Variable (tf.random_normal (shape= [2,1]))

B = tf.Variable (tf.random_normal (shape= [1,1]))

Model_output = tf.subtract (tf.matmul (x_data, A), b)

L2_norm = tf.reduce_sum (tf.square (A))

Alpha = tf.constant

Classification_term = tf.reduce_mean (tf.maximum (0.

Tf.subtract (1, tf.multiply (model_output, y_target)

Loss = tf.add (classification_term, tf.multiply (alpha, l2_norm))

My_opt = tf.train.GradientDescentOptimizer

Train_step = my_opt.minimize (loss)

Return model_output, loss, train_step

Def train (sess, batch_size):

Print ("# Training loop")

For i in range (100):

X_vals_train, y_vals_train,\

X_vals_test, y_vals_test = gen_data (batch_size)

Model_output, loss, train_step = svm ()

Init = tf.global_variables_initializer ()

Sess.run (init)

Prediction = tf.sign (model_output)

Accuracy = tf.reduce_mean (tf.cast (

Tf.equal (prediction, y_target), tf.float32))

Sess.run (train_step, feed_dict=

{

X_data: x_vals_train

Y_target: y_vals_train

})

Train_loss = sess.run (loss, feed_dict=

{

X_data: x_vals_train

Y_target: y_vals_train

})

Train_acc = sess.run (accuracy, feed_dict=

{

X_data: x_vals_train

Y_target: y_vals_train

})

Test_acc = sess.run (accuracy, feed_dict=

{

X_data: x_vals_test

Y_target: np.transpose ([y_vals_test])

})

If I% 10 = = 1:

Print ("train loss: {: .6f}, train accuracy: {: .6f}".

Format (train_loss [0], train_acc))

Print ("test accuracy: {: .6f}" .format (test_acc))

Print ("- * -" * 15)

Def main (_):

With tf.Session () as sess:

Train (sess, batch_size=16)

If _ name__ = = "_ _ main__":

Tf.app.run ()

To sum up, the pit in SVM should first know that the purpose of SVM is to find a line or hyperplane, then calculate the distance from the point to the hyperplane, and then transform this distance into a quadratic programming problem, and then use Lagrange method to solve this optimization problem, and finally involve kernel function method.

After reading the above, have you mastered how TensorFlow implements linear support vector machine (SVM)? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.