Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use k-nearest neighbor Classification algorithm in Ignite

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

How to use k-nearest neighbor classification algorithm in Ignite, aiming at this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

First, take the raw data and split it into training data (60%) and test data (40%). Then use Scikit-learn to perform this task again, and modify the code used in the previous article as follows:

From sklearn import datasetsimport pandas as pd# Load Iris dataset.iris_dataset = datasets.load_iris () x = iris_dataset.datay = iris_dataset.target# Split it into train and test subsets.from sklearn.model_selection import train_test_splitx_train, x_test, y_train, y_test = train_test_split (x, y, test_size=0.4, random_state=23) # Save train set.train_ds = pd.DataFrame (x_train Columns=iris_dataset.feature_names) train_ds ["TARGET"] = y_traintrain_ds.to_csv ("iris-train.csv", index=False, header=None) # Save test set.test_ds = pd.DataFrame (x_test, columns=iris_dataset.feature_names) test_ds ["TARGET"] = y_testtest_ds.to_csv ("iris-test.csv", index=False, header=None)

When the training and test data are ready, you can write the application. The algorithm in this paper is:

Read training data and test data

Save training data and test data in Ignite

Fitting k-NN model using training data

Apply the model to test data

Determine the accuracy of the model.

Read training data and test data

You need to read two CSV files with 5 columns, one is training data, the other is test data, and the five columns are:

Sepal length (cm)

Sepals width (cm)

Petal length (cm)

Petal width (cm)

Species of flowers (0:Iris Setosa,1:Iris Versicolour,2:Iris Virginica)

You can read data from a CSV file with the following code:

Private static void loadData (String fileName, IgniteCache cache) throws FileNotFoundException {Scanner scanner = new Scanner (new File (fileName)); int cnt = 0; while (scanner.hasNextLine ()) {String row = scanner.nextLine (); String [] cells = row.split (","); double [] features = new double [cells.length-1]; for (int I = 0; I

< cells.length - 1; i++) features[i] = Double.valueOf(cells[i]); double flowerClass = Double.valueOf(cells[cells.length - 1]); cache.put(cnt++, new IrisObservation(features, flowerClass)); }} 该代码简单地一行行的读取数据,然后对于每一行,使用CSV的分隔符拆分出字段,每个字段之后将转换成double类型并且存入Ignite。 将训练数据和测试数据存入Ignite 前面的代码将数据存入Ignite,要使用这个代码,首先要创建Ignite存储,如下: IgniteCache trainData = getCache(ignite, "IRIS_TRAIN");IgniteCache testData = getCache(ignite, "IRIS_TEST");loadData("src/main/resources/iris-train.csv", trainData);loadData("src/main/resources/iris-test.csv", testData); getCache()的实现如下: private static IgniteCache getCache(Ignite ignite, String cacheName) { CacheConfiguration cacheConfiguration = new CacheConfiguration(); cacheConfiguration.setName(cacheName); cacheConfiguration.setAffinity(new RendezvousAffinityFunction(false, 10)); IgniteCache cache = ignite.createCache(cacheConfiguration); return cache;}使用训练数据拟合k-NN分类模型 数据存储之后,可以像下面这样创建训练器: KNNClassificationTrainer trainer = new KNNClassificationTrainer(); 然后拟合训练数据,如下: KNNClassificationModel mdl = trainer.fit( ignite, trainData, (k, v) ->

V.getFeatures (), / / Feature extractor. (K, v)-> v.getFlowerClass () / / Label extractor. .withK (3) .withDistanceMeasurement (new EuclideanDistance ()) .withStrategy (KNNStrategy.WEIGHTED)

Ignite saves the data in key-value format, so the above code uses the value part, the target value is the Flower class, and the characteristics are in other columns. Set the value of k to 3, which represents three kinds. For distance measurement, there are several options, such as Euclid, hamming, or Manhattan, using Euclid in this case. Finally, specify whether to use the SIMPLE algorithm or the WEIGHTED k-NN algorithm, in this case WEIGHTED.

Apply the model to test data

Next, you can test the test data with a trained classification model, as follows:

Int amountOfErrors = 0posiint totalAmount = 0political try (QueryCursor cursor = testData.query (new ScanQuery () {for (Cache.Entry testEntry: cursor) {IrisObservation observation = testEntry.getValue (); double groundTruth = observation.getFlowerClass (); double prediction = mdl.apply (new DenseLocalOnHeapVector (observation.getFeatures (); totalAmount++; if (groundTruth! = prediction) amountOfErrors++ System.out.printf ("> |% .0f\ t\ t\ t |% .0f\ t\ t\ t |\ n", prediction, groundTruth);} System.out.println ("> -"); System.out.println ("\ n > Absolute amount of errors" + amountOfErrors) System.out.printf ("\ n > Accuracy% .2f\ n", (1-amountOfErrors / (double) totalAmount));} determine the accuracy of the model

Next, we can confirm the authenticity of the model by comparing the real classification in the test data with the classification of the model.

After the code runs, it is summarized as follows:

> Absolute amount of errors 2 > Accuracy 0.97

As a result, Ignite was able to correctly classify 97% of the test data into three different categories.

This is the answer to the question about how to use the k-nearest neighbor classification algorithm in Ignite. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report