What is the K nearest neighbor algorithm on the basis of python 04/18 Update SLTechnology News&Howtos

What is the K nearest neighbor algorithm on the basis of python

2025-04-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Python foundation K nearest neighbor algorithm is what, I believe many inexperienced people are helpless about this, for this reason this article summarizes the causes and solutions of the problem, through this article I hope you can solve this problem.

1. Principle of k-nearest neighbor algorithm and API1. k-nearest neighbor algorithm

A sample belongs to a class if most of the k most similar (i.e. nearest) samples in the feature space belong to that class. (Similar samples should have similar values between features.)

How to find the distance between samples:

2. k-nearest neighbor algorithm API

3. K-Nearest Neighbor Algorithm Features

The k value is small and is easily affected by outliers.

k value is large and is easily fluctuated by the number (category) of k value.

Advantages: Simple, easy to understand, easy to implement, no estimation parameters, no training (no iteration required)

Disadvantages: lazy algorithm, large amount of computation when classifying test samples, large memory overhead

To sum up, K value must be specified when using this algorithm, and the classification accuracy cannot be guaranteed if K value is not properly selected. At the same time, if the number of samples is very large, the algorithm will take a long time, so the use scenario is generally a small data scenario.

II. K-Nearest Neighbor Algorithm Case Study Case Information Overview

knn Use Case: Take predicting where guests are staying (assuming they are staying at a hotel). The characteristics in the prepared data are: hotel number (place_id), check-in (row_id), guest abscissa (x), guest ordinate (y), time stamp (time), guest location accuracy (accuracy)

In other words, our goal is to predict which hotel number guests will stay in. Then the problem is a classification problem. According to the k-nearest neighbor algorithm, we usually consider the guest staying at the hotel closest to him. But at the same time, depending on other information given, whether to stay in a hotel will also be affected by other factors, such as check-in time and positioning accuracy.

Therefore, our first step in processing the data takes into account all factors that we believe have an impact on the guest stay. Such as guest coordinates, check-in time, positioning accuracy.

The features are then processed, adding needed ones to the list, deleting unwanted ones, or filtering some of the data. After the final processing, the target value is taken out separately as y_train, and the algorithm can be trained using x_train and y_train.

Part 1: Processing Data 1. Data Reduction

Assuming that the data data has been imported, the data volume is too large, so the data volume is reduced for the speed of presentation.

Code:

data.query('x>0.1 & x0.5 & y

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.