In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the relevant knowledge of python how to achieve the decision tree, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value. I believe you will gain something after reading this python how to achieve the decision tree article. Let's take a look at it.
Background introduction
This is one of my favorite algorithms, and I often use it. It is a supervised learning algorithm, which is mainly used for classification problems. Surprisingly, it applies to classification and continuous dependent variables. In this algorithm, we divide the population into two or more similar sets. This is done based on the most important attributes / independent variables, as different groups as possible.
In the figure above, you can see that the population is divided into four different groups based on multiple attributes to identify whether they can play or not. In order to divide the population into different heterogeneous groups, it uses a variety of technologies, such as Gini, information gain, chi-square, entropy.
The best way to understand how the decision tree works is to play Jezzball--, a classic game from Microsoft (shown in the following figure). Basically, you have a room with moving walls, and you need to create walls so that the maximum area is cleared by the ball.
So every time you wall a room, you try to create two different people in the same room. The decision tree works in a very similar way, by dividing the population into as different groups as possible.
Let's take a look at the decision tree case using Python Scikit-learn:
Import pandas as pdfrom sklearn.tree import DecisionTreeClassifierfrom sklearn.metrics import accuracy_score
# read the train and test datasettrain_data = pd.read_csv ('train-data.csv') test_data = pd.read_csv (' test-data.csv')
# shape of the datasetprint ('Shape of training data:', train_data.shape) print ('Shape of testing data:', test_data.shape)
Train_x = train_data.drop (columns= ['Survived'], axis=1) train_y = train_data [' Survived']
Test_x = test_data.drop (columns= ['Survived'], axis=1) test_y = test_data [' Survived'] model = DecisionTreeClassifier () model.fit (train_x,train_y)
# depth of the decision treeprint ('Depth of the Decision Tree:', model.get_depth ())
# predict the target on the train datasetpredict_train = model.predict (train_x) print ('Target on train data',predict_train)
# Accuray Score on train datasetaccuracy_train = accuracy_score (train_y,predict_train) print ('accuracy_score on train dataset:', accuracy_train)
# predict the target on the test datasetpredict_test = model.predict (test_x) print ('Target on test data',predict_test)
# Accuracy Score on test datasetaccuracy_test = accuracy_score (test_y,predict_test) print ('accuracy_score on test dataset:', accuracy_test)
The result of the above code running:
Shape of training data: (712,25) Shape of testing data: (179) 25) Depth of the Decision Tree: 19Target on train data [01 1 0 0 000 0 0 01 11 0 01 0 01 0 01 0 01 0 01 0 01 0 01 0 01 0 01 0 01 11 0 01 0 01 0 01 1 0 01 11 0 0 0 00 00 00 1 00 1 0 1 0 1 1 00 0 1 00 1 00 0 1 0 1 0 1 00 00 1 0 1 1 00 00 1 1 00 1 00 1 0 1 1 0 1 1 0 1 1 00 00 00 0 1 00 00 0 1 00 1 0 1 1 1 00 1 0 1 00 1 1 1 00 0 1 1 1 00 00 1 00 00 00 0 1 00 0 1 0 10 00 01 00 01 01 01 00 01 1 1 01 00 01 00 1 10 1 1 10 1 10 01 01 1 1 1 10 01 00 01 10 01 10 00 00 00 01 10 1 10 10 1 1 10 00 10 1 10 00 01 01 10 0 00 00 00 1 10 01 10 10 00 10 10 00 10 10 01 01 00 00 01 00 01 00 00 00 01 00 10 10 01 00 11 00 00 10 01 11 1 01 10 11 10 11 10 00 00 00 01 11 10 01 00 10 01 01 01 1 11 00 10 00 10 00 00 11 00 11 01 01 01 00 00 00 01 1 10 00 0 00 00 11 1 00 1 01 1 01 00 01 11 01 00 00 00 0 00 00 1 01 1 00 00 1 00 01 01 01 11 00 00 00 11 1 00 11 1 01 01 00 1 00 01 1 00 1 00 1 01 00 1 00 01 00 11 01 00 00 11 01 11 01 01 01 1 01 01 00 1 00 1 01 1 01 00 01 01 00 0 00 00 00 1 00 01 01 1 1 1 01 1 00 1 01 00 1 00 1 1 1 01 00 01 01 01 00 01 00 1 00 1 01 00 1 1 00 00 00 01 00 0 00 01 01 01 1 1 00 1 0] accuracy_score on train dataset: 0.9859550561797753Target on Test data [00 01 1 00 00 00 01 01 00 00 0 01 11 1 00 1 01 1 01 11 1 01 00 01 00 01 1 01 11 00 11 1 01 11 01 11 00 0 00 1 00 00 1 00 00 00 1 00 11 00 00 00 11 00 1 01 01 1 11 01 1 01 01 00 00 11 1 1 01 1 1 1 0 01 1 0 01 1 0 01 0 01 01 0 0 01 1 0 01 0 0 01 1 0 01 0 0 01 0 0 01 0 0 01 0 0 01 1 01 0 0 0] accuracy_score on test dataset: 0.770949720670391 this is the end of the article on "how to realize the decision tree by python" Thank you for reading! I believe that everyone has a certain understanding of the knowledge of "how to achieve the decision tree in python". If you want to learn more knowledge, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un