Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem that machine learning cannot be performed with Python

2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "how to solve the problem of machine learning that cannot be performed with Python". In daily operation, I believe many people have doubts about how to solve the problem of machine learning that cannot be performed with Python. I have consulted all kinds of materials and sorted out simple and easy operation methods. I hope to help you answer the question of "how to solve the problem of machine learning that cannot be performed with Python"! Next, please follow the small series to learn together!

data preparation

As mentioned earlier, you create a table to hold the Iris dataset and load the data into it. OML requires a column as the row ID(sequence), so remember:

CREATE SEQUENCE seq_iris; CREATE TABLE iris_data( iris_id NUMBER DEFAULT seq_iris.NEXTVAL, sepal_length NUMBER, sepal_width NUMBER, petal_length NUMBER, petal_width NUMBER, species VARCHAR2(16) );

You can now download and load the data:

When a modal window pops up, simply provide the path to download CSV and click Next multiple times. SQL developers do their job correctly without help.

model training

Now it's time to do something fun. Training a classification model can be broken down into multiple steps, such as training/testing segmentation, model training, and model evaluation, and we start with the simplest.

training/testing segmentation

Oracle uses two views to accomplish this step: one for training data and one for testing data. You can easily create these amazing PL/SQL:

BEGIN EXECUTE IMMEDIATE ‘CREATE OR REPLACE VIEW iris_train_data AS SELECT * FROM iris_data SAMPLE (75) SEED (42)’; EXECUTE IMMEDIATE ‘CREATE OR REPLACE VIEW iris_test_data AS SELECT * FROM iris_data MINUS SELECT * FROM iris_train_data’; END; /

The script does two things:

Create a training view-75% of the data (SAMPLE (75)) is split in random seeds ( SEED (42)).

Create a test view-distinguish the entire dataset from the training view

The data is stored in views called iris_train_data and iris_test_data. Guess what they store.

SELECT COUNT(*) FROM iris_train_data; >>> 111 SELECT COUNT(*) FROM iris_test_data; >>> 39

model training

The easiest way to train a model is to execute a DBMS_DATA_MINING package with a single procedure without creating additional setup tables. A decision tree algorithm is used to train the model. The method is as follows:

DECLARE v_setlstDBMS_DATA_MINING.SETTING_LIST; BEGIN v_setlst(‘PREP_AUTO’) := ‘ON’; v_setlst(‘ALGO_NAME’) :=‘ALGO_DECISION_TREE’; DBMS_DATA_MINING.CREATE_MODEL2( ‘iris_clf_model’, ‘CLASSIFICATION’, ‘SELECT * FROM iris_train_data’, v_setlst, ‘iris_id’, ‘species’ ); END; /

The CREATE_MODEL2 procedure accepts multiple parameters. Then we explain the parameters we entered:

iris_clf_model -just the model name, it can be anything.

CLASSIFICATION -Machine learning task in progress, must be capitalized for some reason.

SELECT * FROM iris_train_data -Specifies where the training data is stored.

v_setlst -List of the above settings for the model.

iris_id -Name of the sequence type column (each value is unique).

species -name of the target variable (something you are trying to predict)

It takes one to two seconds to execute this module, and you can start calculating when you finish!

model evaluation

Use this script to evaluate this model:

BEGIN DBMS_DATA_MINING.APPLY( ‘iris_clf_model’, ‘iris_test_data’, ‘iris_id’, ‘iris_apply_result’ ); END; /

It applies iris_clf_model to the invisible test data iris_test_data and stores the evaluation results in the iris_apply_result table.

More lines (39×3), but highlights. That's not intuitive enough, so here's a slightly different way to show the results:

DECLARE CURSOR iris_ids IS SELECT DISTINCT(iris_id) iris_id FROM iris_apply_result ORDER BY iris_id; curr_y VARCHAR2(16); curr_yhat VARCHAR2(16); num_correct INTEGER := 0; num_total INTEGER := 0; BEGIN FOR r_id IN iris_ids LOOP BEGIN EXECUTE IMMEDIATE ‘SELECT species FROM iris_test_data WHERE iris_id = ‘ ||r_id.iris_id INTO curr_y; EXECUTE IMMEDIATE ‘SELECT prediction FROM iris_apply_result WHERE iris_id = ‘ ||r_id.iris_id || ‘AND probability = ( SELECTMAX(probability) FROMiris_apply_result WHERE iris_id = ‘|| r_id.iris_id || ‘)’ INTO curr_yhat; END; num_total := num_total + 1; IF curr_y = curr_yhat THEN num_correct := num_correct +1; END IF; END LOOP; DBMS_OUTPUT.PUT_LINE(‘Num. testcases: ‘ || num_total); DBMS_OUTPUT.PUT_LINE(‘Num. correct :‘ || num_correct); DBMS_OUTPUT.PUT_LINE(‘Accuracy : ‘ || ROUND((num_correct /num_total), 2)); END; /

It's a lot, but the script above can't be simplified any more. Here's the breakdown:

CURSOR-Get all the different iris_ids(because there are duplicates in iris_apply_results table).

curr_y, curr_yhat, num_correct, num_total are variables that store the actual and predicted categories, the number of correct classifications, and the total number of items tested in each iteration.

For each unique iris_id get the actual category (iris_test_data from matching ID) and the predicted category (highest prediction probability in iris_apply_results table)

It is easy to check whether the actual and predicted values are the same-this indicates that the classification is correct.

The variables num_total and num_correct are updated in each iteration.

Finally, print the model performance to the console.

The following is the output of the script:

The test set has 39 cases

37 out of 39 samples were correctly classified

The accuracy of the results was 95%.

At this point, the study of "how to solve the problem of machine learning that cannot be performed in Python" is over, hoping to solve everyone's doubts. Theory and practice can better match to help you learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report