How to realize stratified sampling Stratified in R language 02/13 Update SLTechnology News&Howtos

How to realize stratified sampling Stratified in R language

2026-02-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

How to implement stratified sampling Stratified in R language? in order to solve this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

1. Observation data set

Head (iris)

Selecting the first six data in the dataset, we can see that the iris dataset has a total of five fields.

Dim (iris)

The iris dataset has 150 pieces of data and 5 fields

Summary (iris)

Observing the contents of each variable, we can see that the first four variables (Sepal.Length Sepal.Width Petal.Length Petal.Width) are quantitative variables, while the last variable (Species) is a qualitative variable. We will use the last variable as the basis for stratified sampling.

Library (sampling)

Load stratified sampling package sampling

N=round (3/5*nrow (iris) / 3)

Calculate the number of samples for each category. Here we take 3 prime 5 samples from each "Species" for sampling.

Sub_train=strata (iris,stratanames= ("Species"), size=rep (nMagol 3), method= "srswor") head (sub_train)

The stratanames parameter is the variable on which the sampling is based, and the size parameter is the sampling number of each category. Here we use the n calculated in the previous step as the sampling number, method is the sampling method, and we choose srswor.

Data_train= iris [sub _ train$ID_unit,] data_test= iris [- sub_train$ID_unit,]

The sampling results are defined as training set (data_train) and test set (data_test) respectively.

Dim (data_train); dim (data_test)

Observe the number of fields and data in the training set and test set. In line with our sampling expectations.

Head (data_train); head (data_test)

Observe the first few pieces of data from the training set and the test set.

Data_train;data_test

Look at the overall sampling results, where the amount of data is too large to be given.

Write.csv (data_train, "C:/Users/cnrozh/Desktop/iris_data_train.csv") write.csv (data_test, "C:/Users/cnrozh/Desktop/iris_data_test.csv")

Save dataset

This is the answer to the question about how to achieve stratified sampling Stratified in R language. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.