How to realize Plant Seedling Classification by ConvNeXt 07/02 Update SLTechnology News&Howtos

How to realize Plant Seedling Classification by ConvNeXt

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

Today, I would like to share with you the relevant knowledge about how to achieve plant seedling classification in ConvNeXt. The content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article. Let's take a look at it.

Preface

ConvNeXts is built entirely by standard ConvNet modules and competes with Transformer in accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy, superior to Swin Transformers in COCO detection and ADE20K segmentation, while maintaining the simplicity and efficiency of standard ConvNet.

Characteristics of ConvNexts

Using 7 × 7 convolution kernels, small convolution kernels are used in classical CNN models such as VGG and ResNet, but ConvNexts proves the effectiveness of large convolution sums. The authors tried several kernel sizes, including 3, 5, 7, 9, and 11. The performance of the network is improved from 79.9% (3 × 3) to 80.6% (7 × 7), while the FLOPs of the network remains roughly the same, and the kernel size is good at reaching the saturation point at 7 × 7.

Use GELU (Gaussian error linear unit) to activate the function. GELUs is the synthesis of dropout, zoneout and Relus. GELUs is a mask composed of input multiplied by a 0d1, and the generation of the mask is randomly dependent on the input according to probability. The experimental effect is better than that of Relus and ELUs. The following figure shows the experimental data:

Use LayerNorm instead of BatchNorm.

Invert the bottleneck. Figures 3 (a) to (b) illustrate these configurations. Although the FLOPs of the deep convolution layer has increased, due to a significant reduction in the FLOPs of the fast 1 × 1 convolution layer that downsamples residual blocks, this change reduces the FLOPs of the entire network to 4.6G. The score improved from 80.5% to 80.6%. In the ResNet-200/Swin-B scenario, this step brings more benefits (81.9% to 82.6%) and reduces FLOP.

ConvNeXt residual module

The residual module is the core of the whole model. As shown below:

Code implementation:

Class Block (nn.Module): r "ConvNeXt Block. There are two equivalent implementations: (1) DwConv-> LayerNorm (channels_first)-> 1x1 Conv-> GELU-> 1x1 Conv; all in (N, C, H, W) (2) DwConv-> Permute to (N, H, W, C); LayerNorm (channels_last)-> Linear-> GELU-> Linear; Permute back We use (2) as we find it slightly faster in PyTorch Args: dim (int): Number of input channels. Drop_path (float): Stochastic depth rate. Default: 0.0 layer_scale_init_value (float): Init value for Layer Scale. Default: 1e-6. " Def _ init__ (self, dim, drop_path=0., layer_scale_init_value=1e-6): super (). _ _ init__ () self.dwconv = nn.Conv2d (dim, dim, kernel_size=7, padding=3, groups=dim) # depthwise conv self.norm = LayerNorm (dim, eps=1e-6) self.pwconv1 = nn.Linear (dim, 4 * dim) # pointwise/1x1 convs Implemented with linear layers self.act = nn.GELU () self.pwconv2 = nn.Linear (4 * dim, dim) self.gamma = nn.Parameter (layer_scale_init_value * torch.ones ((dim)), requires_grad=True) if layer_scale_init_value > 0 else None self.drop_path = DropPath (drop_path) if drop_path > 0. Else nn.Identity () def forward (self, x): input = x = self.dwconv (x) x = x.permute (0,2,3,1) # (N, C, H, W)-> (N, H, W) C) x = self.norm (x) x = self.pwconv1 (x) x = self.act (x) x = self.pwconv2 (x) if self.gamma is not None: X = self.gamma * x = x.permute (0,3,1,2) # (N, H, W, C)-> (N, C, H) W) x = input + self.drop_path (x) return x data enhanced Cutout and Mixup

ConvNext uses Cutout and Mixup, and I added these two enhancements to my code to improve my performance. Official use of timm, I did not use the official, but chose to use torchtoolbox. Installation commands:

Pip install torchtoolbox

Cutout implementation, in transforms.

From torchtoolbox.transform import Cutout# data preprocessing transform = transforms.Compose ([transforms.Resize ((224,224)), Cutout (), transforms.ToTensor (), transforms.Normalize ([0.5,0.5,0.5], [0.5,0.5,0.5])])

Mixup implementation, in the train method. Need to import package: from torchtoolbox.tools import mixup_data, mixup_criterion

For batch_idx, (data, target) in enumerate (train_loader): data, target = data.to (device, non_blocking=True), target.to (device, non_blocking=True) data, labels_a, labels_b, lam = mixup_data (data, target, alpha) optimizer.zero_grad output = model (data) loss = mixup_criterion (criterion, output, labels_a, labels_b Lam) loss.backward () optimizer.step () print_loss = loss.data.item () project structure

Use the tree command to print the project structure

Data set

The data set is classified by plant seedlings, with a total of 12 categories. The dataset connection is as follows:

Link extraction code: syng

Create a new data folder in the root directory of the project. After obtaining the dataset, extract the trian and test and place them under the data folder, as shown below:

Import model file

Find the convnext.py file in the official link and place it in the Model folder. As shown in the figure:

Install the library and import the required library

The timm library is used in the model. If you don't need to install it, execute the command:

Pip install timm

Create a new train_connext.py file and import the required packages:

Import torch.optim as optimimport torchimport torch.nn as nnimport torch.nn.parallelimport torch.utils.dataimport torch.utils.data.distributedimport torchvision.transforms as transformsfrom dataset.dataset import SeedlingDatafrom torch.autograd import Variablefrom Model.convnext import convnext_tinyfrom torchtoolbox.tools import mixup_data, mixup_criterionfrom torchtoolbox.transform import Cutout set global parameters

Set the use of GPU, set the learning rate, BatchSize, epoch and other parameters.

# set global parameter modellr = 1e-4BATCH_SIZE = 8EPOCHS = 300DEVICE = torch.device ('cuda' if torch.cuda.is_available () else' cpu') data preprocessing

Data processing is relatively simple, there is no complex attempt, interested can add some processing.

# data preprocessing transform = transforms.Compose ([transforms.Resize ((224,224)), Cutout (), transforms.ToTensor (), transforms.Normalize ([0.5,0.5,0.5], [0.5,0.5,0.5])) transform_test = transforms.Compose ([transforms.Resize ((224,224)), transforms.ToTensor (), transforms.Normalize ([0.5,0.5,0.5]) [0.5, 0.5, 0.5])])

Data reading

Then we create a new init.py and dataset.py under the dataset folder and write the following code in the mydatasets.py folder:

Talk about the core logic of the code.

The first step is to set up a dictionary, define the ID corresponding to the category, and replace the category with numbers.

The second step is to write a method to get the image path in _ _ init__. The test set has only one layer of path to read directly, and the training set is a category folder under the train folder. First get the category, and then get the specific image path. Then we use the method of segmenting the data set in sklearn to split the training set and the verification set according to the proportion of 7:3.

The third step defines the method of reading a single picture and category in the _ _ getitem__ method. Because the image has a bit depth of 32 bits, I make a conversion when reading the image.

The code is as follows:

# coding:utf8import osfrom PIL import Imagefrom torch.utils import datafrom torchvision import transforms as Tfrom sklearn.model_selection import train_test_splitLabels = {'Black-grass': 0,' Charlock': 1, 'Cleavers': 2,' Common Chickweed': 3, 'Common wheat': 4,' Fat Hen': 5, 'Loose Silky-bent': 6,' Maize': 7, 'Scentless Mayweed': 8,' Shepherds Purse': 9, 'Small-flowered Cranesbill': 10 'Sugar beet': 11} class SeedlingData (data.Dataset): def _ _ init__ (self, root, transforms=None, train=True, test=False): main goal: get the addresses of all the pictures And according to the training, verify Test partition data "" self.test = test self.transforms = transforms if self.test: imgs = [os.path.join (root, img) for img in os.listdir (root)] self.imgs = imgs else: imgs_labels = [os.path.join (root) Img) for img in os.listdir (root)] imgs = [] for imglable in imgs_labels: for imgname in os.listdir (imglable): imgpath = os.path.join (imglable, imgname) imgs.append (imgpath) trainval_files, val_files = train_test_split (imgs, test_size=0.3) Random_state=42) if train: self.imgs = trainval_files else: self.imgs = val_files def _ _ getitem__ (self, index): "return the data of one picture at a time" img_path = self.imgs [index] img_path = img_path.replace ("\\" '/') if self.test: label =-1 else: labelname = img_path.split ('/') [- 2] label = Labels [labelname] data = Image.open (img_path) .convert ('RGB') data = self.transforms (data) return data, label def _ len__ (self): return len (self.imgs)

Then we call SeedlingData in train.py to read the data and remember to import the dataset.py (from mydatasets import SeedlingData) we just wrote.

# read data dataset_train = SeedlingData ('data/train', transforms=transform, train=True) dataset_test = SeedlingData ("data/train", transforms=transform_test, train=False) # Import data train_loader = torch.utils.data.DataLoader (dataset_train, batch_size=BATCH_SIZE, shuffle=True) test_loader = torch.utils.data.DataLoader (dataset_test, batch_size=BATCH_SIZE, shuffle=False) setup model

Set the loss function to nn.CrossEntropyLoss ().

Set the model to coatnet_0 and modify the last layer of full connection output to 12 (the category of the dataset).

The optimizer is set to adam.

Change the learning rate adjustment strategy to cosine annealing

# instantiate the model and move to GPUcriterion = nn.CrossEntropyLoss () # criterion = SoftTargetCrossEntropy () model_ft = convnext_tiny (pretrained=True) num_ftrs = model_ft.head.in_featuresmodel_ft.fc = nn.Linear (num_ftrs, 12) model_ft.to (DEVICE) # Select a simple and violent Adam optimizer Lower learning rate optimizer= optim.Adam (model_ft.parameters (), lr=modellr) cosine_schedule = optim.lr_scheduler.CosineAnnealingLR (optimizer=optimizer,T_max=20,eta_min=1e-9) define training and verification functions

Parameters required for alpha=0.2 Mixup.

# define the training process alpha=0.2def train (model, device, train_loader, optimizer, epoch): model.train () sum_loss = 0 total_num = len (train_loader.dataset) print (total_num, len (train_loader)) for batch_idx, (data, target) in enumerate (train_loader): data, target = data.to (device, non_blocking=True), target.to (device, non_blocking=True) data, labels_a, labels_b Lam = mixup_data (data, target, alpha) optimizer.zero_grad () output = model (data) loss = mixup_criterion (criterion, output, labels_a, labels_b Lam) loss.backward () optimizer.step () print_loss = loss.data.item () sum_loss + = print_loss if (batch_idx + 1)% 10 = = 0: print ('Train Epoch: {} [{} / {} ({: .0f}%)]\ tLoss: {: .6f}' .format (epoch (batch_idx + 1) * len (data), len (train_loader.dataset), 100. * (batch_idx + 1) / len (train_loader), loss.item ()) ave_loss = sum_loss / len (train_loader) print ('epoch: {}, loss: {}' .format (epoch, ave_loss)) ACC=0# verification process def val (model, device, test_loader): global ACC model.eval () test_loss = 0 correct = 0 total_num = len (test_loader.dataset) print (total_num) Len (test_loader)) with torch.no_grad (): for data, target in test_loader: data, target = Variable (data) .to (device), Variable (target) .to (device) output = model (data) loss = criterion (output, target) _, pred = torch.max (output.data) 1) correct + = torch.sum (pred = = target) print_loss = loss.data.item () test_loss + = print_loss correct = correct.data.item () acc = correct / total_num avgloss = test_loss / len (test_loader) print ('\ nVal set: Average loss: {: .4F} Accuracy: {} / {} ({: .0f}%)\ n'.format (avgloss, correct, len (test_loader.dataset), 100 * acc) if acc > ACC: torch.save (model_ft, 'model_' + str (epoch) +' _'+ str (round (acc, 3)) ACC = acc# training for epoch in range (1 EPOCHS + 1): train (model_ft, DEVICE, train_loader, optimizer, epoch) cosine_schedule.step () val (model_ft, DEVICE, test_loader)

And then you can start training.

You can get good results by training 10 epoch:

Test the first way to write

The directory where the test set is stored is as follows:

The first step is to define the category, the order of this category corresponds to the order of the category during training, do not change the order!

Classes = ('Black-grass',' Charlock', 'Cleavers',' Common Chickweed', 'Common wheat',' Fat Hen', 'Loose Silky-bent',' Maize', 'Scentless Mayweed',' Shepherds Purse', 'Small-flowered Cranesbill',' Sugar beet')

The second step is to define transforms,transforms the same as the transforms of the validation set, without data enhancement.

Transform_test = transforms.Compose ([transforms.Resize ((224,224)), transforms.ToTensor (), transforms.Normalize ([0.5,0.5,0.5], [0.5,0.5,0.5])])

The third step is to load model and put the model in DEVICE.

DEVICE = torch.device ("cuda:0" if torch.cuda.is_available () else "cpu") model = torch.load ("model_8_0.971.pth") model.eval () model.to (DEVICE)

The fourth step is to read the picture and predict the category of the picture. Note here that the Image of the PIL library is used to read the picture. Do not use cv2,transforms is not supported.

Path = 'data/test/'testList = os.listdir (path) for file in testList: img = Image.open (path + file) img = transform_test (img) img.unsqueeze_ (0) img = Variable (img) .to (DEVICE) out = model (img) # Predict _, pred = torch.max (out.data, 1) print (' Image Name: {}, predict: {} '.format (file, classes [pred.data.item ()]))

Test the complete code:

Import torch.utils.data.distributedimport torchvision.transforms as transformsfrom PIL import Imagefrom torch.autograd import Variableimport osclasses = ('Black-grass',' Charlock', 'Cleavers',' Common Chickweed', 'Common wheat',' Fat Hen', 'Loose Silky-bent',' Maize', 'Scentless Mayweed',' Shepherds Purse', 'Small-flowered Cranesbill',' Sugar beet') transform_test = transforms.Compose ([transforms.Resize ((224,224) Transforms.ToTensor (), transforms.Normalize ([0.5,0.5,0.5], [0.5,0.5] ]) DEVICE = torch.device ("cuda:0" if torch.cuda.is_available () else "cpu") model = torch.load ("model_8_0.971.pth") model.eval () model.to (DEVICE) path = 'data/test/'testList = os.listdir (path) for file in testList: img = Image.open (path + file) img = transform_test (img) img.unsqueeze_ (0) img = Variable ( Img) .to (DEVICE) out = model (img) # Predict _ Pred = torch.max (out.data, 1) print ('Image Name: {}, predict: {}' .format (file, classes [pred.data.item ()]))

Running result:

The second way of writing

Second, use custom Dataset to read pictures. The first three steps are the same as above, but the difference is mainly in the fourth step. When reading the data, use the SeedlingData of Dataset to read.

Dataset_test = SeedlingData ('data/test/', transform_test,test=True) print (len (dataset_test)) # label for index in range (len (dataset_test)): item = dataset_ test [index] img, label = item img.unsqueeze_ (0) data = Variable (img). To (DEVICE) output = model (data) _, pred = torch.max (output.data, 1) print (' Image Name: {}) Predict: {} '.format (dataset_test.imgs [index], classes [pred.data.item ()]) index + = 1

Running result:

These are all the contents of the article "how to achieve Plant Seedling Classification by ConvNeXt". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.