Weeds are a significant threat to agricultural production. Weed classification systems based on image analysis have offered innovative solutions to agricultural problems, with convolutional neural networks (CNNs) playing a pivotal role in this task. However, CNNs are limited in their ability to capture global relationships in images due to their localized convolutional operation. Vision Transformers (ViT) and Pyramid Vision Transformers (PVT) have emerged as viable solutions to overcome this limitation. Our study aims to determine the effectiveness of CNN, PVT, and ViT in classifying weeds in image datasets. We also examine if combining these methods in an ensemble can enhance classification performance. Our tests were conducted on significant agricultural datasets, including DeepWeeds and CottonWeedID15. The results indicate that a maximum of 3 methods in an ensemble, with only 15 epochs in training, can achieve high accuracy rates of up to 99.17%. This study demonstrates that high accuracies can be achieved with ease of implementation and only a few epochs.
Rozendo G.B., Roberto G.F., do Nascimento M.Z., Alves Neves L., Lumini A. (2024). Weeds Classification with Deep Learning: An Investigation Using CNN, Vision Transformers, Pyramid Vision Transformers, and Ensemble Strategy. Cham : Springer [10.1007/978-3-031-49018-7_17].
Weeds Classification with Deep Learning: An Investigation Using CNN, Vision Transformers, Pyramid Vision Transformers, and Ensemble Strategy
Lumini A.
2024
Abstract
Weeds are a significant threat to agricultural production. Weed classification systems based on image analysis have offered innovative solutions to agricultural problems, with convolutional neural networks (CNNs) playing a pivotal role in this task. However, CNNs are limited in their ability to capture global relationships in images due to their localized convolutional operation. Vision Transformers (ViT) and Pyramid Vision Transformers (PVT) have emerged as viable solutions to overcome this limitation. Our study aims to determine the effectiveness of CNN, PVT, and ViT in classifying weeds in image datasets. We also examine if combining these methods in an ensemble can enhance classification performance. Our tests were conducted on significant agricultural datasets, including DeepWeeds and CottonWeedID15. The results indicate that a maximum of 3 methods in an ensemble, with only 15 epochs in training, can achieve high accuracy rates of up to 99.17%. This study demonstrates that high accuracies can be achieved with ease of implementation and only a few epochs.File | Dimensione | Formato | |
---|---|---|---|
paper_weeds_final-2.pdf
embargo fino al 27/11/2024
Tipo:
Postprint
Licenza:
Licenza per accesso libero gratuito
Dimensione
9.24 MB
Formato
Adobe PDF
|
9.24 MB | Adobe PDF | Visualizza/Apri Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.