Intelligent Detection and Classification of Walnut Fungi Diseases Using Machine Learning

: Detecting and classifying diseases in agricultural crops is crucial for ensuring crop health and maximizing yields. In the case of walnut trees, timely identification of fungal diseases is essential to prevent their spread and minimize economic losses. Traditional methods of disease diagnosis rely on visual inspection by experts, which can be time-consuming and prone to human errors. In this study, we propose an intelligent system for the detection and classification of walnut fungi diseases using machine-learning techniques. The system leverages the power of image processing and pattern recognition algorithms to automate the identification process, enabling rapid and accurate diagnosis. The system is trained using a dataset comprising high-resolution images of healthy walnut leaves and leaves affected by various fungal diseases. We extract relevant features from the images and employ state-of-the-art machine learning algorithms, such as back-propagation neural network (BPNN), to learn the complex patterns associated with each disease. To evaluate the performance of our system, we conducted extensive experiments on a real-world dataset, achieving high accuracy rates for disease detection and classification. The system successfully identifies common walnut fungi diseases, including walnut blight, anthracnose, and powdery mildew, with accuracy rates exceeding 90%. Moreover, it exhibits robustness against variations in lighting conditions and leaf orientations. Our results demonstrate the potential of machine learning techniques in revolutionizing disease diagnosis in the agricultural domain. The proposed system offers a cost-effective and scalable solution for farmers and agronomists to monitor the health of walnut trees, detect diseases at an early stage, and apply targeted treatments. By providing accurate and timely diagnoses, this technology can contribute to reducing crop losses and improving overall productivity in walnut cultivation.


Introduction
In a nation's economic and overall growth, agriculture is essential.Particularly in Ghana, where agriculture is one of the largest economic sector, accounting for more than 18% of GDP and providing employment for 62% of the population, agriculture is the primary source of income.Ghana's agricultural sector encompasses both subsistence farming and commercial farming.Small-scale farmers form a significant portion of the agricultural workforce, engaging in the production of staple crops such as maize, yam, cassava, millet, and sorghum.Additionally, cash crops like cocoa, oil palm, rubber, coffee, and fruits are cultivated for export and generate foreign exchange earnings.Livestock rearing, including poultry, cattle, sheep, and goats, is also an essential component of Ghana's agriculture sector.[1].The livestock industry provides a source of income, employment, and food security for many rural communities.The government of Ghana recognizes the importance of agriculture and has implemented various policies and initiatives to support its development.These include providing agricultural extension services, improving access to credit and markets, promoting mechanization, and investing in irrigation infrastructure.Efforts are also being made to enhance agricultural productivity through the adoption of modern farming techniques, improved seeds, and agrochemicals.Furthermore, there is a growing focus on sustainable agriculture practices, climate resilience, and value addition to agricultural products.Overall, Ghana's agriculture sector continues to play a vital role in the country's economy, providing livelihoods, ensuring food security, and contributing to rural development.The government's commitment to agricultural growth and development aims to further strengthen the sector and harness its potential for economic transformation and poverty reduction.[2].
Currently, fungal diseases are a significant problem in the agriculture field worldwide, substantially decreasing the quantity and quality of agricultural products.Therefore, the fungal diseases that affect trees, particularly walnut trees, must be cured.The Khyber Pakhtunkhwa province of Pakistan has 226 indigenous types of walnuts, mainly infected by fungal diseases [3].Walnut production declined sharply in Pakistan from 2000 to 2017 because tree diseases, such as anthracnose, leaf blotch, and bacteria blight, affected the walnut trees.
Anthracnose is the most well-known leaf disease of walnut trees, and it is caused by the fungus Gnomonia Leptospira.Anthracnose is a fungal disease that affects a wide range of plant species, including trees, shrubs, and crops.It is caused by various species of fungi in the order Diaporthales, such as Colletotrichum spp.and Glomerella spp.[4].Anthracnose can cause significant damage to plant foliage, stems, fruits, and flowers.The symptoms of anthracnose can vary depending on the plant species affected, but common signs include the development of dark, sunken lesions on leaves, stems, or fruits.These lesions may have a concentric ring pattern and can enlarge over time.In severe cases, leaves may wilt, turn brown, and fall prematurely.Infected fruits may rot, become discolored, or develop raised bumps or lesions.The Gnomonic Leptostyla spreads due to rain in the spring season.The anthracnose disease reduces the size, mass, and actual crop of nuts and the falling of leaves before time.The first symptom of the anthracnose disease on a walnut leaf is circular brown lesions on the leaves.Initially, this symptom is visible only on the leaf's underside, which eventually spreads on both the upper and the lower leaf surfaces as time passes.Walnut anthracnose affects only walnuts and butternuts, which belong to the same genus of Juglans.There are also forms of anthracnose that can cause damage to maples, oaks, shade trees, and other plants, such as tomatoes, beans, cucumbers, and squash during the plant growing season [5].
The fungus Marssonina Juglandis, which changes in severity yearly, causes walnut leaf blotch.Walnut leaf blotch, also known as walnut blotch leafminer, is a fungal disease that affects walnut trees.It is caused by the fungus Phyllosticta juglandis.This disease primarily targets the leaves of walnut trees and can cause significant damage if left untreated.Initially, small, round, brown-color spots of a few millimeters in size appear on a leaf surface, and then these spots merge to larger blotches.The symptoms of anthracnose and leaf blotch on a walnut leaf are shown in Fig. 1.The majority of the current detection techniques for walnut disease rely only on visual inspection.The identification of diseases by the naked eye, however, takes time and is error-prone.Therefore, it is crucial to detect the disease on time.An automatic walnut disease detection system can be used to detect the disease in its early stage.In the past years, machine learning techniques have played a significant role in the agricultural field.To improve the accuracy and speed of diagnostic results, many machine learning-based algorithms have been used, such as k-mean cluster, support vector machine (SVM), K-nearest neighbor, naive Bayesian, and artificial neural networks (ANNs) [6,7].This paper proposes an automatic detection method of walnut diseases, such as anthracnose and leaf blotch.The proposed method is deployed using a machine learning approach.The proposed model takes a walnut leaf image as an input to predict and identify diseases precisely.The main contributions of this study can be summarized as follows: A back-propagation neural network (BPNN) model is proposed to identify and classify walnut leaf diseases.A leaf disease dataset, including the anthracnose and leaf blotch diseases, is designed to train and test the BPNN model.The BPNN is compared with a multi-support vector machine (mSVM) in terms of accuracy, and the result shows that the BPNN model is comparatively better than the mSVM.

Related Work
Leaf disease identification has been a crucial problem and a significant concern in the agricultural sector for a long time.Machine learning algorithms have been widely used for disease detection in the agriculture field in recent years.Khalesi et al. [8] used an automatic system to classify the Kaghazi and Sangi genotypes of Iranian walnuts.The features were extracted using the fast Fourier transform (FFT), and the principal component analysis (PCA) was applied to the extracted features.Finally, a multilayer feedforward neural network was used for classification.This method's detection accuracies of Sangi and Kaghazi genotypes were 99.64% and 96.56%, respectively.
Chuanlei et al. [6] developed an automatic method to diagnose apple leaf diseases using image processing techniques and pattern recognition methods.The RGB image was first converted to the HSI gray image as a pre-processing step.The image of an infected leaf was segmented by the region growing algorithm (RGA).Then, the RGA extracted features from the leaf image segmented area, including texture, shape, and color features.Finally, the SVM classifier was used for classification and detection, achieving high accuracy of 90%.Tigadi et al. [9] proposed an automatic method for detecting banana plant diseases, such as Yellow Sigatoka, Black Sigatoka, Panama Wilt, Bunchy top, and Streak virus, by applying the image processing techniques.Initially, a digital camera captured images of banana leaves with various diseases.The preprocessing techniques, such as image resizing, cropping, and color conversion, were used.Further, two types of features, the color features and the template's histogram (HOT), were extracted.Finally, the banana diseases were classified by the trained artificial neural network.Bhange et al. [10] used a web-based tool for the identification of pomegranate fruit disease.First, a leaf image was resized, and then features, such as morphology, color, and concave-convex variation (CCV), were extracted.Next, the disease area was segmented by the k-means cluster algorithm.Finally, the SVM was used for classification and detection.The proposed system achieved an accuracy of 82% in pomegranate disease identification.Waghmare et al. [11] proposed an automatic system for detecting major grape diseases, such as downy mildew and black rot, from a grape leaf image.The pre-processing steps were applied to the input image to make the image suitable for further processing.
Additionally, the background was removed from the image, and the RGB color space was converted to the HSV color space.Further, the affected area was segmented from the leaf image, and the texture, color, shape, and edge features were extracted by the firstand second-order statistical methods and the gray level co-occurrence matrix (GLCM).Finally, the extracted features were processed by the SVM classifier for classification.Detection accuracy of 96% was achieved.Awate et al. [12] proposed a fruit disease detection and diagnosis method based on image processing techniques.In this method, the K-means clustering algorithm was used for image segmentation, and the color, morphology, and texture were extracted from the segmented image.Finally, an ANN-based classifier was used to identify and classify fruit diseases.Kusumandari et al. [13] presented a strawberry plant disease detection method.In their method, the input image quality was improved by preprocessing, and the RGB color space was converted into the HSV color space.After that, the regional method was used for the segmentation of the infected area of plant leaves.Detection accuracy of 85% was achieved.
Areni et al. [14] introduced an early detection image processing-based method of symptoms of pest attacks on cocoa fruits.First, the pre-processing step was conducted to enhance the input image quality and to convert the RGB model into the grayscale model.Further, image features were extracted using the Gabor kernel and stored in the database for the comparison of the test sample.The results show 70% accuracy on the testing dataset.

Method
The flowchart of the proposed method is shown in Fig. 2. The proposed method uses a leaf image as an input and pre-processes it to enhance the image contrast.Next, it segments the leaf image using the image segmentation technique.The image features are then extracted from the segmented image and fed to the classifier to identify and categorize walnut leaf diseases.The proposed architecture consists of five modules.The details of each of the three modules are given in the following.

Image Pre-processing Module
This module aims to resize an RGB leaf image, improve its contrast, and transform the enhanced RGB image into the YUV color space.The proposed method requires images of the same size, whereas raw input image size may vary.In addition, training performs faster on smaller images.Therefore, all raw images are resized to 256 × 256 pixels, as shown in Fig. 3.After the image resizing process, two techniques are separately applied to enhance image contrast, intensity adjustment, and histogram equalization technique.The results of the two enhancement techniques are shown in Fig. 4, where it can be seen that the result of the histogram equalization technique is better than the intensity adjustment.Therefore, histogram equalization is adopted to eliminate the noise and enhance image features of the leaf's surface, such as line, edges, and maladies particles of the leaf.Next, the enhanced RGB color image is converted into a YUV color space using Eqs.( 1)-( 3).The results of the RBG-to-YUV color transformation of the walnut leaf images are presented in Fig. 5.

Image Segmentation Module
This module aims to segment the infected regions in the V channel of a YUV leaf image.Different methods, including the k-means clustering, fuzzy algorithm, region base, convolution neural network, wavelength transform, and thresholding algorithm, have been used for image segmentation [15].In the proposed method, the infected regions in a leaf image are segmented using the Otsu thresholding algorithm [16].This algorithm restores a single intensity range that separates pixels into two classes: foreground and background [17,18].The input image, that is, the V channel obtained from the YUV model, and its segmented image are shown in Fig. 6.

Feature Extraction Module
In pattern recognition and image processing, a feature extraction represents a special form of dimensionality reduction.The main feature extraction goal is to obtain the most relevant information from the original data represented in a lower dimensionality space.The input data are transformed into a reduced representation set of features named the feature vector.In this process, relevant features are extracted from objects to form the feature vectors.Then, classifiers use the feature vectors to recognize the input unit with the target output unit.There are various features in the input images, and they are extracted to detect and classify leaf diseases.A feature often contains data relative to color, shape, context, or texture.In this paper, the color, texture, and shape features are extracted from the leaf images' segmented affected area.

Color Features
A color feature is one of the most widely used features in plant disease detection.The human vision system is more sensitive to color information than to the gray surface.The color of an image can be represented through a few color models.The most commonly used color models are RGB, HSV, and YUV.The color feature can be described by color histogram [19], color correlogram, and a color moment [20].In this paper, the color moment is used to represent color features from the V channel of the YUV model.The color moments include the mean, standard deviation, skewness, variance, kurtosis, and inverse difference moment (IDM).The extracted color features of the image presented in Fig. 6(b) are illustrated in Tab. 1.Following are the features that are extracted from the input images.The mean of an image denotes the average color in an image, which can be computed as follows: Where N denotes the total number of pixels in the image, Pmn denotes the value of a pixel located at (m; n) of the image, and Þ represents the mean value.

Standard deviation:
The variance or deviation between pixels of an input image is represented with the standard deviation, which can be computed by taking the square root of the variance of the color distribution, which is calculated as follows: Where σ denotes the standard deviation.
The skewness is a measure of the degree of asymmetry mean probability distribution, which provides the information on the color distribution shape.It can be computed as follows: The variance is used as a measure of the gray level contrast to establish the relative component descriptors and is calculated as follows: The kurtosis denotes a measure of the peak value of the real-valued random variable; it shapes the descriptor of a probability distribution and can be calculated as follows: The IDM is inversely related to the contrast measure.For similar pixel values, the IDM value is high.Its value can be calculated as follows:

Texture Features
Texture feature is an important low-level feature that divides an image into the region of interests (ROIs) for classification.It provides more details about a specific region in the image.Several methods can be used to describe the main texture features, such as coarseness and regularity.The GLCM measure is one of the most important measures that can be used to describe the texture and to estimate the special dependency of the gray level of an image [21].In this paper, the texture features, including the contrast, correlation, energy, homogeneity, and entropy, are extracted from the walnut leaf through the GLCM.The GLCM features extracted from the image shown in Fig. 6(b) are given in Tab. 2. The formal definitions of the texture features used in this work are as follows.The contrast measures the intensity between a pixel and its neighboring pixel over the whole image, and it is considered zero for a constant image; it is also known as a variance or a moment of inertia and is computed as follows: Where, g i j represents the GLCM, and i and j represent the gray value of a pixel on row and column, respectively.
Correlation represents the estimation of the correlation of a pixel and its neighboring pixel over the entire image and can be computed as follows: Where, μ i and μ j corresponds to the average on row i and column j, respectively and ri and rj correspond to the variance on row i and column j, respectively.
Energy denotes the sum of squared elements in the GLCM, and by default, it is one for a constant image.The energy is also known as the angular second moment, and it is calculated by Homogeneity is a measure of the closeness of the distribution of elements in the GLCM to the GLCM diagonal.The homogeneity is calculated by Entropy is a measure of image complexity, and it measures the disorder of the GLCM.The value of energy is calculated by

Shape Features
Shape features are significant because they describe an object in an image using its most important characteristics.The shape is one of the most important features for detecting infected walnut leaf images.In the walnut leaf disease images, it can be seen that the shapes of various types of diseases differ significantly.After the image segmentation process, the proposed model obtains the infected regions of the target disease, and then the areas of the infected regions are computed.
The percentage of the infected area of the walnut leaf image is the ratio of the area of the infected region in the leaf to that of the whole leaf.The following equation is used for computing the percentage of the infected region in a walnut leaf image.
Where, A1 and A2 denote the total number of white pixels in a leaf has infected area and the total number of pixels in the whole leaf image, respectively.

BPNN Model
A BPNN is the most widely used neural network type for classification and prediction.The backpropagation algorithm searches for the minimum value of the error function in the weight space, utilizing gradient descent.The BPNN is a multilayer network consisting of an input layer, one or more hidden layers, and an output layer.Further, the BPNN plays an active role in agriculture disease recognition, and significant results have been achieved in this field [22,23].In this work, the BPNN algorithm is used for walnut disease identification and classification.
The BPNN model is trained using a dataset comprised of 70% of all images.Various techniques, including pre-processing, segmentation, and features extraction, are applied to all images, and a feature vector presented in Tab. 3 is obtained.The feature vector is fed to the network input layer.The input layer consists of 14 neurons; this number of neurons is used because 14 features are used as the input data.After being processed by the input layer, the features are fed to the hidden layer that consists of 50 neurons.The sigmoid activation function is used in the output layers.The structure of the BPNN is shown in Fig. 7.
In the training process, the maximum number of training epochs was set to 1000, the inertia coefficient was set to 0.8, and the learning efficiency was set to 0.01.The weighted output (XiWij) of the hidden layer's neuron j is added to the bias value of a neuron j in the output layer (ð ð) to obtain the output of the neuron j in the output layer (Ij) r, which is expressed Where Xi represents the input data of the output layer, which is the output data of the hidden layer; Wij represents the weight value of the connection between neuron i in the hidden layer and neuron j in the output layer, and uj represents the bias value of neuron j.
The input Ij passes through an activation function f of the output layer to produce the desired network output Zj.In the present work, the sigmoid activation function has been used as an activation function, and it is expressed by The proposed model was developed using the MATLAB ® 2016 software environment.The trained network was stored in the form of .net1file in MATLAB ® 2016.The trained network was tested using the test dataset.

mSVM
The SVM is one of the most popular supervised machine learning algorithms for object detection and image classification [24].The SVM divides the training dataset into two classes and forms the optimum separating hyperplanes.The feature vector of images in the first-class lies on one side of the hyperplane, and the feature vector of images in the second class lies on the opposite side of the hyperplane.The number of hyperplanes in the SVM depends on the number of classes.In this paper, the mSVM is used for walnut leaf disease detection and classification using a linear kernel function.The number of iterations was set to 500 during the training process of mSVM l.The same process is applied for training the BPNN on the training dataset to obtain the feature vector, which is presented in Tab. 3. The feature vector was used as an input to the mSVM.By applying the mSVM, the feature vector was used to classify images into three classes: infected images I, infected images II, and non-infected images.The training set of the proposed model included images of three classes.The first class included the images of walnut leaves infected by the anthracnose, the second class included the images of walnut leaves infected by the leaf blotch disease, and the third class included images of healthy walnut leaves.The first, second, and third classes were labeled as 1, 2, and 3, respectively, as presented in Tab. 4. The training set contained 70% of all the images of walnut leaves.

Results and Discussions
The walnut leaf images were processed, and the image features were extracted using the GLCM and the color moment.The extracted features were fed to the classifier to predict the walnut leaf diseases, namely, the anthracnose and blotch diseases.The collection of images for model development was the foremost and crucial task in this work.Images of infected walnut leaves were captured on the black background using a highresolution camera, having a 180-dpi resolution.The images were stored in a JPG or PNG format.First, an infected leaf was put on the black background under an appropriate light source.To improve the image's view and brilliance, it was ensured that reflection was eliminated and that the light was uniformly dispersed.The leaf images were appropriately zoomed to ensure that an image included a leaf and the background.The total data consisted of 3670 images of walnut leaves with a resolution of 512 512 pixels; 70% of all images were used for training, and the remaining 30% of images were used for the test.The overall data included 2415 images of the walnut leaves with the anthracnose disease, 740 images of the walnut leaves with the blotch disease, and 515 images of healthy walnut leaves.After the model training with the training dataset, the trained model was tested using the test dataset to verify its ability to recognize walnut leaf diseases.

Classifier Performance Analysis
The detection accuracy was used to test the trained network's performance on the test set, and it was calculated by: Where TP denotes the number of true positives, TN denotes the number of true negatives, and total is the total number of test images.
BPNN classifier: The accuracy of the BPNN is presented in Fig. 8, where it can be seen that the BPNN achieved the highest accuracy of 97.8% for anthracnose disease, while the accuracy values for the other two image classes were lower.The overall accuracy of the BPNN in the walnut leaf disease detection was 95.3%.The accuracy values of the BPNN for the three image classes are given in Tab. 5.  Multiclass SVM: The accuracy of the mSVM in the walnut leaf disease detection is shown in Fig. 9.The overall accuracy of mSVM was 91.1%.The accuracy values of the mSVM for the three image classes are given in Tab. 6.

Comparison of BPNN and mSVM
The accuracies of the two classifiers expressed in percentage are given in Tab. 7. The recognition rates of the BPNN classifier for the anthracnose, leaf blotch and healthy leaves were 97.8%, 95.6%, and 93.33%, respectively; the overall accuracy of the BPNN was 95.3%.The recognition rates of the mSVM for the anthracnose, leaf blotch and healthy leaves were 95.6%, 91%, and 86.7%, respectively.The overall accuracy of the mSVM was 91.1%.The accuracy comparison of the two models is shown in Fig. 10, where it can be seen that the BPNN outperformed the mSVM.This could be because the BPNN learned the loss function parameters and changed their values in each iteration.In contrast, in the mSVM, the detection result was mostly based on a fixed value and could not be changed by calculating the error rate.Furthermore, the BP neural networks had strong self-learning and self-adaptive abilities and fast calculation speeds for large samples, which allows the best prediction models in the walnut leaf diseases.Due to the mentioned reasons mentioned above, the BPNN model result was comparatively better than that of the mSVM.

Conclusion
The research presented in this study proposes a machine learning-based technique for the identification and classification of fungal infections, specifically anthracnose and leaf blotch, in walnut leaves using image analysis.The proposed method follows a series of steps to process the input image and extract relevant features for disease detection.First, the input RGB image is pre-processed, and then it is converted to the YUV color space.Subsequently, the image is segmented using the Otsu thresholding algorithm, which separates the image into distinct regions based on pixel intensities.From the segmented images, color and texture features, along with the affected area, are extracted.To evaluate the performance of the proposed model, a dataset consisting of walnut leaf images is divided into training and testing sets, with 70% of the images used for training and 30% for testing.The model's effectiveness is compared with the mSVM (multiclass Support Vector Machine) model, which is another machine learning approach commonly used in similar applications.Experimental results demonstrate the superiority of the proposed Backpropagation Neural Network (BPNN) model over the mSVM model in terms of walnut leaf disease identification and classification.The proposed BPNN model shows promising results and outperforms the mSVM model in accurately detecting and categorizing walnut diseases.
The findings of this research suggest that the proposed machine learning-based method can be valuable for farmers and agricultural practitioners in the detection of walnut diseases.Additionally, the study acknowledges the existence of other deep learning-based techniques in the literature and suggests that further investigations will compare the proposed BPNN model with various other approaches.Overall, this research contributes to the development of a robust and efficient tool for walnut disease detection, displaying the potential of machine learning and deep learning techniques in the agricultural domain.

Fig. 1
Fig. 1 shows the symptoms of anthracnose and leaf blotch on a walnut leaf.

Figure 1 :
Figure 1: (a) Leaf image containing the symptom of anthracnose (b) Leaf image containing the symptom of blotch

Figure 2 :
Figure 2: Architecture of our proposed technique

Figure 6 :
Figure 6: (a) Input image and (b) Result of segmentation

Figure 7 :
Figure 7: Structure of the BPNN