2023 Automated Fish Classification Using
2023 Automated Fish Classification Using
Abstract Underwater habitats are home to about 34,800 fish species. Understanding
the various fish species and their therapeutic properties is crucial. The majority
of bodies of water have a large number of fish. They can be found in almost all
aquatic settings, from high alpine streams (like char and gudgeon) to the abyssal and
even hadal depths of the deepest oceans, despite the fact that no species have yet
been discovered in the bottom 25% of the ocean (such as cusk-eels and snailfish).
In the research, a model for categorizing the various fish species was devised. To
ascertain the fish’s possible health advantages, different fish species are categorized.
The transfer learning is used to pretrain models with our data which is then given
to convolutional neural networks (CNNs), visual geometry group (VGG), residual
neural network (RESNET), and densely connected neural network (DENSENET)
with a moderate amount of change in the output layer. Additionally, this dataset is
input to a CNN model with five sequential convolutional layers that uses the “Adam
optimizer,” “ReLU,” and the “SoftMax” activation function. Furthermore, convo-
lutional neural network is used with sixteen layers and softmax activation at the
bottom to create the VGG model. The transfer learning is used for RESNET and
DENSENET by changing the last layers. The 121 thick layers of the DENSENET
that has used were interconnected forward and used Adam optimizer. The 50-layer
RESNET, which employs skip connections with ReLU and softmax activation func-
tions. By using classic CNN, we obtain an accuracy of 98.11%; with the help of
VGG, and able to achieve an accuracy of 99%; with RESNET, a 99.56% accuracy;
and finally, DENSENET, a 98.78% accuracy. Potential health advantages of fish.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 167
A. Joshi et al. (eds.), Information and Communication Technology for Competitive
Strategies (ICTCS 2022), Lecture Notes in Networks and Systems 623,
https://doi.org/10.1007/978-981-19-9638-2_16
168 M. C. Teja et al.
1 Introduction
The value of the global fish industry is in the billions of dollars. The fisherman
must be aware of the fish and its price because fishing is a seasonal occupation, and
the species composition and distribution of fish are important biological data for
fisheries research. The Cambrian period saw the emergence of the first soft-bodied
chordates, which are the earliest animals that can be categorized as fish. They had
notochords, which allowed them to be nimbler than their invertebrate counterparts,
despite the fact that they lacked a real spine. Throughout the Paleozoic era, fish would
keep evolving and branching into a vast range of forms. In the Paleozoic, many fish
evolved exterior defenses against predators. The first fish with teeth emerged during
the Silurian epoch, and many of them including sharks became ferocious marine
predators rather than just arthropods’ prey.
The improvement of deep learning techniques, maximum substantially convo-
lutional neural networks (CNNs), rendered all preceding latest machine learning
methods withinside the area of computer vision obsolete. CNN has enabled deep
learning to significantly advance in the processing of visual data, despite the fact that
it is being used in a variety of applications. CNN can be used to classify images and
resolve computer vision-related problems. CNN can be trained to recognize a wide
range of complex features. It enables us to categorize pictures in a timely and precise
manner. Image recognition and detection is a classic machine learning problem. It
is a very challenging task to detect an object or to recognize an image from a digital
image or a video [1]. The important aspect of CNN is to obtain abstract features
when input propagates toward the deeper layers. For example, in image classifi-
cation, the edge might be detected in the first layers, and then the simpler shapes
in the second layers, and then the higher-level features such as faces [2]. Machine
learning has attained a significant change lately due to the growth of ANN. This
method is biologically guided computational models, and it can improve the perfor-
mance [3]. Multilayer perceptron is modified into CNNs. CNNs undertake a unique
method for regularization. CNNs are consequently on the lower end of the connec-
tivity and complexity spectrum. An artificial neural network referred to as a residual
neural network (RESNET) (ANN). Additionally, control neural network makes use
of it. It is a gateless or open-gated variant of the Highway Net, an incredibly deep
feedforward neural network with masses of layers; this is the primary functioning
neural network to move for that reason far. To skip a few layers, use shortcuts or skip
connections (Highway Nets might also learn the skip weights themselves via an addi-
tional weight matrix for their gates). Typical RESNET models are built with batch
normalization in among double- or triple-layer skips that incorporate ReLU nonlin-
earities. DENSENET is models that have numerous parallel skips. A non-residual
network is referred to as a plain network while discussing residual neural networks.
The data is the “A Large-Scale Fish Dataset found on the Kaggle Web site. Images
of trout, shrimp, red mullet, horse mackerel, Black Sea sprat, gilt head bream, and
red sea bream are included in the dataset”. Totally, there are nine types of distinct
species. The remaining part of the paper is divided into the following sections. The
Automated Fish Classification Using Convolutional Neural Networks 169
2 Literature Survey
The study of fish image identification has advanced significantly. For classifying
fish images, many academics have recommended the deep CNN model. Table 1
describes the literacy survey; this study concludes that the proposed models are
using the different techniques to classify the fishes, and maximum of the models are
based on the underwater classification of fish species through different deep learning
techniques.
The section explains the background information on data preparation, the CNN
approaches chosen to build the prediction systems, and the layout of their evalu-
ation. The aim of the paper is to implement a classification models on 9 species of
fish. Dataset: the dataset namely “A Large-Scale Fish Dataset” from Kaggle Web
site. The dataset includes gilt head bream, red sea bream, sea bass, red mullet, horse
mackerel, Black Sea sprat, striped red mullet, trout, shrimp image samples.
Here, all of the images were resized to 256 × 256 pixels in order to fit them into the
CNN model. Each pixel’s value was rescaled from a range of 0–255 to a range of 0
to 1 for pixel-level background [13]. As there are 1000 samples of each species of
fish, the final dataset consist of 9000 image samples [14]. There is no need for any
augmentation because the collection already contains well-augmented images [14].
There are 1000 images of each of the nine different species of fish in this dataset,
which includes 9000 total images. Figure 1 is the procedure for all the models.
Here, the dataset is divided into test, train, and validation first. The convolutional
neural network layers receive the train and validation datasets, which are used to
train the model. After the model has been trained and test it to see how accurate, it is
using the trained dataset. Next step is to classify the fish species on the test dataset.
170 M. C. Teja et al.
Table 1 (continued)
Author and year Aim Methodology Conclusion and gaps
Rathi et al. [10] They have introduced a Convolutional neural Background noise and
framework based on networks, deep other water bodies
deep learning, and learning, and image made it difficult to
image processing processing are used in properly classify
methods have been this method as a novel numerous images.
proposed by them approach.
Saitoh et al. [11] To be able to accept the 1. Normalization They studied efficient
photo of the fish in the 2. Bags of visual words features for fish
complex background model detection and evaluated
taken on the rocky 3. Texture features the proposed method
outcrop 4. Recognition Method: with a large dataset
Random forest (RF)
Cao et al. [12] They introduced to 1. DeCAF They used a
classify marine animals 2. Hand-designed combination of CNN
using combined CNN features and hand-designed
and hand-designed 3. Feature selection image feature for new
image features 4. Feature combination feature creation
in classification
For the purposes of object and image recognition and classification, convolutional
neural networks, or CNN [15, 16], are a common type of artificial neural network.
Thus, deep learning uses a CNN to classify objects in an image. Convolutional neural
networks (CNNs) can analyze complicated objects and patterns due to the fact they
have got an input layer, an output layer, more than one hidden layers, and hundreds
of thousands of parameters. Activation function will be used after the convolutional
and pooling layers. All of the hidden layers are partly connected withinside the
beginning, and the output layer is the final completely connected layer. The size of
the input image and the output shape are comparable. Typically, the first layer extracts
fundamental features like edges that run horizontally or diagonally. The following
layer receives this output and detects more complex features like corners or multiple
edges. The network can also moreover recognize more and more complex elements,
in conjunction with objects, etc. Figure 2, the CNN model, which comprises of five
sequential convolutional layers with ReLU as the activation function and the maxpool
layer after that, was utilized in the study. There are nine classes for classification;
the “softmax” activation function is used at the output layer. For the model, Adam
optimizer was used. For the batch size of 100, 20 epoches are used to implement
CNN Architecture.
172 M. C. Teja et al.
A common neural network that serves as the foundation for many computer vision
packages is known as RESNET, brief for residual networks. This model changed
into the winner of the 2015 ImageNet challenge. RESNET represented a huge devel-
opment in that it effectively enabled us to train fairly deep neural networks with
Automated Fish Classification Using Convolutional Neural Networks 173
greater than a hundred and fifty layers. Due to the difficulty of vanishing gradients,
very deep neural network training changed into challenging before RESNET. Skip
connection changed into first conceived of via way of means of RESNET. A signifi-
cant drawback of convolutional neural networks is the “Vanishing Gradient Problem.”
Gradient value greatly reduces during backpropagation; therefore, weights scarcely
change at all. RESNET is employed to get around this. It employs “SKIP CONNEC-
TION.” Figure 3, convolutional neural network with 50 layers, is called ResNet-50.
The pretrained network, which was trained on more than a million images from the
ImageNet database, has been loaded. Using the pretrained network (transfer learning)
[17], nine different categories of fish can be recognized in images. As a result, the
network now has comprehensive feature representations for a range of photos. Up to
256 by 256 pixel, photos can be sent to the network.
VGG16 is a CNN architecture, which was used and won 2014 ILSVR (ImageNet)
competition. It is seemed as one of the nice vision version architectures created
to date. The most particular characteristic of VGG16 is that it prioritized having
convolution layers of 3 × 3 filters with a stride 1 and usually utilized the same padding
and maxpool layer of 2 × 2 filters with a stride 2. Throughout the whole architecture,
convolution and maxpool layers are organized withinside the identical manner. It
concludes with actually connected layers (FC) and a softmax for output. The 16 in
VGG16 denotes the reality that there are 16 layers with weights. A pretrained model
of the network that has been trained on greater than a million images is present
withinside the ImageNet database. Using the pretrained network, nine special classes
of fish can be known in images. As a result, the network now has whole characteristic
representations for a variety of photos. Up to 256 by 256 pixel, photos can be sent to
the network. Figure 4, VGG [18] architecture, consists of 16 layers with final layer
made use of softmax activation. For the test dataset, the accuracy rate is 99%.
convolutional neural network [20] that links every layer to every layer under it. As
an illustration, the second layer is attached to the third, fourth, and so on layers, and
the first layer is attached to the second, third, fourth, and so on levels. The Tensor-
Flow framework was used and huge pretrained weights databases and ImageNets to
generate the pretrained model. By using this pretrained model, we can classify the 9
species of fishes (Fig. 5); DENSENET architecture consist of 121 dense layers with
the output layer’s softmax activation function became utilized to divide the image
into nine classes. A 98.78% accuracy rate was attained.
4 Results
This section outlines our findings from multiple CNN methods discussed above.
The result obtained from the traditional CNN [21, 22] by using some optimizations
the accuracy rate is 98.11%. Figure 6 describes the actual image is sea bass, and
the predicted sea bass confidence is 100%. The actual image is gilt-head bream, and
the predicted gilt-head bream confidence is 99.9%. The actual image is red mullet,
and the predicted red mullet confidence is 100%. The actual image is shrimp, and
the predicted shrimp confidence is 99.9%. The actual image is stripped red mullet,
and the predicted stripped red mullet confidence is 100%. The actual image is horse
mackerel, and the predicted horse mackerel confidence is 99.99%.
Figure 7 is the graph explains the accuracy of training and validation phase and
loss of train and validation, respectively.
Table 2 describes the individual species’ precision, recall, and F1-scores for
traditional CNN
176 M. C. Teja et al.
The result obtained from the RESNET50 by using some optimizations the achieved
accuracy is 99.56%. Figure 8 describes the actual image is Black Sea sprat, and the
predicted Black sea sprat confidence is 100%.The actual image is gilt-head bream,
and the predicted gilt-head bream confidence is 99.75%. The actual image is red
mullet, and the predicted red mullet confidence is 99.9%. The actual image is shrimp,
and the predicted shrimp confidence is 100%.The actual image is sea bass, and the
predicted sea bass confidence is 99.99%. The actual image is striped red mullet, and
the predicted striped red mullet confidence is 87.39%.
Table 3 describes the individual species’ precision, recall, and F1-scores for
RESNET50
Automated Fish Classification Using Convolutional Neural Networks 177
The result obtained from the VGG16 by using some optimizations the achieved accu-
racy is 99%. Figure 9 describes the actual image is sea bass, and the predicted sea bass
confidence is 100%.The actual image is Black Sea sprat, and the predicted Black Sea
sprat confidence is 100%. The actual image is red mullet, and the predicted red mullet
178 M. C. Teja et al.
confidence is 100%. The actual image is shrimp, and the predicted shrimp confidence
is 100%. The actual image is stripped red mullet, and the predicted stripped red mullet
confidence is 100%.
Table 4 describes the individual species’ precision, recall, and F1-scores for VGG.
Automated Fish Classification Using Convolutional Neural Networks 179
The result obtained from the DENSENET by using some optimizations the achieved
accuracy is 98.78%. This table consists of precision, recall, F1-score for each species
by using DENSENET. Figure 10, the graph explains the accuracy of training and
validation phase and loss of train and validation, respectively
Figure 11 describes the actual image is sea bass, and the predicted sea bass confi-
dence is 92.64%. The actual image is Black Sea sprat, and the predicted Black Sea
sprat confidence is 100%. The actual image is red mullet, and the predicted red
mullet confidence is 79.83. The actual image is horse mackerel, and the predicted
180 M. C. Teja et al.
horse mackerel confidence is 99.47%.The actual image is stripped red mullet, and
the predicted stripped red mullet confidence is 79.63%.The actual image is gilt-head
bream, and the predicted gilt-head bream confidence is 98.99%. The actual image
is Red Sea bream, and the predicted Red Sea bream confidence is 100%. The actual
image is trout, and the predicted trout confidence is 100%.
Table 5 describes the individual species’ precision, recall, and F1-scores for
DENSENET.
Automated Fish Classification Using Convolutional Neural Networks 181
Fig. 11 Actual and predicted images using densely connected convolutional networks
5 Discussion
By the use of VGG (99% accuracy) and RESNET (99.56% accuracy) we achieved
better accuracy as in comparison to the traditional convolutional neural network
(98.11% accuracy) and the DENSENET (98.78% accuracy) for fish classification.
By general overall performance, RESNET achieved excessive accuracy after which
VGG has finished excessive accuracy and DENSENET has finished more accu-
racy as in comparison to the traditional CNN. While DENSENET concatenates all
the preceding feature maps, RESNET makes use of summation to connect them.
DENSENET has one large benefit over traditional deep CNNs: The information will
182 M. C. Teja et al.
now no longer be washed-out or disappear by the point it reaches the network’s end. A
straightforward connectivity scheme does this. Understanding how layers in an ordi-
nary CNN are connected is important to recognize this. RESNET representational
capability is restricted with the aid of using the identification shortcut that stabilizes
training; however, DENSENET has a more capability because of multi-layer feature
concatenation. Dense concatenation, however, creates a brand new problem with the
aid of using necessitating expensive GPU memory and extra training time. With the
introduction of VGG, accuracy and speed each drastically improved. This turned into
more often than not because of growing the model’s depth and including pretrained
models. Nonlinearity improved with the variety of layers with smaller kernels that
is continually a good thing in deep learning. The benefit withinside the RESNET on
this model as mentioned now no longer each neuron withinside the RESNET design
needs to fire at once. This drastically cuts down on training time and will increase
accuracy. After learning a feature once, it does not attempt to learn it again; instead,
it concentrates on learning extra features.
The accuracy finished with the aid of using the DENSENET is lesser compared
with the VGG and RESNET due to the immoderate connections now no longer hand-
iest decrease networks’ computation-efficiency and parameter efficiency, however
additionally make networks extra prone to overfitting. Compared with RESNET,
DENSENET makes use of plenty extra memory because the tensors from exclusive
are concatenated together.
As with they are difficulties with the RESNET, the use of batch normalization
layers, as ResNet, strongly relies on them. It can be difficult to add skip level connec-
tions since you have to consider the dimensionality of the various layers. Accuracy
of DENSENET is more as in comparison to the traditional CNN due to the wide
layers aren’t as necessary in DENSENET for the reason that learned features have
little redundancy because of the excessive connectivity among the layers. A dense
block’s multiple layers all share a common knowledge base. The rate of growth
controls how much new data every layer provides to the overall state. The truth that
every 3 × 3 convolution may be enhanced with a bottleneck accounts for the second
one reason DENSENET has minimum parameters notwithstanding concatenating
Automated Fish Classification Using Convolutional Neural Networks 183
Table 6 Comparison of
Sl No. Model name Accuracy
accuracies of different models
1 CNN 98.11
2 RESNET 99.56
3 VGG 99
4 DENSENET 98.78
6 Conclusion
References
1. Chauhan R, Ghanshala KK, Joshi RC (2018) Convolutional neural network (CNN) for image
detection and recognition. In: First international conference on secure cyber computing and
communication (ICSCCC) 2018, pp 278–282. https://doi.org/10.1109/ICSCCC.2018.8703316
2. Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network.
In: International conference on engineering and technology (ICET), pp 1–6. https://doi.org/10.
1109/ICEngTechnol.2017.8308186
3. Samudre P, Shende P, Jaiswal V (2019) Optimizing performance of convolutional neural
network using computing technique. In: 5th international conference for convergence in
technology (I2CT), pp 1–4. https://doi.org/10.1109/I2CT45611.2019.9033876
4. Vo AT, Tran HS, Le TH (2017) Advertisement image classification using convolutional neural
network. In: 9th international conference on knowledge and systems engineering (KSE). https://
doi.org/10.1109/KSE.2017.8119458(2017)
5. Deep BV, Dash R (2019) Underwater fish species recognition using deep learning techniques.
In: 6th international conference on signal processing and integrated networks (SPIN). https://
doi.org/10.1109/SPIN.2019.8711657
6. Jin L, Liang H (2017) Deep learning for underwater image recognition in small sample size
situations. OCEANS 2017—Aberdeen. https://doi.org/10.1109/OCEANSE.2017.8084645
7. Ding G, Song Y, Guo J, Feng C, Li G, He B, Yan T (2017) Fish recognition using convolutional
neural network. OCEANS 2017—Anchorage
8. Shammi SA, Das S, Hasan M, Noori SRH (2021) FishNet: fish classification using convo-
lutional neural network. In: 12th international conference on computing communication and
networking technologies (ICCCNT). https://doi.org/10.1109/ICCCNT51525.2021.9579550
Automated Fish Classification Using Convolutional Neural Networks 185
9. Akdemir KÜ, Alaybeyoğlu E (2021) Classification of red mullet, bluefish and haddock caught
in the black sea by “single shot multibox detection”. In: International conference on innovations
in intelligent systems and applications (INISTA). https://doi.org/10.1109/INISTA52262.2021.
9548488
10. Rathi D, Jain S, Indu S (2017) Underwater fish species classification using convolutional
neural network and deep learning. In: Ninth international conference on advances in pattern
recognition (ICAPR). https://doi.org/10.1109/ICAPR.2017.8593044
11. Saitoh T, Shibata T, Miyazono T (2015) Image-based fish recognition. In: 7th international
conference of soft computing and pattern recognition (SoCPaR). https://doi.org/10.1109/SOC
PAR.2015.7492817
12. Cao Z, Principe JC, Ouyang B, Dalgleish F, Vuorenkoski A (2015) Marine animal classifica-
tion using combined CNN and hand-designed image features. OCEANS 2015—MTS/IEEE
Washington. https://doi.org/10.23919/OCEANS.2015.7404375
13. Almero VJD, Concepcion II RS, Sybingco E, Dadios EP (2020) An image classifier for under-
water fish detection using classification tree-artificial neural network hybrid. In: RIVF interna-
tional conference on computing and communication technologies (RIVF). https://doi.org/10.
1109/RIVF48685.2020.9140795
14. Dey K, Hassan MM, Rana MM, Hena MH (2021) Bangladeshi indigenous fish classification
using convolutional neural networks. In: International conference on information technology
(ICIT), pp 899–904. https://doi.org/10.1109/ICIT52682.2021.9491681
15. Raihan A, Monju MZ, Hasan MM, Habib MT, Jabiullah MI, Uddin MS (2021) CNN modeling
for recognizing local fish. In: 24th international conference on computer and information
technology (ICCIT), pp 1–5. https://doi.org/10.1109/ICCIT54785.2021.9689898
16. Chi Z, Li Y, Chen C (2019) Deep convolutional neural network combined with concate-
nated spectrogram for environmental sound classification. In: 7th international conference
on computer science and network technology (ICCSNT), pp 251–254. https://doi.org/10.1109/
ICCSNT47585.2019.8962462
17. Pelletier S, Montacir A, Zakari H, Akhloufi M (2018) Deep learning for marine resources
classification in non-structured scenarios: training vs. transfer learning. In: Canadian confer-
ence on electrical & computer engineering (CCECE), pp 1–4. https://doi.org/10.1109/CCECE.
2018.8447682
18. Zhou J, Xiao D, Zhang M (2019) Feature correlation loss in convolutional neural networks
for image classification. In: 3rd information technology, networking,electronic and automation
control conference (ITNEC). pp 219–223. https://doi.org/10.1109/ITNEC.2019.8729534
19. Zheng Z, Guo C, Zheng X, Yu Z, Wang W, Zheng H, Fu M, Zheng B (2018) Fish recogni-
tion from a vessel camera using deep convolutional neural network and data augmentation.
In: OCEANS—MTS/IEEE Kobe techno-oceans (OTO), pp. 1–5. https://doi.org/10.1109/OCE
ANSKOBE.2018.8559314
20. Malik S, Kumar T, Sahoo AK (2017) Image processing techniques for identification of fish
disease. In: 2nd international conference on signal and image processing (ICSIP), pp 55–59.
https://doi.org/10.1109/SIPROCESS.2017.8124505
21. Han SH, Lee KY (2019) Implemetation of image classification CNN using multi thread GPU.
In: international SoC design conference (ISOCC), pp 296–297. https://doi.org/10.1109/ISOCC.
2017.8368904
22. Malfante M, Mohammed O, Gervaise C, Dalla Mura M, Mars JI (2018) Use of deep features
for the automatic classification of fish sounds. In: 2018 OCEANS—MTS/IEEE Kobe techno-
oceans (OTO), pp 1–5. https://doi.org/10.1109/OCEANSKOBE.2018.8559276.