Summer Intern Report
Summer Intern Report
A Project Report Submitted in complete Fulfillment of the Requirements for the Award
of the Degree of
Bachelor of Technology
in
Computer Science and Engineering
By
M. Vedavyas (N200988)
B. Gopi (N201028)
J. Venkata Manoj (N200682)
i
RAJIV GANDHI UNIVERSITY OF KNOWLEDGE TECHNOLOGIES
(A.P. Government Act 18 of 2008) RGUKT-NUZVID, Krishna Dist - 521202
Tel: 08656-235557 / 235150
CERTIFICATE OF COMPLETION
This is to certify that the work entitled “Indian Bird Species Image Classification” is
the bonafide work of students M. Vedavyas (N200988), B. Gopi (N201028), and
J. Venkata Manoj (N200682), carried out under supervision during Summer Intern-
ship at SkillDzire, as a part of the Bachelor of Technology in the Department of Computer
Science and Engineering under RGUKT-IIIT Nuzvid.
The internship was successfully completed during the period May 2025 – July 2025
(8 weeks), involving comprehensive research and development work on:
Data collection, exploration, and visualization of bird species images
Preprocessing using resizing, normalization, and data augmentation techniques
Implementation of Convolutional Neural Network (CNN) models.
Evaluation using accuracy, precision, recall, F1-score, and confusion matrices
Model comparison and hyperparameter tuning using Keras/TensorFlow callbacks.
The work demonstrates excellent understanding of image preprocessing, model evalu-
ation metrics, and the importance of balancing accuracy and class-specific performance
in multi-class image classification.
ii
RAJIV GANDHI UNIVERSITY OF KNOWLEDGE TECHNOLOGIES
(A.P. Government Act 18 of 2008) RGUKT-NUZVID, Krishna Dist - 521202
Tel: 08656-235557 / 235150
CERTIFICATE OF EXAMINATION
This is to certify that the work entitled, “Bird Species Image Classification Using
Convolutional Neural Networks” is the bona fide work of M. Vedavyas (N200988),
B. Gopi (N201028), and J. Venkata Manoj (N200682). The report has been examined
and approved as a study conducted and presented in a manner suitable for acceptance
as the major deliverable of the Summer Internship Programme (May 2025 – July 2025)
at SkillDzire. It is hereby recognized as a work satisfactorily performed in fulfillment of
the requirements for the award of the Bachelor of Technology degree.
This approval does not necessarily endorse or accept every statement made, opinion
expressed, or conclusion drawn, as recorded in this report. It solely signifies the accep-
tance of this work for the purpose for which it has been submitted.
iii
RAJIV GANDHI UNIVERSITY OF KNOWLEDGE TECHNOLOGIES
(A.P. Government Act 18 of 2008) RGUKT-NUZVID, Krishna Dist - 521202
Tel: 08656-235557 / 235150
DECLARATION
We further declare that this internship work is a result of our independent effort and
has not been copied or reproduced from any other source. Where reference has been made
to external material, appropriate citations are given in the references section. The results
and content embodied in this report have not been submitted to any other university or
institute for the award of any degree or diploma.
Date:
Place:
iv
LIST OF CONTENTS
1. Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Motivation for the Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2
2.2 Real-world Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
6. Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-16
6.1 Technologies/Libraries Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2 System Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.3 Comparison with Existing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
v
7. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8. Future Scope/Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
LIST OF FIGURES
Figure-2. Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
vi
1. ABSTRACT
1
2. INTRODUCTION
2.1 Objective
The main objective of this project is to develop a machine learning-based system capable
of accurately classifying Indian bird species from images. Specifically, the goals are:
To build a machine learning model that can distinguish between 25 different bird
species using image data.
To analyze the dataset for class imbalances and image variations, such as pose,
background, and lighting conditions.
To implement a robust preprocessing and augmentation pipeline to improve model
generalization.
To fine-tune a pretrained AlexNet model to achieve high classification accuracy.
To provide a practical tool for ornithologists, bird watchers, and wildlife researchers
for species identification.
This project leverages transfer learning and data augmentation to overcome challenges
like small sample sizes for rare species, diverse backgrounds, and intra-species variation.
2
2.3 Real-world Applications
The AI-powered bird classification system can be applied in multiple real-world scenarios:
2. Educational Tools: Help students and enthusiasts learn bird species quickly and
accurately through image recognition apps.
Example: Mobile apps, interactive guides, online learning platforms.
4. Agriculture and Pest Management: Identify bird species that impact crops
positively (pest predators) or negatively (fruit eaters), helping farmers take informed
decisions.
Example: Monitoring pest-controlling bird populations on farms.
6. AI-assisted Bird Alerts: Integrate with smart devices to alert users when specific
rare or endangered birds are nearby, supporting birdwatchers and researchers.
Example: Mobile notifications, smart binoculars, camera traps with AI.
3
3. RELATED/EXISTING WORKS
Feature-based methods: SIFT, HOG, and color histograms used with classifiers
like SVM and Random Forest.
Limitations: Require manual feature extraction, perform poorly with high intra-
class variation and complex backgrounds.
Example: Early bird identification systems using SVM achieved moderate accuracy
(60–75%) on small datasets.
CNN-based models: AlexNet, VGG, ResNet fine-tuned for bird species datasets.
Limitations: Model size and latency restrict practical usage; low robustness under
varying lighting, background, or bird poses.
4
3.2 Research Gaps Identified
Despite advances in deep learning for image classification, several research gaps remain
in bird species identification:
Limited Dataset Diversity: Most existing datasets are biased toward common
species, leaving rare or endangered birds underrepresented. This affects model gen-
eralization in real-world scenarios.
Imbalanced Class Distribution: Many species have very few images compared
to others, causing models to be biased toward majority classes despite augmentation
techniques.
Lack of Real-time Usability: Most research focuses on accuracy but does not
address inference speed, which is critical for field applications like mobile bird iden-
tification apps.
Explainability Issues: Current models act as black boxes, providing little in-
terpretability for species identification decisions, which is important for scientific
validation.
5
4. PROPOSED METHOD
4.1 Flowchart
The flowchart illustrates the step-by-step pipeline of the bird species classification system,
from dataset input to final prediction.
6
4.2 Explanation of Flowchart Components
The workflow for bird species classification consists of six major stages, each contributing
to building a reliable and accurate model:
Dataset Collection: Images of 25 Indian bird species are collected from Kaggle
and open-source repositories. Using multiple sources ensures diversity in lighting,
pose, and background, which improves the robustness of the dataset.
Data Preprocessing: All images are resized to 224 × 224 pixels (standard input
for AlexNet), normalized, and augmented using rotation, flipping, zooming, and
RandAugment. These steps help handle class imbalance, improve data variety, and
reduce overfitting.
Model Evaluation: Final evaluation is performed on a separate test set using met-
rics such as accuracy, precision, recall, F1-score, and confusion matrix. Misclassified
samples are analyzed to highlight challenging classes and identify areas for further
improvement.
7
5. ALGORITHM AND WORKING
Class imbalance: Certain species, such as the Indian Peafowl, have a larger number
of images due to their popularity, whereas others such as the Indian Pitta are under-
represented. This imbalance risks biasing the model towards majority classes.
Intra-class variation: Birds of the same species may appear in different envi-
ronments, seasons, and life stages. For example, juvenile birds often have distinct
plumage compared to adults.
Inter-class similarity: Many species share similar color palettes and feather struc-
tures (e.g., different species of kingfishers), making fine-grained classification chal-
lenging.
Understanding these dataset characteristics forms the baseline for effective preprocessing
and guides model selection and augmentation decisions.
8
5.2 Preprocessing Techniques
Data preprocessing is a crucial step that transforms raw image inputs into a structured
form suitable for training deep learning models. Without preprocessing, inconsistencies
in size, scale, and distribution would hinder training efficiency.
Resizing and Normalization: All images are resized to 224 × 224 pixels to match
AlexNet’s input layer requirements. Pixel intensities are normalized to a [0, 1] range by
dividing by 255. This ensures consistent input values and faster gradient convergence.
Data Augmentation: Given the class imbalance and dataset size, augmentation is
essential to artificially increase variability. Techniques applied include:
9
5.3 Model Development
Deep learning has transformed computer vision, particularly with Convolutional Neural
Networks (CNNs), which extract hierarchical spatial features from images. In this work,
transfer learning is adopted using AlexNet as the backbone.
AlexNet, proposed by Krizhevsky et al. (2012), was a breakthrough architecture for Im-
ageNet classification. It consists of five convolutional layers, three fully connected layers,
and employs ReLU activation and dropout for regularization. The model’s pretrained
weights, trained on millions of ImageNet images, capture generic features such as edges,
textures, and shapes.
Why AlexNet?
Computationally less intensive than deeper networks (ResNet, DenseNet).
The pretrained AlexNet is adapted to this classification task. The final fully connected
layer is replaced with a dense layer containing 25 nodes, one for each species, with a
softmax activation:
ezj
P (y = j|x) = P25 zk
k=1 e
10
Hyperparameter Tuning:
Optimizer: Adam with learning rate 1e−4 .
Lower convolutional layers are frozen to retain generic feature extraction, while higher
layers are fine-tuned to specialize in bird-specific features. This combination of transfer
learning and fine-tuning enables efficient training while maintaining high accuracy.
Confusion Matrix: A confusion matrix highlights which bird species are frequently
misclassified. For instance, visually similar species such as parakeets and bee-eaters
show higher confusion rates, suggesting the need for additional data or specialized aug-
mentations.
Training and Validation Trends: Plots of accuracy and loss across epochs are ana-
lyzed to check convergence and detect overfitting. Early stopping ensures training halts
once validation accuracy stabilizes.
11
Sample Code :
12
Output:
Figure 2: Output
Confusion-Matrix:
13
Figure 4: Accuracy
14
6. IMPLEMENTATION
PyTorch: Deep learning framework used for model development, training, and
fine-tuning of AlexNet.
Torchvision: Provides pretrained AlexNet model, dataset handling utilities, and
data augmentation transforms.
NumPy & Pandas: Used for numerical operations, dataset management, and
preprocessing tasks.
Matplotlib & Seaborn: Visualization libraries for plotting class distribution, ac-
curacy/loss curves, and confusion matrices.
Scikit-learn: Used for evaluation metrics such as precision, recall, F1-score, and
for generating confusion matrices.
Graphics Processing Unit (GPU): NVIDIA GeForce RTX 3060 with 12GB
VRAM
Memory (RAM): 32 GB DDR4
15
6.3 Comparison with Existing Systems
The proposed bird classification system is compared with existing methods in terms of
dataset size, model architecture, accuracy, and preprocessing strategies. The summary
is shown in Table 1.
Table 1: Comparison with Existing Systems
16
7. CONCLUSION
In this work, a deep learning-based system for the classification of 25 Indian bird species
was developed using transfer learning with the AlexNet architecture. The methodol-
ogy involved systematic dataset collection, preprocessing with augmentation techniques,
fine-tuning of the pretrained model, and rigorous evaluation using multiple performance
metrics. The experimental results demonstrated that the proposed approach achieved
high accuracy while maintaining generalization capability, highlighting the effectiveness
of transfer learning for biodiversity-related image classification tasks. The analysis of
misclassified samples revealed challenges such as inter-class similarities, background clut-
ter, and limited samples for certain bird categories. These observations suggest that
future improvements may be achieved by leveraging larger and more balanced datasets,
advanced augmentation strategies, or the integration of more recent deep architectures
such as ResNet, DenseNet, or EfficientNet. Overall, the proposed system demonstrates
the potential of deep learning in assisting ornithological studies, wildlife monitoring, and
conservation initiatives by providing an automated and scalable solution for bird species
identification. This research also lays the groundwork for future applications in ecologi-
cal informatics, where AI-driven approaches can significantly contribute to biodiversity
preservation and environmental monitoring.
17
8. FUTURE SCOPE
Although the proposed system achieved promising results, there remain several opportu-
nities for improvement and expansion:
Larger and More Diverse Datasets: Expanding the dataset with additional
bird species and images captured in varied environments would enhance the model’s
robustness and improve its ability to generalize.
By addressing these areas, the system can evolve into a more accurate, scalable, and
practical solution, contributing meaningfully to ornithology, ecological monitoring, and
biodiversity conservation.
18
9. REFERENCES
References
[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep
convolutional neural networks,” Advances in Neural Information Processing Systems
(NeurIPS), pp. 1097–1105, 2012.
[2] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale
hierarchical image database,” IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 248–255, 2009.
[3] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “The Caltech-UCSD
Birds-200-2011 Dataset,” Technical Report CNS-TR-2011-001, California Institute
of Technology, 2011.
[4] Kaggle, “Indian Bird Species Dataset,” Available: https://www.kaggle.com/ [Ac-
cessed: Aug. 2025].
[5] C. Shorten and T. M. Khoshgoftaar, “A survey on Image Data Augmentation for
Deep Learning,” Journal of Big Data, vol. 6, no. 60, pp. 1–48, 2019.
[6] PyTorch Foundation, “PyTorch: An open source machine learning framework,”
Available: https://pytorch.org/ [Accessed: Aug. 2025].
19