SNS COLLEGE OF TECHNOLOGY
(An Autonomous Institution)
DEEP LEARNING ASSIGNMENT PHASE - II
NEWS CATEGORY CLASSIFER PREDICTOR USING GEN AI
NAME: VASANTH.A
DEPT: III AIML B
REG NO. : 713522AM114
ABSTRACT
This project presents a deep learning-powered News Category Classifier
enhanced with Generative AI capabilities for improved content understanding
and user interaction. Traditional classifiers often label articles without offering
deeper context. To address this, we utilize the distilbert-base-uncased
transformer model fine-tuned on a labeled news dataset, enabling accurate
categorization across domains such as sports, politics, business, and technology.
Integrated with an interactive Streamlit UI, users can input or paste any news
article, and the app processes the text through the model to predict its category.
To further enhance interpretability, the app optionally generates human-like
summaries or reasoning using a lightweight LLM accessed via the Hugging Face
API or Ollama CLI. This hybrid approach not only ensures high classification
accuracy but also bridges the explainability gap, making it ideal for media
houses, content aggregators, and academic research platforms.
INTRODUCTION
In an era where information is generated at an unprecedented pace, organizing
and categorizing news content efficiently has become a critical challenge for
media platforms and content aggregators. While traditional machine learning
models can classify articles into categories like politics, sports, or business, they
often fail to provide insights into why a certain classification was made. This
lack of explainability reduces trust and limits the system’s usability for editorial
teams and end-users.
This project addresses that gap using Generative AI. By fine-tuning a
transformer-based language model (DistilBERT) on labeled news datasets, and
optionally integrating a large language model (e.g., LLaMA 2 or Mistral via
Hugging Face), we enable the system not only to classify articles accurately but
also to justify its decisions in natural language. This approach enhances
transparency, improves user engagement, and makes the classifier suitable for
real-world deployment in journalism, academic research, and content
moderation.
PROCESS FOR IDENTIFICATION OF PROBLEM
STATEMENT
The following steps were followed:
❖ Domain Analysis:
➢ We examined existing news classification systems and observed
that while many models can label articles accurately, they lack
transparency and offer no rationale behind their decisions,
reducing user trust and engagement.
❖ Dataset Acquisition:
➢ We used a labeled dataset containing news headlines and article
content tagged with categories such as politics, sports, technology,
business, and more. This dataset enabled us to train and evaluate
the classifier on diverse real-world examples.
❖ Problem Framing:
➢ The key question became:
“Can we develop a news classifier that not only categorizes articles
correctly but also explains the reasoning behind each classification
using human-like language?”
❖ Model Choice Justification:
➢ DistilBERT, a lightweight and efficient transformer model,
was selected for its strong performance on text classification
tasks. It was further enhanced using a generative model to
generate natural language explanations.
ARCHITECTURE
STAGES OF DEVELOPMENT
Stage 1: Data Preprocessing
• Collected and cleaned news category dataset (e.g., AG
News, BBC News, etc.)
• Transformed each record into natural language prompts
(e.g., “Classify the following news headline: ‘Stocks fall
amid inflation fears.’”)
• Split the dataset into training and testing sets for model
evaluation
Stage 2: Fine-Tuning the LLaMA or Mistral
Model
• Utilized Hugging Face’s transformers and datasets
libraries
• Fine-tuned with LoRA (Low-Rank Adaptation) for
efficient training
• Model options:
meta-llama/Llama-2-7b-chat-hf or
mistralai/Mistral-7B-Instruct
Instruction tuning format used:
json
Copy
Edit
{
"prompt": "Classify this news: 'The Prime Minister addressed
the nation regarding economic reforms...'",
"response": "Politics. This is related to government and
public administration."
}
Stage 3: Frontend UI Development
• Designed using Streamlit
• Users input headlines/articles via st.text_input() or
st.text_area()
• Prompts are dynamically constructed and passed to the
model backend
Stage 4: Model Invocation
• Prompt sent to Ollama CLI using:
ollama run news-classifier-model
• Model generates prediction + explanation in human-
readable format
Stage 5: Integration & Optimization
• Added @st.cache_resource to optimize model loading
• Integrated loading indicators, clean interface layout, and
category-wise color tags for better UX
PLATFORMS AND TOOLS INCORPORATED
1. Python
● Role: Backbone of the entire project
● Why Used: Python is the preferred language for AI/ML due to its
simplicity, extensive libraries, and vibrant ecosystem.
● Usage:
○ Data preprocessing (pandas, numpy)
○ Model training and prompt engineering
○ Streamlit app development
○ Interfacing with Ollama CLI
2. Streamlit
● Role: Frontend user interface
● Why Used: Streamlit is a fast and easy way to build web apps for
machine learning and data science projects.
● Features Used:
○ st.slider() and st.selectbox() for collecting user
inputs
○ st.chat_message() to simulate a ChatGPT-like interaction
○ st.spinner() to show loading status during inference
● Outcome: Created a clean, interactive UI where users provide loan
application data and receive conversational feedback from the AI model.
3. Hugging Face Transformers
● Role: Model fine-tuning and inference pipeline
● Why Used: Hugging Face provides tools for loading, training, and
deploying large language models like LLaMA.
● Usage:
○ Downloading the base meta-llama/Llama-2-7b-chat-hf
model
○ Converting structured data into instruction-format prompts
○ Fine-tuning the model with domain-specific data
○ Creating inference pipelines to integrate with the UI
4. Ollama CLI
● Role: Lightweight runtime to serve the LLaMA model locally
● Why Used: Traditional deployment of LLaMA models requires heavy
GPU servers. Ollama simplifies this by offering containerized, fast local
inference.
● Usage:
○ Hosting the fine-tuned LLaMA model using a Modelfile
○ Responding to prompts sent from the Streamlit app
Command Example:
ollama run llama-custom-model
5. CUDA / GPU (Optional but Recommended)
● Role: Hardware acceleration for training large models
● Why Used: Fine-tuning a model like LLaMA-2-7B requires high memory
and compute, which CPUs can’t handle efficiently.
● Toolkits: NVIDIA CUDA, cuDNN
● Outcome: Enabled faster training during the model fine-tuning phase
6. Pandas
● Role: Data analysis and preprocessing
● Why Used: Easy handling of structured data in tabular format (CSV
files).
● Usage:
○ Read loan_data_with_cibil.csv
○ Cleaned and transformed categorical variables (e.g., gender,
employment status)
○ Merged features into training prompts
7. Scikit-learn
● Role: Data preparation and utility functions
● Why Used:
○ Used for splitting the dataset into training and testing sets
(train_test_split)
○ Label encoding for categorical features (if needed)
8. LoRA (Low-Rank Adaptation)
● Role: Parameter-efficient fine-tuning
● Why Used: Fine-tuning large models like LLaMA from scratch is
memory-intensive. LoRA reduces the number of trainable parameters.
● Library: peft from Hugging Face ecosystem
● Outcome: Reduced training cost and made fine-tuning feasible on a
mid-tier GPU
9. Transformers Datasets
● Role: Efficient data handling for model training
● Why Used:
○ Compatible with Hugging Face training pipeline
○ Fast I/O and in-memory caching for better training performance
● Usage:
○ Loaded prompt-response pairs in a format suitable for training the
LLaMA model
10. Modelfile (Ollama Specific)
● Role: Configuration file for serving LLaMA models using Ollama
● Why Used: Specifies base model, adapter files, and prompt formatting
Example:
FROM llama2
ADAPTER my-fine-tuned-model.bin
SYSTEM "You are a helpful news category classifier"
OUTCOME
INPUT
OUTPUT
SOURCE CODE