KEMBAR78
Problem Solution and Tech Stack | PDF | Json | Information Retrieval
0% found this document useful (0 votes)
276 views22 pages

Problem Solution and Tech Stack

This document outlines a strategic and technical guide for developing a solution for the HackRx 6.0 challenge, focusing on creating an intelligent decision-support tool using Large Language Models (LLMs) for unstructured documents. It details the necessary components, architectural patterns, and technology stack, emphasizing the importance of rapid development and a robust backend using Python, FastAPI, and LangChain. The proposed architecture, termed 'Evaluative RAG,' enhances traditional methods by incorporating logical evaluation and structured output to meet the challenge's requirements effectively.

Uploaded by

prakashwature
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
276 views22 pages

Problem Solution and Tech Stack

This document outlines a strategic and technical guide for developing a solution for the HackRx 6.0 challenge, focusing on creating an intelligent decision-support tool using Large Language Models (LLMs) for unstructured documents. It details the necessary components, architectural patterns, and technology stack, emphasizing the importance of rapid development and a robust backend using Python, FastAPI, and LangChain. The proposed architecture, termed 'Evaluative RAG,' enhances traditional methods by incorporating logical evaluation and structured output to meet the challenge's requirements effectively.

Uploaded by

prakashwature
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

A Strategic and Technical Guide to Building the HackRx 6.

0
Solution

Part I: Strategic Blueprint for a Winning Hackathon Entry

This report provides a comprehensive architectural and implementation blueprint for


tackling the HackRx 6.0 challenge. The objective is to guide the development of a
sophisticated system that not only meets but exceeds the problem statement's
requirements, positioning the final product as a standout entry. The approach
outlined here prioritizes rapid, robust development—a critical factor in a
time-constrained hackathon environment—by leveraging modern tools and proven
architectural patterns. The focus is on building a functional, impressive, and
demonstrable prototype that effectively showcases advanced AI capabilities.

1.1 Deconstructing the HackRx Challenge: A Systems-Level View

The problem statement for HackRx 6.0 calls for a system that leverages Large
Language Models (LLMs) to function as an intelligent decision-support tool for
unstructured documents like insurance policies, contracts, and legal agreements.1 A
thorough deconstruction of the requirements reveals five core functional components
that must be engineered:
1.​ Ingestion: The system must be capable of processing and understanding various
unstructured document formats, including PDFs, Word files, and emails.
2.​ Query Parsing: It must interpret natural language queries, which may be vague
or use informal shorthand (e.g., "46M" for "46-year-old male"), and extract key
structured details like age, procedure, location, and policy duration.
3.​ Retrieval: The system must perform a semantic search to find relevant clauses or
rules from the ingested documents. The problem explicitly states this should be
based on "semantic understanding rather than simple keyword matching," a
crucial distinction that mandates the use of vector search technologies.
4.​ Evaluation and Reasoning: This is the most complex and differentiating
requirement. The system must not simply return the retrieved information; it must
evaluate it based on the logic defined within the clauses to arrive at a "correct
decision," such as an approval status or a payout amount.
5.​ Structured Output: The final output must be a machine-readable, structured
JSON object containing the decision, any applicable amount, and a clear
justification that maps the decision back to the specific source clauses used.

While many systems can perform information retrieval, the true challenge and
opportunity for innovation lie in the "Evaluation and Reasoning" step. The problem
demands more than a question-answering bot; it requires a primitive reasoning
engine that can apply logical rules found in text to a given set of facts (the user's
query). A successful solution will demonstrate this capability clearly, moving beyond
simple information retrieval to automated, evidence-based decision-making. This
elevates the project from a standard RAG implementation to a more sophisticated
system, which will be a significant factor in judging.

1.2 The Core Architectural Pattern: Advanced Retrieval-Augmented Generation


(RAG)

The most suitable architectural pattern for this challenge is Retrieval-Augmented


Generation (RAG). The standard RAG pipeline consists of two main phases: an offline
Indexing phase and an online Retrieval and Generation phase.2
●​ Indexing: This involves loading documents, splitting them into manageable
chunks, converting these chunks into numerical representations (embeddings)
that capture their semantic meaning, and storing these embeddings in a
specialized database (a vector store).
●​ Retrieval and Generation: When a user submits a query, the system converts
the query into an embedding and uses it to retrieve the most semantically similar
chunks from the vector store. These retrieved chunks, along with the original
query, are then fed into an LLM, which generates a final, coherent answer.

To address the specific requirements of the hackathon, this standard pattern must be
extended. The proposed architecture can be termed "Evaluative RAG" or
"Decision-RAG." This model enhances the standard RAG flow by inserting explicit
steps for query parsing and, most importantly, by engineering the final generation
prompt to command the LLM to perform logical evaluation rather than simple
summarization.

The high-level flow of the Evaluative RAG system will be as follows:


1.​ A user submits a natural language query.
2.​ An initial LLM call parses the query into a structured data object.
3.​ This structured data is used to formulate a semantic search query.
4.​ The retriever fetches relevant clauses from the vector store.
5.​ A final, carefully constructed prompt containing the parsed query, the retrieved
clauses, and a directive to "evaluate and decide" is sent to the LLM.
6.​ The LLM processes this information and generates the required structured JSON
output, including the decision and its justification.

This architecture directly addresses the need for semantic search and logical
evaluation, setting it apart from more common RAG-based Q&A bots.3

1.3 The Recommended Full-Stack Architecture: A Unified Monorepo Approach

For a hackathon project involving both a backend API and a frontend interface, a
monorepo structure is the most efficient approach. A monorepo is a single version
control repository that houses multiple distinct projects, such as the Python backend
and the React frontend.4 This unified structure offers several advantages that are
particularly valuable under tight deadlines:
●​ Simplified Setup: A single git clone and a unified setup script can initialize the
entire development environment.
●​ Streamlined Dependency Management: Tools like pnpm or yarn workspaces
can manage dependencies across both the frontend and backend, reducing
conflicts and simplifying installation.5
●​ Atomic Commits: Changes that affect both the frontend and backend (e.g., an
API modification) can be committed together, ensuring the repository is always in
a consistent state.
●​ Enhanced Collaboration and Code Sharing: It is easier to share code, types,
and configurations between the projects.6

A recommended directory structure for this project would be:


/hackrx-solution​
├──.github/ # CI/CD workflows​
├──.husky/ # Git hooks for code quality​
├── backend/ # Python/FastAPI application​
│ ├── app/​
│ └──...​
├── frontend/ # React/Vite application​
│ ├── src/​
│ └──...​
├── package.json # Root package file for monorepo scripts​
└── pnpm-workspace.yaml # pnpm workspace configuration​

This structure, inspired by proven full-stack templates, provides a clean separation of


concerns while enabling a cohesive development workflow.7

A critical, second-order benefit of the monorepo approach is the ability to easily


implement end-to-end type safety. The FastAPI backend automatically generates an
OpenAPI specification, which defines the API's endpoints, request bodies, and
response schemas. Within the monorepo, a script can be configured to use a tool like
openapi-typescript to read this schema and automatically generate corresponding
TypeScript type definitions for the frontend API client.7 This creates a powerful
feedback loop: if a developer changes a Pydantic model in the backend, the
frontend's TypeScript types can be regenerated, and the TypeScript compiler will
immediately flag any part of the frontend code that is now inconsistent with the new
API contract. This automated, compile-time checking eliminates a vast category of
common runtime errors, dramatically accelerating development and debugging—a
decisive advantage in a hackathon.

Part II: The Intelligence Core: Building the Python & FastAPI
Backend

The backend is the heart of this system, responsible for all the heavy lifting, from
document ingestion to the final AI-driven evaluation. The technology choices for this
layer must prioritize development speed, performance, and, most importantly, access
to a mature AI ecosystem.

2.1 Technology Stack Justification: Python, FastAPI, and LangChain

The recommended backend stack is a combination of Python, the FastAPI framework,


and the LangChain library. This stack is purpose-built for creating high-performance,
AI-powered applications quickly.
●​ Programming Language: Python. While Node.js is a capable backend
technology, particularly for real-time, I/O-bound applications 9, Python is the
undisputed leader in the AI and machine learning domain. Its ecosystem is
unparalleled, providing direct access to foundational libraries like Hugging Face
Transformers, PyTorch, and sentence-transformers, which are essential for tasks
like embedding generation and model interaction.11 Choosing Node.js would
necessitate complex and brittle workarounds, such as creating a separate Python
microservice for the AI components and bridging the two, adding unnecessary
complexity and latency.11 For any serious AI application, Python is the most direct
and powerful path.
●​ Web Framework: FastAPI. FastAPI is a modern, high-performance Python web
framework ideal for building APIs. Its key advantages for this project include:
○​ Asynchronous Support: FastAPI is built on ASGI (Asynchronous Server
Gateway Interface), allowing it to handle long-running I/O operations, like calls
to external LLM APIs, concurrently without blocking the server. This is critical
for building a responsive application.13
○​ Performance: It is one of the fastest Python frameworks available,
approaching the performance of Go and Node.js.10
○​ Automatic Documentation: It automatically generates interactive API
documentation (via Swagger UI and ReDoc) based on the code and Pydantic
schemas. This is invaluable for development, testing, and frontend
integration.7
○​ Type Hinting and Validation: Its deep integration with Pydantic ensures that
all incoming requests and outgoing responses are validated against defined
schemas, reducing bugs and improving code reliability.
●​ AI Orchestration Framework: LangChain. LangChain is a comprehensive
framework designed to simplify the development of applications powered by
LLMs. It provides a rich set of abstractions and pre-built components for every
stage of the RAG pipeline, including document loaders, text splitters, embedding
models, vector stores, and prompt templates.2 Using LangChain dramatically
reduces the amount of boilerplate code required, allowing the team to focus on
the core logic of the application rather than on low-level integrations.2

The following table provides a clear justification for these choices compared to
common alternatives.

Table 1: Backend Technology Stack Comparison

Technology Use Case Key Strengths for Key Weaknesses for


HackRx HackRx

Python General-purpose Unmatched AI/ML Slower than compiled


programming, AI/ML ecosystem languages for
(Transformers, CPU-bound tasks,
PyTorch). Direct but this is not the
access to all bottleneck for this
necessary libraries. I/O-bound
Large talent pool. 11 application. 19

Node.js Backend web Excellent for Very limited native


services, real-time I/O-bound tasks and AI/ML ecosystem.
apps real-time Would require
communication. bridging to a Python
Unified service, adding
JavaScript/TypeScrip complexity and
t stack with frontend. latency. 11
9

FastAPI Modern Python API High performance, Newer than other


development native async support, frameworks, so the
automatic API community, while
documentation, large, is not as
Pydantic-based data tenured as Django's.
validation. Ideal for
LLM-based services.
13

Flask/Django General Python web Mature, stable, and Flask is minimalist


development very large and requires more
communities. Django boilerplate. Django is
includes many large and
"batteries-included" opinionated, which
features. can slow down initial
development. Neither
has native async
support as robust as
FastAPI's.

LangChain LLM application Dramatically The high level of


orchestration accelerates abstraction can
development with sometimes make
pre-built components debugging complex
for RAG. Simplifies chains more difficult.
complex prompt The library is evolving
chains and agentic rapidly.
workflows. 2

2.2 Project Scaffolding: Structuring the Monorepo for Rapid Development

A well-organized project structure is essential for maintaining clarity and development


velocity. The following structure is recommended for the backend portion of the
monorepo, based on established best practices.7

/hackrx-solution​
├── backend/​
│ ├── app/​
│ │ ├── __init__.py​
│ │ ├── api/ # FastAPI routers and endpoints​
│ │ │ ├── __init__.py​
│ │ │ └── endpoints.py​
│ │ ├── core/ # Configuration, settings, logging​
│ │ │ ├── __init__.py​
│ │ │ └── config.py​
│ │ ├── services/ # Core business logic (RAG pipeline)​
│ │ │ ├── __init__.py​
│ │ │ └── rag_service.py​
│ │ └── schemas/ # Pydantic models for API validation​
│ │ ├── __init__.py​
│ │ └── models.py​
│ ├── data/ # Directory for source documents (e.g., PDFs)​
│ ├── vector_store/ # Directory for ChromaDB persistent storage​
│ ├── main.py # FastAPI application entrypoint​
│ ├── requirements.txt​
│ └──.env​
└──... (frontend and monorepo config)​

To set up the backend environment, one would navigate to the backend directory,
create a Python virtual environment, activate it, and install the dependencies listed in
requirements.txt using pip install -r requirements.txt.4

2.3 The Ingestion Pipeline: Transforming Unstructured Policies into Searchable


Knowledge

The ingestion pipeline is a critical offline process that prepares the source documents
for querying. It consists of three main steps: loading, splitting, and storing.

2.3.1 Universal Document Loading

The problem statement requires handling PDFs, Word files, and emails.1 To build a
flexible system that can handle these and potentially other formats, the

Unstructured library is the ideal choice. LangChain provides an


UnstructuredFileLoader that leverages this library to parse a wide variety of document
types with a single, consistent interface.17 This avoids the need to write separate
loading logic for each file type. The implementation would involve pointing the loader
to a directory (e.g.,

backend/data/) and iterating through the files to load them into LangChain Document
objects.
2.3.2 The Art of Strategic Text Chunking

LLMs have a finite context window, meaning they can only process a limited amount of
text at once. Furthermore, for retrieval to be effective, the system needs to find small,
specific, and highly relevant passages of text, not entire documents. Therefore, the
loaded documents must be broken down into smaller chunks.2

The RecursiveCharacterTextSplitter from LangChain is a robust, general-purpose tool


for this task.2 It attempts to split text based on a prioritized list of separators (e.g.,
newlines, sentences, words) to keep semantically related text together. The two key
parameters to configure are

chunk_size and chunk_overlap.


●​ chunk_size: The maximum size of each chunk (in characters). A common starting
point is between 500 and 1000 characters.20
●​ chunk_overlap: The number of characters to overlap between consecutive
chunks. This helps ensure that semantic context is not lost at the boundary
between two chunks. A typical value is 10-20% of the chunk size (e.g., 100-200
characters).20

These values are hyperparameters that can be tuned to optimize retrieval


performance for the specific dataset.

2.3.3 Vectorization and In-Process Storage with ChromaDB

Once the documents are chunked, each chunk must be converted into a numerical
vector (an embedding) that captures its semantic meaning. This is done using an
embedding model. For a hackathon, a high-quality, efficient, and locally runnable
open-source model is preferable to relying on a paid API. The all-MiniLM-L6-v2 model
from the sentence-transformers library is an excellent choice, offering a strong
balance of performance and computational efficiency.18

These embeddings must be stored in a vector database to enable efficient similarity


search. While the market offers many powerful, production-grade vector databases
like Pinecone (managed) or Milvus (open-source, scalable), they introduce significant
setup and operational overhead.3 In a hackathon, development velocity is the most
critical resource. The "best" technology is the one that allows the team to build and
iterate the fastest.

For this reason, ChromaDB is the strongly recommended choice.20 It can be installed
with a simple

pip install and run either in-memory or in a persistent local directory with zero
configuration. This eliminates the need to manage external services or Docker
containers, allowing the team to focus entirely on the application logic. While it may
not scale to billions of vectors like Milvus, it is more than sufficient for the dataset
sizes typical of a hackathon.

Table 2: Vector Database Selection Analysis for Hackathons

Database Type Key Strengths Hackathon Rationale


Suitability

ChromaDB Open-Source, Extremely easy High Prioritizes


Embeddable to set up (pip development
install). Can run speed and
in-process. No simplicity, which
external are paramount
services in a hackathon.
needed. Perfect Zero setup
for rapid overhead.
prototyping. 20

pgvector Open-Source Integrates Medium Requires setting


Extension vector search up and
into a standard managing a
PostgreSQL PostgreSQL
database, instance. More
leveraging a setup than
familiar tool. 22 ChromaDB, but
a good option if
the team is
already expert in
Postgres.

Pinecone Managed Fully managed, Medium Requires


Service highly scalable, network access,
(Closed-Source) excellent API keys, and
performance, reliance on a
simple API. No third-party
infrastructure to service. Can be
manage. 21 fast to start but
creates an
external
dependency.

Milvus Open-Source, Designed for Low Requires a


Scalable massive-scale complex
vector search. deployment,
Highly typically via
performant and Docker
flexible. Strong Compose with
community multiple
support. 21 dependent
services. The
setup time is
prohibitive for a
hackathon.

Weaviate Open-Source, Cloud-native, Low Similar to Milvus,


Scalable supports the setup and
GraphQL, learning curve
flexible schema. are too steep for
Good for a short-term
complex AI project. Better
applications. 21 suited for
production
systems.

2.4 The RAG Chain: Orchestrating Retrieval and Generation

The RAG chain is the online component that responds to user queries. It will be
implemented as a service function in rag_service.py and exposed via a FastAPI
endpoint. The process, orchestrated with LangChain, involves several steps.2
1.​ Query Parsing: The raw user query (e.g., "46M, knee surgery, Pune, 3-month
policy") is first sent to an LLM. A prompt will instruct the model to parse this
string and extract entities into a structured format, ideally a Pydantic model
defined in schemas/models.py. This ensures the system is working with clean,
structured data for the subsequent steps.
2.​ High-Fidelity Semantic Retrieval: The parsed query, or a natural language
version of it, is embedded using the same all-MiniLM-L6-v2 model. This
embedding is then used to perform a similarity search on the ChromaDB vector
store. The search will return the top-k most relevant document chunks (e.g., the
top 3-5 clauses).
3.​ Prompt Engineering for Logical Evaluation: This is the core reasoning step. A
detailed prompt template is constructed. This prompt will provide the LLM with all
the necessary information and clear instructions on how to behave. It will include:
○​ Role-playing instruction: "You are an expert insurance claims adjustor. Your
task is to evaluate the following claim based only on the provided policy
clauses."
○​ The User's Claim Data: The structured data parsed in Step 1.
○​ The Retrieved Evidence: The content of the relevant clauses retrieved in
Step 2, clearly demarcated as "Context" or "Policy Clauses."
○​ The Task and Output Format: A clear directive to evaluate the claim against
the clauses and return a decision in a specific JSON format, including the
justification.

This multi-part prompt transforms the LLM from a simple text generator into a guided
reasoning engine, ensuring it performs the evaluation task as required by the problem
statement.

Part III: Mastering the Final Mile: Guaranteeing Structured JSON


Output

A common failure mode for LLM applications is inconsistent output formatting. Since
the HackRx problem requires a specific, machine-readable JSON response, ensuring
the reliability of this output is a mission-critical task.1 Relying on hope is not a
strategy; a robust, multi-layered approach is necessary.

3.1 The Critical Challenge of Reliable Structured Data from LLMs


LLMs are fundamentally text-to-text models, trained to generate human-like,
conversational prose. When asked to produce structured data like JSON, they can
often fail in subtle ways.24 Common errors include:
●​ Extraneous Text: Wrapping the JSON in conversational text like "Sure, here is the
JSON you requested:...".
●​ Formatting Artifacts: Including Markdown code blocks (json... ) around the
output.
●​ Syntax Errors: Missing commas, mismatched brackets, or incorrect quoting,
resulting in invalid JSON.
●​ Schema Deviations: "Hallucinating" extra fields or omitting required ones.

For an application that needs to programmatically parse this response, any of these
errors will cause the entire workflow to fail. Therefore, enforcing the output structure
is not an optional refinement; it is a core requirement.

3.2 Advanced Prompting Techniques for Bulletproof JSON Generation

The first line of defense is meticulous prompt engineering. By providing the LLM with
extremely clear and explicit instructions, the probability of receiving a valid JSON
response can be significantly increased. The following table summarizes key
strategies derived from best practices.24

Table 3: Prompt Engineering Strategies for Structured JSON

Technique Description Example Prompt Snippet

Be Explicit Directly command the model Respond with valid JSON only.
to output only valid JSON. Do not include any
explanation, commentary, or
markdown formatting.

Provide a Schema Describe the desired JSON The JSON object must contain
structure, including field the following keys: "Decision"
names and data types. (string, either "approved" or
"rejected"), "Amount"
(number, or null if not
applicable), and "Justification"
(object).

Provide a Few-Shot Show the model exactly what Format your response as a
Example a valid output looks like. JSON object matching this
Models are excellent at example: {"Decision":
pattern matching. "approved", "Amount":
50000, "Justification":
{"clause_id": "C4.2", "text":
"..."}}

Use System Prompts If the API supports it (like System: You are a helpful API
OpenAI's), use the "system" that always responds in
role to set the model's perfectly formatted JSON.
persona and behavior for the
entire conversation.

Request No Explanations Explicitly forbid the model Do not provide any text before
from adding conversational or after the JSON object.
filler.

3.3 Implementing Pydantic and LangChain Output Parsers for Validation

While strong prompting is essential, it is not foolproof. The most robust solution is a
hybrid approach that combines prompt guidance with programmatic enforcement.
This moves the requirement from a request to a rule.

This is achieved by defining the desired output structure in the application code using
a Pydantic model and then using a LangChain output parser to enforce it.28
1.​ Define a Pydantic Schema: In the backend/app/schemas/models.py file, a
Pydantic model is created that exactly matches the required JSON output
structure from the problem statement.​
Python​
from pydantic import BaseModel, Field​
from typing import Optional, List​

class Justification(BaseModel):​
clause_id: str = Field(description="The specific ID of the clause from the document.")​
text: str = Field(description="The exact text of the clause referenced.")​

class DecisionResponse(BaseModel):​
Decision: str = Field(description="The final decision, must be 'approved' or 'rejected'.")​
Amount: Optional[float] = Field(description="The approved payout amount, if
applicable.")​
Justification: List[Justification] = Field(description="A list of clauses that justify the
decision.")​

2.​ Use a LangChain Output Parser: LangChain's PydanticOutputParser is


designed for this exact purpose. It takes the Pydantic model (DecisionResponse)
as input and does two things:
○​ It generates detailed formatting instructions that are automatically appended
to the prompt sent to the LLM. These instructions tell the model precisely how
to structure its JSON output to match the Pydantic schema.
○​ After the LLM responds, the parser attempts to parse the model's text output
into an instance of the DecisionResponse Pydantic model. If the parsing is
successful, it returns the structured Python object. If the output is not valid
JSON or does not conform to the schema, it will raise an error, which can be
caught and handled with a retry mechanism.28

This two-layer system is exceptionally robust. The prompt guides the LLM toward the
correct format, and the parser acts as a strict gatekeeper, guaranteeing that any data
passed to the rest of the application is valid and correctly structured. This approach
eliminates a major source of unreliability in LLM-powered systems.

Part IV: The User Interface: A Rapidly-Developed React Frontend

While the core intelligence resides in the backend, a clean, functional user interface is
essential for demonstrating the system's capabilities effectively. The UI should be
simple, intuitive, and quick to build, allowing the majority of the hackathon time to be
spent on the more complex backend logic.

4.1 Technology Stack Justification: React, Vite, and Chakra UI


The recommended frontend stack is designed for maximum development velocity and
a polished end result with minimal effort.
●​ JavaScript Framework: React. Both React and Vue are excellent choices for
building modern single-page applications.29 However, React boasts a larger
community, a more extensive ecosystem of third-party libraries, and greater
demand in the job market, making it a slightly more "standard" and
well-supported option.30 Its component-based architecture is intuitive and highly
flexible.32
●​ Build Tool: Vite. Traditional React projects often use Create React App, but Vite
has emerged as a superior alternative for rapid development.16 Vite leverages
native ES modules in the browser to provide a lightning-fast development server
with near-instantaneous Hot Module Replacement (HMR). This means changes in
the code are reflected in the browser almost instantly, creating a much smoother
and faster development feedback loop, which is a significant quality-of-life
improvement during a hackathon.16
●​ Component Library: Chakra UI. Writing custom CSS is time-consuming. A
component library provides a set of pre-built, reusable, and professionally
designed UI components. Chakra UI is an excellent choice because it is
comprehensive, highly accessible, and easy to customize.16 It allows developers to
build a visually appealing and consistent interface very quickly by composing
components like​
Box, Flex, Input, and Button, rather than styling primitives from scratch.16

4.2 Building the Interface: Core Components

The UI can be broken down into a few key React components, following the structure
of similar full-stack tutorials.16
●​ QueryForm.tsx: This component will contain the main user input element. It will
feature a simple HTML form with a <textarea> for the user to type their natural
language query and a "Submit" button. It will manage the state of the input text.
●​ ResultsDisplay.tsx: This component is responsible for rendering the JSON
response from the backend. It will be conditionally rendered, appearing only after
a response has been received. It should have clearly labeled sections for
"Decision," "Amount," and "Justification" to present the structured data in a
human-readable format.
●​ JustificationViewer.tsx: As a child of ResultsDisplay.tsx, this component will
specifically handle the rendering of the Justification array. It will map over the
array of justification objects and display each source clause clearly, perhaps
within a styled card, showing both the clause_id and the full text. This visual link
between the decision and its evidence is crucial for the demo.

4.3 State Management and API Integration

For an application of this scale, complex state management libraries like Redux are
unnecessary. React's built-in Hooks are more than sufficient.16
●​ State Management: The main App.tsx component will use the useState hook to
manage the application's global state, including:
○​ query: The current user query string.
○​ results: The JSON response object from the API (or null).
○​ isLoading: A boolean to track whether an API request is in flight. This can be
used to show a loading spinner and disable the submit button.
○​ error: An error message string if the API call fails.​
These state variables can be passed down to child components as props or
via React's useContext hook for cleaner state propagation.16
●​ API Integration: The frontend will communicate with the FastAPI backend by
making an asynchronous POST request to the /chat/start endpoint. This can be
done using the browser's native fetch API or a lightweight library like axios, which
offers a slightly more convenient syntax.33 The function responsible for this API
call will:
1.​ Set isLoading to true.
2.​ Send the user's query in the request body.
3.​ await the response from the server.
4.​ If the response is successful, parse the JSON and update the results state.
5.​ If the response fails, update the error state.
6.​ Finally, set isLoading back to false.

This client-side logic will be encapsulated within a handler function that is triggered
when the user submits the query form.
Part V: From Code to Demo: Deployment and Final
Recommendations

The final phase of a hackathon is about successfully demonstrating the work that has
been done. This requires a simple, reliable deployment strategy and a well-rehearsed
presentation that highlights the project's most impressive features.

5.1 A Minimalist Deployment Strategy for a Live Demo

The goal for a hackathon is not a production-ready, scalable deployment, but rather a
simple and reliable way to make the application accessible for a live demo. A
"serverless" or Platform-as-a-Service (PaaS) approach is ideal.
●​ Backend Deployment: The FastAPI application can be quickly deployed to a
service like Render.33 Render offers a free tier, can connect directly to a GitHub
repository, and automatically deploys changes on push. It can run the Python
application using a​
uvicorn command specified in the setup. As an even simpler, temporary
alternative, ngrok can be used to create a secure public URL that tunnels directly
to the local development server running on a team member's machine.34 This
requires no code changes but relies on the local machine remaining online.
●​ Frontend Deployment: Since the React application (built with Vite) compiles
down to a set of static HTML, CSS, and JavaScript files, it can be deployed with
extreme ease to a static hosting provider like Vercel or Netlify.16 These platforms
also connect to GitHub and offer free, continuous deployment, often deploying a
new version in under a minute.

To make this work, two configurations are essential:


1.​ CORS Configuration: The CORSMiddleware in the FastAPI backend must be
configured to allow requests from the frontend's deployment URL (e.g.,
https://your-app-name.vercel.app).16
2.​ Environment Variables: The React application's code should not have the
backend URL hardcoded. Instead, it should read the API endpoint from an
environment variable (e.g., VITE_API_URL). This variable will be set to
http://localhost:8000 for local development and to the deployed backend's URL
(e.g., https://your-backend.onrender.com) in the Vercel/Netlify deployment
settings.16

5.2 Key Pitfalls to Avoid and Pro-Tips for a Compelling Presentation

The final presentation is as important as the code itself. A clear, compelling narrative
can make the difference between a good project and a winning one.

Key Pitfalls to Avoid:


●​ Don't Over-Engineer: Avoid spending time on features not explicitly required,
such as complex user authentication, multi-user support, or intricate UI
animations. The focus must remain on the core AI functionality.
●​ Don't Aim for a Perfect UI: The UI should be clean and functional, but not a work
of art. Use a component library like Chakra UI to make it look good quickly and
move on.
●​ Don't Neglect the "Why": Don't just show what the app does; explain why it's
important and innovative.

Pro-Tips for a Compelling Demo Presentation:


The presentation should tell a story that highlights the system's unique capabilities. A
recommended flow is:
1.​ Set the Stage (The Problem): Begin by explaining the problem: manually
reviewing dense policy documents is slow, error-prone, and requires expertise.
Show a sample query and demonstrate how a simple keyword search would fail to
find the relevant information.
2.​ Introduce Semantic Search (The Magic): Run the same query in your
application. Show how the system retrieves the correct, semantically related
clause even though the keywords don't match. Explain that this is powered by
vector embeddings and semantic search.
3.​ Reveal the Core Innovation (The Reasoning): This is the "wow" moment. State
clearly that the system is not just finding text. Explain that the LLM is acting as a
reasoning engine, evaluating the retrieved policy clauses against the user's
specific situation (age, location, etc.) to arrive at a logical decision.
4.​ Emphasize Trust and Auditability (The Business Value): Display the final
JSON output. Point directly to the Justification field and show how it contains the
exact source clauses that led to the decision. Explain that this makes the system's
decisions fully transparent, auditable, and trustworthy—critical features for
real-world applications in insurance, legal, and compliance.

By following this narrative, the presentation will effectively communicate not just the
technical implementation but also the significance and potential impact of the
solution, leaving a lasting and positive impression on the judges.

Works cited

1.​ HackRx 6, accessed on July 19, 2025, https://hackrx.in/#problem-statement


2.​ Build a Retrieval Augmented Generation (RAG) App: Part 1 ..., accessed on July 19,
2025, https://python.langchain.com/docs/tutorials/rag/
3.​ Tech Stack for LLM Application Development – Complete Guide - Prismetric,
accessed on July 19, 2025,
https://www.prismetric.com/tech-stack-for-llm-application-development/
4.​ Python monorepos - Graphite, accessed on July 19, 2025,
https://graphite.dev/guides/python-monorepos
5.​ Best Practices for Structuring Your React Monorepo - DhiWise, accessed on July
19, 2025,
https://www.dhiwise.com/post/best-practices-for-structuring-your-react-monor
epo
6.​ Setting Up a Monorepo for Your React Projects with TypeScript - Medium,
accessed on July 19, 2025,
https://medium.com/@aalam-info-solutions-llp/setting-up-a-monorepo-for-your
-react-projects-with-typescript-29ba3ec15065
7.​ DivineDemon/react-kit: React + Python Monorepo - GitHub, accessed on July 19,
2025, https://github.com/DivineDemon/react-kit
8.​ FastAPI + React - Full stack - Reddit, accessed on July 19, 2025,
https://www.reddit.com/r/FastAPI/comments/1h0kcd6/fastapi_react_full_stack/
9.​ Python or Node.js for Web Development? A Comprehensive Guide - TriState
Technology, accessed on July 19, 2025,
https://www.tristatetechnology.com/blog/python-vs-node-js
10.​Node.js vs Python: What's being used now - SaM Solutions, accessed on July 19,
2025, https://sam-solutions.com/blog/node-js-vs-python/
11.​ Python vs. Node.js for AI-Powered Development Tools: The Definitive Guide,
accessed on July 19, 2025,
https://servicesground.com/blog/python-vs-node-js-for-ai-powered-developme
nt-tools-the-definitive-guide/
12.​Python vs Node.js for AI Development: Which is Best for Your AI App? - YouTube,
accessed on July 19, 2025, https://www.youtube.com/watch?v=DfZxrZE0ZMY
13.​RAG in Production - LangChain & FastAPI - YouTube, accessed on July 19, 2025,
https://www.youtube.com/watch?v=Arf7UwWjGyc
14.​Create a RAG Chatbot with FastAPI & LangChain, accessed on July 19, 2025,
https://blog.futuresmart.ai/building-a-production-ready-rag-chatbot-with-fastap
i-and-langchain
15.​The Best Tech Stacks for AI-Powered Applications in 2025 - DEV Community,
accessed on July 19, 2025,
https://dev.to/elliot_brenya/the-best-tech-stacks-for-ai-powered-applications-in
-2025-efe
16.​Developing a Single Page App with FastAPI and React | TestDriven.io, accessed on
July 19, 2025, https://testdriven.io/blog/fastapi-react/
17.​Building an Ideal Tech Stack for LLM Applications From Scratch - athina.ai,
accessed on July 19, 2025,
https://blog.athina.ai/building-an-ideal-tech-stack-for-llm-applications-from-scr
atch
18.​What is the current best embedding model for semantic search? : r/LangChain -
Reddit, accessed on July 19, 2025,
https://www.reddit.com/r/LangChain/comments/1blfg7i/what_is_the_current_best
_embedding_model_for/
19.​NodeJS vs Python Which Framework You Should Choose and Why? - Medium,
accessed on July 19, 2025,
https://medium.com/@mobileappandgameappdevelopment/nodejs-vs-python-w
hich-framework-you-should-choose-and-why-0c83308acbe8
20.​Building a Basic RAG App with LangGraph, FastAPI & ChromaDB ..., accessed on
July 19, 2025,
https://prashant1879.medium.com/building-a-basic-rag-app-with-langgraph-fast
api-chromadb-668c7454d035
21.​Best 17 Vector Databases for 2025 [Top Picks] - lakeFS, accessed on July 19, 2025,
https://lakefs.io/blog/12-vector-databases-2023/
22.​Top 5 Vector Databases in 2025 - CloudRaft, accessed on July 19, 2025,
https://www.cloudraft.io/blog/top-5-vector-databases
23.​Building a RAG Pipeline with FastAPI, Haystack, and ChromaDB for URLs in
Python, accessed on July 19, 2025,
https://www.aihello.com/resources/blog/building-a-rag-pipeline-with-fastapi-hay
stack-and-chromadb-for-urls-in-python/
24.​How To Write AI Prompts That Output Valid JSON Data | Build5Nines, accessed on
July 19, 2025,
https://build5nines.com/how-to-write-ai-prompts-that-output-valid-json-data/
25.​For Best Results with LLMs, Use JSON Prompt Outputs | HackerNoon, accessed
on July 19, 2025,
https://hackernoon.com/for-best-results-with-llms-use-json-prompt-outputs
26.​Enhance AI Models Prompt Engineering with JSON Output | by Novita AI -
Medium, accessed on July 19, 2025,
https://medium.com/@marketing_novita.ai/enhance-ai-models-prompt-engineeri
ng-with-json-output-ca450f62159a
27.​A Guide to JSON output with LLM prompts - YouTube, accessed on July 19, 2025,
https://www.youtube.com/watch?v=a9Lhm2-TgQ8
28.​How can I get LLM to only respond in JSON strings? - Stack Overflow, accessed
on July 19, 2025,
https://stackoverflow.com/questions/77407632/how-can-i-get-llm-to-only-respo
nd-in-json-strings
29.​Should I learn React.js or Vue.js? - HackerEarth, accessed on July 19, 2025,
https://www.hackerearth.com/blog/reactjs-vs-vuejs
30.​React Vs. Vue: Who Takes The Prize | HackerNoon, accessed on July 19, 2025,
https://hackernoon.com/react-vs-vue-who-takes-the-prize-oq1633op
31.​What does React do better than Vue innately (excluding things like ecosystem)? -
Reddit, accessed on July 19, 2025,
https://www.reddit.com/r/reactjs/comments/10u17c7/what_does_react_do_better
_than_vue_innately/
32.​Vue JS vs React: What is Trending? [Detailed Guide] - Flatlogic Blog, accessed on
July 19, 2025,
https://flatlogic.com/blog/vue-vs-react-what-is-easier-what-is-trending-detailed
-guide-with-all-2021/
33.​Half-a-stack: Integrating React app with FastAPI (Part 1/2) | by Alysha | Medium,
accessed on July 19, 2025,
https://medium.com/@alyshapm10/half-a-stack-integrating-react-app-with-fast
api-part-1-2-81cff4cbd7bf
34.​How To Build a FastAPI & React Full Stack App | Clerk, Databases, LLMs & More -
YouTube, accessed on July 19, 2025,
https://www.youtube.com/watch?v=13tMEW8r6C0

You might also like