Generative AI Testing Intern – Assignment
PART 1: GenAI Tool Testing & Observation Log (1.5 hours)
Below is a sample Testing & Observation Log for ChatGPT, covering four distinct prompt
types. You can expand it with screenshots or additional rows as you continue your 1.5-
hour session.
Prompt Prompt Used Output Accurac Latency Observed Rating Final Notes /
Type Summary y / Bugs / (out of Improvement
Perform Glitches 10) Ideas
ance
Notes
Factual “What is the Correctly High ~1.2 s to None 9 Very precise;
Questio capital of identified respond could include
n Burkina Faso, capital as more recent
and can you Ouagadougo trade figures
list its threeu and listed (e.g. 2023
largest exports
gold, cotton, data).
by value?” and livestock
products as
top exports
with
approximate
annual
values.
Creative “Write a 200- Engaging, Medium ~1.8 s; Minor 7 Encourage
Writing word fairy tale imaginative slight repetition more varied
in which a short story; pause vocabulary and
cloud learns to strong before tighter editing.
swim, metaphors finishing Maybe add a
exploring but “style” slider for
themes of occasional concision vs.
freedom and minor flourish.
belonging.” repetition
(“soft cloud”
appears
twice).
Long / (Excerpt from Summarized Medium ~2.4 s; Dropped 6 Would benefit
Comple a fictional 350- the key handled last chart’s from a “depth”
x Input word research contributions long metrics toggle to
(300+ abstract on (hybrid GA- input details preserve all
words) “Quantum- SA), well but named
Inspired methodology truncate figures/details.
Optimization steps, and d some Consider
Algorithms”) reported a finer streaming long-
“Quantum- high-level points input
inspired performance processing to
algorithms comparison. show progress.
leverage Missed one
quantum of the three
mechanics chart details.
concepts to
solve NP-hard
problems.
Here we
propose a
hybrid GA-SA
approach…”
(continues to
describe
methodology,
pseudo-code,
and three
performance
charts in
prose.)
Multilin “Por favor, Delivered a High ~1.5 s None 8 Excellent
gual / escribe en clean, multilingual
Code- español un runnable support. Could
Based pequeño script Python auto-detect
Prompt en Python que snippet with column data
lea un archivo pandas: type issues or
CSV y calcule reads CSV, suggest error
la media de la handles handling for
columna missing data, non-numeric
‘ventas’.” computes values.
df['ventas'].mea
n(), and prints
result—all in
Spanish
comments.
PART 2: Mini Prompt Library (1 hour)
Here’s a mini prompt library for the Content Generator use case, with six prompts (3
beginner-level, 3 advanced), each categorized and with adaptation notes. At the end is
a bonus sample log showing how you might compare outputs from ChatGPT vs.
Gemini.
Level Prompt Category Adaptation Notes
Beginner 1. “Write a 50-word introduction for a Creative Tone: change “healthy” to
blog post about healthy meal-prep “fun” for a playful vibe.
ideas.” Length: bump to 200 words
for a deeper intro.
Beginner 2. “Generate three catchy tweets Creative Tone: use emojis & slang for
promoting our new productivity app.” a casual tone; formal
business style for B2B
audiences.
Length: request “3–5 tweets”
or “1 thread.”
Beginner 3. “List five key features of our eco- Analytical Tone: neutral for spec sheet;
friendly water bottle, in bullet points.” humorous (“bottle that saves
the planet one sip at a time”).
Length: expand bullets into
full sentences.
Advanced 4. “Create a 4-week content Technical Tone: swap “Gen Z” for “C-
calendar (titles + publication dates) suite execs” to shift register.
for a fintech blog targeting Gen Z Length: scale to 3 months,
readers.” include social-media cross-
posts.
Advanced 5. “Draft a 1,200-word SEO- Analytical Tone: more academic for
optimized article on ‘blockchain in white-paper style; more
supply chain,’ including at least 5 casual for newsletter.
internal & external links.” Length: shorten to 800
words or expand to 2,000 if
needed.
Advanced 6. “Define brand voice guidelines Technical Tone: shift from “luxury” to
(voice, tone, do’s & don’ts) for a “eco-friendly” brand
luxury skincare line.” positioning.
Length: give a 1-page
summary vs. a 5-page
handbook outline.
Bonus: Sample Cross-Tool Comparison Log
Prompt Tool Output Highlights Notes on Differences
No.
1 ChatGPT Crisp 50-word intro, strong call-to- Very direct; less flourish.
action, light use of “you.”
Gemini More evocative imagery (“rainbow of Tends to embellish;
veggies”), slightly longer (~65 words). exceeded target length.
4 ChatGPT Clean calendar table with titles & dates; Spot-on structure, a bit
suggestions aligned to fintech events formulaic.
(e.g., “Crypto Q&A Week”).
Gemini Included short descriptions alongside More “value-add” but less
each title; added social posts per week. concise.
6 ChatGPT Structured four sections: voice, tone, Professional but dry.
do’s, don’ts. Precise language, formal
style.
Gemini Added brand personality archetypes More creative suggestions,
(e.g., “The Curator”), used more though occasional buzzword
creative examples. overuse.
PART 3: UX Quick Review (30 minutes)
Here’s a concise UX review of Perplexity.ai (May 2025):
1. First-Use Experience & Onboarding
• Sign-up flow: You can immediately start asking questions without an account,
which is excellent for lowering barriers. Signing up (via Google, GitHub, or email)
unlocks history and personalization.
• Guidance: A brief “Getting Started” tooltip highlights the input box and shows
example queries, but it disappears quickly—there’s no “tour replay” option.
2. Prompt Interface Friendliness
• Simplicity: The main page is dominated by a single search bar and a carousel of
example prompts (“Explain X,” “Compare A vs. B,” etc.). Very approachable for
beginners.
• Feedback aids: As you type, it suggests related topics and auto-completes
queries, reducing friction. However, there’s no inline syntax help for more
advanced operators (e.g. site:, source:).
3. Clarity & Interactivity of Outputs
• Layout: Results appear in a clean, two-column layout—AI response on the left,
source citations on the right. Citations are clickable, opening in a side-pane
preview (no full tab jump).
• Interactivity: You can upvote/downvote answers, ask follow-ups directly
(threaded), or regenerate. Code snippets are nicely formatted, but there’s no
“copy all” button for multi-block outputs.
4. Suggested Improvements
• Persistent Tour: Offer a “Show me around” button to replay onboarding tips.
• Advanced Prompt Builder: Add a simple dropdown of operators (e.g. filter by
date, domain) for power users.
• Enhanced Copy Tools: Provide a one-click “Copy All” for responses and for
citations separately.
• Theme & Accessibility: A true dark mode toggle (not just auto-detect) and text-
size controls would help users with visual needs.
Overall, Perplexity nails immediacy and clarity, but a few small tweaks around
onboarding and power-user features would make it even stronger.