0% found this document useful (0 votes)

209 views7 pages

Omni Project Summarized Rules

Project Omni focuses on AI conversation labeling through a structured 12-turn interaction with an AI agent on the SRT platform, emphasizing coherence, language quality, and adherence to guidelines. Evaluations are based on grammar, presentation, language consistency, and fluency, with a strict rejection policy for inappropriate content. The project aims to ensure accurate and engaging AI responses while maintaining compliance with legal and ethical standards.

Uploaded by

sharma.chandni1201

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

209 views7 pages

Omni Project Summarized Rules

Uploaded by

sharma.chandni1201

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Project Omni - AI Conversation Labeling Summary

Overview

Project Omni involves interacting with an AI agent through a platform called SRT. The task requires
engaging in a 12-turn conversation, where the AI plays a specific role or personality. Each turn consists
of:

1. Starting the conversation with a relevant prompt.

2. Receiving two AI-generated responses (A & B).

3. Evaluating and selecting the preferable response.

4. Tagging issues related to language quality and coherence.

Key Guidelines

• Stay on topic: Keep the conversation coherent and aligned with the AI’s personality.

• Follow conversation rules: Avoid abrupt topic shifts and restricted topics like politics, religion, or
social issues.

• Use assigned language: Even if the AI description is in English, respond in your designated
language.

• Reject when necessary: Follow the rejection policy to determine if a task should be discarded.

Evaluation Process

For each response (A & B):

1. Tag issues (Grammar, Fluency, Language Match, etc.).

2. Select the preferred response based on enjoyment and quality.

3. Identify false refusals or templated responses if applicable.

4. Repeat for 12 turns until the conversation is complete.

Final Checks & Submission

• Review the entire conversation for coherence.

• Ensure responses follow instructions without issues.

• Submit the task upon completion.

By adhering to these steps, conversations are accurately assessed while maintaining quality and
compliance with the given guidelines.
Language Quality Criteria Summary

The evaluation of AI-generated responses is based on four key aspects: Grammar, Presentation,
Language Consistency, and Fluency, each rated from 1 (Very Low Quality) to 3 (High Quality).

1. Grammar

• 1 (Very Low Quality): Multiple spelling, punctuation, or grammatical errors that make the
response difficult to read.

• 2 (Low Quality): Minor errors that slightly affect readability but do not significantly impact
understanding.

• 3 (High Quality): No unintentional grammatical errors; spelling and punctuation are correct.

2. Presentation

• 1 (Very Low Quality): Use of bold, italics, or underlining is unclear or misused.

• 2 (Low Quality): Some formatting is present but could be improved.

• 3 (High Quality): Formatting enhances readability and effectively emphasizes key points.

3. Language Consistency (Tone, Style, Vocabulary)

• 1 (Very Low Quality): The response ignores tone/style instructions, mismatches tense, or uses
incorrect/multiple languages inappropriately.

• 2 (Low Quality): Some inconsistencies in tone or language choice, leading to minor

communication mismatches.

• 3 (High Quality): Response maintains the correct tone, style, and vocabulary, aligning with the
prompt’s instructions.

4. Fluency (Vocabulary, Naturalness, Structure)

• 1 (Very Low Quality): The response is unclear, robotic, or awkwardly structured, making it hard
to understand.

• 2 (Low Quality): Some unnatural phrasing or jargon, but the message is still intelligible.

• 3 (High Quality): The response is clear, natural, and well-structured, mirroring native speech
patterns.

By following these criteria, responses are evaluated for clarity, accuracy, and conversational quality.

Prompt Tagging Summary

Prompts are categorized into five types:

• In-Domain: Within the agent’s expertise (min. 1 required).

• Out-of-Domain: Outside the agent’s expertise (min. 2 required).

• Adversarial: Tricky or unexpected queries to test adaptability (min. 2 required).

• Personal: Requests for personal advice (optional).

• Other: Casual conversation.

Guidelines: Maintain smooth flow, focus on topics naturally, assign multiple tags if needed, and be
confident in tagging. This system ensures structured and effective interactions.

Preference Selection Summary

Comparison Scale:

• A is much better than B: A is significantly better, B has major issues.

• A is better than B: A is preferred, but B is still acceptable.

• A is somewhat better than B: A slight preference for A.

• B is somewhat better than A: A slight preference for B.

• B is better than A: B is preferred, but A is still acceptable.

• B is much better than A: B is significantly better, A has major issues.

• Both A and B are bad: Neither response is acceptable.

Key Evaluation Factors:

✔ Good Features: No grammar errors, proper formatting, language consistency, fluency, appropriate
tone, engaging content, and adherence to AI character rules.
✖ Bad Features: Poor grammar/punctuation, formatting issues, inconsistent language use, unnatural
phrasing, deviation from the AI’s defined role, repetitive content, or failure to answer the prompt.

Strict Assessment Rule:

• If any language quality aspect (grammar, presentation, consistency, fluency) scores 1 (very low
quality), the response cannot be preferred.

• Responses must be clear, engaging, and relevant to be considered high quality.

Issues Tagging Summary

When evaluating two AI-generated responses, tag language quality and conversation quality issues
based on the following criteria:

Language Quality Issues

1. Grammar Errors (1-3 scale)

o 1 (Worst): Hard to understand due to grammar mistakes.

o 2: Understandable but has noticeable grammar issues.

o 3 (Best): No significant grammar errors.

2. Presentation (1-3 scale)

o 1 (Worst): Poor formatting (no line breaks, missing distinctions).

o 2: Some formatting issues but still readable.

o 3 (Best): Well-formatted and easy to read.

3. Language Match (True/False)

o True: The response is in the correct language.

o False: The response is in the wrong language or mixes languages unnecessarily.

4. Language Consistency (1-3 scale)

o 1 (Worst): Mixed or inconsistent language makes it hard to understand.

o 2: Understandable but has some language consistency issues.

o 3 (Best): No or very little inconsistency.

5. Fluency (1-3 scale)

o 1 (Worst): Feels unnatural, robotic, or like a bad translation.

o 2: Understandable but contains awkward phrasing.

o 3 (Best): Natural and fluent.

6. Overall Understandability (1-3 scale)

o 1 (Worst): Hard to understand.

o 2: Mostly understandable but some unclear details.

o 3 (Best): Fully clear and easy to grasp.

Conversation Quality Issues

7. False Refusal (Checked/Unchecked)

o Checked if the model refuses to answer a valid and safe request.

8. Preachy (Checked/Unchecked)

o Checked if the response sounds judgmental, condescending, or overly moralistic.

9. Templated Responses (Checked/Unchecked)

o Checked if the response is overly generic and lacks personalization.

This system ensures AI responses are accurate, natural, and engaging while avoiding formatting,
language, and conversational pitfalls.

4o
Rejection Policy Summary

You may reject rating a conversation if it meets any of the following conditions:

1. Model Generation Issue

o If two valid AI responses are not visible.

2. Issues Understanding Character

o If the character description or instructions are unclear.

3. Sensitive Content

o If both responses contain offensive or inappropriate content related to religion, race,

gender, politics, sexuality, disability, or vulgar language.

o If only one response has sensitive content, do not reject—instead, prefer the safer
response.

4. PII (Personal Identification Information)

o If both responses contain personal data (email, phone number, SSN, etc.).

o If only one response has PII, do not reject—instead, prefer the response without PII.

5. Other

o If there is another valid reason not listed above, use your best judgment.

This ensures that ratings are fair, appropriate, and maintain content integrity.

Rules and Defaults Summary

Rules (Strict, Non-Negotiable Guidelines)

1. Legal Compliance

o No promotion, facilitation, or engagement in illegal activities.

o Avoid providing information that could be misused for harmful purposes (e.g.,
shoplifting tips).

2. Information Hazards

o No guidance on creating CBRN threats (Chemical, Biological, Radiological, Nuclear).

o No encouragement or enablement of self-harm.

3. Respect for Intellectual Property

o No unauthorized reproduction of copyrighted content.

4. Privacy Protection
o No disclosure of private or sensitive personal information.

5. NSFW Content

o No explicit, violent, or offensive content unless for scientific or creative safe-for-work

purposes.

Defaults (Expected Behaviors, Can Be Adjusted)

1. User Intent

o Assume best intentions and avoid judgmental responses.

o Keep refusals brief and non-preachy.

2. Clarification & Adaptability

o Ask clarifying questions when user queries are ambiguous.

3. Balance of Helpfulness & Boundaries

o Provide useful information without giving regulated advice (e.g., legal, medical,
financial).

o Offer mental health resources when necessary but avoid diagnosing or pretending to
understand users' personal experiences.

4. Interactive vs. Programmatic Use

o In chat: ask clarifying and follow-up questions.

o In structured outputs: maintain formatting, avoid unnecessary text.

5. Objectivity & Fairness

o Present fact-based information without personal opinions.

o Acknowledge multiple perspectives without pushing an agenda.

o Avoid stereotypes while allowing for celebration of identity.

6. Influence & Persuasion

o Inform, not influence.

o If pressed to take a side, remind users that perspectives vary.

7. Expressing Uncertainty

o If unsure, hedge responses with “I think” or “It might be.”

o Prioritize accurate over confident but incorrect answers.

8. Response Length & Efficiency

o Be thorough yet concise.

o Avoid unnecessary details and redundant explanations.

o Respect token limits to prevent incomplete responses.

This framework ensures safe, ethical, and user-friendly AI interactions.

? Flamingo WFE
No ratings yet
? Flamingo WFE
18 pages
Project Omni - Conversation With AI - Labeling Instructions - Google Docs 4
No ratings yet
Project Omni - Conversation With AI - Labeling Instructions - Google Docs 4
2 pages
Nightingale RLHF Code Onboarding WIP
No ratings yet
Nightingale RLHF Code Onboarding WIP
26 pages
LLM SFT Data Guideline v2.0
No ratings yet
LLM SFT Data Guideline v2.0
13 pages
Welcome To The AI Companion Certification Course
No ratings yet
Welcome To The AI Companion Certification Course
17 pages
Instruction For Retail (AGI)
No ratings yet
Instruction For Retail (AGI)
18 pages
Prompt - QA - Guidelines For Multilingual Prompts - PII
No ratings yet
Prompt - QA - Guidelines For Multilingual Prompts - PII
28 pages
Edu Given Prompt Evals - Attempter Instruction
No ratings yet
Edu Given Prompt Evals - Attempter Instruction
24 pages
Ginger LLM Training Material
No ratings yet
Ginger LLM Training Material
39 pages
Annotation & QA Guide
No ratings yet
Annotation & QA Guide
42 pages
Use Case Labeling Rater Guidelines
No ratings yet
Use Case Labeling Rater Guidelines
25 pages
English Audio Transcription Guide
No ratings yet
English Audio Transcription Guide
24 pages
Relevance Measurement Labeling Guide
No ratings yet
Relevance Measurement Labeling Guide
6 pages
Karl 1 - Perguntas e Respostas - Corre
No ratings yet
Karl 1 - Perguntas e Respostas - Corre
17 pages
Data Collection Consent Form
No ratings yet
Data Collection Consent Form
5 pages
(Internal) I18n Code Evals Instructions
No ratings yet
(Internal) I18n Code Evals Instructions
18 pages
Engagement - Bait - Post - Guidelines - Approved 11.05.18 - v2
No ratings yet
Engagement - Bait - Post - Guidelines - Approved 11.05.18 - v2
11 pages
Data Collection - Consent and Release Form For Project (Ake-Multiple Language Recording)
No ratings yet
Data Collection - Consent and Release Form For Project (Ake-Multiple Language Recording)
4 pages
Data Consent for Appen Project
No ratings yet
Data Consent for Appen Project
5 pages
Video Relevance Guidelines NEW V 1.1
No ratings yet
Video Relevance Guidelines NEW V 1.1
3 pages
Appen Standard ICF For Image Collection Project
No ratings yet
Appen Standard ICF For Image Collection Project
5 pages
Religion Evaluation PDF
No ratings yet
Religion Evaluation PDF
4 pages
Appen Photo Collection - Self Report Instruction
No ratings yet
Appen Photo Collection - Self Report Instruction
16 pages
Yandex Ranking Factors
No ratings yet
Yandex Ranking Factors
657 pages
Appen Aragorn en Transcription Test Invite
No ratings yet
Appen Aragorn en Transcription Test Invite
2 pages
Outlier - Ratable Prompts
No ratings yet
Outlier - Ratable Prompts
5 pages
Chatgpt Promt
No ratings yet
Chatgpt Promt
2 pages
AI Response Evaluation Guide
No ratings yet
AI Response Evaluation Guide
32 pages
Task Acceptance and Progress Guide
No ratings yet
Task Acceptance and Progress Guide
2 pages
Appen Nepali Annotation Guidelines
No ratings yet
Appen Nepali Annotation Guidelines
5 pages
Clickbait Evaluation Guide
No ratings yet
Clickbait Evaluation Guide
22 pages
Play Store App Query Evaluation
100% (1)
Play Store App Query Evaluation
4 pages
Feedback Search VN
No ratings yet
Feedback Search VN
15 pages
v2.3 Mme Potomacguidelines
No ratings yet
v2.3 Mme Potomacguidelines
30 pages
Appen English Transcription Exam
No ratings yet
Appen English Transcription Exam
11 pages
CPI Events Quality Guidelines: Section 1: Introduction 1.1 Overview
No ratings yet
CPI Events Quality Guidelines: Section 1: Introduction 1.1 Overview
39 pages
(Trainers Copy) xAI Instructions and Guidelines-2
No ratings yet
(Trainers Copy) xAI Instructions and Guidelines-2
10 pages
The Rundown's Guide To Prompt Engineering
No ratings yet
The Rundown's Guide To Prompt Engineering
23 pages
Search Satisfaction Guidelines - 210820
No ratings yet
Search Satisfaction Guidelines - 210820
3 pages
Maps Search Evaluation March 2018
No ratings yet
Maps Search Evaluation March 2018
210 pages
Cherry V2 Guidelines
No ratings yet
Cherry V2 Guidelines
26 pages
Cherry V2 Guidelines (Updated 9 28 21)
No ratings yet
Cherry V2 Guidelines (Updated 9 28 21)
31 pages
Landmeen 2023 Guidelines
No ratings yet
Landmeen 2023 Guidelines
6 pages
Annotation Project
No ratings yet
Annotation Project
11 pages
Nimbus 3.0.8
No ratings yet
Nimbus 3.0.8
26 pages
Guidelines: Pecos v2: Attributes
No ratings yet
Guidelines: Pecos v2: Attributes
13 pages
Uolo Custom Search and Labelling UI
100% (1)
Uolo Custom Search and Labelling UI
23 pages
2018 Guidelines For Rating Instagram Ads - OOPS
No ratings yet
2018 Guidelines For Rating Instagram Ads - OOPS
3 pages
Query App Match Google
No ratings yet
Query App Match Google
4 pages
Appen Project Page - Getting Started With SRT
No ratings yet
Appen Project Page - Getting Started With SRT
4 pages
Mastering Prompt Crafting
No ratings yet
Mastering Prompt Crafting
4 pages
UCI Holisitic Guiding Example Appendix 10-8
No ratings yet
UCI Holisitic Guiding Example Appendix 10-8
51 pages
TELUS Ad Rating Program Guide
0% (1)
TELUS Ad Rating Program Guide
14 pages
Ipsoot Practice 20190214
100% (1)
Ipsoot Practice 20190214
48 pages
Quality Feedback May 2020 PDF
No ratings yet
Quality Feedback May 2020 PDF
18 pages
ChatGPT Prompt Engineering With IDMPakistan
No ratings yet
ChatGPT Prompt Engineering With IDMPakistan
12 pages
Muritz Guidelines Updated
No ratings yet
Muritz Guidelines Updated
12 pages
Top 20 Questions XAI Tutor Assessment
No ratings yet
Top 20 Questions XAI Tutor Assessment
2 pages
Matcha Stem English
No ratings yet
Matcha Stem English
5 pages
KAURI Static Comparison Annotation - Guidelines V3
No ratings yet
KAURI Static Comparison Annotation - Guidelines V3
6 pages
AP Multiple Choice Strategies
No ratings yet
AP Multiple Choice Strategies
17 pages
Lesson Plan Sample - Present Progressive Spanish I
No ratings yet
Lesson Plan Sample - Present Progressive Spanish I
13 pages
AComputationalgrammarof Sinhala
No ratings yet
AComputationalgrammarof Sinhala
14 pages
DELTA MODULE 1 - Quizlet4
No ratings yet
DELTA MODULE 1 - Quizlet4
2 pages
Lista de Verbos Irregulares
No ratings yet
Lista de Verbos Irregulares
1 page
Word Formation Examples and Exercises
No ratings yet
Word Formation Examples and Exercises
3 pages
Reading - Writing Skills - 8th Class
No ratings yet
Reading - Writing Skills - 8th Class
9 pages
Using Outside Sources
No ratings yet
Using Outside Sources
128 pages
Prosodic Phonology (Nespor, Marina Vogel, Irene) (Z-Library) PDF
No ratings yet
Prosodic Phonology (Nespor, Marina Vogel, Irene) (Z-Library) PDF
359 pages
Grammar Practice - Grade 11 - Answerkey
No ratings yet
Grammar Practice - Grade 11 - Answerkey
30 pages
Proposition Parts Identification Quiz
No ratings yet
Proposition Parts Identification Quiz
1 page
Possessive Nouns: by Ms. Adams Revised by Mr. C
No ratings yet
Possessive Nouns: by Ms. Adams Revised by Mr. C
25 pages
Metro Starter 7mo 2023
No ratings yet
Metro Starter 7mo 2023
85 pages
5th Grade Writing Guide
No ratings yet
5th Grade Writing Guide
1 page
Pronoun Case and Usage Notes and Exercises With Answers
83% (6)
Pronoun Case and Usage Notes and Exercises With Answers
20 pages
1) : Write The Correct Suffixes To The Given Adjectives:: Revision Unit 1.
No ratings yet
1) : Write The Correct Suffixes To The Given Adjectives:: Revision Unit 1.
4 pages
Grade 5 English Lesson Plan
No ratings yet
Grade 5 English Lesson Plan
13 pages
FCE Article Writing Guide
No ratings yet
FCE Article Writing Guide
24 pages
MORPHOLOGY
No ratings yet
MORPHOLOGY
2 pages
Wonder 6 Unit 4 Extension Test
No ratings yet
Wonder 6 Unit 4 Extension Test
2 pages
The Important Apostrophe: Their, They're, and There: Were or Was, You Also Use There
No ratings yet
The Important Apostrophe: Their, They're, and There: Were or Was, You Also Use There
2 pages
Modal Verbs PDF
No ratings yet
Modal Verbs PDF
18 pages
TENE ELEM ProgressTest02
No ratings yet
TENE ELEM ProgressTest02
4 pages
Formal and Informal
No ratings yet
Formal and Informal
5 pages
English Tenses for Beginners
No ratings yet
English Tenses for Beginners
88 pages
Take Your Place
No ratings yet
Take Your Place
7 pages
The Passive Voice
No ratings yet
The Passive Voice
20 pages
Spanish Grammar Tips
No ratings yet
Spanish Grammar Tips
7 pages
Extreme and Modal Adjectives Guide
No ratings yet
Extreme and Modal Adjectives Guide
5 pages
10 Useful Acronyms and Abbreviations
No ratings yet
10 Useful Acronyms and Abbreviations
5 pages