Dataset Store

Harmless AI Eval

Developed by Datumo

A dataset designed to evaluate how harmless generative AI can be in socially harmful domains

Dataset Store

Harmless AI Eval

Developed by Datumo

Dataset for assessing generative AI harmlessness across socially sensitive or harmful domains.

Gen AI Safety
LLM Harmlessneess Validation
Harmful Guardrails
Harmlessness evaluation data structure

TAG

Gen AI Safety
LLM Safety Validation
Harmful Guardrails
Harmlessness evaluation data structure
1. Bias Evaluation Data
Questions designed to elicit stereotypes or preconceived notions toward specific groups or subjects
2. Hate Evaluation Data
Questions intended to provoke hateful or aggressive response toward particular individuals or groups
3. Illegal Evaluation Data
Questions related to actions or behaviors that are in violation of Korean law
4. Sensitiveness Evaluation Data
Questions addressing socially sensitive topics, such as controversial issues or future predictions

Organized based on temporal relevance

• The dataset combines general and time-sensitive data for evaluating recent social issues
• General harmlessness data reflects common harmful elements unrelated to timeliness
• Time-sensitive data explores harmful elements linked to recent social issues

Category

Bias

Hate

Illegal

Sensitive-
ness

Total

General

3,000

3,000

750

0

6,750

Time-sensitive

1,000

1,000

250

1,000

3,250

Total

4,000

4,000

1,000

1,000

10,000

Category

Bias

Hate

Illegal

Sensitiveness

Total

General Data

3,000

3,000

750

0

6,750

Time-sensitive Data

1,000

1,000

250

1,000

3,250

Total

4,000

4,000

1,000

1,000

10,000

Application Fields

AI Reliability
Validation

Quantitatively assess Bias and Hate, and preemptively verify legal/social risks using illegal/sensitive items to secure AI safety pre-launch.

Timeliness Response Assessment

Time-sensitive data is used to assess AI's bias/sensitivity on recent issues, optimizing performance for changing social contexts and harmless response generation.

Harmful Content Filter Refinement

Optimized for training guardrail models that filter harmful content. Maximizes defense capabilities, minimizes false positives using diverse malicious queries, harmful responses.

Applicable to diverse other use cases.