Misinformation & Hallucination
False or fabricated outputs that can mislead users or the public.
- Fake citations
- Fabricated facts
- Synthetic news
SAIFE’s working taxonomy. These categories anchor detection, dashboards, and simulators.
False or fabricated outputs that can mislead users or the public.
Exposing personal or sensitive data in prompts, outputs, or logs.
Unfair treatment or stereotyping across protected attributes.
Outputs that enable or encourage harm to self or others.
Model/system exploits, jailbreaking, or unauthorized access.
Infringing content or non-transformative reproduction.
Deceptive identity or financial abuse via AI systems.
Any material or behavior that endangers minors.
Unqualified guidance in high-stakes domains.
Coordinated influence or civic process interference.
False or fabricated outputs that can mislead users or the public.
Exposing personal or sensitive data in prompts, outputs, or logs.
Unfair treatment or stereotyping across protected attributes.
Outputs that enable or encourage harm to self or others.
Model/system exploits, jailbreaking, or unauthorized access.
Infringing content or non-transformative reproduction.
Deceptive identity or financial abuse via AI systems.
Any material or behavior that endangers minors.
Unqualified guidance in high-stakes domains.
Coordinated influence or civic process interference.