SAIFE
CitizenPreview
Secure AI For Everyone

AI Risk Families SAIFE Helps Detect and Prevent

SAIFE is built to monitor real AI risk across consumer, enterprise, and government environments. This page explains the major AI risk families SAIFE is designed to detect, surface, and help prevent — from deepfakes and fraud to privacy leakage, policy drift, high-stakes advice abuse, and manipulative automation.

Risk families covered
18
Public-facing AI risk categories
Families with blocking posture
15
Categories that can require hard intervention
Families with escalation posture
14
Categories that may require routed review or case handling
Common surfaces monitored
60
Examples of where these risks show up

Why this page exists

AI risk is broader than one issue like hallucinations, copyright, or prompt injection. Real-world AI defense requires coverage across identity deception, manipulation, privacy, security, public harm, operational drift, rights-impacting automation, and evidence integrity. SAIFE organizes that reality into clear families so people can understand what is being watched for and why it matters.

What you will see below
• every major AI risk family SAIFE covers publicly
• plain-language descriptions of each category
• examples of what each category includes
• common surfaces where the risk appears
• the kinds of protections SAIFE can support

Public AI risk family index

Each card below represents a major AI risk family SAIFE is built to recognize in the real world. These categories are written in public language, but they reflect the deeper runtime coverage model we built into SAIFE’s global risk foundation.

18 categories shown
Deepfakes & Synthetic Impersonation
Deepfakes

AI-generated or manipulated identity signals that make a person, voice, image, or institution appear real when it is not.

Examples
Deepfake face swapsVoice spoofingSynthetic identity claims
Hallucinated Outputs & Fabricated Claims
Hallucinations

Outputs that sound confident but are false, unsupported, invented, or disconnected from trustworthy evidence.

Examples
Fake citationsFabricated factsUnsupported authority claims
Misinformation & Manipulated Narratives
Misinformation

AI content that distorts meaning, manufactures false consensus, amplifies deceptive framing, or manipulates public understanding.

Examples
Narrative distortionSynthetic consensusCoordinated influence patterns
Doomscroll & Compulsion Amplification
Compulsion

AI-driven patterns that optimize for compulsive engagement, emotional hijacking, fear escalation, outrage loops, or unhealthy retention behavior.

Examples
Compulsion loopsFear amplificationRagebait optimization
Bias & Discrimination
Bias

AI behavior that produces unfair treatment, exclusion, stereotyping, proxy discrimination, or disparate outcomes across protected groups.

Examples
Protected-attribute biasStereotype reinforcementUnequal recommendations
Unsafe & Harmful Outputs
Unsafe outputs

AI outputs that facilitate self-harm, violence, illegal activity, weaponization, or other severe real-world harm.

Examples
Violence enablementSelf-harm guidanceIllegal activity assistance
Harmful or Deceptive Automation
Deceptive automation

AI systems automating deception, manipulation, fraud, scalable social engineering, or rights-impacting actions without appropriate oversight.

Examples
Deceptive automation flowsHigh-volume social engineeringRights-impacting automation without review
Policy Drift
Policy drift

A gap between what an organization says its AI systems should do and what live behavior, controls, or enforcement actually show.

Examples
Runtime behavior drifting from policyMissing policy controlsScope mismatch
Compliance Drift
Compliance drift

A breakdown between required obligations and actual monitoring, retention, escalation, notification, or human review practices.

Examples
Obligation gapsRetention-control driftNotification-route drift
Prohibited Model or Tool Use
Unauthorized AI use

Use of unapproved models, providers, tools, endpoints, plugins, or workflows outside authorized scope.

Examples
Unauthorized model usageUnauthorized tool invocationShadow AI provider usage
Provenance Failure
Provenance

Missing or broken origin, attribution, chain-of-custody, authenticity, or evidence lineage for AI outputs and artifacts.

Examples
Missing provenanceBroken chain of custodyAuthenticity marker failure
Rights-Impacting Automation Without Review
No-review decisions

AI materially shaping or making high-stakes decisions without the required human review, approval, routing, or justification.

Examples
Human review absentAutomated decisions without approvalHigh-stakes routing failure
Privacy & Data Leakage
Privacy

Sensitive data exposure through prompts, outputs, memory, logs, model behavior, or tenant boundary failures.

Examples
PII leakageTraining data exposurePrompt exfiltration attempts
Security, Abuse & Prompt Injection
Security

Attempts to hijack AI instructions, abuse tools, extract credentials, override agent behavior, or take over sessions and workflows.

Examples
Prompt injection attacksTool-chain hijacksInstruction overrides
Fraud & Financial Deception
Fraud

AI-generated scams, phishing, payment deception, invoice fraud, trust-based manipulation, or other financially harmful abuse.

Examples
Synthetic financial fraudPhishing generationPayment redirection deception
Child Safety
Child safety

AI content or behavior that targets, exploits, sexualizes, manipulates, or endangers minors.

Examples
Child exploitation riskGrooming patternsUnsafe dialogue targeting minors
High-Stakes Advice Abuse
High-stakes advice

Unsafe or unauthorized guidance in medical, legal, safety-critical, or other high-consequence domains.

Examples
Unsafe medical guidanceUnauthorized legal guidanceHigh-stakes instructions without review
Political Manipulation & Civic Interference
Political manipulation

AI used to suppress participation, target civic behavior, simulate false grassroots support, or manipulate democratic processes.

Examples
Voter suppression contentTargeted political manipulationAstroturf amplification

Deepfakes & Synthetic Impersonation

Deepfakes

AI-generated or manipulated identity signals that make a person, voice, image, or institution appear real when it is not.

This is one of the fastest-growing AI harms because it can target trust directly — who spoke, who approved something, who sent it, or whether a media artifact is authentic.

SAIFE posture
WarnBlockEscalate
Examples
Deepfake face swapsVoice spoofingSynthetic identity claimsExecutive impersonationCross-modal identity mismatch
Common surfaces
BrowserMessagingUploadsMedia flowsAPIs
Typical protections
WarnBlockEscalate

Hallucinated Outputs & Fabricated Claims

Hallucinations

Outputs that sound confident but are false, unsupported, invented, or disconnected from trustworthy evidence.

AI can appear authoritative even when it is wrong. That becomes dangerous in research, policy, support, legal, medical, and operational workflows.

SAIFE posture
WarnRequire reviewBlock in high-stakes use
Examples
Fake citationsFabricated factsUnsupported authority claimsConfidence without evidenceUngrounded policy answers
Common surfaces
AssistantsSearch overlaysPolicy workflowsSupport toolsResearch tools
Typical protections
WarnRequire reviewBlock in high-stakes use

Misinformation & Manipulated Narratives

Misinformation

AI content that distorts meaning, manufactures false consensus, amplifies deceptive framing, or manipulates public understanding.

The danger is not only false statements. It is also narrative shaping, synthetic amplification, and coordinated influence that can mislead whole communities.

SAIFE posture
WarnBlockEscalateEvidence pack creation
Examples
Narrative distortionSynthetic consensusCoordinated influence patternsCivic manipulation contentDeceptive rewrites or summaries
Common surfaces
BrowserFeedsMessagingResearch assistantsPublic information surfaces
Typical protections
WarnBlockEscalateEvidence pack creation

Doomscroll & Compulsion Amplification

Compulsion

AI-driven patterns that optimize for compulsive engagement, emotional hijacking, fear escalation, outrage loops, or unhealthy retention behavior.

Not every AI harm looks like fraud or hacking. Some harms come from systems trained to keep people hooked, distressed, or behaviorally trapped.

SAIFE posture
WarnDampenRestrictEscalate
Examples
Compulsion loopsFear amplificationRagebait optimizationSession-retention dark patternsExploitation of vulnerable users
Common surfaces
BrowserMobile appsFeedsChat systemsNotifications
Typical protections
WarnDampenRestrictEscalate

Bias & Discrimination

Bias

AI behavior that produces unfair treatment, exclusion, stereotyping, proxy discrimination, or disparate outcomes across protected groups.

Bias becomes especially serious when AI influences rights, opportunities, benefits, employment, education, healthcare, housing, or justice.

SAIFE posture
WarnRequire reviewBlock in high-stakes contexts
Examples
Protected-attribute biasStereotype reinforcementUnequal recommendationsRights-impacting biasProxy discrimination
Common surfaces
Decision support toolsFormsAssistantsSearchHR/legal/benefits systems
Typical protections
WarnRequire reviewBlock in high-stakes contexts

Unsafe & Harmful Outputs

Unsafe outputs

AI outputs that facilitate self-harm, violence, illegal activity, weaponization, or other severe real-world harm.

These are the moments where AI risk becomes urgent and immediate. Systems must be able to intervene, not simply observe.

SAIFE posture
WarnBlockEscalate urgently
Examples
Violence enablementSelf-harm guidanceIllegal activity assistanceWeaponization instructionsExtreme harm escalation
Common surfaces
ChatBrowserSupport toolsAssistantsAPIs
Typical protections
WarnBlockEscalate urgently

Harmful or Deceptive Automation

Deceptive automation

AI systems automating deception, manipulation, fraud, scalable social engineering, or rights-impacting actions without appropriate oversight.

The risk is not just a bad answer. It is an AI system taking or scaling harmful action in ways that feel legitimate or human-approved.

SAIFE posture
WarnBlockRequire reviewEscalate
Examples
Deceptive automation flowsHigh-volume social engineeringRights-impacting automation without reviewAutonomous overreachCovert persuasion at scale
Common surfaces
AgentsWorkflow enginesBrowserMessagingOutreach systems
Typical protections
WarnBlockRequire reviewEscalate

Policy Drift

Policy drift

A gap between what an organization says its AI systems should do and what live behavior, controls, or enforcement actually show.

Policies do not protect anyone if they are not connected to real controls, monitored surfaces, and runtime behavior.

SAIFE posture
WarnEscalateRequire reviewCreate case
Examples
Runtime behavior drifting from policyMissing policy controlsScope mismatchSilent exception expansion
Common surfaces
RuntimeAssignmentsPoliciesScopesEnforcement
Typical protections
WarnEscalateRequire reviewCreate case

Compliance Drift

Compliance drift

A breakdown between required obligations and actual monitoring, retention, escalation, notification, or human review practices.

Compliance risk often grows quietly. SAIFE treats missing review paths, weak evidence handling, and degraded notification routes as detectable risk conditions.

SAIFE posture
WarnEscalateCreate caseRequire remediation
Examples
Obligation gapsRetention-control driftNotification-route driftReview workflow drift
Common surfaces
EvidenceRetentionNotificationsGovernance controlsApproval workflows
Typical protections
WarnEscalateCreate caseRequire remediation

Prohibited Model or Tool Use

Unauthorized AI use

Use of unapproved models, providers, tools, endpoints, plugins, or workflows outside authorized scope.

A major real-world AI risk is shadow AI — usage that bypasses approved controls, oversight, or security boundaries.

SAIFE posture
WarnBlockEscalate
Examples
Unauthorized model usageUnauthorized tool invocationShadow AI provider usageEndpoint scope violations
Common surfaces
BrowserNetworkSDKsAPI gatewaysRuntime
Typical protections
WarnBlockEscalate

Provenance Failure

Provenance

Missing or broken origin, attribution, chain-of-custody, authenticity, or evidence lineage for AI outputs and artifacts.

When no one can answer where content came from, what touched it, or whether it is authentic, trust collapses.

SAIFE posture
WarnRequire reviewBlock in evidence-sensitive flows
Examples
Missing provenanceBroken chain of custodyAuthenticity marker failureMissing source attribution
Common surfaces
EvidenceUploadsArtifactsReportingDashboards
Typical protections
WarnRequire reviewBlock in evidence-sensitive flows

Rights-Impacting Automation Without Review

No-review decisions

AI materially shaping or making high-stakes decisions without the required human review, approval, routing, or justification.

This is one of the most important governance risks because it affects due process, fairness, and accountability in high-stakes settings.

SAIFE posture
Require reviewBlockEscalate
Examples
Human review absentAutomated decisions without approvalHigh-stakes routing failureOverride without justification
Common surfaces
Government workflowsCourtsProvider systemsPolicy systemsHR/benefits/legal/health flows
Typical protections
Require reviewBlockEscalate

Privacy & Data Leakage

Privacy

Sensitive data exposure through prompts, outputs, memory, logs, model behavior, or tenant boundary failures.

Privacy risk can come from what an AI reveals, remembers, routes, or exposes across users, organizations, or workflows.

SAIFE posture
WarnRedactBlockEscalate
Examples
PII leakageTraining data exposurePrompt exfiltration attemptsCross-tenant leaksSensitive memory retention
Common surfaces
BrowserChatStorageLogsMulti-tenant systems
Typical protections
WarnRedactBlockEscalate

Security, Abuse & Prompt Injection

Security

Attempts to hijack AI instructions, abuse tools, extract credentials, override agent behavior, or take over sessions and workflows.

Prompt injection and tool abuse are now core AI security risks. They can turn helpful systems into unsafe ones.

SAIFE posture
WarnBlockIsolateEscalate
Examples
Prompt injection attacksTool-chain hijacksInstruction overridesCredential exfiltrationSession takeover assistance
Common surfaces
AgentsBrowserAPIsToolingRuntime
Typical protections
WarnBlockIsolateEscalate

Fraud & Financial Deception

Fraud

AI-generated scams, phishing, payment deception, invoice fraud, trust-based manipulation, or other financially harmful abuse.

Financial harm often arrives disguised as trust — a voice, a message, an invoice, or a convincing relationship.

SAIFE posture
WarnBlockEscalate
Examples
Synthetic financial fraudPhishing generationPayment redirection deceptionInvoice or identity scamsRomance or trust fraud
Common surfaces
EmailMessagingPaymentsVoiceConsumer workflows
Typical protections
WarnBlockEscalate

Child Safety

Child safety

AI content or behavior that targets, exploits, sexualizes, manipulates, or endangers minors.

This category must always remain first-class and hard-protected. It cannot depend on optional governance maturity.

SAIFE posture
BlockEscalate urgently
Examples
Child exploitation riskGrooming patternsUnsafe dialogue targeting minorsSexualized minor content
Common surfaces
ChatMessagingUploadsMediaConsumer apps
Typical protections
BlockEscalate urgently

High-Stakes Advice Abuse

High-stakes advice

Unsafe or unauthorized guidance in medical, legal, safety-critical, or other high-consequence domains.

AI should not present itself as qualified authority where the consequences of being wrong are serious.

SAIFE posture
WarnRequire reviewBlock
Examples
Unsafe medical guidanceUnauthorized legal guidanceHigh-stakes instructions without reviewFalse expertise claims
Common surfaces
AssistantsHealthcare workflowsLegal workflowsSafety-sensitive systems
Typical protections
WarnRequire reviewBlock

Political Manipulation & Civic Interference

Political manipulation

AI used to suppress participation, target civic behavior, simulate false grassroots support, or manipulate democratic processes.

AI systems can be used to distort civic trust, public discourse, and democratic participation at speed and scale.

SAIFE posture
WarnBlockEscalate
Examples
Voter suppression contentTargeted political manipulationAstroturf amplificationSynthetic civic campaigns
Common surfaces
FeedsMessagingAdsCampaign contentPublic discourse
Typical protections
WarnBlockEscalate