Scan Document

Scan extracted document text for the same risk signals as the prompt scanner — PII, cultural knowledge, education data, re-identification, and prompt injection. Ideal for scanning file uploads, pasted documents, or any large text payload before it reaches an LLM.

Plan requirement: Document scanning requires the documentScanning feature, available on Professional plans and above.

POST

/v1/scan/document

Scan document content for risk signals. Requires the documentScanning plan feature.

Request Body

Parameter	Type	Required	Description
`content`	string	Yes	The extracted document text to scan (max 500,000 characters)
`filename`	string	No	Original filename for audit context (e.g. "report.pdf")
`mime_type`	string	No	MIME type of the source document (e.g. "application/pdf")
`metadata`	object	No	Arbitrary key-value metadata attached to the scan

Response

200 OK

{
  "tier": "red",
  "action": "block",
  "risk_score": 82,
  "signals": [
    {
      "domain": "pii",
      "type": "ssn",
      "confidence": 0.95,
      "weight": 60
    },
    {
      "domain": "cultural",
      "type": "ceremony_reference",
      "confidence": 0.80,
      "weight": 35
    }
  ],
  "explanation": "Detected SSN and indigenous ceremony reference...",
  "sanitized_prompt": "[REDACTED SSN] ... [CULTURAL_REFERENCE]",
  "token_id": null,
  "active_domains": ["pii", "education", "cultural", "reidentification", "injection"],
  "latency_ms": 14,
  "document_metadata": {
    "filename": "intake-form.pdf",
    "mime_type": "application/pdf"
  }
}

Response Fields

Parameter	Type	Required	Description
`tier`	"green" \| "yellow" \| "red"	Yes	Risk classification tier
`action`	"allow" \| "flag" \| "block"	Yes	Recommended action
`risk_score`	number	Yes	Aggregate risk score (0-100)
`signals`	Signal[]	Yes	Array of detected risk signals
`explanation`	string	Yes	Human-readable explanation of findings
`sanitized_prompt`	string \| null	Yes	Content with sensitive data redacted (yellow/red tier)
`token_id`	string \| null	Yes	Confirmation token for yellow tier scans
`active_domains`	string[]	Yes	Detection domains that were active
`latency_ms`	number	Yes	Processing time in milliseconds
`document_metadata`	object	Yes	Echo of filename and mime_type from the request

Examples

cURL

curl -X POST https://indigiarmor.com/v1/scan/document \
  -H "Authorization: Bearer ia_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Student transcript for Jane Doe, SSN 123-45-6789...",
    "filename": "transcript.pdf",
    "mime_type": "application/pdf"
  }'

SDK

const result = await armor.scanDocument({
  content: extractedText,
  filename: 'transcript.pdf',
  mime_type: 'application/pdf',
});

if (result.action === 'block') {
  console.log('Blocked:', result.explanation);
}