Security guardrails for AI agents — input filtering, prompt injection detection, and output validation