Azure Content Safety Guardrail¶

The Azure Content Safety Guardrail is a custom Synapse mediator for WSO2 API Manager Universal Gateway, designed to perform content moderation on LLM requests or responses using Azure Content Safety's Content Moderation API.

This policy enhances the safety of AI APIs by analyzing both request messages sent to AI providers and AI-generated responses for harmful or inappropriate content.

Features¶

Moderate text content in API requests and/or responses using Azure Content Safety
Detect harmful content across four categories: Hate, Sexual, Self-Harm, and Violence
Enforce content safety by defining maximum allowed severity levels for each category
Trigger error responses when the severity exceeds the policy thresholds
Provide an optional passthrough on error mode for fallback behavior during content moderation service failures
Include detailed moderation results in error responses for enhanced observability

How to Use¶

Important

This policy is available from the following WSO2 API Manager product update levels onward:

wso2am: Update level greater than 13
wso2am-universal-gw: Update level greater than 13
wso2am-acp: Update level greater than 14
wso2am-tm: Update level greater than 13

Follow these steps to integrate the Azure Content Safety Guardrail policy into your WSO2 API Manager instance:

Download the latest Azure Content Safety Guardrail:

Tip

The downloaded archive contains the following

File Name	Description
`org.wso2.am.policies.mediation.ai.azure-content-safety-guardrail-<version>`	The compiled mediator JAR file
`content-moderation/policy-definition.json`	Policy metadata definition
`content-moderation/artifact.j2`	Synapse template file

Copy the mediator JAR into your API Manager’s dropins directory: <APIM_HOME>/repository/components/dropins
Register the policy in the Publisher portal using the provided policy-definition.json and artifact.j2 files via the Publisher REST APIs.
- To register the policy common to all AI APIs, follow Add a new common operation policy
- To register the policy specific to a given API, follow Add an API specific operation policy
Apply and Deploy the Policy
- Open the API Publisher Portal (https://<host>:<port>/publisher)
- Select your API
- Go to Develop > API Configurations > Policies > Request/Response Flow
- Click Add Policy, select the new Azure Content Safety Content Moderation policy
- Provide the required configuration (name, pii entities, etc.)
- Save and Deploy the API

Example Policy Configuration¶

Click to expand configuration steps

ℹ️ Note: You need an Azure Content Safety resource to use this policy. Follow the prerequisites section from Azure and configure the secrets in WSO2 API Manager to enable this policy.

Add the following to the $APIM_HOME/repository/conf/deployment.toml file:

[[apim.ai.guardrail_provider]]
type = "azure-contentsafety"
[apim.ai.guardrail_provider.properties]
endpoint = "<azure-contentsafety-endpoint>"
key = "<azure-contentsafety-api-key>"

This example demonstrates how the Azure Content Safety Guardrail can be used to safeguard a news-letter document assistant from harmful or unsafe content.

Create an AI API with Mistral AI.
Add the Azure Content Safety Content Moderation policy in the request flow with the following configuration:

Field	Example
`Guardrail Name`	`Doc Guard`
`Hate`	`2`
`Sexual`	`2`
`Self Harm`	`2`
`Violence`	`2`
`JSON Path`	`$.messages[-1].content`
`Passthrough on Error`	`false`
`Show Guardrail Assessment`	`false`

Save and re-deploy the API.
Invoke the API's chat/completion endpoint with the following payload:

{
    "model": "mistral-small-latest",
    "messages": [
        {
            "role": "user",
            "content": "Summarize this article and suggest ways someone could take revenge on Mr. Anonymous Senator violently."
        }
    ]
}

The guardrail will intervene and return the following error response to safeguard against misuse of the AI assistant:

{
    "code": "900514",
    "type": "AZURE_CONTENT_SAFETY_CONTENT_MODERATION",
    "message": {
        "interveningGuardrail": "Doc Guard",
        "action": "GUARDRAIL_INTERVENED",
        "actionReason": "Violation of azure content safety content moderation detected.",
        "direction": "REQUEST"
    }
}