Azure Content Safety Guardrail¶
The Azure Content Safety Guardrail is a custom Synapse mediator for WSO2 API Manager Universal Gateway, designed to perform content moderation on LLM requests or responses using Azure Content Safety's Content Moderation API.
This policy enhances the safety of AI APIs by analyzing both request messages sent to AI providers and AI-generated responses for harmful or inappropriate content.
Features¶
- Moderate text content in API requests and/or responses using Azure Content Safety
- Detect harmful content across four categories: Hate, Sexual, Self-Harm, and Violence
- Enforce content safety by defining maximum allowed severity levels for each category
- Trigger error responses when the severity exceeds the policy thresholds
- Provide an optional passthrough on error mode for fallback behavior during content moderation service failures
- Include detailed moderation results in error responses for enhanced observability
How to Use¶
Important
This policy is available from the following WSO2 API Manager product update levels onward:
wso2am: Update level greater than 13wso2am-universal-gw: Update level greater than 13wso2am-acp: Update level greater than 14wso2am-tm: Update level greater than 13
Follow these steps to integrate the Azure Content Safety Guardrail policy into your WSO2 API Manager instance:
-
Download the latest Azure Content Safety Guardrail:
Tip
The downloaded archive contains the following
File Name Description org.wso2.am.policies.mediation.ai.azure-content-safety-guardrail-<version>The compiled mediator JAR file content-moderation/policy-definition.jsonPolicy metadata definition content-moderation/artifact.j2Synapse template file -
Copy the mediator JAR into your API Manager’s dropins directory:
<APIM_HOME>/repository/components/dropins -
Register the policy in the Publisher portal using the provided
policy-definition.jsonandartifact.j2files via the Publisher REST APIs.- To register the policy common to all AI APIs, follow Add a new common operation policy
- To register the policy specific to a given API, follow Add an API specific operation policy
-
Apply and Deploy the Policy
- Open the API Publisher Portal
(https://<host>:<port>/publisher) - Select your API
- Go to Develop > API Configurations > Policies > Request/Response Flow
- Click Add Policy, select the new Azure Content Safety Content Moderation policy
- Provide the required configuration (name, pii entities, etc.)
- Save and Deploy the API
- Open the API Publisher Portal
Example Policy Configuration¶
Click to expand configuration steps
ℹ️ Note: You need an Azure Content Safety resource to use this policy. Follow the prerequisites section from Azure and configure the secrets in WSO2 API Manager to enable this policy.
Add the following to the $APIM_HOME/repository/conf/deployment.toml file:
[[apim.ai.guardrail_provider]]
type = "azure-contentsafety"
[apim.ai.guardrail_provider.properties]
endpoint = "<azure-contentsafety-endpoint>"
key = "<azure-contentsafety-api-key>"
This example demonstrates how the Azure Content Safety Guardrail can be used to safeguard a news-letter document assistant from harmful or unsafe content.
- Create an AI API with Mistral AI.
- Add the
Azure Content Safety Content Moderationpolicy in the request flow with the following configuration:
| Field | Example |
|---|---|
Guardrail Name |
Doc Guard |
Hate |
2 |
Sexual |
2 |
Self Harm |
2 |
Violence |
2 |
JSON Path |
$.messages[-1].content |
Passthrough on Error |
false |
Show Guardrail Assessment |
false |
- Save and re-deploy the API.
- Invoke the API's
chat/completionendpoint with the following payload:
{
"model": "mistral-small-latest",
"messages": [
{
"role": "user",
"content": "Summarize this article and suggest ways someone could take revenge on Mr. Anonymous Senator violently."
}
]
}
The guardrail will intervene and return the following error response to safeguard against misuse of the AI assistant: