Guardrails

Creating an AI app is easy but making sure it works well in the real world is hard. Fortunately, the solution is pretty simple — use LLM guardrails and monitoring. Here we’ll go over 2 example tools that very easily integrate into your AI apps to make sure they’re always working properly, Guardrails AI and Iudex AI.

Joke bot in IUDEX

Starting LLM agent

Let’s say we want to make a joke writing service.

from openai import OpenAI
 
client = OpenAI()
 
def tell_joke(subject):
    res = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Tell me a joke about {subject}",
        }],
        temperature=1,
    )
 
    # subject="space" => "Why did the sun go to school? To get a little brighter!"
    return res.choices[0].message

Running tell_joke with “space” gets the result “Why did the sun go to school? To get a little brighter!”

Great, we’re done!

But wait a minute, we don’t want our jokes to be too long or else they won’t be funny.

Guarded LLM agent

Our friends over at Guardrails AI have built an amazing library that acts like a super charged moderation endpoint. Let’s use that to make sure our jokes are always under 50 characters.

Guardrails works by wrapping your LLM calls with sets of tests that the inputs or outputs of the LLM must pass. For our joke bot, Guardrails requires that we create a guard to enforce output length and then replace openai client with guard.

from guardrails import Guard
from guardrails.hub import ValidLength
 
def tell_joke(subject):
    guard = Guard().use_many(
        ValidLength(min=0, max=50),
    )
 
    res = guard(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Tell me a joke about {subject}",
        }],
        temperature=1,
    )
 
    # subject="space" => "Why did the sun go to school? To get a little brighter!"
    return res.validated_output

Now that the request can fail, how will we know when if does?

Monitored LLM agent

We can easily set up Iudex to capture Guardrails logs. Iudex works for any server framework that you run in production. In this example we’ll use a FastAPI server. We only need to make two changes:

Import from iudex import instrument and call instrument above the fastapi import.
Import from guardrails.telemetry import default_otlp_tracer and add default_otlp_tracer('joke_guard') and add name="joke_guard" in Guard().

from iudex import instrument
instrument(
    service_name="joke-bot",
    env="prod",
    iudex_api_key="your_write_only_api_key",
)
 
from fastapi import FastAPI, APIRouter
from guardrails import Guard
from guardrails.hub import ValidLength
from guardrails.telemetry import default_otlp_tracer
 
app = FastAPI()
 
default_otlp_tracer('joke_guard')
 
@app.get("/joke/{subject}")
def tell_joke(subject):
    if not subject:
        raise ValueError("I can't tell a joke without a subject!")
 
    guard = Guard(name="joke_guard").use_many(
        ValidLength(min=0, max=100),
    )
 
    res = guard(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"Tell me a joke about {subject}",
        }],
        temperature=1,
    )
 
    # subject="space" => "Why did the sun go to school? To get a little brighter!"
    return res.validated_output

Voila! Now we can see whenever someone asks for a joke from our service.

Joke bot in IUDEX

And, we can see if the LLM outputs were valid.

Joke bot in IUDEX

Check out the Iudex AI and Guardrails AI for more information on how to get started and the benefits of using enterprise grade support!

Git Lambda