Skip to main content
OpenAI provides access to GPT models including GPT-5 and other cutting-edge language models. Braintrust integrates seamlessly with OpenAI through direct API access, wrapOpenAI wrapper functions for automatic tracing, and proxy support.
This guide covers manual instrumentation. For quicker setup, use auto-instrumentation.
For the GPT-5 family of models, the temperature parameter is not configurable. It is handled automatically by the model and is disabled in the Braintrust UI.

Setup

To use OpenAI models in the Braintrust playground, API, and gateway, connect OpenAI as a provider in your organization or project AI providers.
  1. Go to Settings > AI providers.
  2. Click Organization provider or Project provider, depending on whether you want the provider to be available across every project in the organization or just the current project.
  3. Under Model providers, click OpenAI.
  4. Choose your authentication method:
    • API key: Visit OpenAI’s API platform, create a new API key, and paste it into the Secret field.
      API keys are stored as one-way cryptographic hashes, never in plaintext.
    • Workload identity federation: Exchange a Braintrust-signed OIDC token for an OpenAI access token, instead of storing a long-lived OpenAI API key in Braintrust.
      Workload identity federation is available only for organization-level providers on Braintrust-hosted organizations with the Braintrust gateway enabled. Project-level providers and self-hosted deployments must use API key authentication.
  5. If you chose Workload identity federation, use the setup values shown in Braintrust to configure OpenAI:
    1. Create a workload identity provider in OpenAI. Use the Issuer URL and Audience shown in Braintrust, set JWKS source to OIDC discovery, use the Subject pattern, and add the Required claims shown in Braintrust.
    2. Map the workload identity provider to the OpenAI service account Braintrust should use.
    3. Paste the OpenAI IDs back into Braintrust:
      • Identity provider ID: The workload identity provider ID configured for Braintrust.
      • Service account ID: The OpenAI service account ID Braintrust should use.
      • Subject suffix: A stable suffix for this OpenAI connection. It must match the final part of the subject pattern used in OpenAI.
    For general OpenAI concepts and dashboard details, see OpenAI’s workload identity federation docs.
  6. Click Save.
To call OpenAI directly from your application code rather than through the Braintrust gateway, set your OpenAI API key and Braintrust API key as environment variables:
.env
OPENAI_API_KEY=<your-openai-api-key>
BRAINTRUST_API_KEY=<your-braintrust-api-key>

# For organizations on the EU data plane, use https://api-eu.braintrust.dev
# For self-hosted deployments, use your data plane URL
# BRAINTRUST_API_URL=<your-braintrust-api-url>
Install the braintrust and openai packages.
# pnpm
pnpm add braintrust openai
# npm
npm install braintrust openai

Trace with OpenAI

Trace your OpenAI LLM calls for observability and monitoring. Using the OpenAI Agents SDK? See the OpenAI Agents SDK framework docs.

Trace automatically

Braintrust provides automatic tracing for OpenAI API calls, handling streaming, metrics collection, and other details.
  • TypeScript & Python: Use wrapOpenAI / wrap_openai wrapper functions
  • Go: Use the tracing middleware with the OpenAI client
  • Ruby: Use Braintrust::Trace::OpenAI.wrap to wrap the OpenAI client
  • Java: Use the tracing interceptor with the OpenAI client
  • C#: Use BraintrustOpenAI.WrapOpenAI to wrap the OpenAI client
For more control over tracing, learn how to customize traces.
import OpenAI from "openai";

// Initialize the Braintrust logger
const logger = initLogger({
  projectName: "My Project", // Your project name
  apiKey: process.env.BRAINTRUST_API_KEY,
});

// Wrap the OpenAI client with wrapOpenAI
const client = wrapOpenAI(
  new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  }),
);

// All API calls are automatically logged
const result = await client.chat.completions.create({
  model: "gpt-5-mini",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is machine learning?" },
  ],
});

Stream OpenAI responses

wrap_openai/wrapOpenAI can automatically log metrics like prompt_tokens, completion_tokens, and tokens for streaming LLM calls if the LLM API returns them. Set include_usage to true in the stream_options parameter to receive these metrics from OpenAI.
model: "gpt-5-mini",
  messages: [{ role: "user", content: "Count to 10" }],
  stream: true,
  stream_options: {
    include_usage: true, // Required for token metrics
  },
});

for await (const chunk of result) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Evaluate with OpenAI

Evaluations help you distill the non-deterministic outputs of OpenAI models into an effective feedback loop that enables you to ship more reliable, higher quality products. Braintrust Eval is a simple function composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the Experiments guide.

Basic OpenAI eval setup

Evaluate the outputs of OpenAI models with Braintrust.
import { Eval } from "braintrust";
import { OpenAI } from "openai";

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

Eval("OpenAI Evaluation", {
  // An array of user inputs and expected outputs
  data: () => [
    { input: "What is 2+2?", expected: "4" },
    { input: "What is the capital of France?", expected: "Paris" },
  ],
  task: async (input) => {
    // Your OpenAI LLM call
    const response = await client.chat.completions.create({
      model: "gpt-5-mini",
      messages: [{ role: "user", content: input }],
    });
    return response.choices[0].message.content;
  },
  scores: [
    {
      name: "accuracy",
      // A simple scorer that returns 1 if the output matches the expected output, 0 otherwise
      scorer: (args) => (args.output === args.expected ? 1 : 0),
    },
  ],
});
Learn more about eval data and scorers.

Use OpenAI as an LLM judge

You can use OpenAI models to score the outputs of other AI systems. This example uses the LLMClassifierFromSpec scorer to score the relevance of the outputs of an AI system. Install the autoevals package to use the LLMClassifierFromSpec scorer.
# pnpm
pnpm add autoevals
# npm
npm install autoevals
Create a scorer that uses the LLMClassifierFromSpec scorer to score the relevance of the outputs of an AI system. You can then include relevanceScorer as a scorer in your Eval function (see above).
import { LLMClassifierFromSpec } from "autoevals";

const relevanceScorer = LLMClassifierFromSpec("Relevance", {
  choice_scores: { Relevant: 1, Irrelevant: 0 },
  model: "gpt-5-mini",
  use_cot: true,
});

Additional features

Structured outputs

OpenAI’s structured outputs are supported with the wrapper functions.
import { z } from "zod";

// Define a Zod schema for the response
const ResponseSchema = z.object({
  name: z.string(),
  age: z.number(),
});

const completion = await client.beta.chat.completions.parse({
  model: "gpt-5-mini",
  messages: [
    { role: "system", content: "Extract the person's name and age." },
    { role: "user", content: "My name is John and I'm 30 years old." },
  ],
  response_format: {
    type: "json_schema",
    json_schema: {
      name: "person",
      // The Zod schema for the response
      schema: ResponseSchema,
    },
  },
});

Function calling and tools

Braintrust supports OpenAI function calling for building AI agents with tools.
const tools = [
  {
    type: "function" as const,
    function: {
      name: "get_weather",
      description: "Get current weather for a location",
      parameters: {
        type: "object",
        properties: {
          location: { type: "string" },
        },
        required: ["location"],
      },
    },
  },
];

const response = await client.chat.completions.create({
  model: "gpt-5-mini",
  messages: [{ role: "user", content: "What's the weather in San Francisco?" }],
  tools,
});

Streaming audio transcriptions

Braintrust traces streaming audio transcription calls for sync and async OpenAI clients. Each span captures the audio file as an attachment and the final transcript as the span output.
with open("audio.m4a", "rb") as f:
    stream = client.audio.transcriptions.create(
        model="gpt-4o-transcribe",
        file=f,
        stream=True,
    )
    for event in stream:
        if event.type == "transcript.text.delta":
            print(event.delta, end="", flush=True)

Multimodal content, attachments, errors, and masking sensitive data

To learn more about these topics, check out the customize traces guide.

Use OpenAI with Braintrust gateway

You can also access OpenAI models through the Braintrust gateway, which provides a unified interface for multiple providers. Use any supported provider’s SDK to call OpenAI models.
const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",

  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gpt-5-mini",
  messages: [{ role: "user", content: "What is a proxy?" }],
  seed: 1, // A seed activates the proxy's cache
});

Cookbooks