Getting started
This notebook provides a practical guide on using the structured-logprobs
library with OpenAI's API to generate structured responses enriched with token-level log-probabilities.
Install the library¶
structured-logprobs
is available on PyPI and can be simply installed with pip.
!pip install structured-logprobs~=0.1
Let's import the required libraries.
import getpass
import json
import math
from openai import OpenAI
from openai.types import ResponseFormatJSONSchema
from rich import print, print_json
from structured_logprobs.main import add_logprobs, add_logprobs_inline
Setting Up the OpenAI API Client¶
An OpenAI API key is mandatory to authenticate access to OpenAI's API. It is a token necessary to initialize the OpenAI Python client, enabling you to send requests to the API and receive responses.
In this notebook, you will be prompted to enter your OPENAI_API_KEY securely using Python's getpass module. This ensures that your key is not hardcoded, reducing the risk of accidental exposure.
api_key = getpass.getpass(prompt="Enter you OPENAI_API_KEY: ")
Let's initialize the OpenAI client.
client = OpenAI(api_key=api_key)
Create a chat completion request¶
The first step is to define the JSON schema, used to structure the chat request to OpenAI. This schema helps OpenAI understand exactly how the response should be formatted and organized.
Below is an example JSON schema used in this notebook. To learn more about JSON Schema, refer to this overview
schema_content = {
"type": "json_schema",
"json_schema": {
"name": "answears",
"description": "Response to questions in JSON format",
"schema": {
"type": "object",
"properties": {
"capital_of_France": {"type": "string"},
"the_two_nicest_colors": {
"type": "array",
"items": {"type": "string", "enum": ["red", "blue", "green", "yellow", "purple"]},
},
"die_shows": {"type": "integer"},
},
"required": ["capital_of_France", "the_two_nicest_colors", "die_shows"],
"additionalProperties": False,
},
"strict": True,
},
}
The schema must be validated before being used as a parameter in the request to OpenAI.
response_schema = ResponseFormatJSONSchema.model_validate(schema_content)
Additionally, to create the chat completion, you must set up the model, input messages, and other parameters such as logprobs and response_format.
completion = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": (
"I have three questions. The first question is: What is the capital of France? "
"The second question is: Which are the two nicest colors? "
"The third question is: Can you roll a die and tell me which number comes up?"
),
}
],
logprobs=True,
response_format=response_schema.model_dump(by_alias=True),
)
If you print the response, you can observe how OpenAI organizes the logprobs. These logprobs are associated with individual tokens, which may not be convenient if you are looking for the log probability of the full value extracted for each requested field.
ChatCompletion(
id='chatcmpl-ApHuoaVGaxOoPUX6syvQt9XkfSkCe',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=ChoiceLogprobs(
content=[
ChatCompletionTokenLogprob(
token='{"',
bytes=[123, 34],
logprob=-1.50940705e-05
),
,
ChatCompletionTokenLogprob(
token='capital',
bytes=[99, 97, 112, 105, 116, 97, 108],
logprob=-7.226629e-06
),
#...
],
refusal=None
),
message=ChatCompletionMessage(
content='{"capital_of_France": "Paris", "capital_of_France_logprob": -1.22165105e-06,
"the_two_nicest_colors": ["blue", "green"], "die_shows": 4.0, "die_shows_logprob": -0.44008404}',
refusal=None,
role='assistant',
audio=None,
function_call=None,
tool_calls=None
)
)
],
created=1736786958,
model='gpt-4o-2024-08-06',
object='chat.completion',
service_tier='default',
system_fingerprint='fp_703d4ff298',
usage=CompletionUsage(
completion_tokens=27,
prompt_tokens=133,
total_tokens=160,
completion_tokens_details=CompletionTokensDetails(
accepted_prediction_tokens=0,
audio_tokens=0,
reasoning_tokens=0,
rejected_prediction_tokens=0
),
prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)
)
)
Enhance the chat completion result with log probabilities¶
The strategy for aggregating log-probabilities involves mapping each character in the generated message's content to its corresponding token. Instead of focusing on individual token probabilities, the log probabilities of all tokens that form a given value are summed. This approach generates a more meaningful probability for all JSON elements.
chat_completion = add_logprobs(completion)
Now if you print the response you can see that it is a new Python object, which contains the original OpenAI response under the 'value' field, and a 'log_probs' field where the message values are replaced with their respective log probabilities.
ChatCompletionWithLogProbs(
value=ChatCompletion(
id='chatcmpl-ApHuoaVGaxOoPUX6syvQt9XkfSkCe',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=ChoiceLogprobs(
content=[
ChatCompletionTokenLogprob(
token='{"',
bytes=[123, 34],
logprob=-1.50940705e-05,
top_logprobs=[]
),
#...
],
refusal=None
),
message=ChatCompletionMessage(
content='{"capital_of_France":"Paris","the_two_nicest_colors":["blue","green"],"die_shows":4}',
refusal=None,
role='assistant',
audio=None,
function_call=None,
tool_calls=None
)
)
],
created=1736786958,
model='gpt-4o-2024-08-06',
object='chat.completion',
service_tier='default',
system_fingerprint='fp_703d4ff298',
usage=CompletionUsage(
completion_tokens=27,
prompt_tokens=133,
total_tokens=160,
completion_tokens_details=CompletionTokensDetails(
accepted_prediction_tokens=0,
audio_tokens=0,
reasoning_tokens=0,
rejected_prediction_tokens=0
),
prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)
)
),
log_probs=[
{
'capital_of_France': -1.22165105e-06,
'the_two_nicest_colors': [-0.00276869551265, -0.00539924761265],
'die_shows': -0.44008404
}
]
)
print_json(chat_completion.value.choices[0].message.content)
{ "capital_of_France": "Paris", "the_two_nicest_colors": [ "blue", "green" ], "die_shows": 4 }
print(chat_completion.log_probs[0])
{ 'capital_of_France': -1.10244729e-06, 'the_two_nicest_colors': [-0.0022088558126500003, -0.01012725961265], 'die_shows': -0.43754107 }
By applying the exponential function to logprobs, you can easily convert them to probabilities.
data = chat_completion.log_probs[0]
transformed_data = {
key + "_prob": [round(math.exp(log_prob), 2) for log_prob in value]
if isinstance(value, list)
else round(math.exp(value), 2)
for key, value in data.items()
}
print(transformed_data)
{'capital_of_France_prob': 1.0, 'the_two_nicest_colors_prob': [1.0, 0.99], 'die_shows_prob': 0.65}
Enhance the chat completion result with in-line log probabilities¶
With the add_logprobs_inline
method you can embeds log probabilities directly within the content of the message. Instead of having log probabilities as a separate field, this function integrates them into the content if the chat completion response itself, allowing for atomic values to be accompanied by their respective log probabilities.
chat_completion_inline = add_logprobs_inline(completion)
If you print now the response you can see that the content of the message is replaced with a dictionary that includes also inline log probabilities for atomic values.
ChatCompletion(
id='chatcmpl-ApIDdbCuAJ8EHM6RDNgGR3mEQZTBH',
choices=[
Choice(
finish_reason='stop',
index=0,
logprobs=ChoiceLogprobs(
content=[
ChatCompletionTokenLogprob(
token='{"',
bytes=[123, 34],
logprob=-2.3795938e-05,
top_logprobs=[]
),
#...
],
refusal=None
),
message=ChatCompletionMessage(
content='{"capital_of_France": "Paris", "capital_of_France_logprob": -7.448363e-07,
"the_two_nicest_colors": ["blue", "green"], "die_shows": 4.0, "die_shows_logprob": -0.46062052}',
refusal=None,
role='assistant',
audio=None,
function_call=None,
tool_calls=None
)
)
],
created=1736788125,
model='gpt-4o-2024-08-06',
object='chat.completion',
service_tier='default',
system_fingerprint='fp_703d4ff298',
usage=CompletionUsage(
completion_tokens=27,
prompt_tokens=133,
total_tokens=160,
completion_tokens_details=CompletionTokensDetails(
accepted_prediction_tokens=0,
audio_tokens=0,
reasoning_tokens=0,
rejected_prediction_tokens=0
),
prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)
)
)
print_json(chat_completion_inline.choices[0].message.content)
{ "capital_of_France": "Paris", "capital_of_France_logprob": -1.10244729e-06, "the_two_nicest_colors": [ "blue", "green" ], "die_shows": 4.0, "die_shows_logprob": -0.43754107 }
The probability can easily be obtained by exponentiating the the log-probability.
data = json.loads(chat_completion_inline.choices[0].message.content)
transformed_data = {
(key[:-8] + "_prob" if key.endswith("_logprob") else key): (
round(math.exp(value), 2) if key.endswith("_logprob") else value
)
for key, value in data.items()
}
print(transformed_data)
{ 'capital_of_France': 'Paris', 'capital_of_France_prob': 1.0, 'the_two_nicest_colors': ['blue', 'green'], 'die_shows': 4.0, 'die_shows_prob': 0.65 }