打造專屬ChatGPT:利用LLM進行結構化輸出(Structured Outputs)
內容目錄
A. 前言
在〖打造專屬ChatGPT:OpenAI Chat Completion API參數解析〗中有提到,可以利用強大的LLM語言能力,輕鬆地從對話(非結構化資料)中提取出特定的關鍵資訊(結構化資料)。這對於我們在對話過程中需要掌握特定資訊時非常有用。
在OpenAI中,透過結構化輸出(Structured output)的功能,我們可以確保模型的回應符合事先定義的JSON Schema。當我們呼叫REST API時,可以將JSON Schema直接傳遞給OpenAI,使用SDK,當然也遵循相同的方法。
結構化輸出(Structured Outputs)支援的模型
- ≥ gpt-4o-mini-2024-07-18
- ≥ gpt-4o-2024-08-06
較舊的模型可嘗試使用JSON模式(
{"type": "json_object"}
)。官方建議優先使用結構化輸出(Structured Outputs),經驗上來說,結構化輸出(Structured Outputs)在處理複雜的結構時,比JSON模式有更好的表現。
B. 結構化輸出(Structured Outputs)使用方式
使用OpenAI的結構化輸出時,需要遵循JSON Schema的規則,當我們在Chat Completion API中希望回傳資料符合特定格式時,我們可以透過response_format參數來指定輸出形式:
from openai import OpenAI
import json
client = OpenAI()
completion = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Extract the event information."},
{
"role": "user",
"content": "Alice and Bob are going to a science fair on Friday.",
},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "CalendarEvent",
"description": "取得活動名稱、時間、參與者",
"schema": {
"properties": {
"name": {"title": "Name", "type": "string"},
"date": {"title": "Date", "type": "string"},
"participants": {
"items": {"type": "string"},
"title": "Participants",
"type": "array",
},
},
"required": ["name", "date", "participants"],
"title": "CalendarEvent",
"type": "object",
"additionalProperties": False,
},
"strict": True,
},
},
)
event = json.loads(completion.choices[0].message.content)
print(event["name"], event["date"], event["participants"])
範例中我們將response_format
設定為json_schema
,基本結構及限制如下:
{
"type" :"json_schema",
"json_schema" {
"name": Required,a-zA-Z0-9_-,最大長度64
"description": 協助模型了解該如何使用這個結構
"schema": *****JSON Schema寫在這裡*****
"strict": boolean or null,是否嚴格遵循定義。設為True時,只支持部分JSON Schema。預設為False
}
}
而我們的JSON Schema就會填入"schema"
中,從前面的範例中我們可以看到光是三個欄位(name, date, participants)就要寫下許多內容。也因此OpenAI提供了額外支援,Python SDK是透過Python的pydantic模組來實現,程式碼如下:
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI()
class CalendarEvent(BaseModel):
name: str
date: str
participants: list[str]
class Config:
extra = "forbid"
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "Extract the event information."},
{
"role": "user",
"content": "Alice and Bob are going to a science fair on Friday.",
},
],
response_format=CalendarEvent,
)
event = completion.choices[0].message.parsed
print(event.name, event.date, event.participants)
很明顯的程式碼變得簡潔,最後輸出欄位資訊(name, date, participants)時,也變成存取event
物件的屬性。
這裡可以留意一下,以往我們呼叫模型時的方法都是client.chat.completions.create(...)
,但這裡是client.beta.chat.completions.parse()
,目前還不是正式版。
如果我們想要保持client.chat.completions.create(...)
來呼叫API,同時也避免非正式版的不穩定性,或是未來需要修改接口的可能,pydantic
模組的BaseModel
定義了類別方法(classmethod
)model_json_schema
,當我們的CalendarEvent
繼承了BaseModel
,透過這個方式CalendarEvent.model_json_schema()
我們可以得到完整的JSON Schema,程式碼修改如下:
from openai import (
OpenAI,
APIConnectionError,
APIResponseValidationError,
APIStatusError,
LengthFinishReasonError,
ContentFilterFinishReasonError,
)
from pydantic import BaseModel, ValidationError
from typing import List
class CalendarEvent(BaseModel):
name: str
date: str
participants: List[str]
class Config:
extra = "forbid"
client = OpenAI()
try:
completion = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract the event information."},
{
"role": "user",
"content": "Alice and Bob are going to a science fair on Friday.",
},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "CalendarEvent",
"description": "取得活動名稱、時間、參與者",
"schema": CalendarEvent.model_json_schema(),
"strict": True,
},
},
)
print(completion.choices[0].message.content)
if completion.choices[0].message.refusal:
# 使用json_schema時,OpenAI可能會拒絕用戶的請求,從refusal欄位取出拒絕原因
print(completion.choices[0].message.refusal)
else:
event: CalendarEvent = CalendarEvent.model_validate_json(
completion.choices[0].message.content
)
print(event.name, event.date, event.participants)
except APIConnectionError as e:
# Connection error or Request timed out.
print(e.message)
except APIResponseValidationError as e:
# OpenAI回傳資料無法驗證,Data returned by API invalid for expected schema.
print(e.code, e.type, e.message)
print(e.status_code)
except APIStatusError as e:
# 4xx及5xx相關錯誤
print(e.code, e.type, e.message)
print(e.status_code)
print(e.request_id)
except LengthFinishReasonError as e:
# Could not parse response content as the length limit was reached.
print(e)
except ContentFilterFinishReasonError as e:
# Could not parse response content as the request was rejected by the content filter.
print(e)
except ValidationError as e:
# pydantic模組驗證模型輸出
print(e.json())
夠過BaseModel
的類別方法(classmethod
)model_json_schema
和model_validate_json,我們保持了程式碼的整潔並對輸出結果再次驗證,避免非預期結果,同時方便後續取值。
除了透過pydantic
模組驗證模型輸出(ValidationError
)以避免非預期結果外,在執行client.chat.completions.create
時,也透過SDK定義的錯誤類別來捕捉錯誤,錯誤有以下類型:
- APIConnectionError:Connection error or Request timed out.
- APIResponseValidationError:OpenAI回傳資料無法驗證,Data returned by API invalid for expected schema.
- APIStatusError:APIStatusError,4xx及5xx相關錯誤。
- LengthFinishReasonError:Could not parse response content as the length limit was reached.
- ContentFilterFinishReasonError:Could not parse response content as the request was rejected by the content filter.
不同錯誤可取得資訊不同,請參考範例程式碼。
在使用結構化輸出時,OpenAI可能會出於安全考量拒絕用戶的請求(Refusals with Structured Outputs)。此時,API會回傳refusal
欄位,值的資料類型為字串,表示拒絕,透過completion.choices[0].message.refusal
可以取得拒絕訊息。
C. 使用場景
1. Chain of thought(數學解題步驟)
from pydantic import BaseModel
from openai import OpenAI
class Step(BaseModel):
explanation: str
output: str
class Config:
extra = "forbid"
class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str
class Config:
extra = "forbid"
client = OpenAI()
completion = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step.",
},
{"role": "user", "content": "how can I solve 8x + 7 = -23"},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "MathReasoning",
"description": "數學計算",
"schema": MathReasoning.model_json_schema(),
"strict": True,
},
},
)
math_reasoning: MathReasoning = MathReasoning.model_validate_json(
completion.choices[0].message.content
)
for _ in math_reasoning.steps:
print(_.explanation)
print(_.output)
2. 非結構化資料萃取
from pydantic import BaseModel
from openai import OpenAI
class ResearchPaperExtraction(BaseModel):
title: str
authors: list[str]
abstract: str
keywords: list[str]
class Config:
extra = "forbid"
client = OpenAI()
completion = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": "You are an expert at structured data extraction. You will be given unstructured text from a research paper and should convert it into the given structure.",
},
{"role": "user", "content": "..."},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "ResearchPaperExtraction",
"schema": ResearchPaperExtraction.model_json_schema(),
"strict": True,
},
},
)
research_paper = ResearchPaperExtraction.model_validate_json(
completion.choices[0].message.content
)
print(research_paper)
3. 生成UI
from enum import Enum
from typing import List
from pydantic import BaseModel
from openai import OpenAI
class UIType(str, Enum):
div = "div"
button = "button"
header = "header"
section = "section"
field = "field"
form = "form"
class Attribute(BaseModel):
name: str
value: str
class Config:
extra = "forbid"
class UI(BaseModel):
type: UIType
label: str
children: List["UI"]
attributes: List[Attribute]
class Config:
extra = "forbid"
UI.model_rebuild() # This is required to enable recursive types
class Response(BaseModel):
ui: UI
class Config:
extra = "forbid"
client = OpenAI()
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": "You are a UI generator AI. Convert the user input into a UI.",
},
{"role": "user", "content": "Make a User Profile Form"},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "Response",
"description": "生成UI",
"schema": Response.model_json_schema(),
"strict": True,
},
},
)
ui = Response.model_validate_json(completion.choices[0].message.content)
print(ui)
4. Moderation
from enum import Enum
from typing import Optional
from pydantic import BaseModel
from openai import OpenAI
class Category(str, Enum):
violence = "violence"
sexual = "sexual"
self_harm = "self_harm"
class ContentCompliance(BaseModel):
is_violating: bool
category: Optional[Category]
explanation_if_violating: Optional[str]
class Config:
extra = "forbid"
client = OpenAI()
completion = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "system",
"content": "Determine if the user input violates specific guidelines and explain if they do.",
},
{"role": "user", "content": "How do I prepare for a job interview?"},
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "ContentCompliance",
"description": "Content Compliance",
"schema": ContentCompliance.model_json_schema(),
"strict": True,
},
},
)
compliance = ContentCompliance.model_validate_json(
completion.choices[0].message.content
)
print(compliance)
D. 注意事項
1. 確保輸入(messages
)與定義的結構相關
如果輸入(messages
)與定義的結構相關性太低,可能會產生虛構的內容。可以在提示中加入語句,指示模型在檢測到輸入與任務不相符時,返回空參數或指定句子。
3. 確保JSON結構的一致性
為了防止JSON Schema與程式語言中的相應類型出現偏離,強烈建議使用原生的Pydantic(Python)或zod(Javascript) 套件。
如果偏好直接指定JSON Schema,可以添加CI規則,標記JSON架構或基礎數據對象被編輯的情況;亦可以新增一個CI步驟,從類型定義中自動生成JSON架構(反之亦然)。
4. 結構化輸出(Structure output)支援的JSON Schema
- String
- Number
- Boolean
- Integer
- Object
- Array
- Enum
- anyOf
5. 根物件(root object)不能使用anyOf
架構中的根物件(root object)必須是一個物件,不能使用 anyOf。
6. 所有的欄位都必須是required
在範例中,當我們操作CalendarEvent.model_json_schema()
輸出JSON Schema時,required
中已經包含所有欄位:
{
'additionalProperties': False,
'properties': {
'name': {'title': 'Name', 'type': 'string'},
'date': {'title': 'Date', 'type': 'string'},
'participants': {'items': {'type': 'string'}, 'title': 'Participants', 'type': 'array'}
},
'required': ['name', 'date', 'participants'],
'title': 'CalendarEvent',
'type': 'object'
}
如果不透過套件處理JSON Schema,需要額外注意。
如果需要表示一個欄位可以空時,Python中可以使用Union來實現:
from typing import Union
class CalendarEvent(BaseModel):
name: Union[str, None]
date: str
participants: List[str]
class Config:
extra = "forbid"
7. 結構限制
物件屬性最多100個,巢狀層級最多可達5層。
8. Schema大小制
- 在架構中,所有屬性名稱、定義名稱、枚舉(enum)和常量值的總字串長度不得超過15,000個token。
- 所有枚舉(enum)只能有500個,超過250個則枚舉總長度不能超過7500個token。
9. additionalProperties
無論何時皆為false
Python使用pydantic模組時,透過class Config
設定。
10. Key的順序
使用結構化輸出(Structured output)時,輸出將按照架構中鍵的順序進行排列。
某些類型特定的關鍵字尚不支援:
- 對於字串:
minLength
,maxLength
,pattern
,format
- 對於數字:
minimum
,maximum
,multipleOf
- 對於對象:
patternProperties
,unevaluatedProperties
,propertyNames
,minProperties
,maxProperties
- 對於數組:
unevaluatedItems
,contains
,minContains
,maxContains
,minItems
,maxItems
,uniqueItems
設定"strict": True
並使用不支援的JSON Schema,API會回傳錯誤(檢查refusal
)。
E. Tool Calling中的JSON Schema
呼叫Chat Completion API時有一個tools
參數,可以傳入相關工具,讓OpenAI判斷是否需要呼叫工具確保問題可以得到更精確的解答。其中定義呼叫工具所需參數時使用的也是JSON Schema,在直接使用Python Pydantic模組的部分支援仍在beta版本時,可以透過文章的方式,同樣使用Pydantic模組定義結構,再透過pydantic
模組提供的類別方法(model_json_schema
)來輸出JSON Schema。
Tool Calling的用法可以參考〖打造專屬ChatGPT:透過OpenAI Tool Calling深度整合資源〗這篇文章,當你了解到Tool Calling時,其實你會發現Tool Calling也能做到response_format={"type": "json_schema", ...}
在做的事情,那個該如何選擇呢?官方文件的建議如下:
- If you are connecting the model to tools, functions, data, etc. in your system, then you should use function calling.
- If you want to structure the model’s output when it responds to the user, then you should use a structured.
簡單來說就是,萃取出資料後如果有需要呼叫其他工具(包含但不限於外部資料、內部資料庫…),就使用Tool Calling(原本叫做function calling),如果單純想要抓出特定的資訊就透過response_format
,結合這兩種用法,可以讓程式變得聰明且靈活,更好地應對各種情境。
F. 總結
這篇文章討論了利用OpenAI的ChatGPT進行結構化輸出(Structured output)的方法,強調了如何將對話中的非結構化數據轉化為特定的結構化信息。
同時也介紹了應用於不同使用場景的示例,例如數學問題解答過程、非結構化數據處理及用戶界面生成等,並提供了實作原始碼和注意事項。
OpenAI的Python SDK雖然支援直接使用pydantic
模組來撰寫JSON Schema,但直接的使用仍能在beta階段,目前正式版是直接撰寫JSON Schema,如果覺得這個方式太麻煩,也可以模仿文章中的用法,仍然透過pydantic
模組處理JSON Schema,並藉由pydantic
模組提供的類別方法來輸出或驗證模型回傳內容。