打造專屬ChatGPT：利用LLM進行結構化輸出(Structured Outputs)

2024-11-11

Al GPT chain of thought, json schema, LLM, moderation, 結構化輸出(Structured Outputs)

打造專屬ChatGPT：利用LLM進行結構化輸出(Structured Outputs)

A. 前言

在〖打造專屬ChatGPT：OpenAI Chat Completion API參數解析〗中有提到，可以利用強大的LLM語言能力，輕鬆地從對話(非結構化資料)中提取出特定的關鍵資訊(結構化資料)。這對於我們在對話過程中需要掌握特定資訊時非常有用。

在OpenAI中，透過結構化輸出(Structured output)的功能，我們可以確保模型的回應符合事先定義的JSON Schema。當我們呼叫REST API時，可以將JSON Schema直接傳遞給OpenAI，使用SDK，當然也遵循相同的方法。

結構化輸出(Structured Outputs)支援的模型

≥ gpt-4o-mini-2024-07-18

≥ gpt-4o-2024-08-06

較舊的模型可嘗試使用JSON模式({"type": "json_object"})。

官方建議優先使用結構化輸出(Structured Outputs)，經驗上來說，結構化輸出(Structured Outputs)在處理複雜的結構時，比JSON模式有更好的表現。

B. 結構化輸出(Structured Outputs)使用方式

使用OpenAI的結構化輸出時，需要遵循JSON Schema的規則，當我們在Chat Completion API中希望回傳資料符合特定格式時，我們可以透過response_format參數來指定輸出形式：

from openai import OpenAI
import json

client = OpenAI()

completion = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "CalendarEvent",
            "description": "取得活動名稱、時間、參與者",
            "schema": {
                "properties": {
                    "name": {"title": "Name", "type": "string"},
                    "date": {"title": "Date", "type": "string"},
                    "participants": {
                        "items": {"type": "string"},
                        "title": "Participants",
                        "type": "array",
                    },
                },
                "required": ["name", "date", "participants"],
                "title": "CalendarEvent",
                "type": "object",
                "additionalProperties": False,
            },
            "strict": True,
        },
    },
)

event = json.loads(completion.choices[0].message.content)
print(event["name"], event["date"], event["participants"])

範例中我們將response_format設定為json_schema，基本結構及限制如下：

{
    "type" :"json_schema",
    "json_schema" {
        "name": Required，a-zA-Z0-9_-，最大長度64
        "description": 協助模型了解該如何使用這個結構
        "schema": *****JSON Schema寫在這裡*****
        "strict": boolean or null，是否嚴格遵循定義。設為True時，只支持部分JSON Schema。預設為False
    }
}

而我們的JSON Schema就會填入"schema"中，從前面的範例中我們可以看到光是三個欄位(name, date, participants)就要寫下許多內容。也因此OpenAI提供了額外支援，Python SDK是透過Python的pydantic模組來實現，程式碼如下：

from pydantic import BaseModel

from openai import OpenAI

client = OpenAI()


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

    class Config:
        extra = "forbid"


completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed
print(event.name, event.date, event.participants)

很明顯的程式碼變得簡潔，最後輸出欄位資訊(name, date, participants)時，也變成存取event物件的屬性。

這裡可以留意一下，以往我們呼叫模型時的方法都是client.chat.completions.create(...)，但這裡是client.beta.chat.completions.parse()，目前還不是正式版。

如果我們想要保持client.chat.completions.create(...)來呼叫API，同時也避免非正式版的不穩定性，或是未來需要修改接口的可能，pydantic模組的BaseModel定義了類別方法(classmethod)model_json_schema，當我們的CalendarEvent繼承了BaseModel，透過這個方式CalendarEvent.model_json_schema()我們可以得到完整的JSON Schema，程式碼修改如下：

from openai import (
    OpenAI,
    APIConnectionError,
    APIResponseValidationError,
    APIStatusError,
    LengthFinishReasonError,
    ContentFilterFinishReasonError,
)

from pydantic import BaseModel, ValidationError
from typing import List


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: List[str]

    class Config:
        extra = "forbid"


client = OpenAI()

try:
    completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Extract the event information."},
            {
                "role": "user",
                "content": "Alice and Bob are going to a science fair on Friday.",
            },
        ],
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": "CalendarEvent",
                "description": "取得活動名稱、時間、參與者",
                "schema": CalendarEvent.model_json_schema(),
                "strict": True,
            },
        },
    )
    print(completion.choices[0].message.content)

    if completion.choices[0].message.refusal:
        # 使用json_schema時，OpenAI可能會拒絕用戶的請求，從refusal欄位取出拒絕原因
        print(completion.choices[0].message.refusal)
    else:
        event: CalendarEvent = CalendarEvent.model_validate_json(
            completion.choices[0].message.content
        )
        print(event.name, event.date, event.participants)
except APIConnectionError as e:
    # Connection error or Request timed out.
    print(e.message)
except APIResponseValidationError as e:
    # OpenAI回傳資料無法驗證，Data returned by API invalid for expected schema.
    print(e.code, e.type, e.message)
    print(e.status_code)
except APIStatusError as e:
    # 4xx及5xx相關錯誤
    print(e.code, e.type, e.message)
    print(e.status_code)
    print(e.request_id)
except LengthFinishReasonError as e:
    # Could not parse response content as the length limit was reached.
    print(e)
except ContentFilterFinishReasonError as e:
    # Could not parse response content as the request was rejected by the content filter.
    print(e)
except ValidationError as e:
    # pydantic模組驗證模型輸出
    print(e.json())

夠過BaseModel的類別方法(classmethod)model_json_schema和model_validate_json，我們保持了程式碼的整潔並對輸出結果再次驗證，避免非預期結果，同時方便後續取值。

除了透過pydantic模組驗證模型輸出(ValidationError)以避免非預期結果外，在執行client.chat.completions.create時，也透過SDK定義的錯誤類別來捕捉錯誤，錯誤有以下類型：

APIConnectionError：Connection error or Request timed out.
APIResponseValidationError：OpenAI回傳資料無法驗證，Data returned by API invalid for expected schema.
APIStatusError：APIStatusError，4xx及5xx相關錯誤。
LengthFinishReasonError：Could not parse response content as the length limit was reached.
ContentFilterFinishReasonError：Could not parse response content as the request was rejected by the content filter.

不同錯誤可取得資訊不同，請參考範例程式碼。

在使用結構化輸出時，OpenAI可能會出於安全考量拒絕用戶的請求(Refusals with Structured Outputs)。此時，API會回傳refusal欄位，值的資料類型為字串，表示拒絕，透過completion.choices[0].message.refusal可以取得拒絕訊息。

C. 使用場景

1. Chain of thought(數學解題步驟)

from pydantic import BaseModel
from openai import OpenAI

class Step(BaseModel):
    explanation: str
    output: str

    class Config:
        extra = "forbid"


class MathReasoning(BaseModel):
    steps: list[Step]
    final_answer: str

    class Config:
        extra = "forbid"

client = OpenAI()
completion = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful math tutor. Guide the user through the solution step by step.",
        },
        {"role": "user", "content": "how can I solve 8x + 7 = -23"},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "MathReasoning",
            "description": "數學計算",
            "schema": MathReasoning.model_json_schema(),
            "strict": True,
        },
    },
)

math_reasoning: MathReasoning = MathReasoning.model_validate_json(
    completion.choices[0].message.content
)
for _ in math_reasoning.steps:
    print(_.explanation)
    print(_.output)

2. 非結構化資料萃取

from pydantic import BaseModel
from openai import OpenAI


class ResearchPaperExtraction(BaseModel):
    title: str
    authors: list[str]
    abstract: str
    keywords: list[str]

    class Config:
        extra = "forbid"


client = OpenAI()
completion = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "system",
            "content": "You are an expert at structured data extraction. You will be given unstructured text from a research paper and should convert it into the given structure.",
        },
        {"role": "user", "content": "..."},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "ResearchPaperExtraction",
            "schema": ResearchPaperExtraction.model_json_schema(),
            "strict": True,
        },
    },
)

research_paper = ResearchPaperExtraction.model_validate_json(
    completion.choices[0].message.content
)
print(research_paper)

3. 生成UI

from enum import Enum
from typing import List
from pydantic import BaseModel
from openai import OpenAI


class UIType(str, Enum):
    div = "div"
    button = "button"
    header = "header"
    section = "section"
    field = "field"
    form = "form"


class Attribute(BaseModel):
    name: str
    value: str

    class Config:
        extra = "forbid"


class UI(BaseModel):
    type: UIType
    label: str
    children: List["UI"]
    attributes: List[Attribute]

    class Config:
        extra = "forbid"


UI.model_rebuild()  # This is required to enable recursive types


class Response(BaseModel):
    ui: UI

    class Config:
        extra = "forbid"


client = OpenAI()
completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "system",
            "content": "You are a UI generator AI. Convert the user input into a UI.",
        },
        {"role": "user", "content": "Make a User Profile Form"},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "Response",
            "description": "生成UI",
            "schema": Response.model_json_schema(),
            "strict": True,
        },
    },
)

ui = Response.model_validate_json(completion.choices[0].message.content)
print(ui)

4. Moderation

from enum import Enum
from typing import Optional
from pydantic import BaseModel
from openai import OpenAI


class Category(str, Enum):
    violence = "violence"
    sexual = "sexual"
    self_harm = "self_harm"


class ContentCompliance(BaseModel):
    is_violating: bool
    category: Optional[Category]
    explanation_if_violating: Optional[str]

    class Config:
        extra = "forbid"


client = OpenAI()
completion = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "system",
            "content": "Determine if the user input violates specific guidelines and explain if they do.",
        },
        {"role": "user", "content": "How do I prepare for a job interview?"},
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "ContentCompliance",
            "description": "Content Compliance",
            "schema": ContentCompliance.model_json_schema(),
            "strict": True,
        },
    },
)

compliance = ContentCompliance.model_validate_json(
    completion.choices[0].message.content
)
print(compliance)

D. 注意事項

1. 確保輸入(`messages`)與定義的結構相關

如果輸入(messages)與定義的結構相關性太低，可能會產生虛構的內容。可以在提示中加入語句，指示模型在檢測到輸入與任務不相符時，返回空參數或指定句子。

2. 例外處理與提示工程

結構化輸出仍可能包含錯誤。如果發現錯誤，可以做以下調整：

指示
在系統指示中提供範例
將任務拆分為更簡單的子任務

詳細調整方式可以參考OpenAI的提示工程指南。

3. 確保JSON結構的一致性

為了防止JSON Schema與程式語言中的相應類型出現偏離，強烈建議使用原生的Pydantic(Python)或zod(Javascript) 套件。

如果偏好直接指定JSON Schema，可以添加CI規則，標記JSON架構或基礎數據對象被編輯的情況；亦可以新增一個CI步驟，從類型定義中自動生成JSON架構(反之亦然)。

4. 結構化輸出(Structure output)支援的JSON Schema

String
Number
Boolean
Integer
Object
Array
Enum
anyOf

5. 根物件(root object)不能使用`anyOf`

架構中的根物件(root object)必須是一個物件，不能使用 anyOf。

6. 所有的欄位都必須是required

在範例中，當我們操作CalendarEvent.model_json_schema()輸出JSON Schema時，required中已經包含所有欄位：

{
    'additionalProperties': False, 
    'properties': {
        'name': {'title': 'Name', 'type': 'string'}, 
        'date': {'title': 'Date', 'type': 'string'}, 
        'participants': {'items': {'type': 'string'}, 'title': 'Participants', 'type': 'array'}
    }, 
    'required': ['name', 'date', 'participants'], 
    'title': 'CalendarEvent', 
    'type': 'object'
}

如果不透過套件處理JSON Schema，需要額外注意。

如果需要表示一個欄位可以空時，Python中可以使用Union來實現：

from typing import Union

class CalendarEvent(BaseModel):
    name: Union[str, None]
    date: str
    participants: List[str]

    class Config:
        extra = "forbid"

7. 結構限制

物件屬性最多100個，巢狀層級最多可達5層。

8. Schema大小制

在架構中，所有屬性名稱、定義名稱、枚舉(enum)和常量值的總字串長度不得超過15,000個token。
所有枚舉(enum)只能有500個，超過250個則枚舉總長度不能超過7500個token。

9. `additionalProperties`無論何時皆為`false`

Python使用pydantic模組時，透過class Config設定。

10. Key的順序

使用結構化輸出(Structured output)時，輸出將按照架構中鍵的順序進行排列。

某些類型特定的關鍵字尚不支援：

對於字串：minLength, maxLength, pattern, format
對於數字：minimum, maximum, multipleOf
對於對象：patternProperties, unevaluatedProperties, propertyNames, minProperties, maxProperties
對於數組：unevaluatedItems, contains, minContains, maxContains, minItems, maxItems, uniqueItems

設定"strict": True並使用不支援的JSON Schema，API會回傳錯誤(檢查refusal)。

E. Tool Calling中的JSON Schema

呼叫Chat Completion API時有一個tools參數，可以傳入相關工具，讓OpenAI判斷是否需要呼叫工具確保問題可以得到更精確的解答。其中定義呼叫工具所需參數時使用的也是JSON Schema，在直接使用Python Pydantic模組的部分支援仍在beta版本時，可以透過文章的方式，同樣使用Pydantic模組定義結構，再透過pydantic模組提供的類別方法(model_json_schema)來輸出JSON Schema。

Tool Calling的用法可以參考〖打造專屬ChatGPT：透過OpenAI Tool Calling深度整合資源〗這篇文章，當你了解到Tool Calling時，其實你會發現Tool Calling也能做到response_format={"type": "json_schema", ...}在做的事情，那個該如何選擇呢？官方文件的建議如下：

If you are connecting the model to tools, functions, data, etc. in your system, then you should use function calling.
If you want to structure the model’s output when it responds to the user, then you should use a structured.

簡單來說就是，萃取出資料後如果有需要呼叫其他工具(包含但不限於外部資料、內部資料庫…)，就使用Tool Calling(原本叫做function calling)，如果單純想要抓出特定的資訊就透過response_format，結合這兩種用法，可以讓程式變得聰明且靈活，更好地應對各種情境。

F. 總結

這篇文章討論了利用OpenAI的ChatGPT進行結構化輸出(Structured output)的方法，強調了如何將對話中的非結構化數據轉化為特定的結構化信息。

同時也介紹了應用於不同使用場景的示例，例如數學問題解答過程、非結構化數據處理及用戶界面生成等，並提供了實作原始碼和注意事項。

OpenAI的Python SDK雖然支援直接使用pydantic模組來撰寫JSON Schema，但直接的使用仍能在beta階段，目前正式版是直接撰寫JSON Schema，如果覺得這個方式太麻煩，也可以模仿文章中的用法，仍然透過pydantic模組處理JSON Schema，並藉由pydantic模組提供的類別方法來輸出或驗證模型回傳內容。

G. 相關主題

打造專屬ChatGPT：透過OpenAI Tool Calling深度整合資源

打造專屬ChatGPT：OpenAI Chat Completion API參數解析(20241111更新)

打造專屬ChatGPT：利用LLM進行結構化輸出(Structured Outputs)

A. 前言

B. 結構化輸出(Structured Outputs)使用方式

C. 使用場景

1. Chain of thought(數學解題步驟)

2. 非結構化資料萃取

3. 生成UI

4. Moderation

D. 注意事項

1. 確保輸入(messages)與定義的結構相關

2. 例外處理與提示工程

3. 確保JSON結構的一致性

4. 結構化輸出(Structure output)支援的JSON Schema

5. 根物件(root object)不能使用anyOf

6. 所有的欄位都必須是required

7. 結構限制

8. Schema大小制

9. additionalProperties無論何時皆為false

10. Key的順序

E. Tool Calling中的JSON Schema

F. 總結

G. 相關主題

Leave a Reply 取消回覆

1. 確保輸入(`messages`)與定義的結構相關

5. 根物件(root object)不能使用`anyOf`

9. `additionalProperties`無論何時皆為`false`

Leave a Reply
取消回覆