怎么通过OpenAI API调用其多模态大模型（GPT-4o）

现在只要有额度，大家都可以调用OpenAI的多模态大模型了，例如GPT-4o和GPT-4 Turbo，我一年多前总结过一些OpenAI API的用法，发现现在稍微更新了一下。其实也是比较简单的，就是本地图片需要用base 64先编码，然后再上传。当然，大家用的时候还是要注意花费，现在感觉还是有点贵的。

蛐蛐蛐

10540人浏览 · 2024-05-19 20:57:32

蛐蛐蛐 · 2024-05-19 20:57:32 发布

现在只要有额度，大家都可以调用OpenAI的多模态大模型了，例如GPT-4o和GPT-4 Turbo，我一年多前总结过一些OpenAI API的用法，发现现在稍微更新了一下。主要参考了这里：https://platform.openai.com/docs/guides/vision

其实也是比较简单的，就是本地图片需要用base 64先编码，然后再上传。我举个例子，大家应该一看就清楚（图片放在Processed文件夹里面）：

from openai import OpenAI
import os
import base64

client = OpenAI(
    api_key="Your_API_Key"
)

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

fig_path='Processed'

for filename in os.listdir(fig_path):
    if filename.endswith('.png'):
       image_path=os.path.join(fig_path, filename)
       print(image_path)
       base64_image = encode_image(image_path)
       messages=[
        {
            "role": "user", 
             "content": [
                {"type":"text", "text":"What's in this image?"},
                {
                   "type":"image_url",
                   "image_url":{
                      "url":f"data:image/png;base64,{base64_image}"
                      }
                }
            ]
        }
        ]
       completion = client.chat.completions.create(
          model="gpt-4o",
          messages=messages
        )
       chat_response = completion
       answer = chat_response.choices[0].message.content
       print(f'ChatGPT: {answer}')

当然，大家用的时候还是要注意花费，现在感觉还是有点贵的。