Ollama 本地部署多模态模型调用指南

本文介绍了两种调用Ollama多模态模型API的方法：基础生成API和聊天式API。两种方法都支持处理包含图片的请求，通过Base64编码传输图像数据。方法一使用/api/generate端点进行直接推理，方法二通过/api/chat实现多轮对话交互。文章提供了详细的Python代码示例，包含错误处理、参数优化和注释说明，并建议了图片批处理、超时设置等最佳实践。两种方法分别适合简单推理和需要会话保

SuperCreators

1356人浏览 · 2025-08-01 11:37:20

SuperCreators · 2025-08-01 11:37:20 发布

本文介绍两种调用 Ollama 多模态模型的方法，并对代码进行优化和注解。Ollama 支持通过 API 与视觉语言模型（如 qwen2.5vl）交互，可处理包含图片的请求。

方法一：基础生成 API 调用

import json
import requests
import base64

def invoke_llm(model='qwen2.5vl:latest', host='127.0.0.1', image_path=None):
    """
    调用 Ollama 生成 API 进行多模态推理
    参数：
        model: 模型名称
        host: Ollama 服务地址
        image_path: 图片文件路径
    返回：
        API 响应结果
    """
    url = f'http://{host}:11434/api/generate'
    
    # 图片转 Base64
    encoded_image = None
    if image_path:
        with open(image_path, "rb") as image_file:
            encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
    
    data = {
        "model": model,
        "prompt": '描述这张图片的内容',  # 更清晰的提示词
        "images": [encoded_image] if encoded_image else [],
        "stream": False
    }
    
    try:
        response = requests.post(url, json=data)
        response.raise_for_status()  # 检查请求是否成功
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"请求失败: {e}")
        return None

# 示例调用
res = invoke_llm(image_path="example.jpg")
print(res)

注解说明：

使用函数封装核心逻辑，提高代码复用性
添加错误处理机制
图片路径作为可选参数，更灵活
使用 f-string 格式化 URL
注释详细说明函数功能

方法二：聊天式 API 调用

def chat_with_image(model='qwen2.5vl:latest', host='127.0.0.1', image_path=None, prompt="描述这张图片："):
    """
    使用聊天式 API 与多模态模型交互
    参数：
        model: 模型名称
        host: Ollama 服务地址
        image_path: 图片文件路径
        prompt: 用户提示词
    返回：
        API 响应结果
    """
    url = f'http://{host}:11434/api/chat'
    headers = {"Content-Type": "application/json"}
    
    # 图片转 Base64
    encoded_image = None
    if image_path:
        with open(image_path, "rb") as image_file:
            encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
    
    data = {
        "model": model,
        "messages": [
            {
                "role": "user",
                "content": prompt,
                "images": [encoded_image] if encoded_image else []
            }
        ],
        "stream": False
    }
    
    try:
        response = requests.post(url, json=data, headers=headers)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"请求失败: {e}")
        return None

# 示例调用
result = chat_with_image(image_path="example.jpg", prompt="图片中有多少人？")
print(result)

优化点：

分离图片处理逻辑，避免代码重复
提示词可作为参数传入，更灵活
添加 HTTP 头部信息确保正确的内容类型
更完整的错误处理

通用工具函数

def image_to_base64(image_path):
    """将图片转换为 Base64 编码字符串"""
    try:
        with open(image_path, "rb") as image_file:
            return base64.b64encode(image_file.read()).decode("utf-8")
    except IOError as e:
        print(f"无法读取图片文件: {e}")
        return None