gh_mirrors/ll/llama项目与数据库集成：推理结果存储方案

余怡桔Solomon

685人浏览 · 2025-10-06 06:14:29

余怡桔Solomon · 2025-10-06 06:14:29 发布

gh_mirrors/ll/llama项目与数据库集成：推理结果存储方案

【免费下载链接】llama Inference code for LLaMA models 项目地址: https://gitcode.com/gh_mirrors/ll/llama

引言

在使用gh_mirrors/ll/llama项目进行文本生成时，推理结果通常只是临时输出到控制台，无法满足后续分析、审计或二次使用的需求。本文将详细介绍如何将llama模型的推理结果高效存储到数据库中，解决数据持久化问题。

方案概述

本方案通过扩展llama项目的生成代码，在文本生成完成后自动将结果存储到SQLite数据库中。主要涉及以下几个步骤：

定义数据库模型结构
修改生成代码，添加数据库存储逻辑
实现数据存储接口
测试验证

数据库设计

ER图

mermaid

数据表结构

表名	字段	类型	说明
completion	id	INTEGER	主键
	type	TEXT	类型(text/chat)
	prompt	TEXT	输入提示
	generation	TEXT	生成结果
	created_at	DATETIME	创建时间
	temperature	REAL	温度参数
	top_p	REAL	top_p参数
	max_gen_len	INTEGER	最大生成长度
chat	id	INTEGER	主键
	completion_id	INTEGER	关联completion表的外键
	role	TEXT	角色(system/user/assistant)
	content	TEXT	对话内容

实现步骤

1. 创建数据库模块

首先，在项目根目录下创建一个新的数据库模块文件database.py：

import sqlite3
import datetime
import os
from typing import List, Dict, Any

class CompletionDB:
    def __init__(self, db_path: str = "llama_completions.db"):
        """初始化数据库连接"""
        self.db_path = db_path
        self._create_tables()
    
    def _create_tables(self):
        """创建数据表"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        # 创建completion表
        cursor.execute('''
        CREATE TABLE IF NOT EXISTS completion (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            type TEXT NOT NULL,
            prompt TEXT NOT NULL,
            generation TEXT NOT NULL,
            created_at DATETIME NOT NULL,
            temperature REAL NOT NULL,
            top_p REAL NOT NULL,
            max_gen_len INTEGER NOT NULL
        )
        ''')
        
        # 创建chat表
        cursor.execute('''
        CREATE TABLE IF NOT EXISTS chat (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            completion_id INTEGER NOT NULL,
            role TEXT NOT NULL,
            content TEXT NOT NULL,
            FOREIGN KEY (completion_id) REFERENCES completion (id)
        )
        ''')
        
        conn.commit()
        conn.close()
    
    def save_text_completion(self, prompt: str, generation: str, 
                            temperature: float, top_p: float, max_gen_len: int) -> int:
        """保存文本补全结果"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        created_at = datetime.datetime.now().isoformat()
        
        cursor.execute('''
        INSERT INTO completion (type, prompt, generation, created_at, temperature, top_p, max_gen_len)
        VALUES (?, ?, ?, ?, ?, ?, ?)
        ''', ("text", prompt, generation, created_at, temperature, top_p, max_gen_len))
        
        completion_id = cursor.lastrowid
        conn.commit()
        conn.close()
        
        return completion_id
    
    def save_chat_completion(self, dialog: List[Dict[str, str]], generation: Dict[str, str],
                            temperature: float, top_p: float, max_gen_len: int) -> int:
        """保存聊天补全结果"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        created_at = datetime.datetime.now().isoformat()
        prompt = str(dialog)
        generation_text = generation["content"]
        
        cursor.execute('''
        INSERT INTO completion (type, prompt, generation, created_at, temperature, top_p, max_gen_len)
        VALUES (?, ?, ?, ?, ?, ?, ?)
        ''', ("chat", prompt, generation_text, created_at, temperature, top_p, max_gen_len))
        
        completion_id = cursor.lastrowid
        
        # 保存对话历史
        for msg in dialog:
            cursor.execute('''
            INSERT INTO chat (completion_id, role, content)
            VALUES (?, ?, ?)
            ''', (completion_id, msg["role"], msg["content"]))
        
        # 保存生成的回复
        cursor.execute('''
        INSERT INTO chat (completion_id, role, content)
        VALUES (?, ?, ?)
        ''', (completion_id, generation["role"], generation["content"]))
        
        conn.commit()
        conn.close()
        
        return completion_id
    
    def get_completion(self, completion_id: int) -> Dict[str, Any]:
        """获取补全结果详情"""
        conn = sqlite3.connect(self.db_path)
        cursor = conn.cursor()
        
        cursor.execute('''
        SELECT * FROM completion WHERE id = ?
        ''', (completion_id,))
        
        completion = cursor.fetchone()
        
        if not completion:
            conn.close()
            return None
        
        result = {
            "id": completion[0],
            "type": completion[1],
            "prompt": completion[2],
            "generation": completion[3],
            "created_at": completion[4],
            "temperature": completion[5],
            "top_p": completion[6],
            "max_gen_len": completion[7]
        }
        
        # 如果是chat类型，获取对话历史
        if result["type"] == "chat":
            cursor.execute('''
            SELECT role, content FROM chat WHERE completion_id = ?
            ''', (completion_id,))
            
            result["chat"] = [{"role": row[0], "content": row[1]} for row in cursor.fetchall()]
        
        conn.close()
        return result

修改生成代码

修改文本补全代码

编辑example_text_completion.py文件，添加数据库存储功能：

# 在文件顶部导入数据库模块
from database import CompletionDB

# 在main函数最后添加结果保存逻辑
# 初始化数据库
db = CompletionDB()

# 保存结果
for prompt, result in zip(prompts, results):
    print(prompt)
    print(f"> {result['generation']}")
    print("\n==================================\n")
    
    # 保存到数据库
    db.save_text_completion(
        prompt=prompt,
        generation=result["generation"],
        temperature=temperature,
        top_p=top_p,
        max_gen_len=max_gen_len
    )

修改聊天补全代码

编辑example_chat_completion.py文件，添加数据库存储功能：

# 在文件顶部导入数据库模块
from database import CompletionDB

# 在main函数最后添加结果保存逻辑
# 初始化数据库
db = CompletionDB()

# 保存结果
for dialog, result in zip(dialogs, results):
    for msg in dialog:
        print(f"{msg['role'].capitalize()}: {msg['content']}\n")
    print(
        f"> {result['generation']['role'].capitalize()}: {result['generation']['content']}"
    )
    print("\n==================================\n")
    
    # 保存到数据库
    db.save_chat_completion(
        dialog=dialog,
        generation=result["generation"],
        temperature=temperature,
        top_p=top_p,
        max_gen_len=max_gen_len
    )

扩展生成模块

为了使数据库存储功能更通用，可以直接扩展llama/generation.py中的Llama类，添加数据库存储能力：

# 在文件顶部导入必要的模块
import datetime
from database import CompletionDB

# 在Llama类中添加数据库相关方法
class Llama:
    # ... 现有代码 ...
    
    def __init__(self, model: Transformer, tokenizer: Tokenizer, db_path: str = "llama_completions.db"):
        self.model = model
        self.tokenizer = tokenizer
        self.db = CompletionDB(db_path)  # 初始化数据库连接
    
    def text_completion(
        self,
        prompts: List[str],
        temperature: float = 0.6,
        top_p: float = 0.9,
        max_gen_len: Optional[int] = None,
        logprobs: bool = False,
        echo: bool = False,
        save_to_db: bool = True,  # 添加保存到数据库的选项
    ) -> List[CompletionPrediction]:
        """
        Perform text completion for a list of prompts using the language generation model.
        
        Args:
            # ... 现有参数 ...
            save_to_db (bool, optional): Whether to save results to database. Defaults to True.
        """
        # ... 现有代码 ...
        
        completions = [{"generation": self.tokenizer.decode(t)} for t in generation_tokens]
        
        # 添加结果保存逻辑
        if save_to_db:
            for prompt, completion in zip(prompts, completions):
                self.db.save_text_completion(
                    prompt=prompt,
                    generation=completion["generation"],
                    temperature=temperature,
                    top_p=top_p,
                    max_gen_len=max_gen_len or self.model.params.max_seq_len - 1
                )
        
        return completions
    
    def chat_completion(
        self,
        dialogs: List[Dialog],
        temperature: float = 0.6,
        top_p: float = 0.9,
        max_gen_len: Optional[int] = None,
        logprobs: bool = False,
        save_to_db: bool = True,  # 添加保存到数据库的选项
    ) -> List[ChatPrediction]:
        """
        Generate assistant responses for a list of conversational dialogs.
        
        Args:
            # ... 现有参数 ...
            save_to_db (bool, optional): Whether to save results to database. Defaults to True.
        """
        # ... 现有代码 ...
        
        # 添加结果保存逻辑
        if save_to_db:
            for dialog, prediction in zip(dialogs, predictions):
                self.db.save_chat_completion(
                    dialog=dialog,
                    generation=prediction["generation"],
                    temperature=temperature,
                    top_p=top_p,
                    max_gen_len=max_gen_len or self.model.params.max_seq_len - 1
                )
        
        return predictions

使用示例

文本补全并存储

# 使用修改后的文本补全功能
prompts = [
    "I believe the meaning of life is",
    "Simply put, the theory of relativity states that ",
]

results = generator.text_completion(
    prompts,
    max_gen_len=64,
    temperature=0.6,
    top_p=0.9,
    save_to_db=True  # 启用数据库存储
)

聊天补全并存储

# 使用修改后的聊天补全功能
dialogs = [
    [{"role": "user", "content": "what is the recipe of mayonnaise?"}],
    [
        {"role": "user", "content": "I am going to Paris, what should I see?"},
        {
            "role": "assistant",
            "content": "Paris, the capital of France, is known for its stunning architecture...",
        },
        {"role": "user", "content": "What is so great about #1?"},
    ],
]

results = generator.chat_completion(
    dialogs,
    max_gen_len=128,
    temperature=0.7,
    top_p=0.9,
    save_to_db=True  # 启用数据库存储
)

总结

通过本文介绍的方案，我们成功将llama模型的推理结果存储到了SQLite数据库中，实现了以下功能：

结构化存储推理结果，便于后续查询和分析
保存完整的对话历史，支持上下文回溯
记录生成参数，便于复现实验结果

此方案可以根据实际需求进一步扩展，例如：

使用更强大的数据库如PostgreSQL或MySQL
添加索引以提高查询性能
实现数据备份和同步功能
开发简单的Web界面用于数据管理和查询

通过这种方式，llama项目的实用性得到了显著提升，使其更适合用于需要数据持久化的生产环境。

参考资料

【免费下载链接】llama Inference code for LLaMA models 项目地址: https://gitcode.com/gh_mirrors/ll/llama

智能体开发者社区

中国智能体开发者社区，聚焦智能体与大模型开发，提供前沿资讯、实用工具链、开源项目及行业案例。通过技术沙龙、开发者大赛等活动，促进经验交流与协作，助力开发者快速构建创新智能应用。

更多推荐

DeepSeek联合北大开源DSpark推测解码框架：半自回归架构与置信度调度的工程实践

6月27日，DeepSeek团队联合北京大学发布了一篇分量不轻的论文——《DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation》。创始人梁文锋亲自署名，这也是DeepSeek在完成500亿融资后交出的第一份技术答卷。跟新模型和参数竞赛无关——这是一个工程级推理加速框架。核心数据：同

智能体开发者社区

我一个小白用AI在魔搭社区notebook上搭建AI智能助手的过程

智能体开发者社区

Go语言正式进军AI Agent：官方MCP SDK与ADK框架深度解析

该SDK内置高效编解码器，支持PB级数据流的低延迟传输，特别适合实时AI交互场景。协议栈采用分层设计，物理层支持WebSocket和QUIC双协议栈，应用层提供消息队列和流式处理两种模式。性能测试显示，在同等硬件条件下，Go实现的吞吐量比Java版本高30%，延迟降低45%。内存管理采用对象池技术，复用频繁创建的临时对象。MCP SDK提供多模态通信协议支持，ADK框架则为开发者提供构建AI Ag