[ggerganov/llama.cpp]将 LORA 转换为 ggml 转换为 gguf

大家好你们好，

我有一个 Huggingface 模型（https://huggingface.co/andreabac3/Fauno-Italian-LLM-13B），我想将其转换为 gguf。

这是一个 LORA 模型，我可以使用 Convert-lora-to-ggml.py 将其转换为 ggml。

现在，当我尝试将其转换为 gguf 时，我尝试使用 Convert-llama-ggml-to-gguf.py 但 ggml 模型的幻数（通过第一次转换生成）有一个幻数 (b'algg')不被识别

我究竟做错了什么？

谢谢卢卡

xcottos

您不需要将 LoRA 从 GGML 转换为 GGUF。我认为你可能做错的是尝试用--modelor加载 LoRA -m？LoRA 的工作方式是加载基本模型并在其上应用 LoRA。因此，除了您链接的内容之外，您还需要 GGUF 中的基本模型来应用 LoRA。-m base_model.gguf --lora your_lora.bin然后您将在实际尝试加载模型时执行此操作。

编辑：我认为这应该作为基本模型：https ://huggingface.co/TheBloke/LLaMA-13b-GGUF

KerfuffleV2

谢谢Kerfuffle，让我处理你的答案（我是法学硕士的新手），一旦取得进展我会回复你

再次感谢卢卡的解释

xcottos

你好，我也面临着同样的问题。

我尝试使用不同的 LoRA 适配器，但现在，我按照之前的对话下载了两个模型。我将TheBloke/LLaMA-13b-GGUF放入 llama.cpp/models 目录，将andreabac3/Fauno-Italian-LLM-13B放入 llama.cpp/models/loras 目录。之后，我运行主命令如下：

./main -m models/llama-13b.Q8_0.gguf --lora models/loras/adapter_model.bin --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n 256 -p "The conversation between human and AI assistant.\n[|Human|] Qual'è il significato della vita?\n[|AI|] "

然而，结果如下（为简洁起见，省略了先前的输出）：

....................................................................................................
llama_new_context_with_model: n_ctx      = 4096
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size  = 3200.00 MB
llama_build_graph: non-view tensors processed: 924/924
llama_new_context_with_model: compute buffer total size = 364.63 MB
llama_apply_lora_from_file_internal: applying lora adapter from 'models/loras/adapter_model.bin' - please wait ...
llama_apply_lora_from_file_internal: unsupported file version
llama_init_from_gpt_params: error: failed to apply lora adapter
main: error: unable to load model

我正在运行最新的代码，并在具有 Ubuntu:22.04 映像的 Docker 容器上运行它。/# 制作 --version | head -1 GNU Make 4.3 /# g++ --version | 头-1 g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

如果我错过了任何文档并且没有正确使用它，我深表歉意。如果我能够在 llama.cpp 中成功使用 LoRA 适配器，这将对我的项目产生重大影响。我很感谢这个存储库和提供的支持。任何帮助将不胜感激。

yuki-tomita-127

@yuki-富田-127

你好，我也面临着同样的问题。

您是否已使用转换 LoRA convert-llama-ggml-to-gguf.py？我认为你的问题与其他人不同，因为听起来你错过了这一步。

KerfuffleV2

@KerfuffleV2

感谢您的答复。

对于我之前的帖子中缺乏细节，我深表歉意。我曾尝试使用convert-llama-ggml-to-gguf.py. 以下是我已采取的步骤，但在从 LoRA 转换为 GGML 到 GGUF 时遇到错误。

我曾经convert-lora-to-ggml.py改装过原来的LoRA适配器。

python convert-lora-to-ggml.py models/loras

输出：

<Output omitted>
Converted models/loras/adapter_config.json and models/loras/adapter_model.bin to models/loras/ggml-adapter-model.bin

这似乎已经成功了。

然后我将convert-llama-ggml-to-gguf.pyLoRA 适配器从 ggml 转换为 gguf。

python convert-llama-ggml-to-gguf.py --input models/loras/ggml-adapter-model.bin --output models/loras/ggml-adapter-model.gguf

输出：


* Using config: Namespace(input=PosixPath('models/loras/ggml-adapter-model.bin'), output=PosixPath('models/loras/ggml-adapter-model.gguf'), name=None, desc=None, gqa=1, eps='5.0e-06', context_length=2048, model_metadata_dir=None, vocab_dir=None, vocabtype='spm')

=== 警告 === 请注意，此转换脚本是尽力而为的。如果可能，请使用本机 GGUF 模型。===警告===

注意：如果转换LLaMA2，则需要指定“--eps 1e-5”。70B型号还需要“--gqa 8”。
扫描 GGML 输入文件 Traceback（最近一次调用最后一次）：文件“/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py”，第 453 行，位于
main() 文件“/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py”，第 430 行，主偏移 = model.load(data, 0) 文件“/workspaces/llama.cpp/convert -llama-ggml-to-gguf.py”，第 190 行，加载偏移 += self.validate_header(data, offset) 文件“/workspaces/llama.cpp/convert-llama-ggml-to-gguf.py”，第 175 行，在 validate_header 中引发 ValueError(f"意外的文件魔法 {magic!r}！这看起来不像 GGML 格式文件。") ValueError: 意外的文件魔法 b'algg'！这看起来不像 GGML 格式文件。

当我这样做时会发生此错误，我相信这与@xcottos 经历的结果相同。

yuki-tomita-127

@yuki-富田-127

噢，非常抱歉。我本来想写convert-lora-to-ggml.py在那里。我的错。convert-llama-ggml-to-gguf.py用于将实际模型从 GGML 转换为 GGUF。

因此，需要明确的是，您将使用convert-lora-to-ggml.pyLoRA 将原始 HuggingFace 格式（或其他格式）转换为正确的格式。之后，您不需要任何进一步的转换步骤（例如从 GGML 到 GGUF）。convert-lora-to-ggml.py您可以使用--lora示例main等加载输出。

KerfuffleV2

@yuki-富田-127

./main -m models/llama-13b.Q8_0.gguf --lora models/loras/adapter_model.bin --color -c 4096 --temp 0.7 --repeat_penalty >1.1 -n 256 -p "The conversation between human and AI assistant.\n[|Human|] Qual'è il significato della vita?\n[|AI|] "
在您的启动命令中，您不应该更改models/loras/adapter_model.bin为吗models/loras/ggml-adapter-model.bin？

Galunid

在你的启动命令中，你不应该改变吗

他们的问题是我不小心告诉了他们错误的脚本，所以他们根本无法生成转换后的适配器。

KerfuffleV2

@KerfuffleV2 @Galuid

事实上，将的输出传递convert-lora-to-ggml.py给命令--lora中的选项main是有效的！我不明白文件格式转换。我非常感谢您提供的一系列答案！

另外，如果我稍微接管了这个话题，我想表示歉意；@xcottos 是发起这一讨论的人。对于那个很抱歉。

yuki-tomita-127

[ggerganov/llama.cpp]将 LORA 转换为 ggml 转换为 gguf

回答

相关问题