围绕 Acceleration 的使用说明（人工智能）

分类：人工智能发布时间 2026-03-16 15:36 路浏览 54 路点赞 0

#人工智能 #模型训练 #微调

Acceleration¶

LLaMA-Factory supports multiple acceleration techniques, including: FlashAttention, Unsloth, Liger Kernel.

FlashAttention¶

FlashAttention can speed up attention mechanism computation while reducing memory usage.

If you want to use FlashAttention, please add the following parameters to the training configuration file when starting training:

flash_attn:fa2

Unsloth¶

The Unsloth framework supports large language models such as Llama, Mistral, Phi-3, Gemma, Yi, DeepSeek, Qwen, and supports 4-bit and 16-bit QLoRA/LoRA fine-tuning. This framework improves computation speed while reducing memory usage.

If you want to use Unsloth, please add the following parameters to the training configuration file when starting training:

use_unsloth:True

Liger Kernel¶

Liger Kernel is a performance optimization framework for large language model training that can effectively improve throughput and reduce memory usage.

If you want to use Liger Kernel, please add the following parameters to the training configuration file when starting training:

enable_liger_kernel:True

上一篇：路由 (routing) 功能简析（上）（免费机场）
下一篇：围绕 Install SGLang 的使用说明（人工智能）

还没有评论，欢迎抢沙发。

发表评论

会员中心

会员登录注册账号