vllm(大模型应用与知识增强项目|人工智能)

vllm(大模型应用与知识增强项目|人工智能)

项目简介:A high-throughput and memory-efficient inference and serving engine for LLMs