Skip to main content
Back to top
Ctrl
+
K
安装
Install RTP-LLM
发布版本
RTP-LLM 0.2.0
基本用法
Sending Requests
OpenAI APIs - Completions
OpenAI APIs - Vision
OpenAI APIs - Embedding
RTP-LLM Native APIs
后端教程
deepseek
qwen-moe
kimi
高级后端配置
ServerArgs
采样配置
Attention Backend
支持的模型
Large Language Models
Multimodal Language Models
Embedding Models
How to Support New Models
高级功能
Speculative Decoding
ReuseCache
Tool and Function Calling
Quantization
LoRA Serving
PD Disaggregation
LogitsProcessor
RTP-LLM 路由器
FlexLB (Flexible Load Balancer) - Master Role
Benchmark
RTP-LLM Performance Benchmark Tool
参考文献
General Guidance
Developer Reference
CLI 使用指南
Documentation Manual
Repository
Show source
Suggest edit
Open issue
.md
.pdf
目录
目录