Introduction¶
This article introduces how to deploy the DeepSeek-R1 service using the simplest commands. It is assumed that Anaconda is already installed, and we use the vllm framework, which can be easily deployed in China.
Deployment¶
- Create a virtual environment
conda create -n vllm python=3.11 -y
- Activate the virtual environment
conda activate vllm
- Install PyTorch framework
pip install torch torchvision torchaudio
- Install vllm and modelscope
pip install vllm
pip install modelscope
- Specify using modelscope to download the model
export VLLM_USE_MODELSCOPE=True
- Start the service. You can modify the model according to your needs. DeepSeek-R1 Address. The
tensor-parallel-sizeparameter specifies the number of GPUs to use.
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --tensor-parallel-size 1 --max-model-len 32768 --enforce-eager
Invocation¶
To call the service using Python:
from openai import OpenAI
client = OpenAI(base_url="http://192.168.0.12:11434/v1",
api_key="key")
messages = [{"role": "user", "content": "你好"}]
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
messages=messages,
stream=True
)
for chunk in response:
delta = chunk.choices[0].delta
delta_content = delta.content
if delta_content is not None:
print(delta_content, end='')