Introduction¶

This article introduces how to deploy the DeepSeek-R1 service using the simplest commands. It is assumed that Anaconda is already installed, and we use the vllm framework, which can be easily deployed in China.

Deployment¶

Create a virtual environment

conda create -n vllm python=3.11 -y

Activate the virtual environment

conda activate vllm

Install PyTorch framework

pip install torch torchvision torchaudio

Install vllm and modelscope

pip install vllm
pip install modelscope

Specify using modelscope to download the model

export VLLM_USE_MODELSCOPE=True

Start the service. You can modify the model according to your needs. DeepSeek-R1 Address. The tensor-parallel-size parameter specifies the number of GPUs to use.

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --tensor-parallel-size 1 --max-model-len 32768 --enforce-eager

Invocation¶

To call the service using Python:

from openai import OpenAI

client = OpenAI(base_url="http://192.168.0.12:11434/v1",
                api_key="key")

messages = [{"role": "user", "content": "你好"}]
response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B",
    messages=messages,
    stream=True
)

for chunk in response:
    delta = chunk.choices[0].delta
    delta_content = delta.content
    if delta_content is not None:
        print(delta_content, end='')

Introduction¶

Deployment¶

Invocation¶

Related Articles