安装命令行工具
以下所有操作都是通过dbgpt命令来完成。 要使用dbgpt命令,首先需要安装DB-GPT项目, 可以通过如下命令安装
$ pip install -e ".[default]"
同时也可以通过脚本模式来使用
$ python dbgpt/cli/cli_scripts.py
启动Model Controller
$ dbgpt start controller
默认情况下,Model Server会启动在8000端口
启动Model Worker
:::color2 启动chatglm2-6b模型Worker
:::
dbgpt start worker --model_name chatglm2-6b \
--model_path /app/models/chatglm2-6b \
--port 8001 \
--controller_addr http://127.0.0.1:8000
:::color2 启动vicuna-13b-v1.5模型Worker
:::
dbgpt start worker --model_name vicuna-13b-v1.5 \
--model_path /app/models/vicuna-13b-v1.5 \
--port 8002 \
--controller_addr http://127.0.0.1:8000
:::danger ⚠️ 注意:确保使用您自己的模型名称和模型路径。
:::
启动Embedding模型服务
dbgpt start worker --model_name text2vec \
--model_path /app/models/text2vec-large-chinese \
--worker_type text2vec \
--port 8003 \
--controller_addr http://127.0.0.1:8000
:::danger
⚠️ 注意:确保使用您自己的模型名称和模型路径。:::
:::success 查看并检查已部署模型
:::
$ dbgpt model list
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| Model Name | Model Type | Host | Port | Healthy | Enabled | Prompt Template | Last Heartbeat |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
| chatglm2-6b | llm | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.287654 |
| WorkerManager | service | 172.17.0.2 | 8001 | True | True | | 2023-09-12T23:04:31.286668 |
| WorkerManager | service | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.845617 |
| WorkerManager | service | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.598439 |
| text2vec | text2vec | 172.17.0.2 | 8003 | True | True | | 2023-09-12T23:04:29.844796 |
| vicuna-13b-v1.5 | llm | 172.17.0.2 | 8002 | True | True | | 2023-09-12T23:04:24.597775 |
+-----------------+------------+------------+------+---------+---------+-----------------+----------------------------+
使用模型服务
如上部署好的模型服务,可以通过dbgpt_server来使用。首先修改.env配置文件来更改连接模型地址。
LLM_MODEL=vicuna-13b-v1.5
# The current default MODEL_SERVER address is the address of the Model Controller
MODEL_SERVER=http://127.0.0.1:8000
启动Webserver
dbgpt start webserver --light
—light 表示不启动嵌入式模型服务。
或者直接可以通过命令制定模型的方式启动。
LLM_MODEL=chatglm2-6b dbgpt start webserver --light
使用示例
这是一个通过集群方式部署Embedding模型并使用的完整命令行样例程序。
- 第一步: 启动
Controller Server
dbgpt start controller
- 第二步: 启动向量模型Worker
# 2. second start embedding model worker
dbgpt start worker --model_name text2vec \
--model_path /app/models/text2vec-large-chinese \
--worker_type text2vec \
--port 8003 \
--controller_addr http://127.0.0.1:8000
- 第三步:启动
apiserver
dbgpt start apiserver --controller_addr http://127.0.0.1:8000 --api_keys EMPTY
- 第四步: 测试使用
curl http://127.0.0.1:8100/api/v1/embeddings \
-H "Authorization: Bearer EMPTY" \
-H "Content-Type: application/json" \
-d '{
"model": "text2vec",
"input": "Hello world!"
}'
- 通过包的方式代码引入使用
from dbgpt.rag.embedding import OpenAPIEmbeddings
openai_embeddings = OpenAPIEmbeddings(
api_url="http://localhost:8100/api/v1/embeddings",
api_key="EMPTY",
model_name="text2vec",
)
texts = ["Hello, world!", "How are you?"]
openai_embeddings.embed_documents(texts)
命令行用法
更多关于命令行的使用,可以通过查看命令行帮助进行使用, 如下是一个参考样例
:::color1 查看dbgpt帮助 dbgpt —help
:::
dbgpt --help
Already connect 'dbgpt'
Usage: dbgpt [OPTIONS] COMMAND [ARGS]...
Options:
--log-level TEXT Log level
--version Show the version and exit.
--help Show this message and exit.
Commands:
install Install dependencies, plugins, etc.
knowledge Knowledge command line tool
model Clients that manage model serving
start Start specific server.
stop Start specific server.
trace Analyze and visualize trace spans.
:::color1 查看dbgpt 启动命令 dbgpt start —help
:::
dbgpt start --help
Already connect 'dbgpt'
Usage: dbgpt start [OPTIONS] COMMAND [ARGS]...
Start specific server.
Options:
--help Show this message and exit.
Commands:
apiserver Start apiserver
controller Start model controller
webserver Start webserver(dbgpt_server.py)
worker Start model worker
(dbgpt_env) magic@B-4TMH9N3X-2120 ~ %
:::color1 查看dbgpt 启动模型服务帮助命令 dbgpt start worker —help
:::
dbgpt start worker --help
Already connect 'dbgpt'
Usage: dbgpt start worker [OPTIONS]
Start model worker
Options:
--model_name TEXT Model name [required]
--model_path TEXT Model path [required]
--worker_type TEXT Worker type
--worker_class TEXT Model worker class,
pilot.model.cluster.DefaultModelWorker
--model_type TEXT Model type: huggingface, llama.cpp, proxy
and vllm [default: huggingface]
--host TEXT Model worker deploy host [default: 0.0.0.0]
--port INTEGER Model worker deploy port [default: 8001]
--daemon Run Model Worker in background
--limit_model_concurrency INTEGER
Model concurrency limit [default: 5]
--standalone Standalone mode. If True, embedded Run
ModelController
--register Register current worker to model controller
[default: True]
--worker_register_host TEXT The ip address of current worker to register
to ModelController. If None, the address is
automatically determined
--controller_addr TEXT The Model controller address to register
--send_heartbeat Send heartbeat to model controller
[default: True]
--heartbeat_interval INTEGER The interval for sending heartbeats
(seconds) [default: 20]
--log_level TEXT Logging level
--log_file TEXT The filename to store log [default:
dbgpt_model_worker_manager.log]
--tracer_file TEXT The filename to store tracer span records
[default:
dbgpt_model_worker_manager_tracer.jsonl]
--tracer_storage_cls TEXT The storage class to storage tracer span
records
--device TEXT Device to run model. If None, the device is
automatically determined
--prompt_template TEXT Prompt template. If None, the prompt
template is automatically determined from
model path, supported template: zero_shot,vi
cuna_v1.1,llama-2,codellama,alpaca,baichuan-
chat,internlm-chat
--max_context_size INTEGER Maximum context size [default: 4096]
--num_gpus INTEGER The number of gpus you expect to use, if it
is empty, use all of them as much as
possible
--max_gpu_memory TEXT The maximum memory limit of each GPU, only
valid in multi-GPU configuration
--cpu_offloading CPU offloading
--load_8bit 8-bit quantization
--load_4bit 4-bit quantization
--quant_type TEXT Quantization datatypes, `fp4` (four bit
float) and `nf4` (normal four bit float),
only valid when load_4bit=True [default:
nf4]
--use_double_quant Nested quantization, only valid when
load_4bit=True [default: True]
--compute_dtype TEXT Model compute type
--trust_remote_code Trust remote code [default: True]
--verbose Show verbose output.
--help Show this message and exit.
:::color1 查看dbgpt模型服务相关命令 dbgpt model —help
:::
dbgpt model --help
Already connect 'dbgpt'
Usage: dbgpt model [OPTIONS] COMMAND [ARGS]...
Clients that manage model serving
Options:
--address TEXT Address of the Model Controller to connect to. Just support
light deploy model, If the environment variable
CONTROLLER_ADDRESS is configured, read from the environment
variable
--help Show this message and exit.
Commands:
chat Interact with your bot from the command line
list List model instances
restart Restart model instances
start Start model instances
stop Stop model instances