如何构建一个提供nlp服务的镜像
本文介绍如何构建一个提供nlp服务的镜像,功能如下:
- 基于hanlp 2.x、jionlp为基础,封装NLP处理相关的方法,如命名实体提取,提取QQ、微信、身份证,文本相似性比较等
- 采用fastapi封装NLP的相关服务
Dockerfile文件内容
# sytax=docker/dockerfile:1
# use build args in the docker build command with --build-arg="BUILDARG=true"
# Override at your own risk - non-root configurations are untested
ARG UID=0
ARG GID=0
FROM python:3.11-slim-bookworm as base
# Use args
ARG UID
ARG GID
## Basis ##
ENV ENV=prod \
PORT=10068
WORKDIR /app/backend
ENV HOME /root
# Create user and group if not root
RUN if [ $UID -ne 0 ];then \
if [ $GID -ne 0 ];then \
addgroup --gid $GID app; \
fi; \
adduser --uid $UID --gid $GID --home $HOME --disabled-password --no-create-home app; \
fi
#Make sure the user has access to the app and root directory
RUN chown -R $UID:$GID /app $HOME
RUN apt-get update && \
#Install pandoc,netcat and gcc
apt-get install -y --no-install-recommends curl jq procps && \
#cleanup
rm -rf /var/lib/apt/lists/*;
# install python dependencies
COPY --chown=$UID:$GID ./backend/requirements.txt ./requirements.txt
RUN pip install numpy==1.26.4 --no-cache-dir && \
pip3 install torch --index-url https://download.pytorch.org/whl/cpu --no-cache-dir
RUN pip3 install uv && \
uv pip install --system -r requirements.txt --no-cache-dir && \
chown -R $UID:$GID /app/backend
#copy backend files
COPY --chown=$UID:$GID ./backend .
EXPOSE 10068
HEALTHCHECK CMD curl --silent --fail http://localhost:${PORT:-10068}/health | jq -ne 'input.status == true' || exit 1
USER $UID:$GID
CMD ["bash","start.sh"]
requirements.txt内容
#fastapi
fastapi==0.111.0
uvicorn[standard]==0.30.1
pydantic==2.8.2
python-multipart==0.0.9
requests==2.32.3
aiohttp==3.10.2
#config
Jinja2==3.1.4
alembic==1.13.2
#hanlp
hanlp==2.1.0b60
#jionlp
jiojio==1.2.5
jionlp==1.5.15
zipfile36==0.1.3
start.sh内容
#!/usr/bin/env bash
SCRIPT_DIR=$( cd -- "$( dirname -- "${
BASH_SOURCE[0]}" )" &> /dev/null && pwd )
cd "$SCRIPT_DIR" || exit
PORT="${PORT:-10068}"
HOST="${HOST:-0.0.0.0}"
exec uvicorn main:app --host "$HOST" --port "$PORT" --forwarded-allow-ips '*'