Medical question answering robot project deployment
Article directory
This is a project I did in August last year. At that time, I participated in the training of the Zhuogong Class of the 16th Hospital of China Southern Airlines. I made an intelligent medical system, which included the function of this medical question-and-answer assistant. Due to the limited time and knowledge level at that time, the function of the question answering assistant at that time was relatively simple, which can be understood as a rule-based question answering system, which cannot be regarded as intelligence in the true sense. Therefore, at the beginning of this year, we will improve this project and use some algorithms in the NLP field to make it more practical.
1. Pull the TensorFlow image
Pull the TensorFlow image from Docker Hub, and perform the following operations on the image.
# 拉取镜像
$ docker pull tensorflow/tensorflow:1.14.0-py3
# 生成容器
$ docker run -dit --name diagnosis -p 5002:5002 -p 7474:7474 -p 7473:7473 -p 7687:7687 -p 60061:60061 -p 60062:60062 tensorflow/tensorflow:1.14.0-py3
# 进入容器
$ docker exec -it diagnosis bash
Port 5002 is the project port; the three ports 7473, 7474 and 7687 are the ports of neo4j; 60061 and 60062 are the ports of the other two services.
Check the tensorflow version of the container and whether the gpu is available, enter the python terminal and enter the following command, you can see that the version of tensorflow is used 1.14.0
.
>>> import tensorflow as tf
>>> tf.__version__
'1.14.0'
>>>tf.test.is_gpu_available()
False
2. Configure the system environment
Check out the Ubuntu version, ie 18.04.2
version.
root@322e47635519:/workspace/Diagnosis-Chatbot# cat /etc/issue
Ubuntu 18.04.2 LTS \n \l
2.1 Replace the software source
Back up the original software source first, the command is as follows.
cp /etc/apt/sources.list /etc/apt/sources.list.bak
Because the image does not have vim installed, the contents of the file can only echo
be changed through commands /etc/apt/sources.list
.
Ali source
echo "">/etc/apt/sources.list
echo "deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse">>/etc/apt/sources.list
echo "deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse">>/etc/apt/sources.list
echo "deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse">>/etc/apt/sources.list
echo "deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse">>/etc/apt/sources.list
echo "deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse">>/etc/apt/sources.list
echo "deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse">>/etc/apt/sources.list
echo "deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse">>/etc/apt/sources.list
echo "deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse">>/etc/apt/sources.list
echo "deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse">>/etc/apt/sources.list
echo "deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse">>/etc/apt/sources.list
Update software sources.
apt-get update
apt-get upgrade
When changing the software source, I first used the Tsinghua source, but when I downloaded vim, I was prompted that some dependencies related to vim could not be downloaded. After searching on the Internet, it should be the problem of the source, and then I replaced it with Ali source.
2.2 download vim
You need to use vim to modify the content of the file, so you need to download it.
apt-get install vim -y
After the download is complete, you can check the version of vim through the following command.
vim --version
2.3 Solve the problem of Chinese garbled characters in vim
Modify /etc/vim/vimrc
the content and add the following content at the end:
set fileencodings=utf-8,ucs-bom,gb18030,gbk,gb2312,cp936
set termencoding=utf-8
set encoding=utf-8
After setting, the Chinese in the file can be displayed normally.
2.4 Install Neo4J graph database
For detailed steps, please see my other blog - Installing Neo4j graph database under Linux system .
2.5 Install Network Toolkit
apt-get install inetutils-ping
apt-get install net-tools
3. Run the project
3.1 Copy the project to the container
First create a directory in the container workspace
, and put the project code into this directory.
root@322e47635519:/# mkdir workspace
Copy the project code file on the local machine to the working directory of the container.
$ docker cp "本机上项目的路径" diagnosis:/workspace/
The above command realizes the function of copying the project to the directory diagnosis
in the container /workspace/
.
3.2 Toolkits needed to install the project
First, you need to upgrade pip. The upgrade instructions are as follows.
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade pip
Use the pip command to download the toolkit, -i
the Tsinghua source used later, and finally the name of the toolkit.
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple packageName
In this container, the packages I need to install are as follows:
# 导入Neo4j数据库
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple py2neo==2021.2.3
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pandas==1.1.5
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tqdm==4.62.3
# 启动问答助手服务
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple numpy==1.19.5
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple flask==1.1.4
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple flask_cors==3.0.10
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple scikit-learn==0.24.1
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple requests==2.26.0
# bilstm算法
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pyahocorasick==1.4.2
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple gevent==1.5.0
# 意图识别
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple bert4keras==0.10.8
# 语音识别
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple huggingface_hub==0.0.6
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple hyperpyyaml==0.0.1
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple joblib==0.14.1
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pre-commit==2.3.0
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple sentencepiece==0.1.91
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple SoundFile==0.10.2
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch==1.8.0
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torchaudio==0.8.0
An error was reported when starting the service: OSError: sndfile library not found
. The reason for the error is that it is missing libsndfile
and needs to be installed. The installation instructions are as follows.
$ apt-get install libsndfile1
3.3 Import data
First open the Neo4j service in the container.
neo4j start
There is a folder in the project build_kg
, enter the folder, execute build_kg_utils.py
the program to import the data into the Neo4j database.
$ python build_kg_utils.py
This process can take several hours.
3.4 Open entity extraction service
knowledge_extraction\bilstm
The code of the BiLSTM algorithm is stored in the root directory of the project , and the service needs to be started.
$ python app.py
Because it uses the algorithm code written by others, it prompts the problem of version incompatibility at startup. The original author uses tensorflow1.0 version, and the writing method is inconsistent in many places, so it is recorded here.
First app.py
you need to modify the following code in:
old code | new code |
---|---|
config = tf.ConfigProto() | config = tf.compat.v1.ConfigProto() |
sess = tf.Session(config=config) | sess = tf.compat.v1.Session(config=config) |
graph = tf.get_default_graph() | graph = tf.compat.v1.get_default_graph() |
config = tf.ConfigProto() => config = tf.compat.v1.ConfigProto()
sess = tf.Session(config=config) => sess = tf.compat.v1.Session(config=config)
graph = tf.get_default_graph() => graph = tf.compat.v1.get_default_graph()
3.5 Open the intent recognition service
In the root directory of the project, the code of the intent recognition algorithm using the model nlu\intent_recg_bert
is stored , and the service needs to be started.Bert
$ python app.py
Because it uses the algorithm code written by others, it prompts the problem of version incompatibility at startup. The original author uses tensorflow1.0 version, and the writing method is inconsistent in many places, so it is recorded here.
First app.py
you need to modify the following code in:
old code | new code |
---|---|
config = tf.ConfigProto() | config = tf.compat.v1.ConfigProto() |
sess = tf.Session(config=config) | sess = tf.compat.v1.Session(config=config) |
graph = tf.get_default_graph() | graph = tf.compat.v1.get_default_graph() |
config = tf.ConfigProto() => config = tf.compat.v1.ConfigProto()
sess = tf.Session(config=config) => sess = tf.compat.v1.Session(config=config)
graph = tf.get_default_graph() => graph = tf.compat.v1.get_default_graph()
3.6 Open the Q&A Assistant Service
In app.py
, you need to check the host number and port number. The host number must be written 0.0.0.0
as otherwise the machine cannot open the project. If the port number must be consistent with the port mapped when creating the container (here I set 5002). And turn off debug mode.
app.run(host='0.0.0.0', port=5002, debug=False, threaded=True)
After completing the above operations, enter the following command in the terminal to start the project:
$ python app.py
3.7 Effect display
netstat
TCP
Can list all or ports that are listening UDP
, including services using ports and socket status.
$ netstat -tunlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:7687 0.0.0.0:* LISTEN 1241/java
tcp 0 0 0.0.0.0:5002 0.0.0.0:* LISTEN 1727/python
tcp 0 0 0.0.0.0:7474 0.0.0.0:* LISTEN 1241/java
tcp 0 0 127.0.0.1:60061 0.0.0.0:* LISTEN 1753/python
tcp 0 0 127.0.0.1:60062 0.0.0.0:* LISTEN 1779/python
- -t: display TCP port
- -u: display UDP port
- -n: display numeric addresses instead of hostnames
- -l: only show listening ports
- -p: Display the PID and name of the process
At this time, enter on the browser of this machine localhost:5002
to successfully open the project page!
![](https://media.giphy.com/media/jCOvxQUnFOrMfktO6y/giphy.gif)
4. Build project image
Now encapsulate the container where the project is located into a mirror to facilitate deployment on different systems. Here I use two methods to build, namely Docker commit
and Dockerfile
build project mirroring.
4.1 Docker commit build
In Docker, mirroring is multi-layer storage, and each layer is modified on the basis of the previous layer; while containers are also multi-layer storage, which uses mirroring as the base layer and adds a layer on top of it as a container The storage layer at runtime.
In this project, we tensorflow
created diagnosis
this container based on the image and modified it in the container. The specific changes can be seen through docker diff
the command .
$ docker diff CONTAINER
docker commit
The command can save the storage layer of the container as an image. In other words, on the basis of the original image, the storage layer of the container is added to form a new image. docker commit
The syntax format is:
$ docker commit [选项] <容器ID或容器名> [<仓库名>[:<标签>]]
In this project, I use the following command to build the project image:
$ docker commit --author "xxxx" --message "Diagnosis Chatbot Project" diagnosis username/image:tag
Among them --author
is the designated author, --message
and is the content of this modification. This is similar to git
version control, but this information can also be omitted here. It should be noted that the warehouse name must be lowercase.
Use docker image ls
the command to view our newly created image.
Use docker run
the command to generate the project container according to the project image. The container has already configured the environment, and the service can be started directly in the container.
4.2 Dockerfile construction
In the future, it will be planned to use Dockerfile to build the image of this project.
5. Publish project image
Push Docker commit
the built project image to the remote warehouse, the command is as follows:
$ docker push username/image:tag