The detailed process of building the open source speech recognition project whisper that can be used for nothing | How to build the OpenAI open source speech recognition project Whisper in Linux

The original text is from my personal blog .

1. Preconditions

The server is a GPU server. Click here to jump to the GPU server I use. I built whisper with NVIDIA A 100 graphics card and 4GB video memory.

The Python version should be between 3.8~3.11.

Enter the following command to check the Python version used.

python3 -V

2. Install Anaconda

Why install Anaconda?

In order to reduce version conflicts of libraries used by different projects, we can use Anaconda to create a virtual Python environment.

Download the Anaconda installation script

Find the installer that corresponds to your system.

image-20230512160616642

After the download is complete we can run the script directly.

bash 脚本.sh

You can also run the script in the following way.

chmod +x 脚本.sh
./脚本.sh

After the installation is complete, you need to reconnect to SSH.

To verify whether the installation is successful, you can use the following command.

conda -V

3. Install FFmpeg

apt install ffmpeg

After entering ffmpegEnter, you can see a prompt message, indicating that the installation was successful.

4. Install the graphics card driver

First enter nvidia-smito view the graphics card information. If there is a prompt message, it means that the graphics card driver has been installed.

If the graphics card driver has not been installed, here are two installation methods.

4.1. Method 1

ubuntu-drivers devices View graphics drivers that can be installed

apt install nvidia-driver-530 Install the recommended graphics driver

nvidia-smi view graphics card information

image-20230511174509407

4.2. Method 2

Go to NVIDIAthe official driver download website to download the corresponding graphics card driver.

Click here to download .

For details, please refer to this article .

5. Install CUDA

Download CUDA

The downloaded CUDA version must be less than or equal to the CUDA version seen in nvidia-smi, and cannot be downloaded at will.

Install according to the official order.

Edit ~/.bashrc, add the following command at the end.

export PATH=/usr/local/cuda-12.1/bin${
    
    PATH:+:${
    
    PATH}}
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-12.1/lib64

Note: You need to cuda-12.1change the above to the version of CUDA you installed yourself.

reload

source ~/.bashrc
sudo ldconfig

Check whether CUDA is installed.

nvcc -V

If there is no error reported during the installation process, but the version information is not output after entering this command, then your environment variable is not configured or configured incorrectly.

6. Install cuDNN (optional)

To download cuDNN, you must register an NVIDIA account, and you must check to agree to join their community, otherwise you cannot download it. And this download needs to be authenticated before, so you can't download directly on the service, otherwise what you download is just a web page, we need to download it on the local computer first, and then upload it to the server through the rz or scp command.

cuDNN download

image-20230511181842121

image-20230511182057161

After the download is complete, extract it to the CUDA directory.

tar -xvf 文件名
cd 文件夾
sudo cp include/* /usr/local/cuda-12.1/include
sudo cp lib/libcudnn* /usr/local/cuda-12.1/lib64
sudo chmod a+r /usr/local/cuda-12.1/include/cudnn*
sudo chmod a+r /usr/local/cuda-12.1/lib64/libcudnn*
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

7. Install PyTorch

Click here to download PyTorch

image-20230512162536942

Note: The installed version must be consistent with your CUDA version.

When installing, just copy the official command directly.

Then we can use the following command to verify whether the installation was successful.

python
import torch
torch.__version__
torch.cuda.is_available()

The last sentence is the key. Only by returning True can Whisper use the graphics card to transcribe, otherwise it uses the CPU to transcribe. If the last sentence returns False, it may be that the CUDA version used in the PyTorch version you installed is inconsistent with the CUDA version already installed in your server.

8. Install Whisper

Before installing, you need to use conda to create a virtual environment.

conda create -n whisper python=3.10

Activate the virtual environment.

conda activate whisper

Exit the virtual environment.

conda deactivate

Check out the virtual environment.

conda env list

Delete the virtual environment.

conda remove -n whisper --all

Activate the virtual environment first, and then enter the following command to install.

pip install -U openai-whisper

If there is no error, then we enter the following command, and when we see the information output, it means that the installation is successful.

whisper -h

9. Use of Whisper

The first time it is used is relatively slow, and the model needs to be downloaded. The larger the model used, the slower the transcription speed and the higher the accuracy of the transcription. Whisper has the highest recognition accuracy for Spanish, followed by Italian, and then only English, while the recognition of Mandarin ranks in the middle.

Here is a brief description of the usage of Whisper.

whisper 你要转录的音视频文件 --model large --language Chinese

More usage can be used whisper -hto view.

Guess you like

Origin blog.csdn.net/qq_43907505/article/details/130667674