ChatPDF implemented by ChatGPT, simple application landing, turns your document into a smart assistant, and quickly learns the content of the document through dialogue

Table of contents

first look at the effect

Realization principle

Environment installation

Application Scenario


first look at the effect

First of all, find a paper first, I just found a paper in pdf format here

Well, I now let him act as a smart assistant for a research paper, of course, you can customize your own prompt

 start quiz

It can be seen that the effect is very strong

Realization principle

  1. Extract pdf text for subsequent processing.
  2. Since the OpenAI API has a limit on the number of Tokens, we need to divide the PDF text into fragments smaller than the Token limit.
  3. Use OpenAI's Embedding API to generate vectors for each segment and save them to the database (Postgres)
  4. start asking questions
  5. Convert the question asked by the user into a vector.
  6. The cosine similarity algorithm is used to compare the question vector posed by the user with the vectors in the database to find the text segment most similar to the question.
  7. Feed snippets of text to ChatGPT and have it answer user questions based on those snippets.

Code resources, I put them on the network disk, you can mention them yourself

Link: https://pan.baidu.com/s/1Os_DR8lC9gBtc2ONNN5YJg?pwd=6666 
Extraction code: 6666 
-- Sharing from Baidu Netdisk super member V1

Environment installation

Python environment 3.7+, mine is 3.8

pip install -r requirements.txt

 If an ssl error occurs when running

urllib3 can be downgraded

pip install urllib3==1.25.11

 The execution code is this

 Then, everyone needs to use special Internet access, because in essence, openai is still used

 Before using it, we need to feed our corpus to openai, only need to feed it once, if we change the corpus, we need to feed it again

 Feeding, you can comment out the second use

In addition, you need to change your key to your own before running

Application Scenario

You can use this method of uploading files to solve the word limit problem of openai’s token, and make our documents an assistant to help you learn. Of course, you can study other ideas that can be used to start a business by yourself.

Guess you like

Origin blog.csdn.net/m0_55868614/article/details/129639067