164 使用LlamaParse增强功能

使用LlamaParse增强功能

在之前的示例中，我们向文档提出了一个非常基本的问题，即预算总额。让我们改为询问文档中一个更复杂的具体事实：

documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

response = query_engine.query(
    "How much exactly was allocated to a tax credit to promote investment in green technologies in the 2023 Canadian federal budget?"
)
print(response)

不幸的是，我们得到了一个无用的答案：

The budget allocated funds to a new green investments tax credit, but the exact amount was not specified in the provided context information.

这很糟糕，因为我们恰好知道确切的数字在文档中！但PDF很复杂，有表格和多列布局，LLM错过了答案。幸运的是，我们可以使用LlamaParse来帮助我们。

首先，你需要一个LlamaCloud API密钥。你可以通过注册LlamaCloud免费获得一个。然后像你的OpenAI密钥一样将其放在你的.env文件中：

LLAMA_CLOUD_API_KEY=llx-xxxxx

现在你可以在代码中使用LlamaParse了。让我们将其作为导入引入：

from llama_parse import LlamaParse

让我们进行第二次尝试来解析和查询文件（注意这使用了documents2、index2等），看看我们是否能得到更好的答案：

documents2 = LlamaParse(result_type="markdown").load_data(
    "./data/2023_canadian_budget.pdf"
)
index2 = VectorStoreIndex.from_documents(documents2)
query_engine2 = index2.as_query_engine()

response2 = query_engine2.query(
    "How much exactly was allocated to a tax credit to promote investment in green technologies in the 2023 Canadian federal budget?"
)
print(response2)

我们得到了！

$20 billion was allocated to a tax credit to promote investment in green technologies in the 2023 Canadian federal budget.

你可以随时查看仓库以了解这段代码的样子。

正如你所见，解析质量对LLM的理解有很大影响，即使是相对简单的问题。接下来，我们来看看如何使用记忆来帮助我们回答更复杂的问题。

使用LlamaParse增强功能

猜你喜欢

目录

热门文章