손 시리즈! Zilliz Cloud 및 AWS Bedrock을 사용하여 RAG 애플리케이션 구축

오픈소스 중국 커뮤니티 팀이 공유라는 이름으로 오픈소스 중국 커뮤니티의 뒷이야기를 전하는 첫 생방송을 진행했습니다."

RAG(검색 증강 생성)는 정보 검색과 자연어 처리(NLP) 기능을 결합하여 텍스트 생성을 향상시키는 AI 프레임워크입니다. 특히, RAG 시스템의 언어 모델은 최신 정보를 생성된 응답에 통합하는 검색 메커니즘을 통해 지식 기반 또는 외부 데이터베이스를 쿼리하고 검색하여 최종 출력을 더 정확하고 더 많은 컨텍스트를 포함합니다.

Zilliz Cloud( https://zilliz.com.cn/cloud)는 Milvus( https://milvus.io/) 벡터 데이터베이스를 기반으로 구축되었으며 대규모 벡터화된 데이터를 저장하고 처리하기 위한 솔루션을 제공합니다. 효율적인 관리 및 분석, 데이터 검색이 가능합니다. 개발자는 Zilliz Cloud의 벡터 데이터베이스 기능을 사용하여 대규모 임베딩 벡터를 저장 및 검색하여 RAG 애플리케이션의 검색 모듈 기능을 더욱 향상시킬 수 있습니다.

AWS Bedrock 클라우드 서비스( https://aws.amazon.com/cn/bedrock/)는 NLP 솔루션을 배포하고 확장하는 데 사용할 수 있는 사전 훈련된 다양한 기본 모델을 제공합니다 . 개발자는 AWS Bedrock을 통해 언어 생성, 이해 및 번역 모델을 AI 애플리케이션에 통합할 수 있습니다. 또한 AWS Bedrock은 텍스트에 대해 관련성이 높고 상황에 맞는 응답을 생성하여 RAG 애플리케이션의 기능을 더욱 향상시킬 수 있습니다.

01. Zilliz Cloud 및 AWS Bedrock을 사용하여 RAG 애플리케이션 구축

AWS Bedrock과 함께 Zilliz Cloud를 사용하여 RAG 애플리케이션을 구축하는 방법을 시연 해 보겠습니다 . 기본 프로세스는 그림 1에 나와 있습니다.

그림 1. Zilliz Cloud 및 AWS Bedrock을 사용하여 RAG 애플리케이션을 구축하는 기본 프로세스

#download the packages then import them
! pip install --upgrade --quiet  langchain langchain-core langchain-text-splitters langchain-community langchain-aws bs4 boto3

# For example
import bs4
import boto3

AWS Bedrock 및 Zilliz Cloud에 연결

다음으로 AWS 및 Zilliz Cloud 서비스에 연결하는 데 필요한 환경 변수를 설정합니다. AWS Bedrock 및 Zilliz Cloud 서비스에 연결하려면 AWS 서비스 지역, 액세스 키, Zilliz Cloud의 엔드포인트 URI 및 API 키를 제공해야 합니다.

# Set the AWS region and access key environment variables
REGION_NAME = "us-east-1"
AWS_ACCESS_KEY_ID = os.getenv("AWS_ACCESS_KEY_ID")
AWS_SECRET_ACCESS_KEY = os.getenv("AWS_SECRET_ACCESS_KEY")

# Set ZILLIZ cloud environment variables
ZILLIZ_CLOUD_URI = os.getenv("ZILLIZ_CLOUD_URI")
ZILLIZ_CLOUD_API_KEY = os.getenv("ZILLIZ_CLOUD_API_KEY")

위에 제공된 액세스 자격 증명을 사용하여 AWS Bedrock Runtime 서비스 에 연결하고 AWS Bedrock 언어 모델을 통합하기 위한 boto3 클라이언트(https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) 를 생성했습니다. . 다음으로 ChatBedrock 인스턴스( https://python.langchain.com/v0.1/docs/integrations/chat/bedrock/)를 초기화하고 클라이언트에 연결한 후 사용할 언어 모델을 지정합니다. 이 튜토리얼에서 사용하는 anthropic.claude-3-sonnet-20240229-v1:0 모델입니다 . 이 단계는 텍스트 응답을 생성하기 위한 인프라를 설정하는 데 도움이 되며, 생성된 응답의 다양성을 제어하기 위해 모델의 온도 매개변수도 구성합니다. BedrockEmbeddings 인스턴스는 텍스트와 같은 구조화되지 않은 데이터를 변환하는 데 사용할 수 있습니다( https://zilliz.com.cn/glossary/%E9%9D%9E%E7%BB%93%E6%9E%84%E5%8C%96% E6 %95%B0%E6%8D%AE)를 벡터로 변환합니다.

# Create a boto3 client with the specified credentials
client = boto3.client(
    "bedrock-runtime",
    region_name=REGION_NAME,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
)

# Initialize the ChatBedrock instance for language model operations
llm = ChatBedrock(
    client=client,
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    region_name=REGION_NAME,
    model_kwargs={"temperature": 0.1},
)

# Initialize the BedrockEmbeddings instance for handling text embeddings
embeddings = BedrockEmbeddings(client=client, region_name=REGION_NAME)

정보 수집 및 처리

임베딩 모델이 성공적으로 초기화된 후 다음 단계는 외부 소스에서 데이터를 로드하는 것입니다. WebBaseLoader 인스턴스( https://python.langchain.com/v0.1/docs/integrations/document_loaders/web_base/)를 생성하여 지정된 웹 소스의 콘텐츠를 크롤링합니다.

이 튜토리얼에서는 AI 에이전트 관련 기사의 콘텐츠를 로드합니다. 로더는 BeautifulSoup(https://www.crummy.com/software/BeautifulSoup/bs4/doc/)의 SoupStrainer를 사용하여 웹 페이지의 특정 부분(예: "포스트 콘텐츠", "포스트 제목" 및 ")을 구문 분석합니다. 관련 콘텐츠만 검색되도록 하려면 "-header" 섹션을 게시하세요. 그런 다음 로더는 지정된 네트워크 소스에서 문서를 검색하여 후속 처리를 위한 관련 콘텐츠 목록을 제공합니다. 다음으로 RecursiveCharacterTextSplitter 인스턴스( https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/recursive_text_splitter/)를 사용하여 검색된 문서를 더 작은 텍스트 덩어리로 분할합니다. 이렇게 하면 콘텐츠를 보다 쉽게 관리할 수 있으며 이러한 텍스트 블록을 텍스트 포함 또는 언어 생성 모듈과 같은 다른 구성 요소에 전달할 수도 있습니다.

# Create a WebBaseLoader instance to load documents from web sources
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
# Load documents from web sources using the loader
documents = loader.load()
# Initialize a RecursiveCharacterTextSplitter for splitting text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)


# Split the documents into chunks using the text_splitter
docs = text_splitter.split_documents(documents)

응답 생성

프롬프트 템플릿은 각 응답의 구조를 미리 정의하여 AI가 가능한 경우 통계와 숫자를 사용하고 관련 지식이 부족할 때 답변을 작성하지 않도록 안내할 수 있습니다.

# Define the prompt template for generating AI responses
PROMPT_TEMPLATE = """
Human: You are a financial advisor AI system, and provides answers to questions by using fact based and statistical information when possible.
Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
<context>
{context}
</context>

<question>
{question}
</question>

The response should be specific and use statistics or numbers when possible.

Assistant:"""
# Create a PromptTemplate instance with the defined template and input variables
prompt = PromptTemplate(
    template=PROMPT_TEMPLATE, input_variables=["context", "question"]
)

Zilliz 벡터 스토어를 초기화하고 Zilliz Cloud 플랫폼에 연결하세요. 벡터 저장소는 문서를 빠르고 효율적으로 검색할 수 있도록 문서를 벡터로 변환하는 역할을 합니다. 검색된 문서는 일관된 텍스트로 형식화되고 정리되며, AI는 관련 정보를 응답에 통합하여 궁극적으로 매우 정확하고 관련성이 높은 답변을 제공합니다.

# Initialize Zilliz vector store from the loaded documents and embeddings
vectorstore = Zilliz.from_documents(
    documents=docs,
    embedding=embeddings,
    connection_args={
        "uri": ZILLIZ_CLOUD_URI,
        "token": ZILLIZ_CLOUD_API_KEY,
        "secure": True,
    },
    auto_id=True,
    drop_old=True,
)

# Create a retriever for document retrieval and generation
retriever = vectorstore.as_retriever()

# Define a function to format the retrieved documents
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

마지막으로 AI 응답 생성을 위한 완전한 RAG 링크를 생성합니다. 이 링크는 먼저 벡터 저장소에서 사용자 쿼리와 관련된 문서를 검색하고 검색하고 형식을 지정한 다음 프롬프트 템플릿( https://python.langchain.com/v0.1/docs/modules/model_io/ 프롬프트) 에 전달합니다. /) 응답 구조를 생성합니다. 그런 다음 이 구조화된 입력은 언어 모델로 전달되어 일관된 응답을 생성합니다. 이 응답은 궁극적으로 문자열 형식으로 구문 분석되어 사용자에게 제공되어 정확하고 상황에 맞는 답변을 제공합니다.

# Define the RAG (Retrieval-Augmented Generation) chain for AI response generation
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# rag_chain.get_graph().print_ascii()

# Invoke the RAG chain with a specific question and retrieve the response
res = rag_chain.invoke("What is self-reflection of an AI Agent?")
print(res)

다음은 응답 결과의 예입니다.

Self-reflection is a vital capability that allows autonomous AI agents to improve iteratively by analyzing and refining their past actions, decisions, and mistakes. Some key aspects of self-reflection for AI agents include:
1. Evaluating the efficiency and effectiveness of past reasoning trajectories and action sequences to identify potential issues like inefficient planning or hallucinations (generating consecutive identical actions without progress).
2. Synthesizing observations and memories from past experiences into higher-level inferences or summaries to guide future behavior.

02. Zilliz Cloud와 AWS Bedrock 사용의 장점

표 1에 표시된 것처럼 Zilliz Cloud는 AWS Bedrock과 원활하게 통합되어 RAG 애플리케이션의 효율성, 확장성 및 정확성을 향상시킬 수 있습니다. 개발자는 이 두 서비스를 사용하여 대규모 데이터 세트를 처리하고 RAG 애플리케이션 프로세스를 단순화하며 RAG 생성 응답의 정확성을 향상시키는 포괄적인 솔루션을 개발할 수 있습니다.

표 1. Zilliz Cloud 및 AWS Bedrock 사용의 이점

03. 요약

이 기사에서는 주로 Zilliz Cloud와 AWS Bedrock을 사용하여 RAG 애플리케이션을 구축하는 방법을 소개합니다.

Milvus를 기반으로 구축된 벡터 데이터베이스인 Zilliz Cloud는 임베딩 벡터를 위한 확장 가능한 저장 및 검색 솔루션을 제공하는 반면, AWS Bedrock은 언어 생성을 위한 강력한 사전 훈련 모델을 제공합니다. 샘플 코드를 통해 Zilliz Cloud 및 AWS Bedrock에 연결하고, 외부 소스에서 데이터를 로드하고, 데이터를 처리 및 분할하고, 최종적으로 완전한 RAG 링크를 구축하는 방법을 보여줍니다. 이 기사에 구축된 RAG 애플리케이션은 LLM이 환각을 일으키고 부정확한 응답을 제공할 확률을 최소화하여 최신 NLP 모델과 벡터 데이터베이스 간의 시너지 효과를 최대한 발휘할 수 있습니다. 우리는 이 튜토리얼이 다른 사람들이 RAG 애플리케이션을 구축할 때 유사한 기술을 사용하도록 영감을 주기를 바랍니다.