Hello Langchain (39)

랭체인은 파이썬만 알면 다룰 수 있습니다.

이미 나와있는 LLM모델을 다루고 접근방식을 좀 다르게 한 프레임워크입니다!

😀 Hello Langchain

코랩에는 기본 설치되어있습니다!

import langchain
langchain.__version__

자연어처리(NLP)및 언어 모델을 활용한 어플리케이션을 개발하기 위한 도구를 제공하는 라이브러리입니다.

텍스트생성, 검색, 대화형 인터페이스 구축 등 다양한 기능을 지원합니다

공식 API레퍼런스는 아래 링크입니다.

https://python.langchain.com/api_reference/

LangChain Python API Reference — 🦜🔗 LangChain documentation

langchain-google-vertexai 2.0.10

python.langchain.com

langchain_community 모듈
- LangChain 라이브러리의 확장 모듈
- LangChain의 공식 커뮤니티가 개발한 여러 확장 기능을 포함한 모듈
- 커뮤니티에서 자주 사용되는 다양한 구성 요소나 유틸리티가 포함
- 다양한 오픈소스 기여자들이 LangChain을 더 잘 활용할 수 있도록 돕는 도구들이 포함
- Colab 에는 기본 설치 되어 있지 않음 (2024.12 현재)

공식 API 레퍼런스 https://python.langchain.com/api_reference/community/index.html

langchain-community: 0.3.13 — 🦜🔗 LangChain documentation

python.langchain.com

langchain_openai 모듈
- LangChain 라이브러리에서 OpenAI의 다양한 언어 모델을 쉽게 사용할 수 있도록 도와주는 확장 모듈
- OpenAI API와의 통합을 간소화: 사용자가 OpenAI의 모델을 쉽게 호출하고 응답을 받을 수 있도록 합니다. 예를 들어, 텍스트 생성, 요약, 번역, 질의응답 등의 작업을 수행할 수 있습니다
- Chain 및 Tool Integration: 다양한 "Chain" 및 "Tool"들을 결합하여 복잡한 NLP 작업을 자동화하는 것입니다. langchain-openai 모듈은 OpenAI 모델을 LangChain의 다른 구성 요소와 쉽게 연결할 수 있도록 해, 다양한 언어 처리 워크플로우를 구축할 수 있습니다.

- ★사용하려면 환경변수에 반드시 OPENAI_API_KEY 값이 있어야 한다

공식 API 레퍼런스 https://python.langchain.com/api_reference/openai/index.html

langchain-openai: 0.2.14 — 🦜🔗 LangChain documentation

Error raised when OpenAI Structured Outputs API returns a refusal.

python.langchain.com

😀 랭체인 설치하기

!pip install -U langchain langchain_community langchain_openai

😀 Colab에서 환경 변수 설정

환경변수에 OPEN_API_KEY 값이 반드시 설정되어 있어야 한다
그러나, 이러한 KEY 값 (특히 결제 수단이나, 민감한 정보와 연결된) 이
Colab 에 노출되는건 좋지 않습니당

import os
from dotenv import load_dotenv  # dotenv 패키지 사용

그래서 환경변수에 따로 API를 작성해두고 환경변수로 사용하는편입니다.

그리고 혹시 유출이된다면 바로 삭제해야함요!

ENV_BASE_PATH = r'/content/drive/MyDrive/#LLM 챗봇서비스/ENV/OPENAI_API'

# os.environ['OPENAI_API_KEY'] # KeyError 는 에러가 뜨는것을 확인할 수 있어용

아직 환경변수가 없으니까요!

환경변수를 지정해주기 위해서 아래코드를 입력합니다.

load_dotenv(dotenv_path=os.path.join(ENV_BASE_PATH, 'openai_api_key.txt'))

True가 나오면 있는거예요

os.environ['OPENAI_API_KEY']  # 환경변수 등록
None

이제 다시 환경변수를 쓰면

내 API가 나오는것을 확인할 수 있습니다.

😀 LLM & Chat Model

from langchain.llms.openai import OpenAI
from langchain.chat_models import ChatOpenAI

llm = OpenAI()  # OPENAI_API_KEY  환경변수 없으면 에러남.

chat = ChatOpenAI()  # OPENAI_API_KEY  환경변수 없으면 에러남.

result = llm.predict("How many planets are there?")  # str 리턴
print('답변', result)

<ipython-input-21-24ad80461de5>:1: LangChainDeprecationWarning: The method `BaseLLM.predict` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead.
  result = llm.predict("How many planets are there?")  # str 리턴
답변 

As of 2021, there are eight recognized planets in our solar system: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.

result = llm.predict("태양계에는 얼마나 많은 행성들이 있죠?")  # str 리턴
print('답변', result)

답변 

현재까지 발견된 행성은 약 4000개로 알려져 있습니다. 그러나 아직 탐지되지 않은 더 많은 행성들이 존재할 가능성도 있습니다.

위 답변은 랜덤성이 있기때문에 실행을 누를때마다 살짝 다른 답변이 나옵니다.

질문을 구체적으로하면 더 디테일한 답변이 나올 수 있습니다.

위로 답변된것은 usage에서 확인해보면 사용한것임을 알아둬야합니다.

# 아래와 같이 api 키를 직접 매개변수로 건네줄수도 있지만...  KEY 비추한다.  환경변수 사용을 추천한다.
#
# llm = OpenAI(openai_api_key="sk-")
# chat = ChatOpenAI(openai_api_key="sk-")

😀 Predict Messages

챗 모델은 질문뿐만아니라 대화도 할 수 있습니다.

대화의 정의는 여러 메세지 묶음, 상대의 대화의 맥락에 맞게 대답할 수 있습니다.

# https://python.langchain.com/docs/integrations/chat/openai/#instantiation
# 레퍼런스 : https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html
"""
chat = ChatOpenAI(
    model="gpt-4o",
    temperature=0,
        # 모델의 응답 다양성을 제어하는 역할을 합니다.
        # 이는 OpenAI의 GPT 모델에서 사용하는 매개변수로,
        #  생성되는 텍스트의 창의성과 확률적 다양성(램덤당을 조정합니다.
        #
    max_tokens=None,  # model 리 리턴하는 결과의 최대 token 개수지정.
    timeout=None,
    max_retries=2,
    # api_key="...",  # if you prefer to pass api key in directly instaed of using env vars
    # base_url="...",
    # organization="...",
    # other params...
)
"""
None

모델파라미터에다가 gpt-4o 등으로 넣을 수 있습니다. temper는 랜덤을 조정합니다.

chat = ChatOpenAI(temperature=0.1)

랜덤은 0.1정도로했습니다.

# Message
from langchain.schema import HumanMessage, SystemMessage, AIMessage

이번에는 메시지를 해보겠습니다.

세가지의 메시지봇이있습니당 아래처럼있어요

# HumanMessage : 사람이 AI 에 보내는 Message
# SystemMessage : LLM 에 설정들을 제공하기 위한 Message
# AIMessage: AI 에 의해 보내지는 Message

여기서 시스템메시지를 보면 질문에 대한 프롬포트를 해주는 그런느낌입니다.

역할설정을 해줄 수 있는 그런 것이죠

messages = [
    SystemMessage(
        content = "You are a geography expert. And your only reply in Korean",
    ),
    AIMessage(
        content = "안녕, 내 이름은 둘리 야",
    ),
    HumanMessage(
        content = """What is the distance between Mexico and Thailand.
          Also, what is your name?""",
    )
]

설정은 한국어로만 해줘!

ai 메시지는 말을하고

human은 질문을 2개를 했습니다.

시스템,ai는 각각 전달을 해주었구요.

result = chat.predict_messages(messages)

print(type(result))

result


<ipython-input-26-e7c4d4ee60d4>:1: LangChainDeprecationWarning: The method `BaseChatModel.predict_messages` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead.
  result = chat.predict_messages(messages)
<class 'langchain_core.messages.ai.AIMessage'>
AIMessage(content='멕시코와 태국 사이의 거리는 약 16,000km입니다. 제 이름은 둘리입니다.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 36, 'prompt_tokens': 58, 'total_tokens': 94, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-941c7698-73ec-4f47-8113-cb776a5e7c41-0')

결과를 보면 aimessage로 멕시코 태국 사이 거리는 약 16,000km라고 하고 막 이런걸 다 ai메시지가 답변합니다.,

😀 Prompt Templates

messages를 prompt라고도 합니다.

모델에 입력으로 제공되는 텍스트나 데이터, 모델에게 작업을 수행하도록 지시하거나 모델이 생성할 텍스트의 컨텍스트를 제공합니다 즉 LLM과 의사소통하기 위한 방법입니다!

LLM(대형 언어 모델)에서 Prompt란 모델에 입력으로 제공되는 텍스트나 데이터입니다.

이는 모델에게 작업을 수행하도록 지시하거나 모델이 생성할 텍스트의 컨텍스트를 제공하죠

프롬프트는 모델의 출력을 결정하는 중요한 역할을 합니다

Prompt의 역할

1. 모델에 대한 지시 : 프롬프트는 모델에게 무엇을 해야할지 알려주는 역할을 합니다.

예를들어서 사용자가 모델에게 질문을 하거나 특정 스타일의 텍스트를 생성하도록 요청할대 프롬프트가 필요합니다.

2. 컨텍스트 제공 : 모델이 적절한 응답을 생성할 수 있도록 필요한 배경정보나 문맥을 제공합니다.

예를들어서 어떤 주제에 대한 질문을 할대 관련배경정보를 제공하여 모델이 더 정확한 답을 할 수 있게 합니다.

3. 모델의 출력 유도 : 프롬프트가 모델의 출력을 유도하고 생성되는 텍스트의 스타일, 내용, 형식 등을 결정하는데 중요한 영향을 미칩니다.

예를들자면요!

1.질문 응답:
프롬프트: "What is the capital of France?"
출력: "The capital of France is Paris."

2.창의적 글쓰기:
프롬프트: "Write a short story about a dragon and a knight."
출력: 모델이 창의적으로 드래곤과 기사에 관한 이야기를 생성합니다.

3.번역:
프롬프트: "Translate the following sentence to Spanish: 'Hello, how are you?'"
출력: "Hola, ¿cómo estás?"

프롬프트의 종류는

단순한질문, 지시문, 형식화된 입력이 있습니다.

1. 단순한 질문 : 사용자가 단순히 궁금한 점을 묻는 형태

2. 지시문 : 특정 작업을 수행하도록 지시하는 형태

3. 형식화된 입력 : 특정 형식이나 구조를 갖춘 입력(예시 : 텍스트 요약, 번역, 코드작성 등)

프롬프트설계의 중요성!

정확한 결과를 얻기 위해서는 프롬프트의 설계가 매우 중요합니다. 프롬프트가 모호하거나 불완전하면 모델이 원하는 출력을 생성하기가 어려워요. 그래서 다양한 프롬프트를 실험하면서 모델의 반응을 관찰하고 가장 적합한 프롬프트를 찾는것이 중요합니다.

팁이 있다면

명확하고 구체적인 지시: 무엇을 원하는지 정확하게 전달하세요. 예를 들어, "Explain quantum mechanics"보다는 "Explain quantum mechanics in simple terms for a high school student"와 같이 구체적인 요구를 하는 것이 좋습니다.

적절한 컨텍스트 제공: 필요한 배경 정보나 문맥을 제공하면 모델이 더 정확한 답변을 생성할 수 있습니다.

다양한 실험: 프롬프트를 조금씩 바꿔가며 테스트해 보면서 최적의 응답을 유도할 수 있습니다.

결론!

프롬프트는 대형모델에게 작업을 지시하는 중요한 입력으로 모델이 수행할 작업의 방향을 결정짓는 요소입니다.

프롬프트를 잘 설계하는것이 LLM을 효과적으로 활용하는데 큰 도움이 됩니다.

😀 메시지 커스터마이징

이제 메시지 커스터미이징을 해 보려고하는데 langchain에서 진행해보려고합니다.

from langchain.prompts import PromptTemplate, ChatPromptTemplate

promptTemplate dms string으로부터 template를 만들고

ChatPromptTemplate는 message로부터 template를 만듭니다.

template = PromptTemplate.from_template(
    # placeholder {...} 사용
    "What is the distance between {country_a} and {country_b}"
)

type을 보면

랭체인으로 나오고 랭체인으로 다양하게 툴과 템플릿을 제공받으면서 사용할 수 있습니다.

키값을 전달을 먼저 해야하는데

#키 값을 전달하기
prompt = template.format(country_a="Mexico", country_b="Korea")
prompt #str

여기서 key값을 전달하는 이유는 PromptTemplate가 동적으로 값을 채워넣어서

유연하게 프롬프트를 생성할 수 있게 하기 때문입니다.

예를들어서 나라 이름만 바꿔서 질문을 반복적으로 만들 수 있도록 편리하게 사용할 수 있게 되겠죠

chat.predict(prompt)

결과 ↓
<ipython-input-41-469b1dacb882>:1: LangChainDeprecationWarning: The method `BaseChatModel.predict` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead.
  chat.predict(prompt)
The distance between Mexico and South Korea is approximately 8,900 miles (14,300 kilometers) 
when measured in a straight line.

prompt에 작성된 내용 (What is the distance between ~~~~)에 기반해서 모델이 답변을 생상허나것이고

이 답변이 아래있는 답변이에요

😀 LangChain의 chatpromptTemplate를 활용한 템플릿 정의

template = ChatPromptTemplate.from_messages([
    #전달하는 방법 확인해보기
    #Tuple로 전달함

    #SystemMessage 튜플
    ("system", "You are a geography expert. And you only reply in {language}"),

    #AI Messate 튜플
    ("ai", "안녕 내 이름은 {name}이야."),

    #HumanMessage 튜플
    ("human", """What is the distance between {country_a} and {country_b}.
    Also, what is your name?""")
])

template

를 하게되면

템플릿이 나오고

ChatPromptTemplate(input_variables=['country_a', 'country_b', 'language', 'name'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['language'], input_types={}, partial_variables={}, template='You are a geography expert. And you only reply in {language}'), additional_kwargs={}), AIMessagePromptTemplate(prompt=PromptTemplate(input_variables=['name'], input_types={}, partial_variables={}, template='안녕 내 이름은 {name}이야.'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['country_a', 'country_b'], input_types={}, partial_variables={}, template='What is the distance between {country_a} and {country_b}.\n Also, what is your name?'), additional_kwargs={})])

로 결과가 도출되게 됩니다.

prompt에 내용을 좀 더 디테일하게 커스텀해보자면

prompt = template.format_messages(
    language="Korean",
    name="뽀로로",
    country_a = "France",
    country_b = "Iran"
)

#리스트로 나옵니다. str아님!!
prompt


결과 ↓
 AIMessage(content='안녕 내 이름은 뽀로로이야.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='What is the distance between France and Iran.\n    Also, what is your name?', additional_kwargs={}, response_metadata={})]

로나오게 됩니다.

prompt에 있는 내용을 토대로 예측을 해보자면

chat.predict_messages(prompt)

결과 ↓
AIMessage(content='프랑스와 이란 사이의 거리는 대략 4,000km입니다. 제 이름은 뽀로로입니다.', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 59, 'total_tokens': 96, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-004c2d36-9379-4300-b84d-f0238d9961fd-0')

로 나오게 됩니다.

이렇게 챗봇이 구현되는것을 직접 확인해보면서 체감을확실히 할 수 있는데

모델이 복잡하기도 하면서 또 랭체인이 얼마나 큰역할을 하고있는지 얼마나 간단하게 구현할 수 있게 만들어주는지

직접 해볼 수 있는 예제여서 매우 학습적이었다.

😀 OutputParser & LCEL

Output Parser

LLM(대형 언어 모델)에서 생성된 출력을 처리하고 '원하는 형식으로 변환'하는 데 사용되는 유틸리티입니다. 이를 통해 모델이 생성하는 텍스트를 구조화된 데이터로 변환하거나, 특정 규칙에 따라 데이터를 추출할 수 있습니다

1. 출력 구조화

- 모델의 텍스트 응답을 파싱하여 JSON, 딕셔너리, 목록 등과 같은 프로그래밍에서 사용 가능한 구조화된 데이터로 변환합니다

2. 출력 검증

- 모델이 예상치 못한 출력을 반환할 경우 적절한 에러 메시지를 제공하거나 기본값을 반환하도록 처리할 수 있습니다.

3. 출력 표준화

- 언어 모델의 출력이 항상 일관된 형식으로 제공되도록 보장합니다.

즉 LLM의 출력(답변)을 list로 변환하는 OuterParse입니다.

from langchain.schema import BaseOutputParser

BaseOutputParser을 상속받아서 OutputParser을 만듭니다.

class CommaOutputParser(BaseOutputParser):

  # parse() 메소드를 '반드시' 구현해야 한다.
  #    text=   입력 텍스트
  def parse(self, text):
    items = text.strip().split(",")
    return list(map(str.strip, items))

여기서는 parse()메소드를 반드시 구현해야합니다.

text는 입력테스트구요.

위 클래스를 p로 받겠습니다.

p = CommaOutputParser()

p.parse("Hello,    how,   are,    you")
result -> ['Hello', 'how', 'are', 'you']

parse를 하면 ,를 기준으로 띄어지게 되겠죠.

이렇게 하는 이유는 모델이 쉼표로 구분된 문자열을 반환했을때 이를구조화된 데이터(리스트)로 변환하기 위해서 사용합니다.

template = ChatPromptTemplate.from_messages([
    ("system", """You are a list generating machine.
    Everything you are asked will be answered with a list of max {max_items}.
    """),
    ("human", "{question}")
])

위 부분은 템플릿을 정의한거고 system에서 어떤 상황인지 인지를 시켜주고 역할을 부여해주는것을 볼 수 있습니다.

prompt = template.format_messages(
    max_items = 10,
    question = "What are the planets?",

)
prompt

템플릿 포맷팅을 하므로서 system과 human의 메시지가 채워지게 되겠죠

result = chat.predict_messages(prompt)
result

#줄바꿈이있다는것도 알 수 있고 질문이 명확하지 않아서 제대로 나오지 않는것을 확인할 수 있습니다.

결과를 보면
AIMessage(content='1. Mercury\n2. Venus\n3. Earth\n4. Mars\n5. Jupiter\n6. Saturn\n7. Uranus\n8. Neptune\n9. Pluto (dwarf planet)\n10. Eris (dwarf planet)', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 51, 'prompt_tokens': 40, 'total_tokens': 91, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-dca44db8-57bc-4b96-a1c4-16a434067cdb-0')

AIMessage(content='1. Mercury\n2. Venus\n3. Earth\n4. Mars\n5. Jupiter\n6. Saturn\n7. Uranus\n8. Neptune\n9. Pluto

질문할때 단지 'list'라고만 했기에 위와 같이 답변하였다

우리는 콤마로 구분되길 원한다면 질문을 좀 더 명확히 해야한다.

template = ChatPromptTemplate.from_messages([
    ("system", """You are a list generating machine.
    Everything you are asked will be answered with a comma separated list of max {max_items}.
    Do NOT reply with anything else.
    """),
    ("human", "{question}")
])

prompt = template.format_messages(
    max_items = 10,
    question = "What are the planets?",

)

result = chat.predict_messages(prompt)
result

결과는

AIMessage(content='Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune',

이러게 행성들 이름만 깔끔하게 나오는것을 볼 수 있습니다.!

또 다른 탬플릿을 해보자면

template = ChatPromptTemplate.from_messages([
    ("system", """You are a list generating machine.
    Everything you are asked will be answered with a comma separated list of max {max_items}.
    Do NOT reply with anything else.
    """),
    ("human", "{question}")
])

prompt = template.format_messages(
    max_items = 10,
    question = "What are the colors?",

)

result = chat.predict_messages(prompt)
result

#AIMessage(content='red, blue, green, yellow, orange, purple, pink, black, white, brown',

이번에는 색을 정의하는 것으로 해보았습니다.

빨강~갈색까지 딱 10개 맞춰서 색을 말해주는것을 확인할 수 있습니다.

template = ChatPromptTemplate.from_messages([
    ("system", """You are a list generating machine.
    Everything you are asked will be answered
    with a comma separated list of max {max_items} in lowercase.
    Do NOT reply with anything else.
    """),
    ("human", "{question}")
])

prompt = template.format_messages(
    max_items = 10,
    question = "What are the colors?",
)

result = chat.predict_messages(prompt)
result

이번엔 lowercase를 통해서 소문자로만 답변하게 확인해봤는데

# 소문자로만 답변한다!
# AIMessage(content='red, blue, green, yellow, orange, purple, pink, black, white, brown'
#content에 답변이 있다.

모두다 소문자로만 답변이나온것을 확인할 수 있었습니다.

위에서 보면 content 속성에 답변이 있기떄문에 그 부분만 잘라서 가져와보면

result.content가 되겠죠

결과는 당연히

red, blue, green, yellow, orange, purple, pink, black, white, brown

로 도출되는것까지 확인할 수 있습니다.

이번에는 답변을 해준것을 list답변으로 받아보겠습니다

template = ChatPromptTemplate.from_messages([
    ("system", """You are a list generating machine.
    Everything you are asked will be answered
    with a comma separated list of max {max_items} in lowercase.
    Do NOT reply with anything else.
    """),
    ("human", "{question}")
])

prompt = template.format_messages(
    max_items = 10,
    question = "What are the colors?",
)

result = chat.predict_messages(prompt)


p = CommaOutputParser()
p.parse(result.content)
# ai가 답변을 해준 내용의 우리가 원하는 형태인 list로 답변을 받을 수 있다.
# ['red',
#  'blue',
#  'green',
#  'yellow',
#  'orange',
#  'purple',
#  'pink',
#  'black',
#  'white',
#  'brown']

p.parse(result.content)로 답변해준부분을 list로 담을 수 있었어요.

지 금 까 지!

template -> format -> predict -> parse 하면서 제법 단계가 많았죠?

여기 LCEL을 사용하면 위 과정들이 엄청 많이 생략됩니다. 그 부분이 바로 Chain이라고 할 수 있습니다.

😀 Chian, LCEL 이란?

LCEL(LangChain Expression Language:랭체인 표현 언어)

-LCEL은 LangChain 내에서 복잡한 표현식을 처리하고 모델과의 상호작용을 더 강력하고 유연하게 만드는 기능을 제공합니다.

코드양을 많이 줄여주고 다양한 템플릿과 LLM을 호출합니다

서로 다른 응답을 함꼐 사용할수도있게 합니다!

'Chain'은 모든 요소들을 합쳐주는 역할인데 합쳐진 요소들이 또 하나의 chain으로 실행됩니다.

Chain생성? '|' 연산자를 사용합니다.

chain = template | chat | CommaOutputParser()

chain의 타입을 보면

type(chain) #RunnableSequence객체

RunnableSequence객체라고 보입니다.

Chain 작동해보기

# chain 작동! . invoke({..})

chain.invoke({
    "max_items": 5,
    "question": "What are the pokemons?"
})

체인 내부에는

.format_message()호출

.predict()호출

.parse()호출

이러한 일련의 작업을 chain.invoke()호출 단 한번으로 끝낼 수 있습니다~

chain 끼리도 결합이 가능한데

예시로보면

Chaining

[공식]
https://python.langchain.com/docs/concepts/lcel/
https://python.langchain.com/docs/how_to/#langchain-expression-language-lcel
https://python.langchain.com/docs/how_to/lcel_cheatsheet/

최대한 일관성있게 api를 해주는것을 알 수 있습니다.

LCEL의 input/output

LangChain Expression Language(LCEL)은 LangChain에서 다양한 입력 유형을 활용하여 LLM과 도구를 결합하고 데이터 흐름을 제어하는 언어입니다. LCEL은 LLM의 입력과 처리에 사용되는 입력타입에 대한 설명입니다.

1. Plain Text

설명 : 단순한 텍스트 입력. 이 형식은 가장 기본적인 입력으로 , LLM이 자유로운 자연어처리를 수행할 수 있도록 합니다.

What is the capital of France?

특징 :

텍스트 분석, 생성 및 대화형 작업에 적합.

추가적인 구조나 메타데이터 없이 단순 텍스트로 전달.

2. Structured Input

설명 : JSON, 딕셔너리 또는 구조화된 형식의 입력입니다. 데이터 필드가 명시적으로 정의되어있으며, 모델이 이 구조에 따라 데이터를 처리합니다.

{
    "question": "What is the capital of France?",
    "context": "France is a country in Europe."
}

특징 :

명시적인 데이터 필드를 통해 LLM이 필요한 정보를 더 정확히 추출 및 활용가능

복잡한 데이터 분석이나 멀티 필드 처리가 필요한 작업에 유용합니다.

3. Prompt Templates

설명: 사용자가 정의한 프롬프트 템플릿을 입력으로 사용합니다. 템플릿에 변수값을 채워 넣어서 모델에 전달합니다.

template = "Translate the following text to French: {text}"
input = template.format(text="Hello, how are you?")

특징:

변수기반 입력을 통해 재사용 가능성이 높음.

사용자 정의 입력 생성 및 제어에 적합합니다.

4.Key-Value Pairs

설명: 키-쌍의 입력 형식으로 명시적인 쿼리 형태로 정보를 제공합니다.

{
    "name": "John",
    "age": 30,
    "location": "New York"
}

특징 :

정형화된 데이터를 제공하여 LLM이 더 효율적으로 데이터를 분석 및 처리할 수 있습니다.

특정 정보 필드가 명확히 필요할때 유용할것같습니다.

5. Multi-modal Inputs

설명: 텍스트, 이미지, 오디오 등 다양한 데이터 유형을 조합한 입력 형식입니다.

{
    "text": "Describe the image.",
    "image": "<image_data>"
}

특징:
멀티모달 모델과 통합하여 다양한 입력 형식을 처리 가능.
이미지 캡셔닝, 오디오-텍스트 변환 등의 작업에서 활용.

6. Serialized Inputs

설명: 입력 데이터를 시리얼화(Serialize)하여 특정 형식으로 변환한 입력입니다. 예를 들어, JSON 문자열로 데이터를 전달합니다

input = '{"question": "What is the capital of France?", "context": "France is in Europe."}'

특징:

데이터가 외부 시스템이나 API와 통신할 때 유용.

데이터 포맷에 대한 유연성이 높음.

7. Chat Messages

설명: 채팅 메시지 형식의 입력으로, 사용자가 역할(role)과 내용(content)을 정의하여 LLM에게 대화 형식으로 정보를 전달합니다.

[
    {"role": "system", "content": "You are an assistant."},
    {"role": "user", "content": "What is the weather today?"}
]

특징:

ChatGPT 같은 대화형 모델에 적합.

대화의 맥락을 유지하고 다중 발화 입력을 처리할 수 있음.

8. Custom Input Types

설명: 사용자가 애플리케이션 요구 사항에 따라 정의하는 커스텀 입력 형식입니다.

class CustomInput:
    def __init__(self, field1, field2):
        self.field1 = field1
        self.field2 = field2

특징:

특정 애플리케이션 로직과 완벽히 맞는 형식으로 데이터 처리.

표준 입력 타입으로 표현하기 어려운 복잡한 구조를 다룰 때 유용.

요약하자면!

LCEL의 입력타입은 단순 텍스트로부터 구조화된 데이터, 몰티모달입력까지 다양하게 제공되며, 각 타입은 특정 용도에 맞게 설계되었습니다. 입력 데이터를 정교하게 설계하고 적절한 형식을 선택함으로써 모델의 성능을 최적화할 수 있습니다.

😀 Chian 연결하기

# 첫번째 chain

chef_prompt = ChatPromptTemplate.from_messages([
    ("system","""
      You are a world-class international chef.
      You create easy to follow recipes for any type of cuisines
      with easy to find ingredients.
    """),
    ("human" """
      I want to cook {cuisine} food.
    """),
])


chef_chain = chef_prompt | chat

위 쉐프에게서 레시피를 받아서 그것을 채식 재료로만 사용하도록 변형하기

작업은 두개!

1. 레시피를 전달해주는 쉐프

2. 채식주의자를 위한 셰프

# 두 번째 chain
veg_chef_prompt = ChatPromptTemplate.from_messages([
    ("system","""
      You are a vegetarian chef specialized on
      making traditional recipies vegetarian.
      You find alternative ingredients and explain their preparation.
      You don't radically modify the recipe.
      If there is no alternative for a foot just say
      you don't know how to replace it.
    """),
    ("human","""
      {recipe}
    """),
])

veg_chain = veg_chef_prompt | chat

연결해주는 '|' 가 들어있는것을 확인할 수 있습니다.

# 위 두개의 체인을 연결한 체인

# final_chain = chef_chain | veg_chain

final_chain = {"recipe": chef_chain} | veg_chain

result = final_chain.invoke({
    "cuisine": "indian",  # chef_chain 의 {cuisine} 에 절달.
})

result

약간 가독성이 좋지않아서

print(result.content)해보겠습니다

For a vegetarian version of Indian Chicken Curry, you can easily replace the chicken with a plant-based alternative such as tofu or seitan. Here's how you can adapt the recipe:

Ingredients:
- 1 lb tofu or seitan, cut into bite-sized pieces
- 1 onion, finely chopped
- 2 tomatoes, chopped
- 2 cloves of garlic, minced
- 1-inch piece of ginger, grated
- 1 tsp cumin powder
- 1 tsp coriander powder
- 1/2 tsp turmeric powder
- 1/2 tsp chili powder (adjust to taste)
- 1/2 cup plain vegan yogurt (made from coconut milk or almond milk)
- 2 tbsp oil
- Salt to taste
- Fresh cilantro leaves for garnish

Instructions:
1. Heat oil in a large pan over medium heat. Add the chopped onions and sauté until they turn translucent.
2. Add the minced garlic and grated ginger. Cook for another minute until fragrant.
3. Add the chopped tomatoes and cook until they soften and break down.
4. Add the cumin powder, coriander powder, turmeric powder, and chili powder. Mix well and cook for a couple of minutes.
5. Add the tofu or seitan pieces to the pan and coat them well with the spice mixture.
6. Stir in the plain vegan yogurt and season with salt. Cover the pan and let the mixture simmer for about 15-20 minutes, stirring occasionally.
7. Once the tofu or seitan is cooked through and the curry has thickened, garnish with fresh cilantro leaves.
8. Serve hot with steamed rice or naan bread.

Enjoy your vegetarian Indian Curry with tofu or seitan! Feel free to adjust the spices to suit your taste preferences.

되게 깔끔하게 잘 나오는것을 확인할 수 있습니다

😀 Streaming

위에서 했던것은 ai에 두번 호출해야되기때문에 조금 기다려야하는데

이 과정자체를 실시간으로 출력할수도있는데 바로 Streaming=이다.

Chat model 의 streaming=

streaming 은 LLM model 의 응답(resposne) 이 생성되는 것을

실시간으로(?) 보게 해줌.

callbacks=[StreamingStdOutCallbackHandler()]

볼수 있는 문자(토큰)가 생길 때마다 print 해준다.

callbacks 는 다양한 'event' 감지도 가능

LLM 이 작업을 시작했다거나, 끝냈다거나.

문자를 생성했다거나, 에러가 발생하거나..!!

직접 구현해보겠습니다.

from langchain.callbacks import StreamingStdOutCallbackHandler

chat = ChatOpenAI(
    temperature=0.1,
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

# 새 chat model 로 다시 실행
chaf_chain = chef_prompt | chat
veg_chain = veg_chef_prompt | chat
final_chain = {"recipe": chef_chain} | veg_chain

final_chain.invoke({
    "cuisine": "indian",
})

That sounds like a delicious recipe for Chicken Tikka Masala! As a vegetarian chef, I can help you make a vegetarian version of this dish by replacing the chicken with a suitable alternative.

For this recipe, you can substitute the chicken with paneer, which is a popular Indian cheese that is firm and holds its shape well when cooked. Here's how you can prepare the paneer as a replacement for the chicken:

Ingredients:
- 1 lb paneer, cut into bite-sized pieces

Instructions:
1. Instead of marinating chicken, you can marinate the paneer in the yogurt and spice mixture as mentioned in the recipe. Paneer absorbs flavors well, so marinating it will help enhance its taste.

2. You can follow the same steps for baking the marinated paneer on skewers in the oven until slightly charred. Keep an eye on it as paneer cooks faster than chicken.

3. In the sauce, you can add the cooked paneer pieces at the same stage where you would add the chicken tikka. Let the paneer simmer in the sauce for a few minutes to absorb the flavors.

4. Continue with the recipe as instructed, adjusting the cooking time as needed since paneer doesn't require as much time to cook as chicken.

By following these steps and using paneer as a substitute for chicken, you can enjoy a delicious vegetarian version of Chicken Tikka Masala. Serve it with rice or naan bread for a satisfying meal. Enjoy your cooking!
AIMessage(content="That sounds like a delicious recipe for Chicken Tikka Masala! As a vegetarian chef, I can help you make a vegetarian version of this dish by replacing the chicken with a suitable alternative. \n\nFor this recipe, you can substitute the chicken with paneer, which is a popular Indian cheese that is firm and holds its shape well when cooked. Here's how you can prepare the paneer as a replacement for the chicken:\n\nIngredients:\n- 1 lb paneer, cut into bite-sized pieces\n\nInstructions:\n1. Instead of marinating chicken, you can marinate the paneer in the yogurt and spice mixture as mentioned in the recipe. Paneer absorbs flavors well, so marinating it will help enhance its taste.\n\n2. You can follow the same steps for baking the marinated paneer on skewers in the oven until slightly charred. Keep an eye on it as paneer cooks faster than chicken.\n\n3. In the sauce, you can add the cooked paneer pieces at the same stage where you would add the chicken tikka. Let the paneer simmer in the sauce for a few minutes to absorb the flavors.\n\n4. Continue with the recipe as instructed, adjusting the cooking time as needed since paneer doesn't require as much time to cook as chicken.\n\nBy following these steps and using paneer as a substitute for chicken, you can enjoy a delicious vegetarian version of Chicken Tikka Masala. Serve it with rice or naan bread for a satisfying meal. Enjoy your cooking!", additional_kwargs={}, response_metadata={'finish_reason': 'stop'}, id='run-feaf3d8d-f35d-4151-be2e-503e5159e194-0')

저작자표시 (새창열림)

'AI > 자연어처리' 카테고리의 다른 글

Anaconda 아나콘다 & streamlit(1) (0)	2025.01.22
GPT 를 이용한 영화 리뷰 분류 (38) (0)	2025.01.10
GPT (37) (0)	2025.01.09
한국어 BERT 를 이용한 네이버 영화 리뷰 분류 (36) (0)	2025.01.08
BERT 의 MLM, NSP (35) (0)	2025.01.07

윤슬의 바다

Hello Langchain (39)

😀 Hello Langchain

😀 랭체인 설치하기

😀 Colab에서 환경 변수 설정

😀 LLM & Chat Model

😀 Predict Messages

😀 Prompt Templates

Prompt의 역할

예를들자면요!

프롬프트의 종류는

결론!

😀 메시지 커스터마이징

😀 LangChain의 chatpromptTemplate를 활용한 템플릿 정의

😀 OutputParser & LCEL

Output Parser

지 금 까 지!

😀 Chian, LCEL 이란?

Chain생성? '|' 연산자를 사용합니다.

Chain 작동해보기

Chaining

1. Plain Text

2. Structured Input

3. Prompt Templates

4.Key-Value Pairs

5. Multi-modal Inputs

6. Serialized Inputs

7. Chat Messages

8. Custom Input Types

요약하자면!

😀 Chian 연결하기

😀 Streaming

'AI > 자연어처리' 카테고리의 다른 글

티스토리툴바

Hello Langchain (39)

😀 Hello Langchain

😀 랭체인 설치하기

😀 Colab에서 환경 변수 설정

😀 LLM & Chat Model

😀 Predict Messages

😀 Prompt Templates

Prompt의 역할

예를들자면요!

프롬프트의 종류는

결론!

😀 메시지 커스터마이징

😀 LangChain의 chatpromptTemplate를 활용한 템플릿 정의

😀 OutputParser & LCEL

Output Parser

지 금 까 지!

😀 Chian, LCEL 이란?

Chain생성? '|' 연산자를 사용합니다.

Chain 작동해보기

Chaining

1. Plain Text

2. Structured Input

3. Prompt Templates

4.Key-Value Pairs

5. Multi-modal Inputs

6. Serialized Inputs

7. Chat Messages

8. Custom Input Types

요약하자면!

😀 Chian 연결하기

😀 Streaming

'AI > 자연어처리' 카테고리의 다른 글

관련글

티스토리툴바