GPT入门之LangChain Chain模块

GPT-4o2024-05-12491

Chain模块用中文来说就是链，LangChain本来就是语言链的涵义，那么Chain就是其中的链结构，属于组合各个层的中间结构，可以称之为胶水层，将各个模块粘连在一起，实现相应的功能，也是用于程序的调用入口。

类似于Model模块，Chain模块也有一个基类Chain，是所有chain对象的基本入口，与用户程序的交互、用户的输入、其他模块的输入、内存的接入、回调能力。chain通过传入String值，控制接受的输入和给到的输出格式。

当前集成Chain的子类有：

从上图可以看到，Chain的子类基本都是担任某项专业任务的具体实现类，比如LLMChain，从名字可以看出，这就是专门为大语言模型准备的Chain实现类，那么其他的Chain也类似，在使用过程中，可以根据业务的需求来选择不同的Chain来完成具体的能力，比如说SQLDatabaseChain可以用于数据库操作，VecorDBAQ就是为大模型提供向量化存储的能力。

LLMChain

LLMChain是针对大模型的chain，一般是配合其他的chain一起使用，也是一般开发阶段最为常用的chain。

template = """Write a {adjective} poem about {subject}."""prompt = PromptTemplate(template=template, input_variables=["adjective", "subject"])llm_chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), verbose=True)
llm_chain.predict(adjective="sad", subject="ducks")

SequenticalChain

除了LLMChain，另外一个比较常用的就是顺序执行chains，SequenticalChain主要适用于分布实现的业务，并且业务之间有明确先后顺序的场景，比如说，我们一定是先做饭或者点外卖然后才能吃饭，这种业务是必须保证先后顺序的，就需要使用SequenticalChain。其中：

SimpleSequentialChain: 上一个业务的输出结果作为下一个业务的参数或者输入；
SequentialChain: 这个类就允许多个输入和输出了，比如说，我们吃了一顿三菜一汤的大餐，那么三菜一汤就是我们的输入；

Chain的加载

既然LangChain提供了这么多种类的Chain，LangChain是通过从文件中县读取配置，然后再从配置中获取所有chain的配置信息，具体代码如下：

def load_chain(path: Union[str, Path], **kwargs: Any) -> Chain:    """Unified method for loading a chain from LangChainHub or local fs."""    if hub_result := try_load_from_hub(        path, _load_chain_from_file, "chains", {"json", "yaml"}, **kwargs    ):        return hub_result    else:        return _load_chain_from_file(path, **kwargs)

def _load_chain_from_file(file: Union[str, Path], **kwargs: Any) -> Chain:    """Load chain from file."""    # Convert file to Path object.    if isinstance(file, str):        file_path = Path(file)    else:        file_path = file    # Load from either json or yaml.    if file_path.suffix == ".json":        with open(file_path) as f:            config = json.load(f)    elif file_path.suffix == ".yaml":        with open(file_path, "r") as f:            config = yaml.safe_load(f)    else:        raise ValueError("File type must be json or yaml")
    # Override default 'verbose' and 'memory' for the chain    if "verbose" in kwargs:        config["verbose"] = kwargs.pop("verbose")    if "memory" in kwargs:        config["memory"] = kwargs.pop("memory")
    # Load the chain from the config now.    return load_chain_from_config(config, **kwargs)

上面的代码比较简单，就是解析各种格式的文件，从文件中获取配置信息，然后再从配置中获取chain的配置数据。

type_to_loader_dict = {    "api_chain": _load_api_chain,    "hyde_chain": _load_hyde_chain,    "llm_chain": _load_llm_chain,    "llm_bash_chain": _load_llm_bash_chain,    "llm_checker_chain": _load_llm_checker_chain,    "llm_math_chain": _load_llm_math_chain,    "llm_requests_chain": _load_llm_requests_chain,    "pal_chain": _load_pal_chain,    "qa_with_sources_chain": _load_qa_with_sources_chain,    "stuff_documents_chain": _load_stuff_documents_chain,    "map_reduce_documents_chain": _load_map_reduce_documents_chain,    "map_rerank_documents_chain": _load_map_rerank_documents_chain,    "refine_documents_chain": _load_refine_documents_chain,    "sql_database_chain": _load_sql_database_chain,    "vector_db_qa_with_sources_chain": _load_vector_db_qa_with_sources_chain,    "vector_db_qa": _load_vector_db_qa,}def load_chain_from_config(config: dict, **kwargs: Any) -> Chain:    """Load chain from Config Dict."""    if "_type" not in config:        raise ValueError("Must specify a chain Type in config")    config_type = config.pop("_type")
    if config_type not in type_to_loader_dict:        raise ValueError(f"Loading {config_type} chain not supported")
    chain_loader = type_to_loader_dict[config_type]    return chain_loader(config, **kwargs)