NexusRaven-V2-13B

我要开发同款
匿名用户2024年07月31日
22阅读
所属分类ai、llama、Pytorch
开源地址https://modelscope.cn/models/AI-ModelScope/NexusRaven-V2-13B
授权协议other

作品详情

NexusRaven-13B: Surpassing GPT-4 for Zero-shot Function Calling

Nexusflow HF - Nexusflow Discord - NexusRaven-V2 blog post - Prompting Notebook CoLab - Leaderboard - Read-World Demo - NexusRaven-V2-13B Github

NexusRaven

Introducing NexusRaven-V2-13B

NexusRaven is an open-source and commercially viable function calling LLM that surpasses the state-of-the-art in function calling capabilities.

? Versatile Function Calling Capability: NexusRaven-V2 is capable of generating single function calls, nested calls, and parallel calls in many challenging cases.

? Fully Explainable: NexusRaven-V2 is capable of generating very detailed explanations for the function calls it generates. This behavior can be turned off, to save tokens during inference.

? Performance Highlights: NexusRaven-V2 surpasses GPT-4 by 7% in function calling success rates in human-generated use cases involving nested and composite functions.

? Generalization to the Unseen: NexusRaven-V2 has never been trained on the functions used in evaluation.

? Commercially Permissive: The training of NexusRaven-V2 does not involve any data generated by proprietary LLMs such as GPT-4. You have full control of the model when deployed in commercial applications.

Please checkout the following links!

NexusRaven-V2 model usage

NexusRaven-V2 accepts a list of python functions. These python functions can do anything (including sending GET/POST requests to external APIs!). The two requirements include the python function signature and the appropriate docstring to generate the function call.

NexusRaven-V2's Capabilities

NexusRaven-V2 is capable of generating deeply nested function calls, parallel function calls, and simple single calls. It can also justify the function calls it generated. If you would like to generate the call only, please set a stop criteria of \"\<bot_end>\". Otherwise, please allow NexusRaven-V2 to run until its stop token (i.e. "\<\/s>").

Quick Start Prompting Guide

Please refer to our notebook, How-To-Prompt.ipynb, for more advanced tutorials on using NexusRaven-V2!

  1. When giving docstrings to Raven, please provide well-indented, detailed, and well-written docstrings as this can help accuracy.
  2. Raven does better when all functions provided to it has arguments, either required or optional, (i.e. func(dummy_arg) is preferred over func()) as this can help accuracy.
  3. We strongly recommend to set sampling to False when prompting NexusRaven-V2.
  4. We strongly recommend a very low temperature (~0.001).
  5. We strongly recommend following the prompting style below.

When handling irrelevant user queries, users have noticed that specifying a "no-op" function with arguments work best. For example, something like this might work:

def no_relevant_function(user_query : str):
  """
  Call this when no other provided function can be called to answer the user query.

  Args:
     user_query: The user_query that cannot be answered by any other function calls.
  """

Please ensure to provide an argument to this function, as Raven works best on functions with arguments.

Quickstart

You can run the model on a GPU using the following code.

from modelscope import AutoTokenizer,Model
from modelscope import snapshot_download
import torch 
from typing import List, Tuple

BOS_TOKEN = '<s>'
EOS_TOKEN = '</s>'

B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

SYSTEM_PROMPT = """You are a friendly chatbot"""

def chat_multiturn_seq_format(
    message: str,
    history: List[Tuple[str, str]] = [], 
):
    """
    ```
        <bos>[INST] B_SYS SytemPrompt E_SYS Prompt [/INST] Answer <eos>
        <bos>[INST] Prompt [/INST] Answer <eos>
        <bos>[INST] Prompt [/INST]
    ```
    As the format auto-add <bos>, please turn off add_special_tokens with `tokenizer.add_special_tokens = False`
    Inputs:
      message: the current prompt
      history: list of list indicating previous conversation. [[message1, response1], [message2, response2]]
    Outputs:
      full_prompt: the prompt that should go into the chat model

    e.g:
      full_prompt = chat_multiturn_seq_format("Hello world")
      output = model.generate(tokenizer.encode(full_prompt, add_special_tokens=False), ...)
    """
    text = ''
    for i, (prompt, res) in enumerate(history):
        if i == 0:
            text += f"{BOS_TOKEN}{B_INST} {B_SYS} {SYSTEM_PROMPT} {E_SYS} {prompt} {E_INST}"
        else:
            text += f"{BOS_TOKEN}{B_INST} {prompt}{end_instr}"
        if res is not None:
            text += f" {res} {EOS_TOKEN} "
    if len(history) == 0 or text.strip() == '':
        text = f"{BOS_TOKEN}{B_INST} {B_SYS} {SYSTEM_PROMPT} {E_SYS} {message} {E_INST}"
    else:
        text += f"{BOS_TOKEN}{B_INST} {message} {E_INST}"
    return text


local_dir = snapshot_download("AI-ModelScope/NexusRaven-V2-13B",revision='master')

model = Model.from_pretrained(local_dir, revision='master', device_map='auto', torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(local_dir, revision='master')

full_prompt = chat_multiturn_seq_format("What's the weather like in Seattle right now?")
inputs = tokenizer(full_prompt, add_special_tokens=False, return_tensors="pt")
# Generate
generate_ids = model.generate(inputs.input_ids.to(model.device),  max_length=512,do_sample=False,temperature=0.001,top_k=50, top_p=0.95)
print(tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0])

If you would like to prevent the generation of the explanation of the function call (for example, to save on inference tokens), please set a stopping criteria of \<bot_end>.

Please follow this prompting template to maximize the performance of RavenV2.

Using with OpenAI FC Schematics

If you currently have a workflow that is built around OpenAI's function calling and you want to try NexusRaven-V2, we have a package that helps you drop in NexusRaven-V2.

Evaluation

NexusRaven NexusRaven

For a deeper dive into the results, please see our Github README.

Limitations

  1. The model works best when it is connected with a retriever when there are a multitude of functions, as a large number of functions will saturate the context window of this model.
  2. The model can be prone to generate incorrect calls. Please ensure proper guardrails to capture errant behavior is in place.
  3. The explanations generated by NexusRaven-V2 might be incorrect. Please ensure proper guardrails are present to capture errant behavior.

License

This model was trained on commercially viable data and is licensed under the Nexusflow community license.

References

We thank the CodeLlama team for their amazing models!

@misc{rozière2023code,
      title={Code Llama: Open Foundation Models for Code}, 
      author={Baptiste Rozière and Jonas Gehring and Fabian Gloeckle and Sten Sootla and Itai Gat and Xiaoqing Ellen Tan and Yossi Adi and Jingyu Liu and Tal Remez and Jérémy Rapin and Artyom Kozhevnikov and Ivan Evtimov and Joanna Bitton and Manish Bhatt and Cristian Canton Ferrer and Aaron Grattafiori and Wenhan Xiong and Alexandre Défossez and Jade Copet and Faisal Azhar and Hugo Touvron and Louis Martin and Nicolas Usunier and Thomas Scialom and Gabriel Synnaeve},
      year={2023},
      eprint={2308.12950},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Citation

@misc{nexusraven,
      title={NexusRaven-V2: Surpassing GPT-4 for Zero-shot Function Calling},
      author={Nexusflow.ai team},
      year={2023},
      url={https://nexusflow.ai/blogs/ravenv2}
}

Contact

Please join our Discord Channel to reach out for any issues and comments!

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论