NexusRave-13B: Surpassig GPT-4 for Zero-shot Fuctio Callig

Nexusflow HF - Nexusflow Discord - NexusRave-V2 blog post - Promptig Notebook CoLab - Leaderboard - Read-World Demo - NexusRave-V2-13B Github

Itroducig NexusRave-V2-13B

NexusRave is a ope-source ad commercially viable fuctio callig LLM that surpasses the state-of-the-art i fuctio callig capabilities.

? Versatile Fuctio Callig Capability: NexusRave-V2 is capable of geeratig sigle fuctio calls, ested calls, ad parallel calls i may challegig cases.

? Fully Explaiable: NexusRave-V2 is capable of geeratig very detailed explaatios for the fuctio calls it geerates. This behavior ca be tured off, to save tokes durig iferece.

? Performace Highlights: NexusRave-V2 surpasses GPT-4 by 7% i fuctio callig success rates i huma-geerated use cases ivolvig ested ad composite fuctios.

? Geeralizatio to the Usee: NexusRave-V2 has ever bee traied o the fuctios used i evaluatio.

? Commercially Permissive: The traiig of NexusRave-V2 does ot ivolve ay data geerated by proprietary LLMs such as GPT-4. You have full cotrol of the model whe deployed i commercial applicatios.

Please checkout the followig liks!

Promptig Notebook CoLab
Evaluatio Leaderboard
NexusRave-V2 Real-World Demo

NexusRave-V2 model usage

NexusRave-V2 accepts a list of pytho fuctios. These pytho fuctios ca do aythig (icludig sedig GET/POST requests to exteral APIs!). The two requiremets iclude the pytho fuctio sigature ad the appropriate docstrig to geerate the fuctio call.

NexusRave-V2's Capabilities

NexusRave-V2 is capable of geeratig deeply ested fuctio calls, parallel fuctio calls, ad simple sigle calls. It ca also justify the fuctio calls it geerated. If you would like to geerate the call oly, please set a stop criteria of \"\<bot_ed>\". Otherwise, please allow NexusRave-V2 to ru util its stop toke (i.e. "\<\/s>").

Quick Start Promptig Guide

Please refer to our otebook, How-To-Prompt.ipyb, for more advaced tutorials o usig NexusRave-V2!

Whe givig docstrigs to Rave, please provide well-ideted, detailed, ad well-writte docstrigs as this ca help accuracy.
Rave does better whe all fuctios provided to it has argumets, either required or optioal, (i.e. fuc(dummy_arg) is preferred over fuc()) as this ca help accuracy.
We strogly recommed to set samplig to False whe promptig NexusRave-V2.
We strogly recommed a very low temperature (~0.001).
We strogly recommed followig the promptig style below.

Whe hadlig irrelevat user queries, users have oticed that specifyig a "o-op" fuctio with argumets work best. For example, somethig like this might work:

def o_relevat_fuctio(user_query : str):
  """
  Call this whe o other provided fuctio ca be called to aswer the user query.

  Args:
     user_query: The user_query that caot be aswered by ay other fuctio calls.
  """

Please esure to provide a argumet to this fuctio, as Rave works best o fuctios with argumets.

Quickstart

You ca ru the model o a GPU usig the followig code.

from modelscope import AutoTokeizer,Model
from modelscope import sapshot_dowload
import torch 
from typig import List, Tuple

BOS_TOKEN = '<s>'
EOS_TOKEN = '</s>'

B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\", "\<</SYS>>\\"

SYSTEM_PROMPT = """You are a friedly chatbot"""

def chat_multitur_seq_format(
    message: str,
    history: List[Tuple[str, str]] = [], 
):
    """
    ```
        <bos>[INST] B_SYS SytemPrompt E_SYS Prompt [/INST] Aswer <eos>
        <bos>[INST] Prompt [/INST] Aswer <eos>
        <bos>[INST] Prompt [/INST]
    ```
    As the format auto-add <bos>, please tur off add_special_tokes with `tokeizer.add_special_tokes = False`
    Iputs:
      message: the curret prompt
      history: list of list idicatig previous coversatio. [[message1, respose1], [message2, respose2]]
    Outputs:
      full_prompt: the prompt that should go ito the chat model

    e.g:
      full_prompt = chat_multitur_seq_format("Hello world")
      output = model.geerate(tokeizer.ecode(full_prompt, add_special_tokes=False), ...)
    """
    text = ''
    for i, (prompt, res) i eumerate(history):
        if i == 0:
            text += f"{BOS_TOKEN}{B_INST} {B_SYS} {SYSTEM_PROMPT} {E_SYS} {prompt} {E_INST}"
        else:
            text += f"{BOS_TOKEN}{B_INST} {prompt}{ed_istr}"
        if res is ot Noe:
            text += f" {res} {EOS_TOKEN} "
    if le(history) == 0 or text.strip() == '':
        text = f"{BOS_TOKEN}{B_INST} {B_SYS} {SYSTEM_PROMPT} {E_SYS} {message} {E_INST}"
    else:
        text += f"{BOS_TOKEN}{B_INST} {message} {E_INST}"
    retur text


local_dir = sapshot_dowload("AI-ModelScope/NexusRave-V2-13B",revisio='master')

model = Model.from_pretraied(local_dir, revisio='master', device_map='auto', torch_dtype=torch.float16)
tokeizer = AutoTokeizer.from_pretraied(local_dir, revisio='master')

full_prompt = chat_multitur_seq_format("What's the weather like i Seattle right ow?")
iputs = tokeizer(full_prompt, add_special_tokes=False, retur_tesors="pt")
# Geerate
geerate_ids = model.geerate(iputs.iput_ids.to(model.device),  max_legth=512,do_sample=False,temperature=0.001,top_k=50, top_p=0.95)
prit(tokeizer.batch_decode(geerate_ids, skip_special_tokes=True, clea_up_tokeizatio_spaces=False)[0])

If you would like to prevet the geeratio of the explaatio of the fuctio call (for example, to save o iferece tokes), please set a stoppig criteria of \<bot_ed>.

Please follow this promptig template to maximize the performace of RaveV2.

Usig with OpeAI FC Schematics

If you curretly have a workflow that is built aroud OpeAI's fuctio callig ad you wat to try NexusRave-V2, we have a package that helps you drop i NexusRave-V2.

Evaluatio

For a deeper dive ito the results, please see our Github README.

Limitatios

The model works best whe it is coected with a retriever whe there are a multitude of fuctios, as a large umber of fuctios will saturate the cotext widow of this model.
The model ca be proe to geerate icorrect calls. Please esure proper guardrails to capture errat behavior is i place.
The explaatios geerated by NexusRave-V2 might be icorrect. Please esure proper guardrails are preset to capture errat behavior.

Licese

This model was traied o commercially viable data ad is licesed uder the Nexusflow commuity licese.

Refereces

We thak the CodeLlama team for their amazig models!

@misc{rozière2023code,
      title={Code Llama: Ope Foudatio Models for Code}, 
      author={Baptiste Rozière ad Joas Gehrig ad Fabia Gloeckle ad Ste Sootla ad Itai Gat ad Xiaoqig Elle Ta ad Yossi Adi ad Jigyu Liu ad Tal Remez ad Jérémy Rapi ad Artyom Kozhevikov ad Iva Evtimov ad Joaa Bitto ad Maish Bhatt ad Cristia Cato Ferrer ad Aaro Grattafiori ad Weha Xiog ad Alexadre Défossez ad Jade Copet ad Faisal Azhar ad Hugo Touvro ad Louis Marti ad Nicolas Usuier ad Thomas Scialom ad Gabriel Syaeve},
      year={2023},
      eprit={2308.12950},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Citatio

@misc{exusrave,
      title={NexusRave-V2: Surpassig GPT-4 for Zero-shot Fuctio Callig},
      author={Nexusflow.ai team},
      year={2023},
      url={https://exusflow.ai/blogs/ravev2}
}

Cotact

Please joi our Discord Chael to reach out for ay issues ad commets!

NexusRaven-V2-13B

技术信息

作品详情