starcoder-15b-taco

我要开发同款
匿名用户2024年07月31日
15阅读
开发技术Pytorch
所属分类ai
开源地址https://modelscope.cn/models/zhujihuai/starcoder-15b-taco
授权协议Apache License 2.0

作品详情

Starcoder-15B-TACO

Model Description

Starcoder-15B-TACO is a Starcoder-15B finetuned(full-parameter) on TACO dataset. This model is specialized to solve competition-level programming tasks.

Training data

The model is trained on the Topics in Algorithmic Code Generation Dataset. The dataset focused on algorithmic code generation, aiming to provide a more challenging training dataset and evaluation benchmark for the code generation model field. It includes 25,443 problems in the training set and 1,000 problems in the test set, making it the largest code generation dataset currently available. Each TACO problem is designed to match a diverse set of solution answers, with answers reaching sizes up to 1.55M, to ensure that models trained on this dataset are robust and not prone to overfitting. Furthermore, the TACO dataset includes fine-grained labels such as task topics, algorithms, skills, and difficulty levels, offering more precise guidance for both training and evaluating code generation models. This model is fine-tuned using train split of TACO.

Training procedure

The training script used to train this model can be found here.

Training Details can be seen in our paper

Intended Use and Limitations

The model is finetuned to solve programming problems given a text description and optional starter code.

How to use

You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:

from transformers import AutoModelForCausalLM, AutoTokenizer, FlaxAutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("flagopen/starcoder-15b-taco")
tokenizer = AutoTokenizer.from_pretrained("flagopen/starcoder-15b-taco")
prompt = """
A function to greet user. Given a user name it should say hello
def greet(name):
ANSWER:
""" 
input_ids = tokenizer(prompt, return_tensors='pt').input_ids.to(device)
start = input_ids.size(1)
out = model.generate(input_ids, do_sample=True, max_length=50, num_beams=2, 
                     early_stopping=True, eos_token_id=tokenizer.eos_token_id, )
print(tokenizer.decode(out[0][start:]))

Limitations and Biases

The model is intended to be only used for research purposes and comes with no guarantees of quality of generated code.

Eval results

Coming soon…

声明:本文仅代表作者观点,不代表本站立场。如果侵犯到您的合法权益,请联系我们删除侵权资源!如果遇到资源链接失效,请您通过评论或工单的方式通知管理员。未经允许,不得转载,本站所有资源文章禁止商业使用运营!
下载安装【程序员客栈】APP
实时对接需求、及时收发消息、丰富的开放项目需求、随时随地查看项目状态

评论