legalinstrumenttrain
This model is a fine-tuned version of ./Llama-2-7b-hf on the criminal1_10k dataset. It achieves the following results on the evaluation set:
- Loss: 0.3369
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- trainbatchsize: 2
- evalbatchsize: 2
- seed: 42
- gradientaccumulationsteps: 2
- totaltrainbatch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lrschedulertype: cosine
- lrschedulerwarmup_steps: 20
- num_epochs: 3.0
- mixedprecisiontraining: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.6239 | 0.15 | 100 | 0.6105 |
0.4555 | 0.3 | 200 | 0.4880 |
0.4614 | 0.44 | 300 | 0.4438 |
0.3975 | 0.59 | 400 | 0.4208 |
0.3747 | 0.74 | 500 | 0.3984 |
0.4318 | 0.89 | 600 | 0.3888 |
0.3629 | 1.04 | 700 | 0.3766 |
0.3729 | 1.19 | 800 | 0.3685 |
0.3675 | 1.33 | 900 | 0.3632 |
0.4056 | 1.48 | 1000 | 0.3570 |
0.3222 | 1.63 | 1100 | 0.3522 |
0.2821 | 1.78 | 1200 | 0.3489 |
0.3431 | 1.93 | 1300 | 0.3448 |
0.2885 | 2.07 | 1400 | 0.3429 |
0.262 | 2.22 | 1500 | 0.3413 |
0.3168 | 2.37 | 1600 | 0.3394 |
0.3183 | 2.52 | 1700 | 0.3380 |
0.3021 | 2.67 | 1800 | 0.3372 |
0.2748 | 2.81 | 1900 | 0.3369 |
0.3175 | 2.96 | 2000 | 0.3369 |
Framework versions
- PEFT 0.10.0
- Transformers 4.39.2
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
评论