Your Best Unit 8.7 Model! #41

rasbt · 2023-05-02T23:09:07Z

rasbt
May 2, 2023
Maintainer

My best model achieved 94.9% test accuracy (I don't want to spoil which one yet) 😊

spayot · 2023-06-08T04:51:54Z

spayot
Jun 8, 2023

I tried a few of the smaller encoder style models, but finetuning all layers every time.
so far, none have beaten your 94.9%.
Results so far summarized below:

model	params (M)	its / second	test accuracy
distilbert-base-uncased (baseline)	67	4.7	0.9264
distilroberta-base	83	4.5	0.9311
google/electra-base-discriminator	109	2.2	0.9439
microsoft/deberta-v3-base	184	1.2	0.9248*

All models:

best checkpoint out of 3 epochs,
trained with LR=5e-5 and Adam optimizer,
no frozen parameters
no hyperparameter search (if we except the choice of pretrained model, of course).

iterations / second: calculated based on training iterations (batch size=12) on google colab with T4 GPU.

(*) colab crashed on deBerta after one epoch ... i might need to stop being cheap and pay the 10 bucks if i want to keep playing around :)

0 replies

spayot · 2023-06-08T05:00:54Z

spayot
Jun 8, 2023

@rasbt , would be curious to know which model gave you the 94.9% while only finetuning the last few layers.
i assume it is a somewhat larger model than the few i tried out, right?
would also be curious to experiment on comparing finetuning on the last couple layers as was explored in this exercise, vs using a LoRA approach which you beautifully walked us through here. or would it be only worth for larger scale LLMs?

On a different note: when working with the huggingface ecosystems, what would be the pros and cons of leveraging huggingface's Trainer class, vs the lightning Trainer?

Thanks again for all the good content!
looking forward to going through the last couple of Units.

1 reply

rasbt Jun 8, 2023
Maintainer Author

Super interesting, thanks for sharing @spayot! I, too, would have expected e.g., Electra to perform best.

It turned out that a RoBERTa model was the one that yielded the best results for me. RoBERTa is essentially like BERT but without the is-next-sentence prediction pretraining task.

from transformers import AutoModelForSequenceClassification


model = AutoModelForSequenceClassification.from_pretrained(
    "siebert/sentiment-roberta-large-english", num_labels=2)

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("siebert/sentiment-roberta-large-english")
print("Tokenizer input max length:", tokenizer.model_max_length)
print("Tokenizer vocabulary size:", tokenizer.vocab_size)

I trained it for 3 epochs, but the best model checkpoint is from after the first epoch. The batchsize was 12, and I used Adam with learning_rate=5e-5

PS: I have not used any parameter-efficient finetuning since I thought that might be a bit unfair since it was out of the scope of this class :).

On a different note: when working with the huggingface ecosystems, what would be the pros and cons of leveraging huggingface's Trainer class, vs the lightning Trainer?

Good question. As far as I heard, the HuggingFace Trainer was originally a fork of the Lightning Trainer (~5 years or so back). Over time, I think they made the HF Trainer more NLP specific. Personally, I am more familiar with the Lightning Trainer because I use it also for other types of models like TorchVision models etc. It makes the switching between projects for me a bit more seamless for me.

spayot · 2023-06-10T06:00:42Z

spayot
Jun 10, 2023

updating my table here with roberta-large results (the ones you reported training the last few layers and another one trained with LoRA.

model	params (M)	finetuning	its / second	test accuracy
distilbert-base-uncased (baseline)	67	all params	4.7	0.9264
distilroberta-base	83	all params	4.5	0.9311
google/electra-base-discriminator	109	all params	2.2	0.9439
microsoft/deberta-v3-base	184	all params	1.2	0.9248
siebert/sentiment-roberta-large-english (@rasbt)	354	last layers	?	0.949
roberta-large	358	LoRA (1% trainable params)	1.6**	0.9612

** ran on A100 instead of T4 but without mixed-precision (was not sure if it was compatible)

comparison would have been more satisfying had i used siebert's version of roberta... but nice to see that LoRA seems to deliver some value even outside of super large language models! 😃
Also interesting to realize that even if LoRA requires to train only 1% of all params, each training steps are still quite lengthy.

1 reply

rasbt Jun 10, 2023
Maintainer Author

Wow that is a pretty nice boost thanks to LoRA. Thanks for sharing!

pardeep-singh · 2024-01-30T15:54:26Z

pardeep-singh
Jan 30, 2024

Sharing results from some of the experiments that I did

model	params (M)	finetuning	test accuracy	Notebook
siebert/sentiment-roberta-large-english (@rasbt)	354	last layers	0.949
roberta-large	354	Last Layers	0.909
roberta-large-mnli	354	Last Layers	0.500	Notebook
siebert/sentiment-roberta-large-english	354	All Layers	0.500	Notebook

@rasbt I couldn't understand why the full layers training of siebert/sentiment-roberta-large-english would yield lower accuracy as compared to last layer fine tuning? Similarly couldn't understand why roberta-large-mnli model last layer fine tuning also results in lower accuracy?
Do you have any idea regarding these?

0 replies

nylipton · 2024-04-04T19:38:05Z

nylipton
Apr 4, 2024

Only training the last two layers:
distilbert-base-uncased-finetuned-sst-2-english - 90.39% test accuracy
siebert/sentiment-roberta-large-english - 95.09% test accuracy

1 reply

rasbt Apr 9, 2024
Maintainer Author

Wow nice, I think 95.09% is the 2nd best result thus far!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Your Best Unit 8.7 Model! #41

{{title}}

Replies: 5 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Your Best Unit 8.7 Model! #41

rasbt May 2, 2023 Maintainer

Replies: 5 comments · 3 replies

spayot Jun 8, 2023

spayot Jun 8, 2023

rasbt Jun 8, 2023 Maintainer Author

spayot Jun 10, 2023

rasbt Jun 10, 2023 Maintainer Author

pardeep-singh Jan 30, 2024

nylipton Apr 4, 2024

rasbt Apr 9, 2024 Maintainer Author

rasbt
May 2, 2023
Maintainer

Replies: 5 comments 3 replies

spayot
Jun 8, 2023

spayot
Jun 8, 2023

rasbt Jun 8, 2023
Maintainer Author

spayot
Jun 10, 2023

rasbt Jun 10, 2023
Maintainer Author

pardeep-singh
Jan 30, 2024

nylipton
Apr 4, 2024

rasbt Apr 9, 2024
Maintainer Author