GitHub - naga-karthik/nsl-project

Neural Scaling Laws Project

This repository contains the code for our project in the course IFT6760B: Neural Scaling Laws and Foundation Models taught in the Winter semester of 2022 at Mila/University of Montreal.

The slides for our presentation, titled "On Layer Normalization for Vision Transformers", can be found here.

The goal of our project was to understand the effect of PreNorm and PostNorm versions of the Vision Transformer. While the literature contains some studies that look into the Pre- and Post-Norm versions of the vanilla transformer applied to language data, a similar analysis for vision data using vision transformers (ViT) is lacking. We used 4 datasets: CIFAR10, CIFAR100, Imagenette, and Imagewoof and trained them from scratch.

The idea for this project was primarily inspired by this paper.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
logs_vit/plots		logs_vit/plots
utils		utils
README.md		README.md
lstm_solution.py		lstm_solution.py
main.ipynb		main.ipynb
requirements.txt		requirements.txt
run_exp_vit_pl.py		run_exp_vit_pl.py
sweep.yaml		sweep.yaml
train_vit_post.sh		train_vit_post.sh
train_vit_pre.sh		train_vit_pre.sh
vit_solution.py		vit_solution.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Scaling Laws Project

About

Releases

Packages

Contributors 2

Languages

naga-karthik/nsl-project

Folders and files

Latest commit

History

Repository files navigation

Neural Scaling Laws Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages