Skip to content

naga-karthik/nsl-project

Repository files navigation

Neural Scaling Laws Project

This repository contains the code for our project in the course IFT6760B: Neural Scaling Laws and Foundation Models taught in the Winter semester of 2022 at Mila/University of Montreal.

The slides for our presentation, titled "On Layer Normalization for Vision Transformers", can be found here.

The goal of our project was to understand the effect of PreNorm and PostNorm versions of the Vision Transformer. While the literature contains some studies that look into the Pre- and Post-Norm versions of the vanilla transformer applied to language data, a similar analysis for vision data using vision transformers (ViT) is lacking. We used 4 datasets: CIFAR10, CIFAR100, Imagenette, and Imagewoof and trained them from scratch.

The idea for this project was primarily inspired by this paper.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published