Skip to content

Latest commit

 

History

History
84 lines (75 loc) · 4.17 KB

README.md

File metadata and controls

84 lines (75 loc) · 4.17 KB

SeekDeeper: Minimal Implementations of Popular AI Models

[📖中文ReadMe]

Motivation

Official code repositories often include many engineering details, which can be overwhelming for beginners. This repository aims to implement various models with as little code as possible using PyTorch, making it easier for learners to understand and reproduce results. Additionally, most tutorials lack a complete workflow, focusing only on the model without considering data loading and training. This makes it difficult for beginners to apply their knowledge in practice.

Models

Model Paper Official or Reference Repository
Transformer Attention Is All You Need https://github.com/hyunwoongko/transformer
GPT Improving Language Understanding by Generative Pre-Training https://github.com/openai/finetune-transformer-lm
https://github.com/openai/gpt-2
https://github.com/karpathy/nanoGPT
https://github.com/karpathy/minGPT
GPT-2 Language Models are Unsupervised Multitask Learners
ViT An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Huggingface ViT implementation
GAN Generative Adversarial Networks https://github.com/goodfeli/adversarial
DCGAN Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks https://github.com/Newmu/dcgan_code
WGAN-GP Improved Training of Wasserstein GANs https://github.com/igul222/improved_wgan_training

Directory Structure

For each model, the typical directory structure is as follows:

<model name>/
├── checkpoints/
├── modules/
├── datasets/
├── images/
├── README.md
├── data.py
├── config.py
├── train.ipynb
└── inference.ipynb
  • checkpoints/: Contains pre-trained model weights for direct use in inference.ipynb. Sometimes, pre-trained parameters from official repositories are loaded directly.
  • modules/: Contains modules necessary for model implementation.
  • datasets/: Contains datasets required for training or inference validation, which may sometimes be downloaded to this directory via code.
  • images/: Contains images for README.md of this model.
  • README.md: Introduces the implemented task and describes the implementation details.
  • data.py: Defines Dataset, Dataloader, or data preprocessing.
  • config.py: Defines hyperparameters needed for the experiment.
  • train.ipynb: Clearly presents the process from data loading, preprocessing, to training and evaluation.
  • inference.ipynb: Loads model parameters from the checkpoints/ directory for inference.

License

This project uses the MIT License.