Skip to content

Latest commit

 

History

History
70 lines (59 loc) · 3.25 KB

File metadata and controls

70 lines (59 loc) · 3.25 KB

NeurIPS

Quantization

  • Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples
  • Post-Training Quantization for Vision Transformer
  • Post-Training Sparsity-Aware Quantization
  • BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

Pruning

  • Pruning Randomly Initialized Neural Networks with Iterative Randomization
  • Rethinking the Pruning Criteria for Convolutional Neural Network

CVPR

  • QPP: Real-Time Quantization Parameter Prediction for Deep Neural Networks
  • Automated Log-Scale Quantization for Low-Cost Deep Neural Networks
  • AQD: Towards Accurate Quantized Object Detection
  • Diversifying Sample Generation for Accurate Data-Free Quantization
  • Learnable Companding Quantization for Accurate Low-bit Neural Networks
  • Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks
  • Convolutional Neural Network Pruning with Structural Redundancy Reduction
  • Joint-DetNAS: Upgrade Your Detector with NAS, Pruning and Dynamic Distillation
  • Manifold Regularized Dynamic Network Pruning
  • Network Pruning via Performance Maximization
  • Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework
  • Towards Compact CNNs via Collaborative Compression
  • Content-Aware GAN Compression
  • Bi-GCN: Binary Graph Convolutional Network
  • Binary Graph Neural Networks

ICML

Pruning

  • Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework
  • SparseBERT: Rethinking the Importance Analysis in Self-attention
  • Group Fisher Pruning for Practical Network Compression
  • A Probabilistic Approach to Neural Network Pruning

Quantization

  • Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution
  • HAWQ-V3: Dyadic Neural Network Quantization
  • I-BERT: Integer-only BERT Quantization
  • Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
  • Training Quantized Neural Networks to Global Optimality via Semidefinite Programming

ICLR

Pruning

  • A Gradient Flow Framework for Analyzing Network Pruning
  • Group Fisher Pruning for Practical Network Compression

Quantization

  • Degree-quant: Quantization-aware Training for Graph Neural Networks
  • Training With Quantization Noise for Extreme Model Compression
  • Brecq: Pushing the Limit of Post-training Quantization by Block Reconstruction
  • Neural Gradients Are Near-lognormal: Improved Quantized and Sparse Training
  • Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks
  • Bipointnet: Binary Neural Network for Point Clouds
  • Faster Binary Embeddings for Preserving Euclidean Distances
  • Growing Efficient Deep Networks by Structured Continuous Sparsification
  • CPT: Efficient Deep Neural Network Training Via Cyclic Precision

Distillation

  • Mixkd: Towards Efficient Distillation of Large-scale Language Models
  • Knowledge Distillation As Semiparametric Inference
  • A Teacher-student Framework to Distill Future Trajectories
  • Is Label Smoothing Truly Incompatible with Knowledge Distillation: an Empirical Study
  • Rethinking Soft Labels for Knowledge Distillation: a Bias-variance Tradeoff Perspective
  • Neural Attention Distillation: Erasing Back-door Triggers from Deep Neural Networks
  • Knowledge Distillation Via Softmax Regres-sion Representation Learning