-
ServiceNow Research
- Canada
-
16:47
- 5h behind - https://ehsk.github.io
- @ehsk0
Lists (2)
Sort Name ascending (A-Z)
Stars
Recipes to scale inference-time compute of open models
verl: Volcano Engine Reinforcement Learning for LLMs
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Fully open reproduction of DeepSeek-R1
Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models
A bibliography and survey of the papers surrounding o1
TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle
🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python…
A blazing fast inference solution for text embeddings models
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
🌎💪 BrowserGym, a Gym environment for web task automation
Firefly III: a personal finances manager
A Comprehensive Assessment of Trustworthiness in GPT Models
Ranger helps you see the forest among the trees - Ranger is an effect-size meta analysis library creating beautiful forest plots!
The hub for EleutherAI's work on interpretability and learning dynamics
utilities for decoding deep representations (like sentence embeddings) back to text
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
List of papers on hallucination detection in LLMs.
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
QLoRA: Efficient Finetuning of Quantized LLMs
Train transformer language models with reinforcement learning.