Triton inference server
What's Changed
- Add Support for Metrics While Benchmarking on Neuron and EC2 by @dheerajoruganty in #193
- Fix to correctly parse response in message API format for SageMaker inference by @fespigares in #165
- Tagging by @antara678 in #188
- adding platform identification for deployment by @madhurprash in #187
- update llama2 7b quick file by @madhurprash in #198
- Bug fix for evals by @madhurprash in #196
- Update config-ec2-llama3-8b.yml by @antara678 in #199
- Triton integration by @madhurprash in #194
New Contributors
- @fespigares made their first contribution in #165
Full Changelog: v2.0.6...v2.0.7