Releases · onnx/turnkeyml

27 Feb 20:35

v6.0.0

478bf5b

v6.0.0 Latest

Latest

Summary

This is a major release that introduces an OpenAI-compatible server in a completely new serve tool, support for Quark quantization in the new quark tool, and many other fixes/improvements.

Breaking Changes

New OpenAI-Compatible Server

The previous serve Tool has been replaced by a new standalone serving command. This new server has OpenAI API compatibility and will add Ollama compatibility in the near future.

Old usage: lemoande -i CHECKPOINT oga-load --args serve
New usage: lemonade serve, then use REST APIs to control model loading, completions, etc. See https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/server_spec.md to learn more.

The server can also be installed and used with no-code by running Lemonade_Server_Installer.exe, which is provided as a release asset in this and all future releases.

The server code was also moved out of tools/chat.py into its own file in tools/serve.py. We also renamed chat.py to prompt.py for clarity, since that file now only contains the prompting tool.

The LEAP name has been deprecated

In the interest of reducing naming confusion, the "LEAP API" is now simply the "high-level lemonade API".

Old usage: from lemonade.leap import from_pretrained
New usage: from lemonade.api import from_pretrained

Summary of Contributions

The base checkpoint for models is retrieved from the Hugging Face API at loading time (@ramkrishna2910)
The benchmarking tools (huggingface-bench, oga-bench, and llamacpp-bench) have been refactored to reduce code duplication and improve maintainability. They now also support a list of prompts (or prompt lengths) to be benchmarked: --prompts 128 256 512 (@amd-pworfolk)
The avg_accuracy stats has been renamed to average_mmlu_accuracy for clarity with respect to non-MMLU accuracy tests (@jeremyfowers), (attn @apsonawane)
Introduce Lemonade_Server_Installer.exe (@jeremyfowers)
Implement an OpenAI-compatible server and remove the old serve tool (@danielholanda)
Rename chat module to prompt (@jeremyfowers)
Improved lemonade getting started documentation and remove the "LEAP" branding (@jeremyfowers)
OGA 0.6.0 is the default package for CPU, CUDA, and DML (@jeremyfowers)
Add support for Quark quantization with a new quark-quantize tool (@iswaryaalex)
Clean up the lemonade getting started docs and remove some deprecated tools (@jeremyfowers)

New Contributors

@iswaryaalex made their first contribution in #290

Full Changelog: v5.1.1...v6.0.0

Contributors

danielholanda, ramkrishna2910, and 4 other contributors

Assets 3

06 Feb 19:31

jeremyfowers

v5.1.1

b805839

v5.1.1

What's Changed

Fix broken lemonade link by @jeremyfowers in #278
Update getting_started.md by @jeremyfowers in #282
Avoid lemonade build cache collisions (@jeremyfowers).
- All builds are now placed under <cache_dir>/builds/<build_name> instead of <cache_dir>/<build_name>
  - This creates a more hierarchical cache structure, where builds are peer to models and data.
- All build names now include a timestamp
  - This ensures that build stats and logs will not collide with each other if we build the same model in the same cache, but with different parameters.
- Revs the minor version number because all previous caches are invalidated.
Enable ONNX model download for cpu and igpu in oga-load (@jeremyfowers)
Improvements to memory tracking (@amd-pworfolk)
Improve OGA testing (@jeremyfowers).
- Run the sever test last, since it is the most complex and has the worst telemetry
- Stop deleting the entire cache directory between every test, since that deletes the model builder cache. Instead, just delete the cache/builds directory.
Add average mmlu accuracy by @apsonawane in #287
Update OGA LEAP recipes by @jeremyfowers in #289

Full Changelog: v5.0.5...v5.1.1

Contributors

jeremyfowers, apsonawane, and amd-pworfolk

Assets 2

31 Jan 19:55

danielholanda

v5.0.5

4c19a41

v5.0.5

What's Changed

Early preview of new server interface by @danielholanda in #277

Full Changelog: v5.0.4...v5.0.5

Contributors

danielholanda

Assets 2

29 Jan 22:21

jeremyfowers

v5.0.4

81708d4

v5.0.4

What's Changed

Use public hybrid artifacts by @jeremyfowers in #276

Full Changelog: v5.0.3...v5.0.4

Contributors

jeremyfowers

Assets 2

28 Jan 17:52

jeremyfowers

v5.0.3

b9e6223

v5.0.3

What's Changed

Bring OGA under test and fix OGA server. Improve llm-prompt. by @jeremyfowers in #272
Always move HF tozenizer encodings to the target device by @jeremyfowers in #274
Release v5.0.3: Lemonade installer and examples, repo reorg, and lots more by @jeremyfowers in #275
- Docs, test, and examples have been moved into turnkey (CNNs and Transformers) vs. lemonade (LLMs) directories (@jeremyfowers)
- For example: docs/lemonade/getting_started.md instead of docs/lemonade_getting_started.md
- Track the memory utilization of any lemonade or turnkey command and plot it on a graph by setting the --memory option (@amd-pworfolk).
- Add examples and demo applications for the high-level LEAP APIs in examples/lemonade (@jeremyfowers).
- Add LEAP support for all OGA backends (@jeremyfowers).
- Extend the llm-prompt tool to make it more useful for model and framework validation (@amd-pworfolk).
- Updates and fixes to lemonade test code in llm_api.py (@jeremyfowers).
- Fix not_enough_tokens bug on oga-bench (@danielholanda).

Full Changelog: v5.0.2...v5.0.3

Contributors

danielholanda, jeremyfowers, and amd-pworfolk

Assets 2

16 Jan 00:41

jeremyfowers

v5.0.2

bc33e79

v5.0.2

What's Changed

Re-issuing v5.0.1 to fix a pypi release bug.

Moving HumanEval to pypi (@ramkrishna2910)
Adds std dev for oga-bench (@amd-pworfolk)
Updates build status monitor to change update frequency (@danielholanda)
Fix linter issue (@ramkrishna2910)
Fix llama.cpp issue introduced by their breaking change (@jeremyfowers)
Polish llama.cpp implementation (@ramkrishna2910)
Minor changes fixing onnxruntime_genai issue and input_path by @apsonawane in #267

New Contributors

@apsonawane made their first contribution in #267

Full Changelog: v5.0.0...v5.0.2

Contributors

danielholanda, ramkrishna2910, and 3 other contributors

Assets 2

16 Jan 00:04

ramkrishna2910

v5.0.1

c107c2a

v5.0.1

What's Changed

Moving HumanEval to pypi (@ramkrishna2910)
Adds std dev for oga-bench (@amd-pworfolk)
Updates build status monitor to change update frequency (@danielholanda)
Fix linter issue (@ramkrishna2910)
Fix llama.cpp issue introduced by their breaking change (@jeremyfowers)
Polish llama.cpp implementation (@ramkrishna2910)
Minor changes fixing onnxruntime_genai issue and input_path by @apsonawane in #267

New Contributors

@apsonawane made their first contribution in #267

Full Changelog: v5.0.0...v5.0.1

Contributors

danielholanda, ramkrishna2910, and 3 other contributors

Assets 2

13 Jan 17:20

ramkrishna2910

v5.0.0

4e7450d

v5.0.0

What's Changed

Improve documentation and LLM status clarity by @jeremyfowers in #261
Move llm source code into src/lemonade dir. Add HumanEval. by @jeremyfowers in #262
Adds llamacpp benchmarking support by @ramkrishna2910 in #263

Full Changelog: v4.0.11...v5.0.0

Contributors

ramkrishna2910 and jeremyfowers

Assets 2

06 Jan 22:22

jeremyfowers

v4.0.11

21a5c74

v4.0.11

What's Changed

Hotfix: monitor progress bug by @jeremyfowers in #259

Full Changelog: v4.0.10...v4.0.11

Contributors

jeremyfowers

Assets 2

06 Jan 20:07

jeremyfowers

v4.0.10

2b9a783

v4.0.10

What's Changed

Update ort_genai_hybrid.md by @jeremyfowers in #256
Standardize Timestamps to Fixed Time Zone in TKML Runs by @danielholanda in #257
Allow tools to display percent progress in the monitor by @jeremyfowers in #258

Full Changelog: v4.0.9...v4.0.10

Contributors

danielholanda and jeremyfowers

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Summary

Breaking Changes

New OpenAI-Compatible Server

The LEAP name has been deprecated

Summary of Contributions

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: onnx/turnkeyml

v6.0.0

Summary

Breaking Changes

New OpenAI-Compatible Server

The LEAP name has been deprecated

Summary of Contributions

New Contributors

Contributors

v5.1.1

What's Changed

Contributors

v5.0.5

What's Changed

Contributors

v5.0.4

What's Changed

Contributors

v5.0.3

What's Changed

Contributors

v5.0.2

What's Changed

New Contributors

Contributors

v5.0.1

What's Changed

New Contributors

Contributors

v5.0.0

What's Changed

Contributors

v4.0.11

What's Changed

Contributors

v4.0.10

What's Changed

Contributors