Skip to content

Releases: onnx/turnkeyml

v6.0.0

27 Feb 20:35
Compare
Choose a tag to compare

Summary

This is a major release that introduces an OpenAI-compatible server in a completely new serve tool, support for Quark quantization in the new quark tool, and many other fixes/improvements.

Breaking Changes

New OpenAI-Compatible Server

The previous serve Tool has been replaced by a new standalone serving command. This new server has OpenAI API compatibility and will add Ollama compatibility in the near future.

The server can also be installed and used with no-code by running Lemonade_Server_Installer.exe, which is provided as a release asset in this and all future releases.

The server code was also moved out of tools/chat.py into its own file in tools/serve.py. We also renamed chat.py to prompt.py for clarity, since that file now only contains the prompting tool.

The LEAP name has been deprecated

In the interest of reducing naming confusion, the "LEAP API" is now simply the "high-level lemonade API".

  • Old usage: from lemonade.leap import from_pretrained
  • New usage: from lemonade.api import from_pretrained

Summary of Contributions

  • The base checkpoint for models is retrieved from the Hugging Face API at loading time (@ramkrishna2910)
  • The benchmarking tools (huggingface-bench, oga-bench, and llamacpp-bench) have been refactored to reduce code duplication and improve maintainability. They now also support a list of prompts (or prompt lengths) to be benchmarked: --prompts 128 256 512 (@amd-pworfolk)
  • The avg_accuracy stats has been renamed to average_mmlu_accuracy for clarity with respect to non-MMLU accuracy tests (@jeremyfowers), (attn @apsonawane)
  • Introduce Lemonade_Server_Installer.exe (@jeremyfowers)
  • Implement an OpenAI-compatible server and remove the old serve tool (@danielholanda)
  • Rename chat module to prompt (@jeremyfowers)
  • Improved lemonade getting started documentation and remove the "LEAP" branding (@jeremyfowers)
  • OGA 0.6.0 is the default package for CPU, CUDA, and DML (@jeremyfowers)
  • Add support for Quark quantization with a new quark-quantize tool (@iswaryaalex)
  • Clean up the lemonade getting started docs and remove some deprecated tools (@jeremyfowers)

New Contributors

Full Changelog: v5.1.1...v6.0.0

v5.1.1

06 Feb 19:31
b805839
Compare
Choose a tag to compare

What's Changed

  • Fix broken lemonade link by @jeremyfowers in #278
  • Update getting_started.md by @jeremyfowers in #282
  • Avoid lemonade build cache collisions (@jeremyfowers).
    • All builds are now placed under <cache_dir>/builds/<build_name> instead of <cache_dir>/<build_name>
      • This creates a more hierarchical cache structure, where builds are peer to models and data.
    • All build names now include a timestamp
      • This ensures that build stats and logs will not collide with each other if we build the same model in the same cache, but with different parameters.
    • Revs the minor version number because all previous caches are invalidated.
  • Enable ONNX model download for cpu and igpu in oga-load (@jeremyfowers)
  • Improvements to memory tracking (@amd-pworfolk)
  • Improve OGA testing (@jeremyfowers).
    • Run the sever test last, since it is the most complex and has the worst telemetry
    • Stop deleting the entire cache directory between every test, since that deletes the model builder cache. Instead, just delete the cache/builds directory.
  • Add average mmlu accuracy by @apsonawane in #287
  • Update OGA LEAP recipes by @jeremyfowers in #289

Full Changelog: v5.0.5...v5.1.1

v5.0.5

31 Jan 19:55
4c19a41
Compare
Choose a tag to compare

What's Changed

Full Changelog: v5.0.4...v5.0.5

v5.0.4

29 Jan 22:21
81708d4
Compare
Choose a tag to compare

What's Changed

Full Changelog: v5.0.3...v5.0.4

v5.0.3

28 Jan 17:52
b9e6223
Compare
Choose a tag to compare

What's Changed

  • Bring OGA under test and fix OGA server. Improve llm-prompt. by @jeremyfowers in #272
  • Always move HF tozenizer encodings to the target device by @jeremyfowers in #274
  • Release v5.0.3: Lemonade installer and examples, repo reorg, and lots more by @jeremyfowers in #275
    • Docs, test, and examples have been moved into turnkey (CNNs and Transformers) vs. lemonade (LLMs) directories (@jeremyfowers)
    • For example: docs/lemonade/getting_started.md instead of docs/lemonade_getting_started.md
    • Track the memory utilization of any lemonade or turnkey command and plot it on a graph by setting the --memory option (@amd-pworfolk).
    • Add examples and demo applications for the high-level LEAP APIs in examples/lemonade (@jeremyfowers).
    • Add LEAP support for all OGA backends (@jeremyfowers).
    • Extend the llm-prompt tool to make it more useful for model and framework validation (@amd-pworfolk).
    • Updates and fixes to lemonade test code in llm_api.py (@jeremyfowers).
    • Fix not_enough_tokens bug on oga-bench (@danielholanda).

Full Changelog: v5.0.2...v5.0.3

v5.0.2

16 Jan 00:41
bc33e79
Compare
Choose a tag to compare

What's Changed

Re-issuing v5.0.1 to fix a pypi release bug.

New Contributors

Full Changelog: v5.0.0...v5.0.2

v5.0.1

16 Jan 00:04
c107c2a
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v5.0.0...v5.0.1

v5.0.0

13 Jan 17:20
4e7450d
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.0.11...v5.0.0

v4.0.11

06 Jan 22:22
21a5c74
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.0.10...v4.0.11

v4.0.10

06 Jan 20:07
2b9a783
Compare
Choose a tag to compare

What's Changed

Full Changelog: v4.0.9...v4.0.10