Skip to content

Releases: tjake/Jlama

v0.8.2

05 Nov 04:37
33e4958
Compare
Choose a tag to compare

What's Changed

  • Better json detection and extraction from strings by @tjake in #105

Full Changelog: v0.8.1...v0.8.2

v0.8.1

04 Nov 04:05
Compare
Choose a tag to compare

What's Changed

  • Add missing F32 Q4 operations for arm by @tjake in #103

Full Changelog: v0.8.0...v0.8.1

v0.8.0

30 Oct 00:45
Compare
Choose a tag to compare

What's Changed

  • Adds Downloader class to add fluent coding when different options by @lordofthejars in #87
  • chore: update JlamaRingWorkerService.java by @eltociear in #90
  • Gemma2 classifier sample by @tjake in #92
  • Add utf-8 charset to ui by @LuccaPrado in #96
  • Fix qwen tokenizer since they use more than one by @tjake in #99
  • Add support for granite models by @mariofusco in #98
  • Keep kv-cache to the actual data stays on disk but mmap is removed af… by @tjake in #100

New Contributors

Full Changelog: v0.7.0...v0.8.0

v0.7.0

21 Oct 02:53
Compare
Choose a tag to compare

What's Changed

  • Rest http port fix by @tjake in #76
  • Overloads download method to specify progress by @lordofthejars in #71
  • Set the max tokens based on the model and fix temp for now by @tjake in #77
  • Creates ProgressReporter interface instead of functional interface by @lordofthejars in #80
  • Add Qwen2 support and fix bug with small models using I8Q4 by @tjake in #82
  • Add support for gemma-2 models by @tjake in #83

Full Changelog: v0.6.0...v0.7.0

v0.6.0

16 Oct 01:51
Compare
Choose a tag to compare

What's Changed

  • Adds split by layer for distributed inference by @tjake in #60
  • #62 global preview feature description by @AdamBien in #63
  • Add Support for Q4 Embeddings and vector cleanup by @tjake in #70

New Contributors

Full Changelog: v0.5.0...v0.6.0