Releases: h2oai/h2ogpt
Releases · h2oai/h2ogpt
h2oGPT 0.2.1 Release
Official Release for h2oGPT 0.2.1
What's Changed
- Update Mac One click installer Mar 08, 2024 by @Mathanraj-Sharma in #1456
- Update gradio constraint to 4.20.1 by @Mathanraj-Sharma in #1457
- General chat_template handling + Sealion by @pseudotensor in #1463
- Update linux install script by @Mathanraj-Sharma in #1462
- CohereForAI/c4ai-command-r-v01 by @pseudotensor in #1467
- Set OpenAI proxy port to 5001 for MacOS by @Mathanraj-Sharma in #1468
- Do map[0] instead of map_reduce if all fits into context by @pseudotensor in #1475
- Image change style by @pseudotensor in #1444
- Fix history with images by @pseudotensor in #1479
- Support claude-3 as vision models by @pseudotensor in #1480
- Add gpt-4-vision support as vision model by @pseudotensor in #1481
- Support gemini-vision-pro as vision model by @pseudotensor in #1482
- [DOCS] Correct typos in FAQ and improve readability by @zainhaq-h2o in #1487
- Google auth by @pseudotensor in #1486
- Handle multiple images for gpt4-vision-preview, gemini-pro-vision and claude-3's by @pseudotensor in #1490
- [DOCS] Fix typos on Links page by @zainhaq-h2o in #1488
- clean-up use of grclient by @pseudotensor in #1493
- feat: Qdrant vector store by @Anush008 in #1460
- [Snyk] Fix for 3 vulnerabilities by @smg478 in #1498
- [Snyk] Fix for 15 vulnerabilities by @smg478 in #1501
- Fix trust by @pseudotensor in #1505
- Makelist by @pseudotensor in #1511
- Pass number of prompt tokens and prompt_raw back by @pseudotensor in #1514
- Handle multiple images for llava by @pseudotensor in #1516
- fix: Ignore Qdrant scroll offset gpt_langchain.py by @Anush008 in #1524
- JSON mode by @pseudotensor in #1527
- Gradio 4.25.0 by @pseudotensor in #1510
- Fix grounded template token counting by @pseudotensor in #1533
- Fix llava token counting by @pseudotensor in #1534
- Fixdocker by @pseudotensor in #1538
- Update docker_build_script_ubuntu.sh by @achraf-mer in #1541
- Check and version by @pseudotensor in #1542
- Back to gradio 4.20.1, 4.25.0 really bad in terms of speed and overall stressed performance. Eventually hangs server too easily by @pseudotensor in #1546
- Restore gradio 4.26.0 but no heartbeat by @pseudotensor in #1562
- Repair json if required, also pass back raw response without extraction by @pseudotensor in #1568
- Faster auth access using sqlite3 instead of full json load/change every minor operation by @pseudotensor in #1569
- Stream in async for summary/extract by @pseudotensor in #1575
- Isolate JSON prompts so can change language etc. by @pseudotensor in #1581
- Ensure llama-3 or other chat template based models handled by @pseudotensor in #1588
- Clean-up stopping to avoid hard-coded things for llama-3 as it was fixed 11 days ago. by @pseudotensor in #1590
- remove vllm-check/tgi-check init-container by @robinliubin in #1605
- Together.ai support and remove old chroma migration by @pseudotensor in #1607
- [HELM] Fixes - Add Args when running h2oGPT only by @EshamAaqib in #1610
- Improve split and merge by @pseudotensor in #1612
- [DOCS] Minor FAQ improvements by @zainhaq-h2o in #1613
- At least provide rules even if no schema by @pseudotensor in #1620
- Add OpenAI Proxy TTS by @pseudotensor in #1621
- set podSecurityContext to null, so umbrella can overwrite on openshift by @robinliubin in #1618
- guided_whitespace_pattern by @pseudotensor in #1625
- OpenAI proxy STT by @pseudotensor in #1622
- Refactor gradio tools to isolate non-gradio functions. Fix audio streaming for TTS through OpenAI. WIP for direct OpenAI nochat call without gradio. by @pseudotensor in #1543
- Add support for idefics2 vision model via TGI client by @pseudotensor in #1629
- Put file lock as deep as possible to avoid over locking by @pseudotensor in #1640
- Function server by @pseudotensor in #1641
- Use gunicorn so dead workers restart unlike uvicorn by @pseudotensor in #1645
- Cogvlm2 by @pseudotensor in #1651
- Fix asyncio sglang use by @pseudotensor in #1654
- Repair json work around by @pseudotensor in #1658
- Add function calling for mistralai for better json mode by @pseudotensor in #1659
New Contributors
Full Changelog: 0.2.0...0.2.1
h2oGPT 0.2.0 Release
Official Release for h2oGPT 0.2.0
What's Changed
- Add code to push spaces chatbot by @pseudotensor in #46
- Fixes #48 by @pseudotensor in #55
- More HF spaces restrictions to prevent OOM or no-good choices being chosen by @pseudotensor in #57
- Add max_beams to client_test.py by @lo5 in #64
- Fix directory name from h2o-llm to h2ogpt on install tutorial by @cpatrickalves in #63
- h2o theme for background by @jefffohl in #68
- Add option to save prompt and response as .json. by @arnocandel in #69
- Update tos.md by @eltociear in #70
- Use SAVE_DIR and --save_dir instead of SAVE_PATH and --save_path. by @arnocandel in #71
- Make chat optional from UI/client by @pseudotensor in #74
- Compare models by @pseudotensor in #42
- H2O gradio theme by @jefffohl in #84
- Refactor gradio into separate file and isolate it from torch specific stuff by @pseudotensor in #85
- Refactor finetune so some of it can be used to check data and its tokenization by @pseudotensor in #93
- Llama flash attn by @arnocandel in #86
- Give default context to help chatbot by @pseudotensor in #100
- CUDA mismatch work-around for no gradio case by @pseudotensor in #101
- Add Triton deployment template. by @arnocandel in #91
- Check data for unhelpful responses by @pseudotensor in #103
- Clear torch cache memory every 20s by @pseudotensor in #90
- Try transformers experimental streaming. Still uses threads, so probably won't fix browser exit GPU memory issue by @pseudotensor in #98
- Handle thread stream generate exceptions. by @pseudotensor in #110
- Specify chat separator by @pseudotensor in #114
- [DOCS] README typo fix and readability improvements by @zainhaq-h2o in #118
- Support OpenAssistant models in basic form, including 30B xor one by @pseudotensor in #119
- Add stopping condition to pipeline case by @pseudotensor in #120
- Allow auth control from CLI by @pseudotensor in #123
- Improve data prep by @arnocandel in #122
- [DOCS] Grammar / readability improvements for FAQ.md by @zainhaq-h2o in #124
- neox Flash attn by @arnocandel in #31
- Langchain integration by @pseudotensor in #111
- Allow CLI add to db and clean-up handling of evaluate args by @pseudotensor in #137
- Add zip upload and parallel doc processing by @pseudotensor in #138
- Control visibility of buttons, but still gradio issues mean can't spin/block button while processing in background by @pseudotensor in #140
- Add URL support by @pseudotensor in #142
- HTML, DOCX, and better markdown support by @pseudotensor in #143
- odt, pptx, epub, UI text paste, eml support (both text/html and text/plain) and refactor so glob simpler by @pseudotensor in #144
- Reform chatbot client API code by @pseudotensor in #117
- Add import control check to avoid leaking optional langchain stuff into generate/gradio. Add test by @pseudotensor in #146
- [DevOps] Snyk Integration by @ChathurindaRanasinghe in #131
- Add image support and show sources after upload by @pseudotensor in #147
- Update finetune.py by @orellavie1212 in #132
- ArXiv support via URL in chatbot UI by @pseudotensor in #152
- Improve caption, include blip2 as option by @pseudotensor in #153
- Control chats, save, export, import and otherwise manage by @pseudotensor in #156
- Mac/Windows install and GPT4All as base model for pure CPU mode support by @pseudotensor in #157
- Move loaders out of finetune, which is only for training, while loader used for generation too by @pseudotensor in #161
- Allow selection of subset of docs in collection for query by @pseudotensor in #163
- Improve datasource layout by @pseudotensor in #164
- Refactor run_qa_db a bit, so can do other tasks by @pseudotensor in #167
- Use latest peft/transformers/accelerate/bitsandbytes for 4-bit (qlora) by @arnocandel in #166
- Refactor out run_eval out of generate.py by @pseudotensor in #173
- Add CLI mode with tests by @pseudotensor in #174
- Separate out FAISS from requirements by @pseudotensor in #184
- Generalize h2oai_pipeline so works for any instruct model we have prompt_type for, so run_db_qa will stream and stop just like non-db code path by @pseudotensor in #190
- Ensure can use offline by @pseudotensor in #191
- Fix and test llamacpp by @pseudotensor in #197
- Improve use of ctx vs. max_new_tokens for non-HF models, and if no docs, don't insert == since no docs, just confuses model by @pseudotensor in #199
- UI help in FAQ by @pseudotensor in #205
- Quantized model updates, switch to recommending TheBloke by @pseudotensor in #208
- Fix nochat API by @pseudotensor in #209
- Move docs and optional reqs to directories by @pseudotensor in #214
- Allow for custom eval json file by @pseudotensor in #227
- Fix run_eval and validate parameters are all passed by @pseudotensor in #228
- Add setup.py wheel building option by @pseudotensor in #229
- [DevOps] Fix condition issue for snyk test & snyk monitor by @ChathurindaRanasinghe in #169
- Add weaviate support by @hsm207 in #218
- More weaviate tests by @pseudotensor in #231
- Allow add to db when loading from generate by @pseudotensor in #212
- Allow update db from UI if files changed, since normally not constantly checking for new files by @pseudotensor in #232
- More control over max_max_new_tokens and memory behavior from generate args by @pseudotensor in #234
- Make API easier, and add prompt_dict for custom control over prompt as example of new API parameter don't need to pass by @pseudotensor in #238
- Chunk improve by @pseudotensor in #239
- Fix
TypeError: can only concatenate str (not "list") to str
on startup by @this in #242 - Fix nochat in UI so enter works to submit again, and if langchain mode used then shows HTML links for sources by @pseudotensor in #244
- Improve subset words and code by @pseudotensor in #245
- use instructor embedding, and add migration of embeddings if ever changes, at least for chroma by @pseudotensor in #247
- Add extra clear torch cache calls so embedding on GPU doesn't stick to GPU by @pseudotensor in #252
- Fixes #249 by @pseudotensor in #255
- Support connecting to a local weaviate instance by @hsm207 in #236
- .gitignore updated for .idea and venv by @fazpu in #256
- move enums and add test for export copy since keep changing what files have what structures by @pseudotensor in #260
- Ensure generate hyperparameters are passed through to h2oai_pipelinepy for generation by @pseudotensor in #265
- Submit button is now primary + more spacing between prompt area and action buttons by @fazpu in #261
- input prompt - primary color border added + change in label text by @fazpu in #259
- prompt form moved to a separate file by @fazpu in #258
- Upgrade gradio by @pseudotensor in #269
- Fixes #270 by @pseudotensor in #272
- A couple of small updates to the documentati...