support for LLM.int8
On GPU, you can now use LLM.int8 to use less memory:
from textsum.summarize import Summarizer
summarizer = Summarizer(load_in_8bit=True) # loads default model in LLM.int8, taking 1/4 of the memory
What's Changed
Full Changelog: v0.1.3...v0.1.5