master-39ed572
Various improvements (#131) * Implement model head offloading * Guess the tokenizer from n_vocab * Make PyTorch optional for inference * Add function to offload layers * Add rwkv_eval_sequence_in_chunks
Various improvements (#131) * Implement model head offloading * Guess the tokenizer from n_vocab * Make PyTorch optional for inference * Add function to offload layers * Add rwkv_eval_sequence_in_chunks