Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
lixzou authored Jun 13, 2023
1 parent af4c07a commit fee2e98
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Since our CheeseLLM focuses on Chinese language, we conduct an automatic evaluat
Briefly, our model achieves up to 90.5% performance of ChatGPT-3.5. The scores of CheeseLLM are from the corresponding int8 quantization version.
The questions, responses and ratings for all models in comparison are publicly released [here](https://github.com/WHUIR/Cheese-LLM/tree/main/evaluation/evaluation_documents).
Specifically, the [belle_1k_chinese_evaluation.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/belle_1k_chinese_evaluation.jsonl) contains the 1,000 evaluation questions.
The [cheese_llm_7b_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/cheese_llm_7b_ans.jsonl), [chinese_llama_13b_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/chinese_llama_13b_ans.jsonl), [chinese_llama_7b_plus_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/chinese_llama_7b_plus_ans.jsonl) and [gpt3.5_turbo_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/gpt3.5_turbo_ans.jsonl) include the repsonses of CheeseLLM-v1.0, Chinese-Alpaca-13B, Chinese-Alpaca-Plus-7B and ChatGPT-3.5 respectively. The rest files, named as model1_VS_model2__gpt3.5_evlauation.jsonl, are the comparing results of model1 and model2 generated by GPT-3.5.
The following files: [cheese_llm_7b_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/cheese_llm_7b_ans.jsonl), [chinese_llama_13b_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/chinese_llama_13b_ans.jsonl), [chinese_llama_7b_plus_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/chinese_llama_7b_plus_ans.jsonl), and [gpt3.5_turbo_ans.jsonl](https://github.com/WHUIR/Cheese-LLM/blob/main/evaluation/evaluation_documents/gpt3.5_turbo_ans.jsonl) contain the responses of CheeseLLM-v1.0, Chinese-Alpaca-13B, Chinese-Alpaca-Plus-7B, and ChatGPT-3.5, respectively. The remaining files, named model1_VS_model2__gpt3.5_evaluation.jsonl, consist of the comparison results between model1 and model2 generated by GPT-3.5.


### Limitations
Expand Down

0 comments on commit fee2e98

Please sign in to comment.