LLM-Profiler 0.2.0
Android LLM Inference Engine Profiler
Key features:
- Unplugged testing wrapped in app, no adb needed, better in simulating real-world unplugged use.
- Temperature is kept below
$40^\circ \mathbf{C}$ before each test, and wait the app for a while to cool down automatically to avoid severe CPU&GPU throttling. - Charge level is kept to be above 50% to avoid phones from some vendors automatically activating power-saving.
- Support most of phones with Android API Level
$\geq 30$ .
Currently Supported Engines:
- MNN (Our Modified Version of MNN-3.0.4) (CPU/OpenCL)
- llama.cpp (Version b4735) (CPU)
- MediaPipe
- MLC-LLM
- ExecuteTorch
- mllm
Currently Supported Metrics:
- speed (tok/s)
- capacity consumption (uAh/tok)
- energy consumption (mJ/tok)
- perplexity
- accuracy
Currently Supported Models:
- Qwen Series (text-generation)
- Llama Series (text-generation)
- Gemma Series
- Phi-2
Currently Supported Test Mode:
- json/jsonl/parquet file stored dataset subset testing (subset because of high time/energy cost for large dataset testing on phone)
- designated/fixed length input test
New Features:
- [Doc]: In
docs/
, add model converter doc for each engine. - [BugFix]: fix the bug of incorrect result display for dataset test mode.