Skip to content

v0.2.0

Latest
Compare
Choose a tag to compare
@huangzhengxiang huangzhengxiang released this 27 Feb 13:03

LLM-Profiler 0.2.0

Android LLM Inference Engine Profiler

Key features:

  • Unplugged testing wrapped in app, no adb needed, better in simulating real-world unplugged use.
  • Temperature is kept below $40^\circ \mathbf{C}$ before each test, and wait the app for a while to cool down automatically to avoid severe CPU&GPU throttling.
  • Charge level is kept to be above 50% to avoid phones from some vendors automatically activating power-saving.
  • Support most of phones with Android API Level $\geq 30$.

Currently Supported Engines:

  • MNN (Our Modified Version of MNN-3.0.4) (CPU/OpenCL)
  • llama.cpp (Version b4735) (CPU)
  • MediaPipe
  • MLC-LLM
  • ExecuteTorch
  • mllm

Currently Supported Metrics:

  • speed (tok/s)
  • capacity consumption (uAh/tok)
  • energy consumption (mJ/tok)
  • perplexity
  • accuracy

Currently Supported Models:

  • Qwen Series (text-generation)
  • Llama Series (text-generation)
  • Gemma Series
  • Phi-2

Currently Supported Test Mode:

  • json/jsonl/parquet file stored dataset subset testing (subset because of high time/energy cost for large dataset testing on phone)
  • designated/fixed length input test

New Features:

  • [Doc]: In docs/, add model converter doc for each engine.
  • [BugFix]: fix the bug of incorrect result display for dataset test mode.