GitHub - huangzhengxiang/LLM-Profiler: Android LLM Profiling App: Profile Speed and Energy of mainstream LLM Serving Engine

LLM-Profiler

Android LLM Inference Engine Profiler

Key features:

Unplugged testing wrapped in app, no adb needed, better in simulating real-world unplugged use.
Temperature is kept below $40^\circ \mathbf{C}$ before each test, and wait the app for a while to cool down automatically to avoid severe CPU&GPU throttling.
Charge level is kept to be above 50% to avoid phones from some vendors automatically activating power-saving.
Support most of phones with Android API Level $\geq 30$.

Currently Supported Engines:

Currently Supported Metrics:

Currently Supported Models:

Qwen Series (text-generation)
Llama Series (text-generation)
Gemma Series
Phi-2

Currently Supported Test Mode:

json/jsonl/parquet file stored dataset subset testing (subset because of high time/energy cost for large dataset testing on phone)
designated/fixed length input test

The anroid demo is located in ./android2 directory.

1. Git clone

You needs all the submoudles. add --recursive for your git clone.

git clone --recursive https://github.com/huangzhengxiang/LLM-Profiler.git

2. Quick Start

2.1 model preparation

Convert model to mnn/gguf/tflite... format. (for model converting methods, please refer to each format's repository)
Push your model to /data/local/tmp/llm/model directory.

# example
adb shell mkdir /data/local/tmp/llm
adb shell mkdir /data/local/tmp/llm/model/
adb push model/qwen2_5-1_5b-int4-mnn/ /data/local/tmp/llm/model/

2.2 release version installation

The release version is ready to use at android2\app\release\app-release.apk or in GitHub Release. Install it on your cell phone.

2.3 APP use

After model and apk uploading. Install the apk and use it. Click 加载模型 first and then record your voice after you see it finished 模型加载完成.

3. Build from source

Several LLM inference engines are contained in this app: MNN-Habst (Ours), llama.cpp,

MNN-Habst is up-to-date with MNN master branch at commit: 5bd7ffc22a54f6436e387ec2a5cfde7e207feba1 (Version 3.0.4). Then, heterogeneity-aware backend selection and tuning (Habst algorithm) is added to the repo of MNN-Habst which is the submodule.

llama.cpp is added at commit: 73e2ed3ce3492d3ed70193dd09ae8aa44779651d (Version b4735), being the submodule.

Then, open project in Android Studio and build.

4. Multi-Threading Options for MNN-Habst

Internal: Power_Normal, Power_High, Power_MemoryBound, Power_SelectCore. ("normal", "high", "memory", "select") External Additional Option: "exhaustive", (requires an additional list of selective core group size. e.g., 8Gen3 [1,3,2,2], big->small, and the results are stored in a local file.), "tune_prefill" (tune prefill).

5. Datasets Supports

multi-turn conversation dataset (ShareGPT-en): https://huggingface.co/datasets/shareAI/ShareGPT-Chinese-English-90k (./sharegpt_jsonl/common_en_70k.jsonl) (input prefill controlled decode)
role play (RoleLLM): https://huggingface.co/datasets/ZenMoore/RoleBench (./rolebench-eng/role-generalization/role_specific/test.jsonl) (input prefill, controlled decode)
math problem QA (math_qa): https://huggingface.co/datasets/allenai/math_qa (input prefill, free talk decode)
Open Domain QA (truthful_qa): https://huggingface.co/datasets/truthfulqa/truthful_qa (input prefill, free talk decode)

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
MNN @ 56dc133		MNN @ 56dc133
android2		android2
backupcode/llama.cpp/examples/main		backupcode/llama.cpp/examples/main
dataset		dataset
docs		docs
llama.cpp @ 73e2ed3		llama.cpp @ 73e2ed3
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-Profiler

1. Git clone

2. Quick Start

2.1 model preparation

2.2 release version installation

2.3 APP use

3. Build from source

4. Multi-Threading Options for MNN-Habst

5. Datasets Supports

About

Releases 2

Packages

Contributors 2

Languages

License

huangzhengxiang/LLM-Profiler

Folders and files

Latest commit

History

Repository files navigation

LLM-Profiler

1. Git clone

2. Quick Start

2.1 model preparation

2.2 release version installation

2.3 APP use

3. Build from source

4. Multi-Threading Options for MNN-Habst

5. Datasets Supports

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages