This fork is maintained by dividiti Limited.
$ python -m pip install ck --user
$ ck pull repo:ck-nntest
$ ck install package --tags=lib,nntest
To use Arm Compute Library (Neon) tests with the latest development branch:
$ ck install package --tags=lib,armcl,neon,dev
To install a specific release of the library (e.g. 20.05
):
$ ck install package --tags=lib,armcl,neon,rel.20.05
To use Arm Compute Library (OpenCL) tests with the latest development branch:
$ ck install package --tags=lib,armcl,opencl,dev
To install a specific release of the library (e.g. 20.05
):
$ ck install package --tags=lib,armcl,opencl,rel.20.05
To use TensorFlow CPU tests, install a third-party TensorFlow_CC package:
$ ck pull repo:ck-tensorflow
$ ck install package:lib-tensorflow_cc-shared-1.7.0 [--env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=2]
To install, follow the instructions in the Readme NB: You may want to limit the number of build threads on a memory-constrained platform (e.g. to 2 as above).
To use Caffe tests, get the public CK-Caffe repository:
$ ck pull repo --url=https://github.com/dividiti/ck-caffe
To use Caffe CPU tests, install:
$ ck install package:lib-caffe-bvlc-master-cpu-universal
To use Caffe OpenCL tests, install one or more of the packages listed by:
$ ck list package:lib-caffe-bvlc-opencl-*-universal
For example, install Caffe with CLBlast:
$ ck install package:lib-caffe-bvlc-opencl-clblast-universal
To view all available NNTest test programs and data sets:
$ ck search program --tags=nntest | sort
$ ck search dataset --tags=nntest | sort
To compile and run a single test listed above, use e.g.:
$ ck run nntest:softmax-armcl-opencl
To view all tests to be performed run this command with list_tests
or dry_run
options.
The command with list_tests
option only lists all combinations of a library, dataset,
tensor shape to be processed. And command with dry_run
option prepares a pipeline
for each test but don't run it.
$ ck run nntest:softmax-armcl-opencl --list_tests
$ ck run nntest:softmax-armcl-opencl --dry_run
CK-NNTest supports the following operators for the Arm Compute Library:
- average pool (fp32, uint8)
- convolution (fp32, uint8)
- depthwise convolution (fp32, uint8)
- direct convolution (fp32, uint8)
- fully connected (fp32, uint8)
- gemm (fp32)
- reshape (fp32, uint8)
- resize bilinear (fp32, uint8)
- softmax (fp32, uint8)
- winograd convolution (fp32)
NB: Not all operators are supported for all libraries.
- Kernel profiling:
$ ck run nntest:avgpool-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:avgpool-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:avgpool-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:avgpool-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:conv-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:conv-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:conv-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:conv-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:depthwiseconv-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:depthwiseconv-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:depthwiseconv-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:depthwiseconv-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:directconv-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:directconv-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:directconv-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:directconv-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:fullyconnected-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:fullyconnected-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:fullyconnected-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:fullyconnected-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:gemm-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:gemm-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:reshape-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:reshape-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:reshape-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:reshape-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:resizebilinear-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:resizebilinear-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:resizebilinear-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:resizebilinear-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:softmax-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:softmax-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:softmax-armcl-opencl-uint8 --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:softmax-armcl-opencl-uint8 --repetitions=10 --timestamp=<platform>-validation
- Kernel profiling:
$ ck run nntest:winogradconv-armcl-opencl --dvdt_prof --timestamp=<platform>-profiling
- Validation:
$ ck run nntest:winogradconv-armcl-opencl --repetitions=10 --timestamp=<platform>-validation
If more than one dataset suitable for the operator under test (softmax
above) is found, make the choice.
To test with a particular dataset, use e.g.:
$ ck run program:softmax-armcl-opencl --dataset_file=shape-1024-1-1
To test with all suitable datasets, use e.g.:
$ ck run nntest:softmax-armcl-opencl --iterations=1
NB: By default, ck run nntest:*
iterates over the batch sizes ranging from 1 to 16;
--iterations=1
stops the test after the first iteration (for the batch size of 1).
To override one or more keys in a dataset, use e.g.:
$ ck run program:softmax-armcl-opencl --env.CK_IN_SHAPE_C=256
By default, test results are recorded in a local repository. To just print test
results to the standard output, use no_record
option e.g.:
$ ck run nntest:softmax-armcl-opencl --iterations=1 --no_record
To list the results saved locally, use e.g.:
$ ck list local:experiment:nntest-softmax-armcl-opencl-*
If not all the dataset shapes have been processed during a test session (e.g.
due to a user interrupt or the platform going offline), the session can be resumed
later by running the same command with the --resume
option, e.g.:
$ ck run nntest:conv-armcl-opencl --iterations=1 --repetitions=1 --timestamp=odroid-conv-0001 --resume
NB It's essential to pass exactly the same --timestamp
flag to correctly identify a test session to be resumed.
When a test is invoked with a particular dataset for the first time, CK saves its output as reference (e.g. a vector of floating-point values). In subsequent invocations of this test with the same dataset, CK validates its output against the reference.
To skip output validation, use e.g.:
$ ck run program:softmax-armcl-opencl --skip_output_validation
To replace the reference output, use e.g.:
$ ck run program:softmax-armcl-opencl --overwrite_reference_output
Output validation is performed within a certain threshold specified via the
CK_ABS_DIFF_THRESHOLD
key in the runs_vars
dictionary in the program
metadata. That is, any differences smaller than the threshold are ignored.
To override the threshold value at run-time, use e.g.:
$ ck run program:softmax-armcl-opencl --env.CK_ABS_DIFF_THRESHOLD=0.01
Each reference output gets a unique name e.g. default-3cda82464112173d-1000-1-1-2-42
. Here:
default
is the command key of the given test program;3cda82464112173d
is the unique id of the dataset (ck-nntest:dataset:tensor-0001
);1000-1-1
are the dash-separated values of the keys in the dataset file (shape-1000-1-1.json
) listed in the alphabetical order (i.e.CK_IN_SHAPE_C
,CK_IN_SHAPE_H
,CK_IN_SHAPE_W
);2-42
are the dash-separated values of selected keys in therun_vars
dictionary in the program metadata file listed in the alphabetical order (i.e.CK_IN_SHAPE_N
,CK_SEED
).
To visualize test results in a web browser, run:
$ ck dashboard nntest
and select "Raw results".
It is possible to run this dashboard on a different host and port:
$ ck dashboard nntest --host=192.168.99.1 --port=3355
It is also possible to specify external host and port useful for Docker instances:
$ ck dashboard nntest --wfe_host=192.168.99.1 --wfe_port=3355
You will be able to replay individual tests (to validate performance or fix bugs).
The simplest way is to select a given experiment from the above nntest dashboard,
and then click on a button "Copy to clipboard" in the Reproduce
field.
You can then paste and run a command in your shell. It will look similar to
$ ck replay experiment:186380dfcd98cd7a --point=4e9e9476bab09b2c
Alternatively, you can see all available raw nntest experiments on your machine as follows:
$ ck search experiment --tags=nntest
$ ck run nntest:*softmax* --iterations=4 --repetitions=1 --pause_if_fail
You can run some of the test directly on Android devices connected to your host machine via ADB as follows (you need to have Android NDK and SDK installed):
$ ck compile program:softmax-armcl-opencl --speed --target_os=android23-arm64
$ ck run program:asoftmax-armcl-opencl --speed --target_os=android23-arm64
We plan to add support to compile and run ArmCL-based clients on Android too (there are some minor issues at this stage):
$ ck install package --tags=armcl,vopencl,vavgpool --env.USE_EMBEDDED_KERNELS=ON --target_os=android23-arm64
$ ck compile program:avgpool-armcl-opencl --speed --target_os=android23-arm64
$ ck run program:avgpool-armcl-opencl --speed --target_os=android23-arm64
Extra environment variables for development/debugging:
--env.CK_ADD_RAW_NNTEST_OUTPUT=yes
- add vector output to the CK pipeline--env.CK_ADD_RAW_DVDT_PROF=yes
- add rawdvdt_prof
profile to the CK pipeline--env.CK_ADD_RAW_MALI_HWC=yes
- add Mali hardware performance counters to the CK pipeline
To record the hostname to the meta of all experimental entries:
$ ck set kernel var.record_nntest_hostname=yes
To turn off recording the hostname:
$ ck set kernel var.record_nntest_hostname=no
or
$ ck set kernel var.record_nntest_hostname=
The Arm Compute Library includes validation suite which tests all internal Arm routines. It can be compiled for any ArmCL package as follows:
$ ck compile compile program:validation-armcl-opencl
It is possible to customize this build via --env.KEY=val
.
For example, you can add CXX flags as follows:
$ ck compile program:validation-armcl-opencl --env.EXTRA_CXX_FLAGS="-DDVDT_DEBUG"
You can now run validation as follows (select the run
command):
$ ck run program:validation-armcl-opencl
You can also filter tests such as for softmax
as follows:
$ ck run program:validation-armcl-opencl --env.FILTER=CL/.*Softmax