The repo can be built for the following platforms, using the provided setup and the following instructions.
Chip | Windows | Linux | OS X |
---|---|---|---|
x64 | Instructions | Instructions | Instructions |
The ML.NET repo can be built from a regular, non-admin command prompt. The build produces multiple binaries that make up the ML.NET libraries and the accompanying tests.
The dev workflow describes the development process to follow. It is divided into specific tasks that are fast, transparent and easy to understand. The tasks are represented in scripts (cmd/sh) in the root of the repo.
For more information about the different options that each task has, use the argument -help
when calling the script. For example:
build -help
Examples
- Initialize the repo to make build possible (if the build fails because it can't find
mf.cpp
then perhaps you missed this step)
git submodule update --init
- Building in release mode for platform x64
build.cmd -configuration Release /p:TargetArchitecture=x64
- Building the src and then building and running the tests
build.cmd -test
Note: Before working on individual projects or test projects you must run build
from the root once before beginning that work. It is also a good idea to run build
whenever you pull a large set of unknown changes into your branch.
Under the src directory is a set of directories, each of which represents a particular assembly in ML.NET.
For example the src\Microsoft.MachineLearning.Core directory holds the source code for the Microsoft.MachineLearning.Core.dll assembly.
You can build the DLL for Microsoft.MachineLearning.Core.dll by going to the src\Microsoft.MachineLearning.Core
directory and typing dotnet build
.
You can build the tests for Microsoft.MachineLearning.Core.dll by going to
test\Microsoft.MachineLearning.Core.Tests
directory and typing dotnet test
.
Note: We use build/vsts-ci.yml to define our official build
By default, building from the root or within a project will build the libraries in Debug mode.
One can build in Debug or Release mode from the root by doing build.cmd -configuration Release
or build.cmd -configuration Debug
.
Currently, the full list of supported configurations are:
Debug
,Release
(for .NET Core 3.1 and .NET Framework 4.6.1)
We support both 32-bit and 64-bit binaries. To build in 32-bit, use the TargetArchitecture
flag as below:
build.cmd -configuration Debug /p:TargetArchitecture=x86
During development, there may arise a need to update the current baseline core_manifest.json
and/or core_ep-list.tsv
files. For example, a change in the name or type of a variable in a given class in the API that is not reflected in core_manifest.json
will trigger the following failure:
*** Failure: Output and baseline mismatch at line 123 , expected ' ...x... ' but got ' ...y...' : '../Common/EntryPoints/core_manifest.json'
Steps to update core_manifest.json
and core_ep-list.tsv
:
- Unskip the
RegenerateEntryPointCatalog
unit test intest/Microsoft.ML.Core.Tests/UnitTests/TestEntryPoints.cs
. This can be done by temporarily commenting out the skip attribute on the unit test forRegenerateEntryPointCatalog
([Fact(Skip = "Execute this test if you want to regenerate the core_manifest and core_ep_list files")]
). - Run the unit tests
build.cmd -runTests
(alternatively, run theRegenerateEntryPointCatalog
unit test natively on Visual Studio through the Test Explorer or through Shortcuts by clicking onRegenerateEntryPointCatalog
intest/Microsoft.ML.Core.Tests/UnitTests/TestEntryPoints.cs
and pressing Ctrl+R,T). - Verify the changes to
core_manifest.json
andcore_ep-list.tsv
are correct. - Re-enable the skip attribute on the
RegenerateEntryPointCatalog
test. - Commit the updated
core_manifest.json
andcore_ep-list.tsv
files to your branch.
It may be necessary to run only specific unit tests on CI, and perhaps even run these tests back to back multiple times. The steps to run one or more unit tests are as follows:
- Set
runSpecific: true
andinnerLoop: false
in .vsts-dotnet-ci.yml per each build you'd like to run the specifics tests on CI. - Import
Microsoft.ML.TestFrameworkCommon.Attributes
in the unit test files that contain specific unit tests to be run. - Add the
[TestCategory("RunSpecificTest")]
to the unit test(s) you'd like to run specifically.
If you would like to run these specific unit test(s) multiple times, do the following for each unit test to run:
- Replace the
[Fact]
attribute with[Theory, IterationData(X)]
whereX
is the number of times to run the unit test. - Add the
int iteration
argument to the unit test you'd like to run multiple times. - Use the
iteration
parameter at least once in the unit test. This may be as simple as printing to console theiteration
parameter's value.
These steps are demonstrated in this demonstrative commit.
During development, there may also arise a need to debug hanging tests. In this scenario, it can be beneficial to collect the memory dump while a given test is hanging.
In this case, the given needs to be implemented according to the Microsoft test framework. Please check out the Microsoft test framework walkthrough and the VSTest sample demonstrating the "TestClass", "TestMethod", "DataTestMethod", and "DataRow" attributes.
Once the unit test(s) are implemented according to VSTest and ready to be debugged, the useVSTestTask
parameter in build\ci\job-template.yml
needs to be set to True
. Once these steps are completed and pushed in your pull request, the unit test(s) will run and produce a full memory dump. At the end of a run, the memory dump .dmp
file will be available for downloading and inspection in the published artifacts of the build, in the folder TestResults
.
Note: this is only supported on Windows builds, as ProcDump is officially only available on Windows.