-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Non-batched linear regression for high-dimensional problems #3058
base: main
Are you sure you want to change the base?
ENH: Non-batched linear regression for high-dimensional problems #3058
Conversation
/intelci: run |
cpp/daal/src/algorithms/linear_model/linear_model_train_normeq_update_impl.i
Outdated
Show resolved
Hide resolved
cpp/daal/src/algorithms/linear_model/linear_model_train_normeq_update_impl.i
Outdated
Show resolved
Hide resolved
cpp/daal/src/algorithms/linear_model/linear_model_train_normeq_update_impl.i
Show resolved
Hide resolved
cpp/daal/src/algorithms/linear_model/linear_model_train_normeq_update_impl.i
Outdated
Show resolved
Hide resolved
cpp/daal/src/algorithms/linear_model/linear_model_train_normeq_update_impl.i
Outdated
Show resolved
Hide resolved
@@ -46,14 +46,14 @@ struct OpenBlas<double, cpu> | |||
{ | |||
typedef DAAL_INT SizeType; | |||
|
|||
static void xsyrk(char * uplo, char * trans, DAAL_INT * p, DAAL_INT * n, double * alpha, double * a, DAAL_INT * lda, double * beta, double * ata, | |||
DAAL_INT * ldata) | |||
static void xsyrk(const char * uplo, const char * trans, const DAAL_INT * p, const DAAL_INT * n, const double * alpha, const double * a, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am Ok with the change. But strange that it hadn't triggered any compilation warnings.
…_update_impl.i Co-authored-by: Victoriya Fedotova <[email protected]>
/intelci: run |
Looks like this route generates some test failures in fp32 in sklearnex, but they are due to comparisons after the 5th decimal or so, which is quite expectable for fp32. |
/intelci: run |
@Vika-F I've now made the thresholds for choosing the non-batched route configurable through the parameters system, by adding 3 new parameters, but I'm not sure that I've correctly added them in all the places where they are necessary - could you take a look? Also, is there some way that the tests could be expanded to execute the exact same linear regression tests but with custom parameters? (so that it'd trigger this route). |
/intelci: run |
Description
This PR adds a preliminary version of a non-batched version of the linear regression normal regression algorithm for high-dimensional problems, which calculates the aggregates$\mathbf{X}^T \mathbf{X}$ and $\mathbf{X}^T \mathbf{y}$ separately through corresponding calls to BLAS functions.
The code route here is only meant for high-dimensional problems where the current approach does not lead to better performance. The idea is to make these thresholds configurable through the same parameters system used for e.g. the batch size in the regular route, but adding new configurable parameters appears to be a much trickier job so I'm starting with this PR with hard-coded thresholds in the meantime.
A few notes:
The function that computes the aggregates takes a flagThis was due to a bug in the PR code, has been fixed by now.initializeResult
, but if this flag isfalse
and the results are not zeroed-out beforehand, it will lead to failing some bazel tests when the code route here is used.NumericTable
class in which the data is passed, but the inputs haveconst
qualifiers and those methods that add the statistics to them modify the inputs, so cannot be used withconst
data.I wasn't sure if there's dedicated procedures to allocate aligned arrays, so I'm just using a regular unique pointer for them.Changed to use the dedicatedTArray
class.While the new code route being introduced here is meant for high-dimensional data only, for testing purposes, I'm enabling it for all input sizes. It works only for row-major data as it requires contiguous arrays though.Modified the tests to test both the regular route and this route for all linear regression cases on CPU.Checklist to comply with before moving PR from draft:
PR completeness and readability
Testing
Tests will be left for a future PR once the thresholds for triggering this mode are made configurable.
Performance
Performance comparisons were shared internally, without sklearn_bench as the situations to trigger it are rather specific.