Efficient Hadamard multitask model #1910
Replies: 2 comments 4 replies
-
Have you tried using botorch's own Multitask GP here? https://botorch.org/api/models.html#botorch.models.multitask.MultiTaskGP In general, I believe that you can exploit botorch's acquisition optimization to fix dimensions such as the task dimension in what you're trying to setup. I also think that what should be happening internally in the standard MTGP setup is that a Maybe try asking in the botorch discussions instead if the botorch implementation isnt efficient enough for you. |
Beta Was this translation helpful? Give feedback.
-
I dug a little deeper and found that memory usage is normal as long as simple kernel structures are used. However, as soon as kernels are combined, memory usage increases sharply. I am not sure if this extra memory is really needed or if a lazy tensor evaluation is inefficiently executed somewhere -- hence, I open this discussion as a potential issue. Below is an example that demonstrates the problem. It is largely based on the Hadamard example with only the following minor differences:
In the code, I've marked the point from which onwards the original logic is used. With the
|
Beta Was this translation helpful? Give feedback.
-
Hi all,
I am currently trying to set up a multitask model and I'm wondering how to execute it efficiently. The code is based on this gpytorch example, where the covariance matrix is constructed using an IndexKernel as follows:
I am puzzled about this line,
which in contrast to
covar_x
andcovar_i
returns aNonLazyTensor
. Now the issue is that I'm using the model in combination with botorch's acquisition functions, which, to evaluate a set of candidate points, consider these points as independent "t-batches" (see here). This means that, in order to evaluateN
candidates, there will beN
covariance matrices constructed, each covering the training data + 1 of the candidate points, even though only theN
marginal distributions of the candidate points will be needed in the end. My expectation was that this would be efficiently handled through lazy tensor evaluation, which however does not seem to be the case and destroys the computation performance. I've also tried to explicitly use a lazy multiplication likebut this apparently does not solve the problem as the computational burden gets shifted to the
root_decomposition()
calls inMulLazyTensor
, which I haven't really grasped yet.I'm currently stuck and help would be much appreciated =)
Beta Was this translation helpful? Give feedback.
All reactions