Understanding IndexKernel for Hadamard Multitask model #1474
-
How should I construct the training index array For 2 tasks and inputs with shape equal to 1 the example notebook uses the following code: train_x1 = torch.rand(50)
train_x2 = torch.rand(50)
train_i_task1 = torch.full_like(train_x1, dtype=torch.long, fill_value=0)
train_i_task2 = torch.full_like(train_x2, dtype=torch.long, fill_value=1)
full_train_x = torch.cat([train_x1, train_x2])
full_train_i = torch.cat([train_i_task1, train_i_task2]) is For example if I had 3 tasks and my inputs were two dimensional should I construct a |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 8 replies
-
I managed. |
Beta Was this translation helpful? Give feedback.
-
I think that the documentation for the IndexKernel is still confusing, and since the Hadamard Multi-task notebook is the only example (as far as I know) that tells the user to use this kernel explicitly, but for a specific case (inputs are 1 dimensional and there are only 2 tasks), the code in the example might confuse people. The problem here might be my lack of knowledge though, but please consider this: The Hadamard Multi-task notebook uses the function train_i_task1 = torch.full_like(train_x1, dtype=torch.long, fill_value=0) which when we have multidimensional inputs creates a tensor that repeats the index in every dimension. If this is supposed to be used like this I don't really understand why, because it makes me wonder why a single column of indices isn't enough. So I decided to try and build the indices tensor the way it makes sense to me. In this example I have 3 tasks, and the inputs have 2 dimensions. The indices tensor is 1 dimensional though. class MultitaskGPModel(gpytorch.models.ExactGP):
def __init__(self, train_x, train_y, likelihood):
super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)
self.mean_module = gpytorch.means.ConstantMean()
self.covar_module = gpytorch.kernels.RBFKernel()
self.task_covar_module = gpytorch.kernels.IndexKernel(num_tasks=3, rank=1)
def forward(self,x,i):
mean_x = self.mean_module(x)
# Get input-input covariance
covar_x = self.covar_module(x)
# Get task-task covariance
covar_i = self.task_covar_module(i)
# Multiply the two together to get the covariance we want
covar = covar_x.mul(covar_i)
return gpytorch.distributions.MultivariateNormal(mean_x, covar)
def f1(v):
return torch.sin(torch.sum(v) * 2 * math.pi) + torch.randn(1) * 0.2
def f2(v):
return torch.cos(torch.sum(v) * 2 * math.pi) + torch.randn(1) * 0.2
def f3(v):
return torch.exp(torch.sum(v)) + torch.randn(1) * 0.2
num_obs = 5
train_x1 = torch.rand(num_obs, 2)
train_x2 = torch.rand(num_obs, 2)
train_x3 = torch.rand(num_obs, 2)
train_y1 = torch.tensor([f1(v) for v in train_x1])
train_y2 = torch.tensor([f2(v) for v in train_x2])
train_y3 = torch.tensor([f3(v) for v in train_x3])
train_i_task1 = torch.full((num_obs,), dtype=torch.long, fill_value=0)
train_i_task2 = torch.full((num_obs,), dtype=torch.long, fill_value=1)
train_i_task3 = torch.full((num_obs,), dtype=torch.long, fill_value=2)
full_train_x = torch.cat([train_x1, train_x2, train_x3])
full_train_i = torch.cat([train_i_task1, train_i_task2, train_i_task3])
full_train_y = torch.cat([train_y1, train_y2, train_y3])
likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = MultitaskGPModel((full_train_x, full_train_i), full_train_y, likelihood) notice that I used train_i_task1 = torch.full((num_obs,), dtype=torch.long, fill_value=0)
# I could also have used
# train_i_task1 = torch.full((num_obs, 1), dtype=torch.long, fill_value=0)
# beacuse the __init__ of
# gpytorch.models.ExactGP reshapes to this shape anyway. this way the indices tensor ends up like this >>> full_train_i
tensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2]) and >>> full_train_i.shape
torch.Size([15]) And if I try running likelihood = gpytorch.likelihoods.GaussianLikelihood()
# Here we have two iterms that we're passing in as train_inputs
model = MultitaskGPModel((full_train_x, full_train_i), full_train_y, likelihood)
# this is for running the notebook in our testing framework
import os
smoke_test = ('CI' in os.environ)
training_iterations = 2 if smoke_test else 50
# Find optimal model hyperparameters|
model.train()
likelihood.train()
# Use the adam optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.1) # Includes GaussianLikelihood parameters
# "Loss" for GPs - the marginal log likelihood
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)
for i in range(training_iterations):
optimizer.zero_grad()
output = model(full_train_x, full_train_i)
loss = -mll(output, full_train_y)
loss.backward()
print('Iter %d/50 - Loss: %.3f' % (i + 1, loss.item()))
optimizer.step()
It runs. Thank you in advance. |
Beta Was this translation helpful? Give feedback.
I think that the documentation for the IndexKernel is still confusing, and since the Hadamard Multi-task notebook is the only example (as far as I know) that tells the user to use this kernel explicitly, but for a specific case (inputs are 1 dimensional and there are only 2 tasks), the code in the example might confuse people. The problem here might be my lack of knowledge though, but please consider this:
The Hadamard Multi-task notebook uses the function
torch.full_like
to create the index tensors, like thiswhich when we have multidimensional inputs creates a tensor that repeats the index in every dimension. If t…