-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
making Multinomial sampling slightly faster #786
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
local tester = torch.Tester() | ||
|
||
cmd = torch.CmdLine() | ||
cmd:text() | ||
cmd:text() | ||
cmd:text('Testing alias multinomial on cuda') | ||
cmd:text() | ||
cmd:text('Options') | ||
cmd:option('--compare',false,'compare with cutorch multinomial') | ||
cmd:text() | ||
|
||
-- parse input params | ||
params = cmd:parse(arg) | ||
|
||
require 'cutorch' | ||
local function checkMultinomial() | ||
local n_class = {10, 100, 1000} | ||
local n_sample = {10, 100, 1000, 10000} | ||
local n_dist = 100 | ||
for _, curr_n_class in pairs(n_class) do | ||
for _, curr_n_sample in pairs(n_sample) do | ||
print("") | ||
print("Benchmarking multinomial with "..curr_n_class.." classes and "..curr_n_sample.." samples") | ||
torch.seed() | ||
local probs = torch.CudaDoubleTensor(n_dist, curr_n_class):uniform(0,1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure that benchmarking using float64 is useful, since almost all work done with Torch is in float32. Furthermore, this will have very skewed results on different GPUs due to lack of float64 ALUs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you suggest using |
||
local a = torch.Timer() | ||
local cold_time = a:time().real | ||
a:reset() | ||
cutorch.synchronize() | ||
a:reset() | ||
for i = 1,10 do | ||
torch.multinomial(probs, curr_n_sample, true) | ||
cutorch.synchronize() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are you synchronizing every time through for the benchmark? one should only synchronize at the beginning and the end. |
||
end | ||
print("[CUDA] : torch.multinomial draw: "..(a:time().real/10).." seconds (hot)") | ||
end | ||
torch.seed() | ||
local probs = torch.CudaDoubleTensor(3, curr_n_class):uniform(0,1) | ||
for i =1,3 do | ||
probs[i]:div(probs[i]:sum()) | ||
end | ||
local output = torch.multinomial(probs, 5000000, true) | ||
local counts = torch.Tensor(3, curr_n_class):zero() | ||
for i=1,3 do | ||
output[i]:long():apply(function(x) counts[{i, x}] = counts[{i, x}] + 1 end) | ||
counts[i]:div(counts[i]:sum()) | ||
end | ||
tester:eq(probs:double(), counts, 0.01, "probs and counts should be approximately equal for n_class = "..curr_n_class) | ||
end | ||
end | ||
tester:add(checkMultinomial) | ||
tester:run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plus, this wasn't converted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't do it for this. I just tried it out for the other one. If the idea convinced everyone, I could do it for the other one too.