-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation, Examples #1
Comments
You are right, the API is actually documented fairly well, but end-user examples and a generated documentation website is still missing. This will come eventually. Regarding boundary conditions: accesses outside the array evaluate to nothing, so you can check for that and specify the behaviour yourself. using StaticKernels
a = collect(1:9) constantk = Kernel{(1:1,)}(w -> something(w[1], 5))
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 5] replicatek = Kernel{(-1:1,)}(w -> something(w[1], w[0]))
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 9] mirrork = Kernel{(-1:1,)}(w -> something(w[1], w[-1]))
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 8] circularcurrently no nice solution, but you can hack it by referencing the data array within your window function like this: k = Kernel{(-1:1,)}(w -> something(w[1], a[1]))
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 1] or more generally by using k = Kernel{(1:1,)}(boundary=5, w -> w[1])
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 5]
k = Kernel{(-1:1,)}(boundary=:replicate, w -> w[1])
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 9]
k = Kernel{(-1:1,)}(boundary=:mirror, w -> w[1]))
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 8]
k = Kernel{(-1:1,)}(boundary=:circular, w -> w[1])
map(k, a) # [2, 3, 4, 5, 6, 7, 8, 9, 1] |
I don't think the readme is the place for that. You can look at the documentation of |
Thank you for the suggestions.
But you are totally right: documentation for end-users is not great at the moment, I will try to fix that. |
Is there way to have |
I'm not sure I understand what you are searching for. It will already throw an error if the kernel makes out-of-bounds accesses on arrays without an extension defined. This is since boundary conditions are implemented by calling the window function with properly cropped windows and out-of-bounds access gets turned into calls to If you instead search for some kind of static analysis of the kernel function at compile-time such that no out-of-bounds access will be performed, then I think that is out-of-scope for this package: window functions are treated as more or less black box functions. The |
Think of a simple 3x3 kernel ( Let's say I want to apply this kernel over and over on the same image, but I only care about the inner part. I'd do something like: map!(k, mI, mI); Yet the problem is the return size is I was wondering if there is something built in. Also, any chance for a guide, for beginners, about creating a user defined extension / kernel? Specifically about handling the boundaries. |
Just use an extension: If for whatever reason you absolutely want to apply the kernel only on the relevant parts (which will be getting smaller each iteration) you need to do that yourself, but the provided For extensions there is not much documentation available but it is usually enough to implement |
I will try to give a concrete example, from the discussion at https://discourse.julialang.org/t/92691: using BenchmarkTools;
using LoopVectorization;
using StaticKernels;
function SolveLaplace!( mT, numItr )
numRows, numCols = size(mT);
for kk in 1:numItr
for jj in 2:(numCols - 1)
for ii in 2:(numRows - 1)
mT[ii, jj] = 0.25 * (mT[ii - 1, jj] + mT[ii + 1, jj] + mT[ii, jj - 1] + mT[ii, jj + 1]);
end
end
end
return mT;
end
function SolveLaplaceTurbo!( mT, numItr )
numRows, numCols = size(mT);
for kk in 1:numItr
@turbo for jj in 2:(numCols - 1), ii in 2:(numRows - 1)
mT[ii, jj] = 0.25 * (mT[ii - 1, jj] + mT[ii + 1, jj] + mT[ii, jj - 1] + mT[ii, jj + 1]);
end
end
return mT;
end
function SolveLaplaceKernel!( mT, numItr )
numRows, numCols = size(mT);
k = @kernel w -> 0.25 * (w[-1, 0] + w[1, 0] + w[0, -1] + w[0, 1])
mV = @view mT[2:(numRows - 1), 2:(numCols - 1)]
for kk in 1:numItr
# map!(k, mT, extend(mT, StaticKernels.ExtensionReplicate()));
map!(k, mV, mT);
end
return mT;
end
n = 100;
T = zeros(n, n);
T[1, 1:n] .= 500;
T[n, 1:n] .= 500;
T[1:n, 1] .= 200;
T[1:n, n] .= 500;
numItr = 1000;
@btime SolveLaplace!($T, $numItr);
@btime SolveLaplaceTurbo!($T, $numItr);
@btime SolveLaplaceKernel!($T, $numItr); I wanted to show that Is there any simple optimization to |
Your output and input array for the kernel operation is the same, which means that you almost certainly compute the wrong thing here since you overwrite relevant input data. Nevertheless you could use an intermediate buffer like so to improve: function SolveLaplaceKernel2!( mT, numItr )
numRows, numCols = size(mT);
mBuf = similar(mT);
k = @kernel w -> 0.25 * (w[-1, 0] + w[1, 0] + w[0, -1] + w[0, 1])
mTExt = extend(mT, StaticKernels.ExtensionReplicate())
mBufExt = extend(mBuf, StaticKernels.ExtensionReplicate())
for kk in 1:numItr÷2
map!(k, mBuf, mTExt);
map!(k, mT, mBufExt);
end
if isodd(numItr)
map!(k, mBuf, mTExt)
copy!(mT, mBuf)
end
return mT;
end For me this has an factor of ~8 improvement over your version, but please benchmark for yourself. PS: I don't think this issue tracker is the right place for these discussions. |
In this context it is OK to work in place (Gauss Seidel vs. Jacobi iterations). Regarding the discussion, any chance you enable "Discussions" for this repository? |
Ok, in that case you can ignore my example since I assumed it was unintended. If you have this data dependency, then there are there are almost no options for vectorization since computation for each component depends on the result of the previous computation, which is why you have similar performance as the naive loop. When you use I'll try enabling the discussion feature. |
Hi, In documentation: # 2d total variation
k = @kernel w -> abs(w[1,0] - w[0,0]) + abs(w[0,1] - w[0,0])
sum(k, extend(a, StaticKernels.ExtensionReplicate())) What's the difference between using |
Hello,
This looks like a great addition to Julia for those who are working on Image Processing.
I think it is missing documentation.
But for now, could you show examples with all supported boundary conditions?
Usually we have:
Thank You.
The text was updated successfully, but these errors were encountered: