Slowdown above some array-size threshold #2019

jzluo · 2022-09-13T19:46:31Z

Sorry, wasn't sure how to title this. Continued from #2016.

Thanks so much for all your work. It is much improved and for my case of 20 is no longer slower than Numpy. I played around with it a little more and tested an equivalent (X.T * rootW).T (call it XtW) in addition to the original rootW[:, np.newaxis] * X (call it XW). Please see the plot - I find that XtW has no performance hit with simd, whereas XW actually is still slower than Numpy if larger than dimension size 20 in my example. However at some point (>200 cols in my case) they all become much slower for some reason, which I suppose belongs in a different issue.

#pythran export XW_pythran(float[:,:], float[])
def XW_pythran(X, preds):
    rootW = np.sqrt(preds * (1 - preds))
    XW = rootW[:, np.newaxis] * X
    return XW

#pythran export XtW_pythran(float[:,:], float[])
def XtW_pythran(X, preds):
    rootW = np.sqrt(preds * (1 - preds))
    XW = (X.T * rootW).T
    return XW

# for plot
import perfplot

np.random.seed(0)
preds = np.random.random(20000)
perfplot.show(
    setup=lambda n: np.random.rand(20000, n),
    kernels=[
        lambda X: get_XW(X, preds),  # pure numpy version of XW_pythran
        lambda X: XW_pythran(X, preds),   # -O3 -march=native
        lambda X: XW_pythran_simd(X, preds),  # -O3 -march=native -DUSE_XSIMD
        lambda X: XtW_pythran(X, preds),
        lambda X: XtW_pythran_simd(X, preds)
    ],
    labels=["np", "pythran_XW", "pythran_simd_XW", "pythran_XtW", "pythran_simd_XtW"],
    n_range=[i for i in range(20, 280, 20)],
    xlabel="n_cols",
    relative_to=0,
)

Originally posted by @jzluo in #2016 (comment)

The text was updated successfully, but these errors were encountered:

serge-sans-paille · 2022-09-16T12:50:23Z

I ran the kernel under perf stat and the performance issue is due to a lot of L1 cache misses. We must be doing something not smart wrt. order of iteration :-/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slowdown above some array-size threshold #2019

Slowdown above some array-size threshold #2019

jzluo commented Sep 13, 2022 •

edited

Loading

serge-sans-paille commented Sep 16, 2022

Slowdown above some array-size threshold #2019

Slowdown above some array-size threshold #2019

Comments

jzluo commented Sep 13, 2022 • edited Loading

serge-sans-paille commented Sep 16, 2022

jzluo commented Sep 13, 2022 •

edited

Loading