-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance TODO #19
Comments
Do people expect monad transformers to work with unlawful monads as a regular thing? I was hoping we could just tell people they'll get undefined behavior if they do that. If it's a big concern, we could use the module system to make the optimization "opt-in". |
As long as the expectations and consequences are adequately documented, the sheriff won't come and drag you away. I'm mostly imagining issues with others' test suites that intentionally use (slightly) unlawful instances. That said, if you're actually able to get reliably good performance in realistic situations using rewrite rules, it's hard to argue with that result. |
I attempted a |
I'm having the same issue with a rewrite rule for I'll explain how far I made it with stream fusion. First I added these definitions, which I believe to be a pretty straightforward aspect of the stream fusion literature, although I'm new to this so I might have done something goofy here without realizing it: data Step s a = Done | Yield a s
data Stream a = forall s. Stream (s -> Step s a) s
type ViewS m a = Stream (m (View m a))
stream :: SeqT m a -> ViewS m a
stream (SeqT m) = Stream next m where
{-# INLINE next #-}
next s = case viewl s of
EmptyL -> Done
h S.:< t -> Yield h t
{-# INLINE[0] stream #-}
unstream :: ViewS m a -> SeqT m a
unstream (Stream next s0) = SeqT (unfold s0)
where
unfold s = case next s of
Done -> S.empty
Yield x xs -> x S.<| unfold xs
{-# INLINE[0] unstream #-}
{-# RULES
"stream-unstream" [1] forall (s::ViewS m a). stream (unstream s) = s;
#-} For a warm up, I tried to make a fusable version of alt :: SeqT m a -> SeqT m a -> SeqT m a
alt a b = unstream (alt_s (stream a) (stream b))
{-# INLINE[2] alt #-}
alt_s :: ViewS m a -> ViewS m a -> ViewS m a
alt_s (Stream next_a a) (Stream next_b b) = Stream next (a,b)
where
{-# INLINE next #-}
next (s_a,s_b) = case next_a s_a of
Done -> case next_b s_b of
Done -> Done
Yield y ys -> Yield y (s_a, ys)
Yield x xs -> Yield x (xs, s_b)
{-# INLINE[2] alt_s #-} I'm not a huge fan of how Next I thought I would attempt bind :: Monad m => SeqT m a -> (a -> SeqT m b) -> SeqT m b
bind m f = unstream (bind_s (stream m) (stream . f))
bind_s :: Monad m => ViewS m a -> (a -> ViewS m b) -> ViewS m b
bind_s (Stream next_a a) f = Stream next a
where
{-# INLINE next #-}
next s_a = case next_a s_a of
Done -> Done
Yield y ys -> Yield (y >>= go ys) ys
go ys Empty = return Empty
go ys (b :< bs) = case f b `alt_s` (bind_s (stream bs) f) of
Stream next_fb fb -> case next_fb fb of
Done -> return Empty
Yield h t -> h In simple test cases like I think the issue is in the last line of I think to fix the code above, I would need something of type: flatten :: ViewS m a -> m (View m a) And I'm not sure that function makes sense. I thought "okay, maybe deal with That said, I'm curious though, if we only could fuse chooseSeqT :: (Foldable t, Monad m) => t a -> SeqT m a
chooseSeqT xs = unstream (foldr (alt_s . stream . pure) done xs)
{-# INLINEABLE chooseSeqT #-}
{-# SPECIALIZE chooseSeqT :: [a] -> Seq a #-}
{-# SPECIALIZE chooseSeqT :: Foldable t => t a -> Seq a #-}
done :: ViewS m a
done = Stream (const Done) Empty
{-# INLINE CONLIKE [0] done #-} However, that doesn't appear to lead to any better performance than just using the existing stuff with good inlining/specialization. However, I did learn one thing along the way. That functions like |
Ah, you're looking at stream fusion of the queue.... I was thinking more about |
That's an interesting avenue but I would expect it to also be more brittle and harder to ensure correctness. I guess when you have multiple rules you need to ensure they are confluent. And that's one of the nice things about using a single general rule like |
I believe what I'm talking about should be confluent too—very much like
fold/build fusion. The bigger question is what approach works better (in
different contexts).
…On Sun, Jul 25, 2021, 7:53 PM Jason Dagit ***@***.***> wrote:
That's an interesting avenue but I would expect it to also be more brittle
and harder to ensure correctness. I guess when you have multiple rules you
need to ensure they are confluent. And that's one of the nice things about
using a single general rule like stream-unstream and manually writing the
fuseable definitions. You don't have to think about making things confluent
and there are fewer rules to potentially not apply for random reasons.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7KEBWAZEFFBUXW6NFTTZSPWPANCNFSM5A474RSA>
.
|
I worked on this a bit more today after all. A friend (dmwit) helped me with it a bit. It's in the |
Ah, I know Daniel Wagner too, though I haven't seen him since the pandemic
started.
…On Sun, Jul 25, 2021, 11:45 PM Jason Dagit ***@***.***> wrote:
I worked on this a bit more today after all. A friend (dmwit) helped me
with it a bit. It's in the stream branch if you want to see it. It's back
to the original performance of SeqT before we started optimizing things.
So that's a bummer, however I did see that stream-unstream was firing on
my local example code. So that's kind of cool.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7N2KXX5DX2NP3D2OOLTZTK4HANCNFSM5A474RSA>
.
|
Okay, I really goofed implementing Stream version:
LogicT:
Basically, the it's within 2x of |
Nice!!!
…On Mon, Jul 26, 2021, 2:15 PM Jason Dagit ***@***.***> wrote:
Okay, I really goofed implementing >>= yesterday. I just pushed a new
version thanks to dmwit. The stream version is now the fastest version by
quite a bit.
Stream version:
SeqT IO with msplit:
Batch generated in: 0.24157 seconds
Batch generated in: 0.236334 seconds
Batch generated in: 0.22816799999999998 seconds
Batch generated in: 0.228131 seconds
Batch generated in: 0.22116 seconds
Total time: 1.155495 seconds
LogicT:
LogicT Identity without msplit:
Batch generated in: 0.14227499999999998 seconds
Batch generated in: 0.124216 seconds
Batch generated in: 0.1135 seconds
Batch generated in: 0.159997 seconds
Batch generated in: 0.120277 seconds
Total time: 0.660371 seconds
Basically, the it's within 2x of LogicT now.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOOF7LVTCWEDLUNXE5C3STTZWQ4BANCNFSM5A474RSA>
.
|
I'm opening this ticket to track performance improvement ideas for a bit.
(>>=)
and so on. In particular, this would allow GHC to inlinef
inm >>= f
. Does this actually help? We'll need measurements to see.sequence
ortype-aligned
dependencies for performance reasons? #4.Monad
instances.The text was updated successfully, but these errors were encountered: