Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

Closed
nekohayo opened this issue Jul 8, 2020 · 3 comments · Fixed by #530
Labels
bug needinfo We need more info to solve the issue, or we can't fix it. patch-or-wont-happen Core maintainers would like this, but lack time/energy. Contribute a patch or it won't happen. performance Issues affecting the speed and responsiveness of the application priority:high reproducible-in-git Issues that affect the current dev version

Comments

@nekohayo
Copy link
Member

nekohayo commented Jul 8, 2020

It is a long-standing fact that GTG has always been rather slow when displaying large amounts of tasks, particularly when you have multiple hundreds of them and you are clicking "All tasks" (instead of a specific tag) when not in the "Actionable" view mode (a.k.a. workview).

Up until GTG 0.3.1, this was very painful for startup times (and for exiting the workview). This is issue #109.

However, with GTG 0.4 / git, the performance behavior pattern is different from 0.3.x:

  • In 0.3.x, if you were inside the workview, and loaded only the tasks from a particular tag, instead of trying to display everything, then it was very fast.
  • In 0.4/git, switching between tags is slow even if you are in "Actionable" view mode, even if it ends up displaying only a couple of tasks in the end.
  • Interestingly, one area where 0.4/git is faster than 0.3, is toggling between Open and Actionable (workview) modes. In 0.3.x, it was excruciatingly slow, and in 0.4, it's instantaneous.

So, from a user's perspective, you lose something and you gain something. The performance problem kinda swapped around between 0.3.1 and 0.4: switching tags was fast and is now slow, switching view modes was slow and is now fast.

You can use ./launch.sh -s bryce to use the heavy (~1MB) "bryce" dataset for testing performance issues more easily.

While the port to LXML might fix the symptoms of issue #109 and of this new ticket here, I think it's important to report that there is some sort of underlying performance issue before the problem gets "hidden under the carpet" by LXML's superior performance:

  • Normally, displaying small numbers of tasks should be near instantaneous, or at least much faster than displaying "all tasks". Right now, when a tag has over 100 tasks in it, it feels equally slow to switch to that tag whether you are in Open, Actionable or Closed tasks view mode.
  • This symptom tells me there is "something wrong" in the logic that GTG 0.4 tries to apply to fetch and display the tasks, because it's clearly trying to do "too much" (a.k.a. superfluous) work, which is typically what 99% of performance problems boil down to. If I were to do a wild guess, it's probably trying to load all the tasks of that tag and then re-filtering them for display depending on the view, or something like that, instead of requesting/loading only a subset of the tasks. Or... maybe it is doing everything right. I don't know, one needs to investigate/profile the code. For some general tips, see https://fortintam.com/blog/profiling-specto-and-whole-python-applications-in-general/ and other posts in https://fortintam.com/blog/tag/profiling/

Running sysprof while switching between tags, it seems as if this is not necessarily a disk I/O bound problem (although that graph has some "writes" spikes on my SSD, it doesn't seem like it's filling the graph with that, so at least we're probably not reading/writing to the disk all the time...), but rather a CPU-bound issue (note that the graph doesn't use all my cores/threads, the low CPU usage is certainly because it's single-threaded), suggesting that GTG recalculates everything everytime.

Screenshot from 2020-07-08 19-20-52

Switching between tags a couple of times, it "hits" a bunch of calls a total of... over 73 thousand times? I don't actually know how to interpret the functions/callers/descendants views in Sysprof...

@nekohayo nekohayo added bug priority:high needinfo We need more info to solve the issue, or we can't fix it. reproducible-in-git Issues that affect the current dev version patch-or-wont-happen Core maintainers would like this, but lack time/energy. Contribute a patch or it won't happen. performance Issues affecting the speed and responsiveness of the application labels Jul 8, 2020
@diegogangl
Copy link
Contributor

Lxml won't change much because xml parsing/writing isn't the bottleneck. It's probably a very small part of the time spent too. Also reading is only done on startup. IIRC the biggest amount of time is spent on sorting (inside liblarch).
Not sure sysprof can be used for Python

@nekohayo nekohayo linked a pull request Jan 3, 2021 that will close this issue
@nekohayo
Copy link
Member Author

nekohayo commented Jan 3, 2021

@jaesivsm's changes in PR 530 might actually solve this, or maybe part of it only (the regression part, but not the "slow switching in absolute terms" part)...

@nekohayo
Copy link
Member Author

nekohayo commented Jan 15, 2021

Here are my benchmark observations upon the completion of PR #530 and the accompanying commit in liblarch. I tested with the "bryce" dataset and a copy of my own (even bigger) dataset.

Benchmark data set: "Bryce"

I saw a 2x improvement with various operations when running the "bryce" sample data set (./launch.sh -s "bryce"), which has 827 tasks.

Git master

  • startup time: 23.0 seconds
  • switching to "Actionable" view: 0 secs
    • switching to "work" tag: 6.2 secs
    • switching back to all tasks: 13.6 secs
  • switching back to "Open" tasks view: 0 secs
    • switching to "work" tag: 6.2 secs
    • switching back to all tasks: 13.6 secs

With the optimizations from PR 530

  • startup time: 13.5 secs
  • switching to "Actionable" view: 6.3 secs (the first time only)
    • switching to "work" tag: 3.4 secs
    • switching back to all tasks: 6.4 secs
  • switching back to "Open" tasks view: 0 secs
    • switching to "work" tag: 3.5 secs
    • switching back to all tasks: 6.4 secs

Note: performance gains on the "bryce" dataset are 2x rather than the potential 3x, because that dataset doesn't have more than 1 closed tasks in it, so from a data perspective it's almost as if it only had two panes in practice.

Benchmark data set: my own data as of 2021-01-14

I saw from 20 to 85% better performance with the changes from PR 530.

Git master

  • startup time: 23.3 seconds
  • switching to "Actionable" view: 0 secs
    • switching to "apparte" tag: 3.7 secs
    • switching back to all tasks: 15.2 secs
  • switching back to "Open" tasks view: 0 secs
    • switching to "apparte" tag: 3.7 secs
    • switching back to all tasks: 15.2 secs

With the optimizations from PR 530

  • startup time: 20.9 secs
  • switching to "Actionable" view: 3.0 secs (the first time only)
    • switching to "apparte" tag: 0.9 secs
    • switching back to all tasks: 2.7 secs
  • switching back to "Open" tasks view: 0 secs
    • switching to "apparte" tag: 2.9 secs
    • switching back to all tasks: 12.3 secs

Overall:

  • We only get a slight (and not that big/slow) regression in performance when switching between Open/Actionable the first time, Further switches between "Open" and "Actionable" views are instantaneous, even if you change the contents of a task.
  • Everything else is faster, sometimes by 20-30%, sometimes by a factor of 2x, sometimes by 80%+ (ex: going from 15 secs to 2-3 secs to switch to "All tasks" in actionable view with my personal dataset).
  • Verdict: in practical use, it's a net win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needinfo We need more info to solve the issue, or we can't fix it. patch-or-wont-happen Core maintainers would like this, but lack time/energy. Contribute a patch or it won't happen. performance Issues affecting the speed and responsiveness of the application priority:high reproducible-in-git Issues that affect the current dev version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants