With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

nekohayo · 2020-07-08T23:26:03Z

It is a long-standing fact that GTG has always been rather slow when displaying large amounts of tasks, particularly when you have multiple hundreds of them and you are clicking "All tasks" (instead of a specific tag) when not in the "Actionable" view mode (a.k.a. workview).

Up until GTG 0.3.1, this was very painful for startup times (and for exiting the workview). This is issue #109.

However, with GTG 0.4 / git, the performance behavior pattern is different from 0.3.x:

In 0.3.x, if you were inside the workview, and loaded only the tasks from a particular tag, instead of trying to display everything, then it was very fast.
In 0.4/git, switching between tags is slow even if you are in "Actionable" view mode, even if it ends up displaying only a couple of tasks in the end.
Interestingly, one area where 0.4/git is faster than 0.3, is toggling between Open and Actionable (workview) modes. In 0.3.x, it was excruciatingly slow, and in 0.4, it's instantaneous.

So, from a user's perspective, you lose something and you gain something. The performance problem kinda swapped around between 0.3.1 and 0.4: switching tags was fast and is now slow, switching view modes was slow and is now fast.

You can use ./launch.sh -s bryce to use the heavy (~1MB) "bryce" dataset for testing performance issues more easily.

While the port to LXML might fix the symptoms of issue #109 and of this new ticket here, I think it's important to report that there is some sort of underlying performance issue before the problem gets "hidden under the carpet" by LXML's superior performance:

Normally, displaying small numbers of tasks should be near instantaneous, or at least much faster than displaying "all tasks". Right now, when a tag has over 100 tasks in it, it feels equally slow to switch to that tag whether you are in Open, Actionable or Closed tasks view mode.
This symptom tells me there is "something wrong" in the logic that GTG 0.4 tries to apply to fetch and display the tasks, because it's clearly trying to do "too much" (a.k.a. superfluous) work, which is typically what 99% of performance problems boil down to. If I were to do a wild guess, it's probably trying to load all the tasks of that tag and then re-filtering them for display depending on the view, or something like that, instead of requesting/loading only a subset of the tasks. Or... maybe it is doing everything right. I don't know, one needs to investigate/profile the code. For some general tips, see https://fortintam.com/blog/profiling-specto-and-whole-python-applications-in-general/ and other posts in https://fortintam.com/blog/tag/profiling/

Running sysprof while switching between tags, it seems as if this is not necessarily a disk I/O bound problem (although that graph has some "writes" spikes on my SSD, it doesn't seem like it's filling the graph with that, so at least we're probably not reading/writing to the disk all the time...), but rather a CPU-bound issue (note that the graph doesn't use all my cores/threads, the low CPU usage is certainly because it's single-threaded), suggesting that GTG recalculates everything everytime.

Switching between tags a couple of times, it "hits" a bunch of calls a total of... over 73 thousand times? I don't actually know how to interpret the functions/callers/descendants views in Sysprof...

The text was updated successfully, but these errors were encountered:

diegogangl · 2020-07-08T23:51:16Z

Lxml won't change much because xml parsing/writing isn't the bottleneck. It's probably a very small part of the time spent too. Also reading is only done on startup. IIRC the biggest amount of time is spent on sorting (inside liblarch).
Not sure sysprof can be used for Python

nekohayo · 2021-01-03T00:46:16Z

@jaesivsm's changes in PR 530 might actually solve this, or maybe part of it only (the regression part, but not the "slow switching in absolute terms" part)...

nekohayo · 2021-01-15T05:44:27Z

Here are my benchmark observations upon the completion of PR #530 and the accompanying commit in liblarch. I tested with the "bryce" dataset and a copy of my own (even bigger) dataset.

Benchmark data set: "Bryce"

I saw a 2x improvement with various operations when running the "bryce" sample data set (./launch.sh -s "bryce"), which has 827 tasks.

Git master

startup time: 23.0 seconds
switching to "Actionable" view: 0 secs
- switching to "work" tag: 6.2 secs
- switching back to all tasks: 13.6 secs
switching back to "Open" tasks view: 0 secs
- switching to "work" tag: 6.2 secs
- switching back to all tasks: 13.6 secs

With the optimizations from PR 530

startup time: 13.5 secs
switching to "Actionable" view: 6.3 secs (the first time only)
- switching to "work" tag: 3.4 secs
- switching back to all tasks: 6.4 secs
switching back to "Open" tasks view: 0 secs
- switching to "work" tag: 3.5 secs
- switching back to all tasks: 6.4 secs

Note: performance gains on the "bryce" dataset are 2x rather than the potential 3x, because that dataset doesn't have more than 1 closed tasks in it, so from a data perspective it's almost as if it only had two panes in practice.

Benchmark data set: my own data as of 2021-01-14

I saw from 20 to 85% better performance with the changes from PR 530.

Git master

startup time: 23.3 seconds
switching to "Actionable" view: 0 secs
- switching to "apparte" tag: 3.7 secs
- switching back to all tasks: 15.2 secs
switching back to "Open" tasks view: 0 secs
- switching to "apparte" tag: 3.7 secs
- switching back to all tasks: 15.2 secs

With the optimizations from PR 530

startup time: 20.9 secs
switching to "Actionable" view: 3.0 secs (the first time only)
- switching to "apparte" tag: 0.9 secs
- switching back to all tasks: 2.7 secs
switching back to "Open" tasks view: 0 secs
- switching to "apparte" tag: 2.9 secs
- switching back to all tasks: 12.3 secs

Overall:

We only get a slight (and not that big/slow) regression in performance when switching between Open/Actionable the first time, Further switches between "Open" and "Actionable" views are instantaneous, even if you change the contents of a task.
Everything else is faster, sometimes by 20-30%, sometimes by a factor of 2x, sometimes by 80%+ (ex: going from 15 secs to 2-3 secs to switch to "All tasks" in actionable view with my personal dataset).
Verdict: in practical use, it's a net win.

nekohayo added this to the 0.5 "You Can (Not) Improve Performance" milestone Jul 8, 2020

nekohayo mentioned this issue Jul 8, 2020

Very slow to launch and display more than 200 tasks #109

Open

diegogangl removed this from the 0.5 "You Can (Not) Improve Performance" milestone Oct 24, 2020

nekohayo linked a pull request Jan 3, 2021 that will close this issue

Optim/no refresh on hidden pane #530

Merged

nekohayo added this to the 0.5 "You Can (Not) Improve Performance" milestone Jan 15, 2021

nekohayo mentioned this issue Jan 15, 2021

Optim/no refresh on hidden pane #530

Merged

nekohayo closed this as completed in #530 Jan 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

nekohayo commented Jul 8, 2020

diegogangl commented Jul 8, 2020

nekohayo commented Jan 3, 2021

nekohayo commented Jan 15, 2021 •

edited

Loading

With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

Comments

nekohayo commented Jul 8, 2020

diegogangl commented Jul 8, 2020

nekohayo commented Jan 3, 2021

nekohayo commented Jan 15, 2021 • edited Loading

Benchmark data set: "Bryce"

Git master

With the optimizations from PR 530

Benchmark data set: my own data as of 2021-01-14

Git master

With the optimizations from PR 530

nekohayo commented Jan 15, 2021 •

edited

Loading