Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Old posts get synced with a too high hot_rank #3125

Closed
4 tasks done
sunaurus opened this issue Jun 15, 2023 · 2 comments · Fixed by #3131
Closed
4 tasks done

[Bug]: Old posts get synced with a too high hot_rank #3125

sunaurus opened this issue Jun 15, 2023 · 2 comments · Fixed by #3131
Labels
bug Something isn't working

Comments

@sunaurus
Copy link
Collaborator

sunaurus commented Jun 15, 2023

Requirements

  • Is this a bug report? For questions or discussions use https://lemmy.ml/c/lemmy_support
  • Did you check to see if this issue already exists?
  • Is this only a single bug? Do not put multiple bugs in one issue.
  • Is this a UI / front end issue? Use the lemmy-ui repo.

Summary

By default, every row in post_aggregates starts with a hot_rank of 1728. (Same is true for comments)

This works well for new posts, but when a user searches for an old post on another instance, this older post will also get an entry in the post_aggregates table, with the same default value. The result of this is that users can flood the front page with 2-3Y old posts from other instances, as long as those posts had never been synced before.

This issue is compounded by the fact that hot_rank calculation for older posts only runs on server startup, and in fact will very often fail completely due to database deadlocks (#3076).

I propose significantly reducing the default value of hot_rank in post_aggregates (and also comment_aggregates) for any entry where published is older than 24h. Perhaps even reducing it to 0. I can make a PR with this proposed change.

Steps to Reproduce

  1. Sync an old post from any other instance
  2. Check it's hot_rank in the post_aggregates table

Technical Details

N/A

Version

0.17.4

Lemmy Instance URL

No response

@sunaurus sunaurus added the bug Something isn't working label Jun 15, 2023
@Nutomic
Copy link
Member

Nutomic commented Jun 15, 2023

Yes this makes sense, seems like we missed an edge case there. You can find the relevant code in #2952. I think any post/comment db writes from crates/apub/ need to calculate the rank manually.

@dessalines
Copy link
Member

dessalines commented Jun 15, 2023

@sunaurus I'd prefer this edge-case be handled in two ways, and not through editing or messing with SQL triggers:

  • Since this is only an issue for federated posts and comments, then in the apub code, on receive, update the relevant post_aggregates hot_rank and hot_rank_active columns, for only that item. (Same for comment_aggregates) Check the scheduled_tasks.rs for how to do this already in diesel.
    • I like this the best, because it only changes that row.
  • Another hacky solution, would be for the scheduled_tasks.rs to set all historical rows older than a week to zero, but I don't like that as much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants