Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The get_rooms_paginate DB queries are slow #17330

Open
anoadragon453 opened this issue Jun 19, 2024 · 0 comments
Open

The get_rooms_paginate DB queries are slow #17330

anoadragon453 opened this issue Jun 19, 2024 · 0 comments

Comments

@anoadragon453
Copy link
Member

The RoomWorkerStore.get_rooms_paginate DB store method contains some slow SQL queries. This method is currently only called when hitting the List Room Admin API. The current codebase contains this query:

# Create one query for getting the limited number of events that the user asked
# for, and another query for getting the total number of events that could be
# returned. Thus allowing us to see if there are more events to paginate through
info_sql = """
SELECT state.room_id, state.name, state.canonical_alias, curr.joined_members,
curr.local_users_in_room, rooms.room_version, rooms.creator,
state.encryption, state.is_federatable, rooms.is_public, state.join_rules,
state.guest_access, state.history_visibility, curr.current_state_events,
state.room_type
FROM room_stats_state state
INNER JOIN room_stats_current curr USING (room_id)
INNER JOIN rooms USING (room_id)
{where}
ORDER BY {order_by} {direction}, state.room_id {direction}
LIMIT ?
OFFSET ?
""".format(
where=where_statement,
order_by=order_by_column,
direction="ASC" if order_by_asc else "DESC",
)

Due to the two INNER JOINs, the statement takes about 700ms to execute on matrix.org (and this is after caching, as I ran the statement once before attempting to ANALYZE it).

https://github.com/element-hq/synapse/pull/17276/files#r1631162300 added a similar set of INNER JOINs to the count_sql query.

These joins are necessary to access metadata about rooms, which is spread across multiple tables. For instance, the rooms table contains an is_public column, which allows for filtering room results based on whether a room is published in the public room directory.

Can we consolidate this information into a single table somehow? Or is there a faster way to access information across multiple tables than INNER JOINs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant