Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out the final partitioning structure for solo_scores #16

Closed
Tracked by #15 ...
peppy opened this issue Aug 31, 2023 · 3 comments
Closed
Tracked by #15 ...

Figure out the final partitioning structure for solo_scores #16

peppy opened this issue Aug 31, 2023 · 3 comments
Assignees

Comments

@peppy
Copy link
Member

peppy commented Aug 31, 2023

I setup partitioning to work with my last major effort on this table, but two pieces are missing.

Partition scheme includes ruleset_id but it is not used

Originally we were looking to partition per ruleset, which seems like a great idea from a performance angle. As such, this was included in the partition scheme, but it hasn't actually been used in the constructed partitions.

I want to test the overhead having ruleset_id in the primary key adds, and if it is negligible then we should leave it there and ensure the constructed partitions are per-ruleset.

(will be tested in #17)

Conclusion: Including ruleset in the partitioning schema is not a good direction, in terms of future extensibility and complexity in partition rotation.

Partition rotation is not yet implemented

Currently we have only two partitions – preserve=0 and preserve=1. The plan is to add partition rotation for the p0 case, but this hasn't been done yet. It still seems like a good idea, but comes with a concern:

We are going in a direction where a score can be switched between preserve=0 and preserve=1. We also have tooling for removing preserve=0 scores after a certain delay and when all criteria are correct (see ppy/osu-queue-score-statistics#141).

So, what happens when a score becoming p=0 and falls into a partition that is about to be rotated for cleanup? If the cleanup process is guaranteeing a time window before scores are cleaned up after the last preserve flag switch, then they may end up getting rotated out of existence too soon. An example would be a user unpinning a score, then realising they want to pin it again a few minutes later. With unfortunate timing it would be lost during this period.

A solution may be to change the partitioning to be on updated_at instead of created_at and ensure we are updating updated_at on any row change.

Partition rotation should be considered for other tables

  • solo_score_tokens
@peppy peppy changed the title Figure out the final partitioning structure Figure out the final partitioning structure for solo_scores Aug 31, 2023
@peppy peppy self-assigned this Aug 31, 2023
@peppy
Copy link
Member Author

peppy commented Sep 29, 2023

As a status update here, the actual partition structure (in terms of included columns) has been finalised in #17. What remains is rotation considerations, aka how to split up the partition ranges using updated_at, and where the rotation process is run for cycling.

@peppy
Copy link
Member Author

peppy commented Dec 14, 2023

Found out we can't use TIMESTAMP in a RANGE COLUMNS partition schema.

 CREATE TABLE `scores` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int unsigned NOT NULL,
  `beatmap_id` mediumint unsigned NOT NULL,
  `ruleset_id` smallint unsigned NOT NULL,
  `data` json NOT NULL,
  `has_replay` tinyint(1) DEFAULT '0',
  `preserve` tinyint(1) NOT NULL DEFAULT '0',
  `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `unix_updated_at` int NOT NULL DEFAULT (unix_timestamp()),
  PRIMARY KEY (`id`,`preserve`,`updated_at`),
  KEY `user_ruleset_index` (`user_id`,`ruleset_id`),
  KEY `beatmap_user_index` (`beatmap_id`,`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
/*!50500 PARTITION BY RANGE  COLUMNS(`preserve`,`unix_updated_at`)
(PARTITION p20231215 VALUES LESS THAN (0,1702512000) ENGINE = InnoDB,
 PARTITION p20231216 VALUES LESS THAN (0,1702598400) ENGINE = InnoDB,
 PARTITION p20231217 VALUES LESS THAN (0,1702684800) ENGINE = InnoDB,
 PARTITION p0catch VALUES LESS THAN (0,MAXVALUE) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (MAXVALUE,MAXVALUE) ENGINE = InnoDB) */

This might be the new best-we-can-do.

@peppy
Copy link
Member Author

peppy commented Dec 15, 2023

Above has been deployed and is rotating successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant