Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: recreate indexes by swapping them under a common alias #63

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

slri
Copy link

@slri slri commented May 12, 2021

Note: This is currently a draft

Problem:
When working with data, some objects will be intentionally deleted. If we want a fresh instance of the index without stale/deleted data, we'd need to recreate it. When recreating it the only available way right now, we'd delete the index and then wait for it to be recreated again. This approach harms overall performance since the site is technically down or will have partial data until the indexing finishes.

Solution:
Use index name as an alias instead and create indices with a timestamp of the indexing time attached. While the new index is being created, the old index will still keep the website operational. Once the new index is ready, delete the old index. This ensures that even though we might have stale data for a little while longer, we at least will have it.

Copy link
Contributor

@dkarlovi dkarlovi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good work. some small tweaks.

@@ -26,9 +26,9 @@ public function __construct(PersisterFactory $persisterFactory)
/**
* @return array<ImporterFactory>
*/
public function create(?string $site = null, ?string $type = null, ?string $language = null, ?string $store = null, ?\DateTimeInterface $since = null): array
public function create(?string $site = null, ?string $type = null, ?string $language = null, ?string $store = null, ?\DateTimeInterface $since = null, ?string $runTimestamp = null): array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this to be configurable, it will never be changed to anything else.

$indexSettings = new IndexSettings(
$className,
$indexName,
$runTimestamp ? $indexName.'_'.$runTimestamp : $indexName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name, maybe $indexSuffix?

@@ -35,6 +35,7 @@ protected function configure()
->addArgument('language', InputArgument::OPTIONAL, 'Language to index')
->addArgument('store', InputArgument::OPTIONAL, 'Site store to index')
->addOption('updated-since', 's', InputOption::VALUE_OPTIONAL, 'Fetch objects updated in the relative timeframe ("5minute", "2hour", "1day", "yesterday" etc)')
->addOption('swap', null, InputOption::VALUE_OPTIONAL, 'Do a full reingest by swapping indices', false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Booleans use InputOption::VALUE_NONE, they're false by default.

return $this->baseIndexService;
}

public function __call($method, $args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate, but I guess it's the only way to do it. Wish IndexService had an interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants