Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive Deals Insight #412

Merged
merged 9 commits into from
Nov 30, 2020
Merged

Archive Deals Insight #412

merged 9 commits into from
Nov 30, 2020

Conversation

asutula
Copy link
Member

@asutula asutula commented Nov 19, 2020

  • Adds DealInfo[] to the Mongo Archive object
  • Populates Archive.DealInfo[] every time we poll Powergate for job status
  • Exposes current and historical Archive data via a new/simplified API
  • Uses configurable "slow" and "fast" job polling intervals to periodically check job status with the interval depending on whether or not the archive contains deals that are sealing or not.

Makefile Show resolved Hide resolved
Key: key,
})
// Archives returns information about current and historical archives.
func (c *Client) Archives(ctx context.Context, key string) (*pb.ArchivesResponse, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining those two old methods into one here. Archives returns an object that looks like:

{
  Current: nil|Archive,
  History: Archive[]
}

If the bucket has never been archived, both those values are the nil/empty values.
If the bucket has an archive currently processing, it will be listed in Current and is updates in near-real-time as new deal information is available. The previously completed archive is moved to the first slot in History.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this comment here in the client, but the change is reflected all the way through the stack.

@@ -22,8 +23,7 @@ const (
)

var (
CheckInterval = time.Second * 15
JobStatusPollInterval = time.Minute * 30
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now configured and passed in here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -169,11 +177,10 @@ func (t *Tracker) trackArchiveProgress(
jid string,
bucketRoot cid.Cid,
powInfo *mdb.PowInfo,
) (bool, string, error) {
) (time.Duration, string, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now returns the duration that should be used for rescheduling. It will be different depending on if we have any deals in sealing state or not. 0 is interpreted as "don't reschedule".


if !isJobStatusFinal(res.StorageJob.Status) {
return true, "no final status yet", nil
return t.jobPollIntervalSlow, fmt.Sprintf("getting current job %s for bucket %s: %s", jid, buckKey, err), nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For recoverable errors, we reschedule with the "slow" interval.

return t.jobPollIntervalSlow, fmt.Sprintf("updating archive status: %s", err), nil
}

rescheduleDuration := t.jobPollIntervalFast
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the fast interval by default.

message := "non-final status"
if isJobStatusFinal(res.StorageJob.Status) {
message = "reached final status"
rescheduleDuration = 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the job is final, use duration 0 meaning "don't reschedule"


if rescheduleDuration > 0 {
for _, dealInfo := range res.StorageJob.DealInfo {
if dealInfo.StateId == storagemarket.StorageDealSealing {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to reschedule, inspect the deal states and decide if we should use short or long duration.

I really hate the use of storagemarkets here, but it's necessary since we aren't modeling those states ourself at any point in the stack. Maybe we should so we can avoid this and have typed states for deals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. We should eventually have our own states in the API.
Even if we use storagemarket here, there's the chance that storagemarket version used in Hub might diverge from the one returned by Powergate, if both are running different versions that doesn't break the API. (In fact, that was already the case for a long time since we didn't break APIs in Powergate and didn't update the Powergate client in Hub).

To be fair, I see a really tiny chance that the market's code will be changing the numbers of statuses, so isn't a panicking situation. But I completely agree that the API should be self-contained... not only because is nicer, but because is safer.

Might worth creating an issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err := t.colls.BucketArchives.Replace(ctx, ba); err != nil {
return fmt.Errorf("updating bucket archives status: %s", err)
}
return nil
}

func (t *Tracker) saveDealsInArchive(
func (t *Tracker) saveDealsInBucket(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename for clarity.

}

// ArchiveWatch delivers messages about the archive status.
func (b *Bucket) ArchiveWatch(ctx context.Context, watch bool) (<-chan ArchiveStatusMessage, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to simplify the old pattern of "get status optionally watch by providing a flag" because the types of data returned by status (now Archives) and ArchiveWatch are very different.

},
"archivesJobPollIntervalFast": {
Key: "archives.job_poll_interval_fast",
DefValue: time.Minute * 15,
Copy link
Member Author

@asutula asutula Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poll interval flags and their default values.

AbortedMsg string `bson:"aborted_msg"`
FailureMsg string `bson:"failure_msg"`
CreatedAt int64 `bson:"created_at"`
DealInfo []DealInfo `bson:"deal_info"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New model to hold deal information. This would just unmarshal to an empty array for pre-existing archives I think? Will look at writing a test for that. Also, maybe we want to consider writing a migration for those pre-existing archives. Not sure how complex that could be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd think the same: unmarshal to empty array for existing things.

Regarding the migration, could be. We could wait a bit to see if existing archives really start asking for this data to be populated, and if justified do it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a plan to me.

@asutula asutula self-assigned this Nov 25, 2020
@asutula asutula marked this pull request as ready for review November 25, 2020 05:20
Signed-off-by: Aaron Sutula <[email protected]>
Copy link
Contributor

@jsign jsign left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Makefile Show resolved Hide resolved
api/bucketsd/pb/bucketsd.proto Show resolved Hide resolved
@@ -22,8 +23,7 @@ const (
)

var (
CheckInterval = time.Second * 15
JobStatusPollInterval = time.Minute * 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!


if rescheduleDuration > 0 {
for _, dealInfo := range res.StorageJob.DealInfo {
if dealInfo.StateId == storagemarket.StorageDealSealing {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. We should eventually have our own states in the API.
Even if we use storagemarket here, there's the chance that storagemarket version used in Hub might diverge from the one returned by Powergate, if both are running different versions that doesn't break the API. (In fact, that was already the case for a long time since we didn't break APIs in Powergate and didn't update the Powergate client in Hub).

To be fair, I see a really tiny chance that the market's code will be changing the numbers of statuses, so isn't a panicking situation. But I completely agree that the API should be self-contained... not only because is nicer, but because is safer.

Might worth creating an issue?

Comment on lines +250 to +251
conf.ArchiveJobPollIntervalFast = time.Second * 5
conf.ArchiveJobPollIntervalSlow = time.Second * 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You scaled-down respecting semantics :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha of course!

AbortedMsg string `bson:"aborted_msg"`
FailureMsg string `bson:"failure_msg"`
CreatedAt int64 `bson:"created_at"`
DealInfo []DealInfo `bson:"deal_info"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd think the same: unmarshal to empty array for existing things.

Regarding the migration, could be. We could wait a bit to see if existing archives really start asking for this data to be populated, and if justified do it?

Signed-off-by: Aaron Sutula <[email protected]>
@asutula asutula merged commit 2e8cb77 into master Nov 30, 2020
@asutula asutula deleted the asutula/archive-deals-api branch November 30, 2020 22:48
@jimmylee
Copy link

jimmylee commented Dec 2, 2020

💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants