Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with invalid memory address or nil pointer dereference when restoring #556

Closed
jonicohn opened this issue Jan 5, 2024 · 6 comments · Fixed by #557
Closed

Crash with invalid memory address or nil pointer dereference when restoring #556

jonicohn opened this issue Jan 5, 2024 · 6 comments · Fixed by #557

Comments

@jonicohn
Copy link

jonicohn commented Jan 5, 2024

Hi @benbjohnson,

first of all: Thank you for this project. This helps a lot for my databases!

I'm using it in different projects and databases and it works quite good.

But now I have an issue when restoring a database after it was working for months. The database file has 10 tables and the largest one has 11072 rows with 35 columns. The total size of the database file is ~ 1 MB. The restore crashes with the following error message:

./litestream restore -config litestream.yml /tmp/litestream/experiment.db

/tmp/litestream/experiment.db(s3): restoring snapshot 4b0d294741b7c9ad/00000000 to /tmp/litestream/experiment.db.tmp
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xed93f9]

goroutine 1 [running]:
github.com/benbjohnson/litestream/s3.(*ReplicaClient).SnapshotReader(0xc00005df00, {0x178bd50, 0xc000120000}, {0xc000515bac, 0x10}, 0x0?)
        /home/runner/work/litestream/litestream/s3/replica_client.go:295 +0x3b9
github.com/benbjohnson/litestream.(*Replica).restoreSnapshot(0xc00037e0f0, {0x178bd50, 0xc000120000}, {0xc000515bac, 0x10}, 0x7fffffff?, {0xc000759200, 0x21})
        /home/runner/work/litestream/litestream/replica.go:1308 +0x18d
github.com/benbjohnson/litestream.(*Replica).Restore(0xc00037e0f0, {0x178bd50, 0xc000120000}, {{0x7ffd1a68bf29, 0x1d}, {0x0, 0x0}, {0xc000515bac, 0x10}, 0x7fffffff, ...})
        /home/runner/work/litestream/litestream/replica.go:1074 +0xae5
main.(*RestoreCommand).Run(0x20a8d18, {0x178bd50, 0xc000120000}, {0xc00012e020, 0x3, 0x3})
        /home/runner/work/litestream/litestream/cmd/litestream/restore.go:90 +0x971
main.(*Main).Run(0xc0000061a0?, {0x178bd50, 0xc000120000}, {0xc00012e010, 0x4, 0x4})
        /home/runner/work/litestream/litestream/cmd/litestream/main.go:123 +0x165
main.main()
        /home/runner/work/litestream/litestream/cmd/litestream/main.go:43 +0x7c

If I comment the lines 294 and 295 and compile the code:

out, err := c.s3.GetObjectWithContext(ctx, &s3.GetObjectInput{
Bucket: aws.String(c.Bucket),
Key: aws.String(key),
})
if isNotExists(err) {
return nil, os.ErrNotExist
} else if err != nil {
return nil, err
}
internal.OperationTotalCounterVec.WithLabelValues(ReplicaClientType, "GET").Inc()
internal.OperationBytesCounterVec.WithLabelValues(ReplicaClientType, "GET").Add(float64(*out.ContentLength))
return out.Body, nil
}

the restore is working again. If I instead remove 52 rows of my table or remove 1 column it is working too.

When I download the snapshot file manually and decompress it, it is working too.

I don't really know go language and I'm not sure what these two lines are needed for. When I understand it correctly they are used for some metrics, but I'm not sure if it is safe to remove these two lines, although it would be better to fix the root cause.

Does anybody know why this happens?

Thank you!

@hifi
Copy link
Collaborator

hifi commented Jan 5, 2024

Hi!

Does this only happen with a specific release or a specific database and is it 100% reproducible?

Did you look up the lines of the same release you're on on GitHub as those have changed a little since last release?

Thanks!

@hifi
Copy link
Collaborator

hifi commented Jan 5, 2024

Ah, I see what the issue is now that I'm at my laptop. It's indeed an expectation that a successful request to the bucket returns the content length of the object (snapshot in this case) and it's recorded in metrics. For some reason the length of the object is not given by the AWS SDK.

Is that a standard AWS S3 bucket you are using or some third party provider? And which provider if you can say.

Thanks.

hifi added a commit to beeper/litestream that referenced this issue Jan 5, 2024
hifi added a commit to beeper/litestream that referenced this issue Jan 5, 2024
hifi added a commit to beeper/litestream that referenced this issue Jan 5, 2024
The SDK provides helper functions when you want an empty value if
the actual one is nil so we can just wrap all of them to avoid getting
bit.

Fixes benbjohnson#556
Replaces benbjohnson#555

Also fixes another panic when getting a delete error without key.
hifi added a commit to beeper/litestream that referenced this issue Jan 5, 2024
The SDK provides helper functions when you want an empty value if
the actual one is nil so we can just wrap all of them to avoid getting
bit.

Fixes benbjohnson#556
Replaces benbjohnson#555

Also fixes another panic when getting a delete error without key.
@hifi
Copy link
Collaborator

hifi commented Jan 5, 2024

Opened #557 that fixes this and another issue that is related to the same thing which we just hit earlier.

@hifi hifi closed this as completed in #557 Jan 7, 2024
@jonicohn
Copy link
Author

jonicohn commented Jan 8, 2024

Sorry for the late follow-up:

In this case it is a S3-like API from Palantir Foundry.

@hifi
Copy link
Collaborator

hifi commented Jan 8, 2024

Ah, yeah, so their implementation doesn't seem to send that information. Regardless it should not cause issues so the merged fix should help.

If you're able to build Litestream from source if you could try the latest code from main it should work.

@jonicohn
Copy link
Author

jonicohn commented Jan 8, 2024

Already built it today and it works. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants