Clair Postgres database storage space issue #2170

Mkori1 · 2024-11-12T07:20:06Z

Description of Problem / Feature Request

Postgres Storage :1800 GiB
Clair Postgres database storage space gets full in 1 week span and then suddenly storage space is freed around 1TB.
This happens every week storage gets full and suddenly freed up at once.

Garbage collection is running daily.

Need to understand why storage space is getting full and GC is not able to clean up the space.

Environment:
clair:4.7.2
Combo Mode

config.yaml
introspection_addr: :8089
http_listen_addr: :8080
log_level: debug
indexer:
connstring: XXX
scanlock_retry: 10
layer_scan_concurrency: 5
migrations: true
matcher:
connstring: XXX
migrations: true
update_retention: 2
period:
matchers:
names:

alpine
aws
debian
oracle
photon
python
rhel
suse
ubuntu
updaters:
sets:
alpine
aws
debian
oracle
photon
pyupio
rhel
suse
ubuntu
notifier:
connstring: XXX
delivery_interval: 1m
poll_interval: 5m
migrations: true
auth:
psk:
key: XXX
iss: ["quay"]

tracing and metrics

trace:
name: "jaeger"
probability: 1
jaeger:
agent:
endpoint: "localhost:6831"
service_name: "clair"
metrics:
name: "prometheus"

frostmar · 2024-11-13T09:29:46Z

Storage for Postgres "deleted" rows is only available for reuse after a Postgres VACUUM has run.
Maybe check your database VACUUM / AUTOVACUUM settings

Mkori1 · 2024-11-13T16:56:18Z

What does this mean in log.
2024-11-11 11:19:26 UTC::@:[26410]:LOG: automatic vacuum of table "ClairBackend.public.vuln": index scans: 1
pages: 0 removed, 2801169 remain, 13106 skipped due to pins, 545696 skipped frozen
tuples: 96092 removed, 12632497 remain, 831982 are dead but not yet removable, oldest xmin: 934978812
buffer usage: 6392796 hits, 3018733 misses, 1438233 dirtied
avg read rate: 41.457 MB/s, avg write rate: 19.751 MB/s
system usage: CPU: user: 79.75 s, system: 24.19 s, elapsed: 568.88 s
WAL usage: 2086644 records, 1408460 full page images, 1823645535 bytes

Mkori1 · 2024-11-13T16:56:53Z

Does automatic vacuum of table "ClairBackend.public.vuln": index scans: 1 relates to starting GC in Clair.

Also does below statement means DELETION is not happening
2024-11-13 15:49:17 UTC:10.116.14.176(14922):root@ClairBackend:[24095]:ERROR: canceling statement due to user request
2024-11-13 15:49:17 UTC:10.116.14.176(14922):root@ClairBackend:[24095]:STATEMENT: DELETE FROM layer WHERE NOT EXISTS (SELECT FROM manifest_layer WHERE manifest_layer.layer_id = layer.id);

frostmar · 2024-11-13T17:54:09Z

The concept of VACUUM / AUTOVACUUM in Postgres is important here (and nothing to do with Clair) - take a look at docs maybe starting at https://www.postgresql.org/docs/current/routine-vacuuming.html

Mkori1 · 2024-11-13T18:07:31Z

Do we have any setting related to timeout as I see errors in log when method is DELETE
{"level":"debug","request_id":"da40e9d64c5747d9","component":"httptransport/New","code":500,"error":"unable to delete manifest: timeout: context canceled","time":"2024-11-13T17:49:34Z","message":"http error response"}
{"level":"info","component":"httptransport/New","request_id":"da40e9d64c5747d9","remote_addr":"100.123.110.151:55158","method":"DELETE","request_uri":"/indexer/api/v1/index_report/sha256:542c27dfe5f967b5bde01c11918871d309d30642a47848b9e13829647814b930","status":500,"duration":600097.463138,"time":"2024-11-13T17:49:34Z","message":"handled HTTP request"}

crozzy · 2024-11-13T18:45:29Z

So, excessive latency when deleting manifests was identified as an issue and dealt with here, I believe the earliest version it is available in is v4.7.3.

This change includes an index creation that you can choose to pre-create concurrently if you have a larger instance using the clairctl admin sub-command https://github.com/quay/clair/blob/main/cmd/clairctl/admin.go#L42.

crozzy · 2024-11-13T18:48:03Z

I'm not sure if this is related to the DB storage issue (i.e. the VACUUM isn't running for some reason due to these DELETEs being in-flight for a long time) but I echo what @frostmar said, checking how your DB is configured to (AUTO)VACUUM is likely going to be very useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clair Postgres database storage space issue #2170

Clair Postgres database storage space issue #2170

Mkori1 commented Nov 12, 2024

frostmar commented Nov 13, 2024

Mkori1 commented Nov 13, 2024

Mkori1 commented Nov 13, 2024 •

edited

Loading

frostmar commented Nov 13, 2024

Mkori1 commented Nov 13, 2024 •

edited

Loading

crozzy commented Nov 13, 2024 •

edited

Loading

crozzy commented Nov 13, 2024

Clair Postgres database storage space issue #2170

Clair Postgres database storage space issue #2170

Comments

Mkori1 commented Nov 12, 2024

tracing and metrics

frostmar commented Nov 13, 2024

Mkori1 commented Nov 13, 2024

Mkori1 commented Nov 13, 2024 • edited Loading

frostmar commented Nov 13, 2024

Mkori1 commented Nov 13, 2024 • edited Loading

crozzy commented Nov 13, 2024 • edited Loading

crozzy commented Nov 13, 2024

Mkori1 commented Nov 13, 2024 •

edited

Loading

Mkori1 commented Nov 13, 2024 •

edited

Loading

crozzy commented Nov 13, 2024 •

edited

Loading