You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Those pipelines (running in web workers) were designed to auto extract metadata (e.g. auto generate keywords & title, spatial coverage etc) from various file formats (e.g. PDF, words, excel, CSV etc)
However, when a large files are supplied (e.g. 200MB CSV with 1M rows), the processing could take a long time.
Considering we might have better approach, e.g.:
leveraging LLM to generate metatdata (e.g. write summary description)
allow users to trigger the process via UI interaction around the specific field rather part of file uploading process
We want:
by default, disable the feature since v5 release
allow users to enable the feature via Helm chart config
The text was updated successfully, but these errors were encountered:
Allow disable frontend auto metadata extraction feature via config
We have frontend (in-browser) metadata extraction pipelines here: https://github.com/magda-io/magda/tree/main/magda-web-client/src/Components/Dataset/MetadataExtraction
Those pipelines (running in web workers) were designed to auto extract metadata (e.g. auto generate keywords & title, spatial coverage etc) from various file formats (e.g. PDF, words, excel, CSV etc)
However, when a large files are supplied (e.g. 200MB CSV with 1M rows), the processing could take a long time.
Considering we might have better approach, e.g.:
We want:
The text was updated successfully, but these errors were encountered: