Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xidel, xquery, tuple & download ? #113

Open
Baltazar500 opened this issue Aug 28, 2024 · 8 comments
Open

xidel, xquery, tuple & download ? #113

Baltazar500 opened this issue Aug 28, 2024 · 8 comments

Comments

@Baltazar500
Copy link

Hi.

How to use xidel within an xquery request to use elements (or variables) of an array (tuple) as a name and link to download a file without using external utilities like curl ? "x:download" because xidel does not support :(

How to use "array[1]" for url to download and "array[2]' for filename ?

echo '[{"a":"https://example.com/example.zip","b":"Filename"}]'|xidel -se 'for $a in parse-json(($raw))() let $array:=(($a).a,($a).b) return $array[1] || " " || $array[2]'

How to use separate variables as a url and filename to download ?

echo '[{"a":"https://example.com/example.zip","b":"Filename"}]'|xidel -se 'for $a in parse-json(($raw))()let $title:=($a).b let $url:=($a).a return $url || " " || $title'

@Reino17
Copy link

Reino17 commented Aug 28, 2024

Your $array-variable holds a sequence instead of a proper JSON array.

$ echo '["https://videlibri.sourceforge.net/img/xidel-windows-blue.png","xidel_screenshot.png"]' | \
  xidel -se '$json()'
https://videlibri.sourceforge.net/img/xidel-windows-blue.png
xidel_screenshot.png

Here the input is a proper JSON array with a working example url.

$ echo '["https://videlibri.sourceforge.net/img/xidel-windows-blue.png","xidel_screenshot.png"]' | \
  xidel - -f '$json(1)' --download .

This downloads and saves the image as 'xidel-windows-blue.png' to the current dir.
Normally the stdin dash, -, isn't necessary, but you can't use -f without specifying an input.

$ echo '["https://videlibri.sourceforge.net/img/xidel-windows-blue.png","xidel_screenshot.png"]' | \
  xidel - -f '$json(1)' --download '{$json(2)}'

This downloads and saves the image as 'xidel_screenshot.png' to the current dir.
Unlike -e, the query you enter for --download (and -d as well) is always treated as an extended string. So to evaluate $json(2), the curly brackets are necessary.

"x:download" because xidel does not support :(

No, because obviously there's --download in the first place. There is however a (cumbersome) way to do it in-query:

$ echo '["https://videlibri.sourceforge.net/img/xidel-windows-blue.png","xidel_screenshot.png"]' | \
  xidel -se '
    file:write-binary(
      $json(2),
      string-to-base64Binary(x:request($json(1))/raw)
    )
  '

file-write-binary() (part of the EXPath File Module, integrated in Xidel)

@benibela At first I wanted to use one of the urls on https://nightly.link/benibela/xidel/workflows/main/master as an example url, but none of them seem to work.

@benibela
Copy link
Owner

file-write-binary()

or file:write-text if it is all text

@benibela At first I wanted to use one of the urls on https://nightly.link/benibela/xidel/workflows/main/master as an example url, but none of them seem to work.

They are deleted after three months :/

@Baltazar500
Copy link
Author

Reino17, thanks for the examples, it works (issue completed).

There is a problem with downloading files though. Small files are loaded and saved quickly, but large ones are loaded into RAM (?) and saved after being fully loaded. Is it possible to write them directly to disk? Or is this a question for benibela? Maybe I should open a new issue?

They are deleted after three months :/

Where can I find nightly builds now :) ?

@benibela
Copy link
Owner

but large ones are loaded into RAM (?) and saved after being fully loaded.

it always does that

Where can I find nightly builds now :) ?

i just need to change something, so new ones are made

@Baltazar500
Copy link
Author

it always does that

Can this be changed? For files of hundreds of megabytes or several gigabytes this will be a problem :(

@benibela
Copy link
Owner

possibly, but not so soon.

and to run xpath on the file, it needs to have it in memory anyways

and you can save it multiple times, --download fileA --download fileB , then it should not download it multiple times, but copy the data from memory

@Reino17
Copy link

Reino17 commented Sep 27, 2024

@Baltazar500, you can always use curl for big files.

I'll take my own domain as an example. The last 4 xidel releases on https://rwijnsma.home.xs4all.nl/files/xidel/ to be exact.

$ xidel -s "https://rwijnsma.home.xs4all.nl/files/xidel/" -e '
  "curl --create-dirs --output-dir temp --remote-name-all "||join(
    //tbody/tr[position() = last() - 4 to last()]/td/a/resolve-uri(@href)
  )
'
curl --create-dirs --output-dir temp --remote-name-all \
https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8828-df90faf-openssl-win32.7z \
https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8828-df90faf-win32.7z \
https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8842-e14a969-openssl-win32.7z \
https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8842-e14a969-win32.7z

Let xidel create the string / command you want to execute. (urls put on a new line here for legibility)

$ eval "$(
  xidel -s "https://rwijnsma.home.xs4all.nl/files/xidel/" -e '
    "curl --create-dirs --output-dir temp --remote-name-all "||join(
      //tbody/tr[position() = last() - 4 to last()]/td/a/resolve-uri(@href)
    )
  '
)

And if it looks right, use eval to actually execute.

In this case all the files will obviously be downloaded with the remote filename. If you want full control on the filename, then you can use curl's -K, --config <file> parameter. For instance:

$ xidel -s "https://rwijnsma.home.xs4all.nl/files/xidel/" -e '
  //tbody/tr[position() = last() - 4 to last()]/td/a/(
    `url = "{resolve-uri(@href)}"`,
    `output = "xidel_test{position()}.7z"`
  )
'
url = "https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8828-df90faf-openssl-win32.7z"
output = "xidel_test1.7z"
url = "https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8828-df90faf-win32.7z"
output = "xidel_test2.7z"
url = "https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8842-e14a969-openssl-win32.7z"
output = "xidel_test3.7z"
url = "https://rwijnsma.home.xs4all.nl/files/xidel/xidel-0.9.9-8842-e14a969-win32.7z"
output = "xidel_test4.7z"

$ xidel -s "https://rwijnsma.home.xs4all.nl/files/xidel/" -e '
  //tbody/tr[position() = last() - 4 to last()]/td/a/(
    `url = "{resolve-uri(@href)}"`,
    `output = "xidel_test{position()}.7z"`
  )
' | curl --create-dirs --output-dir temp -# -K -
############################################################################################ 100.0%
############################################################################################ 100.0%
############################################################################################ 100.0%
############################################################################################ 100.0%

@Baltazar500
Copy link
Author

@benibela,

and you can save it multiple times, --download fileA --download fileB , then it should not download it multiple times, but copy the data from memory

But what if I need to download only one large file ?

@Reino17, Thank you for the examples, but I have previously used similar expressions with curl and xidel myself, which is why I mentioned earlier that I want to avoid external utilities :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants