Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove scrollIntoView #519

Merged
merged 1 commit into from
Nov 2, 2023
Merged

Remove scrollIntoView #519

merged 1 commit into from
Nov 2, 2023

Conversation

jrandolf-2
Copy link
Contributor

@jrandolf-2 jrandolf-2 commented Aug 25, 2023

This operation can be done on the client side without much effort.


Preview | Diff

This operation can be done on the client side.
Copy link
Contributor

@OrKoN OrKoN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's optional I don't think we need to remove it. Let's have a discussion in the WG.

@OrKoN OrKoN added the needs-discussion Issues to be discussed by the working group label Aug 25, 2023
@jimevans
Copy link
Collaborator

I’m opposed to removing this. The “easily can be handled client-side” would involve another round-trip to the remote end. For a user using a cloud provider for execution, that means full internet latency twice more than if the option exists. In my experience, the more capability we allow the user to have on the remote end, the better.

I’ve noticed an occasional tendency to think that since we give the user the ability to execute scripts remotely, then we’ve done all we need to do and not provide other features because that means more work for implementors. And this tendency can get even worse when we allow preload scripts, because then the answer is, “Just let the user set the right preload script, regardless of how complex that might be, and they’ll be fine.”

However, that also means the user has to be clever enough to run just the right script, and introduces multiple commands when one would do. Or worse, it forces client library authors to write complex workarounds and make them explain to their users (who are often less sophisticated than one might expect) why their solution is less performant than a competing technology.

The more things we can allow a single command to do at the remote end, the more capabilities we allow the remote end to do without multiple round trips, the better.

@OrKoN
Copy link
Contributor

OrKoN commented Aug 25, 2023

The roundtrip issue sounds more general and perhaps the we can solve it by other means (by batching multiple commands). I think adding all possible options to each of the commands does not scale too: for example, scroll into view can only happen in a single way defined in the spec and clients that want to scroll differently might still need an extra call anyway.

@jrandolf-2
Copy link
Contributor Author

jrandolf-2 commented Aug 25, 2023

@jimevans I disagree that this will cause more round trip. Considering that you have to query for the element, why can't you position it at that time? I agree with your sentiments completely, but for this specific case, I struggle to see how useful it will be in practice.

Take this from a "photographer" perspective. When you take a picture of a subject, you (the client) would either position your camera or your subject to be within frame of the camera.

With scrollIntoView, we are doing this positioning on behalf of you (which is inherently not the task of a camera) which may or may not be the expected behavior. In either case, we face an issue of expectation and confusion; "how do I make scrollIntoView scroll correctly?" and "why is my object not being scrolled into view?" (which the user may not realize is being obscured by another element). This will lead to issues being filed for "implementation error" when in fact the user must thoroughly understand what "scrollIntoView" does. It'd be far simpler (occam's razor) to just have the user do all this positioning.

@jrandolf-2
Copy link
Contributor Author

Since it's optional I don't think we need to remove it

The idea of "scrollIntoView" as a definable option is the fundamental problem. 'scrollIntoView' has a major caveat where element can be scrolled into view but still obscured by a larger element. Users may file an error saying this is not expected when it really is.

@OrKoN
Copy link
Contributor

OrKoN commented Aug 25, 2023

@jrandolf with or without the scroll into view call, the element can still be obscured and I don't see a big difference. So I don't see how scrollIntoView makes situation any worse. Note that if the element is obscured the screenshot is still what the user sees.

@jimevans
Copy link
Collaborator

@jrandolf But “querying for the element” can be done in ways other than executing script*, meaning that one would need to query for the element, then take the screenshot. Two commands.

*Or ought to be. Not having a separate “locate an element” command is a weakness of WebDriver BiDi at the moment, since it does not allow for giving the user any hope of a mechanism outside JavaScript for locating elements. I am holding out hope that we will be able to do so via the accessibility tree, for example. My ideal solution would make the element location mechanism extensible, but that’s a discussion better suited to #150.

@jrandolf-2
Copy link
Contributor Author

jrandolf-2 commented Aug 25, 2023

@jrandolf with or without the scroll into view call, the element can still be obscured and I don't see a big difference. So I don't see how scrollIntoView makes situation any worse. Note that if the element is obscured the screenshot is still what the user sees.

The element is not obscured in this case if there is no scrollIntoView call. The user just hasn't positioned their element. The semantics is important.

@jimevans
Copy link
Collaborator

@OrKoN One of the things I’m discovering is that users want things to Just Work™. If we give them the options for a single command that works in the X% case (where X is some number above 80) and let the outliers customize the behavior as needed, that is better than forcing every user to completely customize their experience manually. I could go on at length about how developers wax rhapsodically about choosing certain non-WebDriver-based browser automation libraries over WebDriver-based ones because of the perception that they Just Work.

Anyway, I’ve said my peace that I believe removing the option to be a mistake, and don’t agree with it. I doubt I’ll have more of substance to add, and whatever comes of the decision, I doubt I’ll change my opinion.

@jrandolf-2
Copy link
Contributor Author

jrandolf-2 commented Aug 25, 2023

One of the things I’m discovering is that users want things to Just Work™.

@jimevans It looks like, based on your arguments, you expect for screenshot to take a picture of the entire element when element clip and scrollIntoView is defined. This perspective is the reason why I think it's troubling to add scrollIntoView. The purpose of captureScreenshot is, at the moment, to capture the viewport. That's it. If an element clip is defined, it means capture and crop to the element, wherever it may be. Scroll into view is completely orthogonal to this feature. If we change the semantics of element clip to mean "capture the entire element", then all current implementations are insufficient.

Tl;dr, what's the goal of element clip? To capture the entire element? Or just clip the viewport to the element (if it's in the viewport). If it's the latter, then scrollIntoView is not relevant. If it's the former...well, scrollIntoView may help but it's certainly not going to work in all cases. In this case, I'd much rather have "Perform implementation-specific action bring element into view" always or propose another method like "BrowsingContext.captureNodeScreenshot` which really attempts to capture the entire element.

@jgraham
Copy link
Member

jgraham commented Aug 25, 2023

I think once we have full document screenshots, the combination of a full document screenshot and an element clip would capture the entire element irrespective of scroll.

I don't think we can easily remove scroll into view for element clipped viewport screenshots, because that's how WebDriver classic behaves.

@jrandolf-2
Copy link
Contributor Author

jrandolf-2 commented Oct 24, 2023

With full document screenshots available, can we remove scrollIntoView? For Webdriver, implementations can just call scrollIntoView with the element and then screenshot.

@jgraham
Copy link
Member

jgraham commented Oct 27, 2023

@jimevans To what extent does this resolve your concern? I think the vast majority of element screenshot cases would be "full screenshot, clipped to the element", which would give you the full element irrespective of viewport, and you'd only need to do scrollIntoView if you specifically want to know whether the element is clipped by the viewport after scrolling (or to precisely implement classic semantics).

@jimevans
Copy link
Collaborator

@jgraham I guess it's fine, though to implement precise classic semantics, that means a second round-trip. I'll remove my objection to landing this PR. If users find that the screenshot behavior as defined here (and as differs from WebDriver Classic) is insufficient, I reserve the right to crow, "I told you so," to the rafters. :)

@jrandolf-2
Copy link
Contributor Author

@jgraham @jimevans @OrKoN Please approve.

@jrandolf-2 jrandolf-2 requested a review from OrKoN October 27, 2023 15:51
@whimboo
Copy link
Contributor

whimboo commented Oct 30, 2023

Do we already have wpt tests that need an update or do we have to create new ones that allow capturing a screenshot outside of the viewport? @jrandolf

@jrandolf-2
Copy link
Contributor Author

Do we already have wpt tests that need an update or do we have to create new ones that allow capturing a screenshot outside of the viewport? @jrandolf

They already exist here: web-platform-tests/wpt#42804

@jrandolf-2 jrandolf-2 merged commit 27c6ed4 into main Nov 2, 2023
3 checks passed
@jrandolf-2 jrandolf-2 deleted the jrandolf/scroll-into-view branch November 2, 2023 10:54
github-actions bot added a commit that referenced this pull request Nov 2, 2023
SHA: 27c6ed4
Reason: push, by jrandolf

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@whimboo
Copy link
Contributor

whimboo commented Nov 2, 2023

@jrandolf FYI @lutien can take care of the removal of scrollIntoView in wpt tests via https://bugzilla.mozilla.org/show_bug.cgi?id=1862649 if you haven't started on that yet.

@jrandolf-2
Copy link
Contributor Author

@jrandolf FYI @lutien can take care of the removal of scrollIntoView in wpt tests via https://bugzilla.mozilla.org/show_bug.cgi?id=1862649 if you haven't started on that yet.

sgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-discussion Issues to be discussed by the working group
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants