-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove scrollIntoView
#519
Conversation
This operation can be done on the client side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's optional I don't think we need to remove it. Let's have a discussion in the WG.
I’m opposed to removing this. The “easily can be handled client-side” would involve another round-trip to the remote end. For a user using a cloud provider for execution, that means full internet latency twice more than if the option exists. In my experience, the more capability we allow the user to have on the remote end, the better. I’ve noticed an occasional tendency to think that since we give the user the ability to execute scripts remotely, then we’ve done all we need to do and not provide other features because that means more work for implementors. And this tendency can get even worse when we allow preload scripts, because then the answer is, “Just let the user set the right preload script, regardless of how complex that might be, and they’ll be fine.” However, that also means the user has to be clever enough to run just the right script, and introduces multiple commands when one would do. Or worse, it forces client library authors to write complex workarounds and make them explain to their users (who are often less sophisticated than one might expect) why their solution is less performant than a competing technology. The more things we can allow a single command to do at the remote end, the more capabilities we allow the remote end to do without multiple round trips, the better. |
The roundtrip issue sounds more general and perhaps the we can solve it by other means (by batching multiple commands). I think adding all possible options to each of the commands does not scale too: for example, scroll into view can only happen in a single way defined in the spec and clients that want to scroll differently might still need an extra call anyway. |
@jimevans I disagree that this will cause more round trip. Considering that you have to query for the element, why can't you position it at that time? I agree with your sentiments completely, but for this specific case, I struggle to see how useful it will be in practice. Take this from a "photographer" perspective. When you take a picture of a subject, you (the client) would either position your camera or your subject to be within frame of the camera. With |
The idea of "scrollIntoView" as a definable option is the fundamental problem. 'scrollIntoView' has a major caveat where element can be scrolled into view but still obscured by a larger element. Users may file an error saying this is not expected when it really is. |
@jrandolf with or without the scroll into view call, the element can still be obscured and I don't see a big difference. So I don't see how scrollIntoView makes situation any worse. Note that if the element is obscured the screenshot is still what the user sees. |
@jrandolf But “querying for the element” can be done in ways other than executing script*, meaning that one would need to query for the element, then take the screenshot. Two commands. *Or ought to be. Not having a separate “locate an element” command is a weakness of WebDriver BiDi at the moment, since it does not allow for giving the user any hope of a mechanism outside JavaScript for locating elements. I am holding out hope that we will be able to do so via the accessibility tree, for example. My ideal solution would make the element location mechanism extensible, but that’s a discussion better suited to #150. |
The element is not obscured in this case if there is no scrollIntoView call. The user just hasn't positioned their element. The semantics is important. |
@OrKoN One of the things I’m discovering is that users want things to Just Work™. If we give them the options for a single command that works in the X% case (where X is some number above 80) and let the outliers customize the behavior as needed, that is better than forcing every user to completely customize their experience manually. I could go on at length about how developers wax rhapsodically about choosing certain non-WebDriver-based browser automation libraries over WebDriver-based ones because of the perception that they Just Work. Anyway, I’ve said my peace that I believe removing the option to be a mistake, and don’t agree with it. I doubt I’ll have more of substance to add, and whatever comes of the decision, I doubt I’ll change my opinion. |
@jimevans It looks like, based on your arguments, you expect for screenshot to take a picture of the entire element when element clip and scrollIntoView is defined. This perspective is the reason why I think it's troubling to add scrollIntoView. The purpose of captureScreenshot is, at the moment, to capture the viewport. That's it. If an element clip is defined, it means capture and crop to the element, wherever it may be. Scroll into view is completely orthogonal to this feature. If we change the semantics of element clip to mean "capture the entire element", then all current implementations are insufficient. Tl;dr, what's the goal of element clip? To capture the entire element? Or just clip the viewport to the element (if it's in the viewport). If it's the latter, then scrollIntoView is not relevant. If it's the former...well, scrollIntoView may help but it's certainly not going to work in all cases. In this case, I'd much rather have "Perform implementation-specific action bring element into view" always or propose another method like "BrowsingContext.captureNodeScreenshot` which really attempts to capture the entire element. |
I think once we have full document screenshots, the combination of a full document screenshot and an element clip would capture the entire element irrespective of scroll. I don't think we can easily remove scroll into view for element clipped viewport screenshots, because that's how WebDriver classic behaves. |
With full document screenshots available, can we remove |
@jimevans To what extent does this resolve your concern? I think the vast majority of element screenshot cases would be "full screenshot, clipped to the element", which would give you the full element irrespective of viewport, and you'd only need to do |
@jgraham I guess it's fine, though to implement precise classic semantics, that means a second round-trip. I'll remove my objection to landing this PR. If users find that the screenshot behavior as defined here (and as differs from WebDriver Classic) is insufficient, I reserve the right to crow, "I told you so," to the rafters. :) |
Do we already have wpt tests that need an update or do we have to create new ones that allow capturing a screenshot outside of the viewport? @jrandolf |
They already exist here: web-platform-tests/wpt#42804 |
@jrandolf FYI @lutien can take care of the removal of |
sgtm! |
This operation can be done on the client side without much effort.
Preview | Diff