-
-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Centralize extraction of URL parameters #2042
Comments
This comes back to the idea that URL extraction and parameter extraction could somehow be merged. I fervently maintain that they cannot and should not. Although I could come up with a giant list of reasons, the primary one comes down to all the places you can get parameters that have nothing to do with URLs. Think an input tag on a form where there's no action attribute. No URL to be found. So if all of those situations force you to do parameter extraction separately anyway, trying to force them together in the cases where you could would actually add a significant amount of complication. The other big reason is when you don't want to do parameter extraction, because you don't care on a particular scan, it's really nice to have a clean separation to just shut it off. If they were merged in any significant way, you'd basically have a siamese twin baby you'd never truly be able to separate. So - hopefully we can mind meld on that issue and converge our vision there. I have been trying to clean up excavate some (#2181) but I think there are really 2 separate discussion points.
To that point, I think we need to move the excavate submodules into their own folder, like how i do lightfuzz submodules
I think this is alleviated somewhat with the aforementioned refactor/cleanup PR, but given the diversity of the scenarios covered there: -Extracting parameters from HTTP_RESPONSE body These are just fundamentally different things that don't lend themselves to being mashed together, again, without actually adding complexity. I think the solution there also comes down to breaking apart |
Is there ever a case where we emit a URL's getparams but not the URL itself? |
Plenty of cases where we are harvesting a parameter with no new url information, or only partial new URL information (like maybe just the path) and have to combine that with the existing URL from the parent event. All of the logic to handle that has to exist separate from URL parsing. And its extra overhead that you really don't want to employ if you aren't dealing with parameters. There are situations where we need information from 3 places to make a parameter and properly associate it with the correct URL:
Without the context from all three, you are going to get it wrong. Remember, forms can be submitted to different URLs than the parent, or to themselves. |
When a URL event is created, we should always save the GET parameters (before they're stripped off) in the event so that we can later speculate/excavate them. This will allow us to delete a lot of code in excavate, since right now we are extracting URL parameters in multiple places.
The text was updated successfully, but these errors were encountered: