Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ListInput #2972

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

Conversation

riesentoaster
Copy link
Contributor

Compared to MultipartInput, this does not care about names, and supports mapping mutators that target the inner value to target the whole thing.

Useful, e.g. for stateful protocol fuzzing. If you would like me to, I'm happy to add an example (baby) fuzzer.

@riesentoaster
Copy link
Contributor Author

I've also added a bunch of mutators (remove last entry, remove random entry, append entry based on generator).

@domenukk
Copy link
Member

I still feel like this is basically just multipart input without names for parts?
I mean, I also don't understand what the name is needed for, but if I ignore it it still works the same as ListInput.

Happy to merge this anyway but let's ask @addisoncrump what he thinks

@domenukk
Copy link
Member

So, after talking to Addison: @riesentoaster can we not merge your ListInput mutators into MultipartInput? We can have the name of the Multipart input be a generic and then we can just pass in () as names...

That would be way less code and functionality duplication

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Feb 13, 2025

There is a good bit of additional complexity this way. I'm not sure the compiler is going to be able to optimize this all away.

Also: Once #2944 is solved, we should remove the DefaultMultipartMutator/impl_default_multipart! hack in favor of mapping mutators. It's so much cleaner because it doesn't have to be defined centrally, and one has to be more explicit in what they would like to happen. We could also remove the manual implementations for the bytes input crossover mutators then.

@riesentoaster
Copy link
Contributor Author

Actually on second thought: Why don't we have a generic ListInput and if you need names you can wrap your actual input in a tuple similar to what I changed MultipartInput into? That would require an additional level of mapping in the mutators, but If you need high performance lookups of names to internal inputs, you may want to use a hash map or something similar instead of a list anyways. I don't think names are ever used in anything I see within LibAFL or the example fuzzers, everything relies on idx and that'd be possible with ListInput as well.

@domenukk
Copy link
Member

Actually on second thought: Why don't we have a generic ListInput and if you need names you can wrap your actual input in a tuple similar to what I changed MultipartInput into? That would require an additional level of mapping in the mutators, but If you need high performance lookups of names to internal inputs, you may want to use a hash map or something similar instead of a list anyways. I don't think names are ever used in anything I see within LibAFL or the example fuzzers, everything relies on idx and that'd be possible with ListInput as well.

@addisoncrump opinions?

}

impl<I> Default for MultipartInput<I> {
impl<I> Default for ListInput<I> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep MultipartInput as name? They are roughly equal, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait I think I see what you did there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there could be other inputs based on other collection types such as maps. I'd personally also like to rename MultipartInput, but kept it for now because it's a change that isn't strictly necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this work differently for maps? isn't multipart input alrady key-value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be tree maps or sth

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I agree

@addisoncrump
Copy link
Collaborator

I think that this change makes sense. Originally, the plan was to have multipart be generic over name, but for an initial implementation this wasn't necessary. So, from my perspective, ListInput would be multipart with () as the name type. But the other way around is fine as well.

@addisoncrump
Copy link
Collaborator

FWIW I have some colleagues who do use multipart specifically with names, and the crossover implementation which considers common names is very important in this regard. Please ensure this functionality stays.

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Feb 14, 2025

FWIW I have some colleagues who do use multipart specifically with names, and the crossover implementation which considers common names is very important in this regard. Please ensure this functionality stays.

They do for now. In the future, I'd suggest creating custom mutators for that, and manually combining them with the non-crossover mutators. One could argue the "default" crossover mutators should not consider names, but do crossover within byte arrays from any input part. And crossover within list inputs is ambiguous anyways, see the crossover mutators I have implemented for between the entire ListInput entries (which are important when the order in the list matters, e.g. when they represent network packets, and one wants to insert entire entries as opposed to doing crossover within entries).

I would like to remove the hack with DefaultMultipartMutator in favour of mapping mutators anyways. They are much more flexible and expressive, and just as performant. There are currently still limitations for crossover and mapping mutators (see #2944), but once that is fixed, I'll propose that change.

@addisoncrump
Copy link
Collaborator

I trust your judgement 👍 I just want to note how people expect things to be used. It's a shame we can't emit compiler warnings for specific monomorphisations...

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Feb 14, 2025

It's a shame we can't emit compiler warnings for specific monomorphisations...

💯

It only targets MultipartInput, so we can also leave it as "legacy" code without a compiler note. I will add comments and change the example fuzzer though, if that's fine.

Well, once #2944 is done anyways.

@riesentoaster
Copy link
Contributor Author

riesentoaster commented Feb 14, 2025

Also: Would you be open to renaming MultipartInput to NamedListInput?

Edit: actually, let's wait with that to keep the disruption as minimal as possible, while we don't change anything substantial about it.

/// Access multiple parts mutably.
///
/// ## Panics
///
/// Panics if idxs is not sorted, has duplicate elements, or any entry is out of bounds.
#[must_use]
pub fn parts_mut<const N: usize>(&mut self, mut idxs: [usize; N]) -> [&mut I; N] {
pub fn parts_by_idxs_mut<const C: usize>(&mut self, mut idxs: [usize; C]) -> [&mut I; C] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think by_idxs is a good name.
parts_at_offsets? parts_at_indexes? Or just keep parts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we do use idx across the codebase, so something with idx is fine? Rust uses index though

@domenukk
Copy link
Member

I am not deep enough into MultipartInputs, @addisoncrump can you take a look?

("part", BytesInput::new(vec![b'h', b'e', b'l', b'l', b'o'])),
("part", BytesInput::new(vec![b'h', b'e', b'l', b'l', b'o'])),
("part", BytesInput::new(vec![b'h', b'e', b'l', b'l', b'o'])),
((), BytesInput::new(vec![b'h', b'e', b'l', b'l', b'o'])),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this as string I think. Then we can merge imho

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants