-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added functionality to track overlapping accesses #258
base: main
Are you sure you want to change the base?
Conversation
I don't think this is working correctly yet. For indirect_calls/jumptable/clang:BAP, these are the relevant instructions with overlapping stack accesses:
The 128 bit access to R31 + 16 overlaps with the 64 bit access to R31 + 16 and the 64 bit access to R31 + 24. However, the analysis doesn't correctly determine that these overlap. The results it gives are that:
There is also node 450 which contains Stack_16 (of size 16) and Stack_12 (of size 4). It has three cells:
This is incorrect - accesses to R31 + 16 and R31 + 12 never overlap, and these offsets don't make any sense. Stack_12 and Stack_16 do not overlap and the analysis should not find that they do. It's alright if they both point to a collapsed node containing pointers both add_two and add_six, but it's even better if we can maintain the offsets if possible? |
4751106
to
159c5d9
Compare
This seems to be working now so I'll merge it soon. It would be nice to be able to improve the precision so it can tell that add_two is at offset 0 and add_six is at offset 8 in the combined node and then tell that 00000429_R8 is pointing to add_two, 00000438_R8 is pointing to add_six, and 0000040b_V0 is pointing to both, but that would probably be a fair bit more complex and isn't a must have. |
src/main/scala/analysis/data_structure_analysis/LocalPhase.scala
Outdated
Show resolved
Hide resolved
Oh you were planning to write tests weren't you? I'll wait for you do to those before merging. |
// 0x10DC0 contains a pointer to the procedure add_two: separate node with a single cell at the start which points to add_two once the analysis process here currently the analysis will grow the cell corresponding to 0x10DC0 the size of the access. it causes that cell to overlap the cell corresponding to 0x10DC8 and as a result their outgoing edges (their pointees) are also merged. This causes the collapse an alternative would be to not grow 0x10DC0's cell but instead do a multi load/write. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
Can you explain this further? I don't quite follow what the problem is, why don't we have a way of telling which offsets to use? |
after V0 := mem[0x10DC0, el]:u128 |
We do actually know the sizes. The way it works is that:
Because 0x10DC8 is 8 bytes after 0x10DC0 and they are loaded together, we know that the offset for cell 2 is 8 just from that. |
we would know the size if the pointee is a single (not merged with anything else) global object (possibly single stack objects aswell), because we can look it up in the symbol table and know its size and also know the object isn't merged with something else so we can rely on our known size. But I don't think we can tell the size in general. if pointee is not global or stack we won't be able to tell it's size, same thing follows if a stack or global object has been merged with some other nontrivial (not a place holder for SSA variables) object. But you are right we are able to get more precision in certain cases |
We know the size because there is an object at 0x10DC0 and an object at 0x10DC8, and we are accessing both via a 128-bit access to 0x10DC0. 0x10DC8 is offset from 0x10DC0 by 8 bytes, so cell2 should also have an offset of 8 bytes. This applies broadly, whether we are accessing global data, or the stack, or a heap location. We don't need to know anything about the symbol table for this (except that those objects exist at those addresses in this example). Of course, if those objects have already been merged with anything else that has caused their nodes to collapse then you can't possibly do anything except collapse these together too, but that's fine. If you don't know the sizes, you can't even know that there are overlapping accesses to begin with. |
You are so correct. OF COURSE we have sizes of the pointees because we use them to determine if there is an overlapping access. Thank you for clarifying that. My symbol table to global objects (Data Structure Nodes) conversion step is missing the middle indirection level (cell for 754 pointing to add_two) from 0x10DC0 to 754 to add_two. I'll fix that next and then should be able to get more precise overlapping accesses implemented. |
Thanks, that makes sense. I agree that we need to add that extra indirection layer that isn't in there at present. |
0185072
to
879508a
Compare
You accidentally overwrote some of your previous fixes with this force push |
879508a
to
0185072
Compare
This pull request address the issue of overlapping accesses for DSA as discussed by issue #254
Remaining: