-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include semantics for each instruction #72
Comments
hi. yeah, as you might imagine I get this request quite often. I agree would be useful but it is way beyond anything I am able to do on my own. A few years ago I started Intel’s ZSIM... |
...(hit wrong button, silly phone) which became Intel Simulation and Analysis Engine (Intel SAE) with years of work from many other dedicated and talented engineers. But that is not an open source project. And it is C code... |
Shameless plug, but Remill has C++ implementations of instructions that map to XED iforms. Also in progress is AArch64 support, based on the documents referenced by @jrmuizel. For example, the semantics for template <typename D, typename S1, typename S2>
DEF_SEM(ADDSD, D dst, S1 src1, S2 src2) {
auto lhs = FReadV64(src1);
auto rhs = FReadV64(src2);
auto sum = FAdd(FExtractV64(lhs, 0), FExtractV64(rhs, 0));
auto res = FInsertV64(lhs, 0, sum);
FWriteV64(dst, res);
return memory;
} And those semantics are mapped to XED iforms in the following way: DEF_ISEL(ADDSD_XMMsd_MEMsd) = ADDSD<V128W, V128, MV64>;
DEF_ISEL(ADDSD_XMMsd_XMMsd) = ADDSD<V128W, V128, V128>;
IF_AVX(DEF_ISEL(VADDSD_XMMdq_XMMdq_MEMq) = ADDSD<VV128W, VV128, MV64>;)
IF_AVX(DEF_ISEL(VADDSD_XMMdq_XMMdq_XMMq) = ADDSD<VV128W, VV128, VV128>;) This document describes a bit more about the process that Remill follows. |
Neat. The fact that this description uses templates means that it's probably possible to extract the semantics for use in languages other than C++. I'll take a look. It looks like https://github.com/StanfordPL/stoke also has a model for a lot of x86 semantics. However it seems like it's built more around a model of querying for an SMT formula for a given x86 instruction and not, for example, being able to generate an interpreter given the semantics. It would be interesting to use the Remill templates to generate SMT formulas and check them against stoke. |
@jrmuizel We use these Semantics in McSema, and some people run KLEE on McSema-produced bitcode. KLEE converts LLVM IR into SMT queries. I also use these semantics in a dynamic binary translator. I have an issue filed with the Stoke team to use Remill semantics, because I believe we support more instructions than they do. One way to get a better understanding of how to use Remill is to see how the test case systems lift instructions. You can see that here and here. You can find out more about Remill or talk to me (username |
ARM includes a full machine readable description of the semantics of each instruction in their xml files: https://developer.arm.com/products/architecture/a-profile/exploration-tools https://alastairreid.github.io/alastairreid.github.io/ARM-v8a-xml-release/
I'm not sure if it's practical to get to that level of accuracy/completeness but the more information available the better.
The text was updated successfully, but these errors were encountered: