diff --git a/docs/asmc.html b/docs/asmc.html new file mode 100644 index 000000000..94046de92 --- /dev/null +++ b/docs/asmc.html @@ -0,0 +1,3339 @@ + + + + + + + + + + How to Use Inline Assembly Language in C Code — gcc 6 documentation + + + + + + + + + + + + + + + +
+
+

Table Of Contents

+ + +

Previous topic

+

When is a Volatile Object Accessed?

+

Next topic

+

Alternate Keywords

+

This Page

+ + + +
+
+ +
+
+
+
+ +
+

How to Use Inline Assembly Language in C Code

+

The asm keyword allows you to embed assembler instructions +within C code. GCC provides two forms of inline asm +statements. A basic ``asm`` statement is one with no +operands (see Basic Asm - Assembler Instructions Without Operands), while an extended ``asm`` +statement (see Extended Asm - Assembler Instructions with C Expression Operands) includes one or more operands. +The extended form is preferred for mixing C and assembly language +within a function, but to include assembly language at +top level you must use basic asm.

+

You can also use the asm keyword to override the assembler name +for a C symbol, or to place a C variable in a specific register.

+
+
    +
+
+
+

Basic Asm - Assembler Instructions Without Operands

+

A basic asm statement has the following syntax:

+
asm [ volatile ] ( ``AssemblerInstructions`` )
+
+

The asm keyword is a GNU extension. +When writing code that can be compiled with -ansi and the +various -std options, use __asm__ instead of +asm (see Alternate Keywords).

+
+
+

Qualifiers

+
+
volatile
+
The optional volatile qualifier has no effect. +All basic asm blocks are implicitly volatile.
+
+
+
+

Parameters

+
+
AssemblerInstructions
+

This is a literal string that specifies the assembler code. The string can +contain any instructions recognized by the assembler, including directives. +GCC does not parse the assembler instructions themselves and +does not know what they mean or even whether they are valid assembler input.

+

You may place multiple assembler instructions together in a single asm +string, separated by the characters normally used in assembly code for the +system. A combination that works in most places is a newline to break the +line, plus a tab character (written as nt). +Some assemblers allow semicolons as a line separator. However, +note that some assembler dialects use semicolons to start a comment.

+
+
+
+
+

Remarks

+

Using extended asm typically produces smaller, safer, and more +efficient code, and in most cases it is a better solution than basic +asm. However, there are two situations where only basic asm +can be used:

+
    +
  • Extended asm statements have to be inside a C +function, so to write inline assembly language at file scope (‘top-level’), +outside of C functions, you must use basic asm. +You can use this technique to emit assembler directives, +define assembly language macros that can be invoked elsewhere in the file, +or write entire functions in assembly language.
  • +
  • Functions declared +with the naked attribute also require basic asm +(see Declaring Attributes of Functions).
  • +
+

Safely accessing C data and calling functions from basic asm is more +complex than it may appear. To access C data, it is better to use extended +asm.

+

Do not expect a sequence of asm statements to remain perfectly +consecutive after compilation. If certain instructions need to remain +consecutive in the output, put them in a single multi-instruction asm +statement. Note that GCC’s optimizers can move asm statements +relative to other code, including across jumps.

+

asm statements may not perform jumps into other asm statements. +GCC does not know about these jumps, and therefore cannot take +account of them when deciding how to optimize. Jumps from asm to C +labels are only supported in extended asm.

+

Under certain circumstances, GCC may duplicate (or remove duplicates of) your +assembly code when optimizing. This can lead to unexpected duplicate +symbol errors during compilation if your assembly code defines symbols or +labels.

+

Since GCC does not parse the AssemblerInstructions, it has no +visibility of any symbols it references. This may result in GCC discarding +those symbols as unreferenced.

+

The compiler copies the assembler instructions in a basic asm +verbatim to the assembly language output file, without +processing dialects or any of the % operators that are available with +extended asm. This results in minor differences between basic +asm strings and extended asm templates. For example, to refer to +registers you might use %eax in basic asm and +%%eax in extended asm.

+

On targets such as x86 that support multiple assembler dialects, +all basic asm blocks use the assembler dialect specified by the +-masm command-line option (see x86 Options). +Basic asm provides no +mechanism to provide different assembler strings for different dialects.

+

Here is an example of basic asm for i386:

+
/* Note that this code will not compile with -masm=intel */
+#define DebugBreak() asm("int $3")
+
+
+
+
+

Extended Asm - Assembler Instructions with C Expression Operands

+

With extended asm you can read and write C variables from +assembler and perform jumps from assembler code to C labels. +Extended asm syntax uses colons (:) to delimit +the operand parameters after the assembler template:

+
asm [volatile] ( ``AssemblerTemplate``
+                 : ``OutputOperands``
+                 [ : ``InputOperands``
+                 [ : ``Clobbers`` ] ])
+
+asm [volatile] goto ( ``AssemblerTemplate``
+                      :
+                      : ``InputOperands``
+                      : ``Clobbers``
+                      : ``GotoLabels``)
+
+

The asm keyword is a GNU extension. +When writing code that can be compiled with -ansi and the +various -std options, use __asm__ instead of +asm (see Alternate Keywords).

+
+
+

Qualifiers

+
+
volatile
+
The typical use of extended asm statements is to manipulate input +values to produce output values. However, your asm statements may +also produce side effects. If so, you may need to use the volatile +qualifier to disable certain optimizations. See Volatile.
+
goto
+
This qualifier informs the compiler that the asm statement may +perform a jump to one of the labels listed in the GotoLabels. +See Goto Labels.
+
+
+
+

Parameters

+
+
AssemblerTemplate
+
This is a literal string that is the template for the assembler code. It is a +combination of fixed text and tokens that refer to the input, output, +and goto parameters. See Assembler Template.
+
OutputOperands
+
A comma-separated list of the C variables modified by the instructions in the +AssemblerTemplate. An empty list is permitted. See Output Operands.
+
InputOperands
+
A comma-separated list of C expressions read by the instructions in the +AssemblerTemplate. An empty list is permitted. See Input Operands.
+
Clobbers
+
A comma-separated list of registers or other values changed by the +AssemblerTemplate, beyond those listed as outputs. +An empty list is permitted. See Clobbers.
+
GotoLabels
+

When you are using the goto form of asm, this section contains +the list of all C labels to which the code in the +AssemblerTemplate may jump. +See Goto Labels.

+

asm statements may not perform jumps into other asm statements, +only to the listed GotoLabels. +GCC’s optimizers do not know about other jumps; therefore they cannot take +account of them when deciding how to optimize.

+

The total number of input + output + goto operands is limited to 30.

+
+
+
+
+

Remarks

+

The asm statement allows you to include assembly instructions directly +within C code. This may help you to maximize performance in time-sensitive +code or to access assembly instructions that are not readily available to C +programs.

+

Note that extended asm statements must be inside a function. Only +basic asm may be outside functions (see Basic Asm - Assembler Instructions Without Operands). +Functions declared with the naked attribute also require basic +asm (see Declaring Attributes of Functions).

+

While the uses of asm are many and varied, it may help to think of an +asm statement as a series of low-level instructions that convert input +parameters to output parameters. So a simple (if not particularly useful) +example for i386 using asm might look like this:

+
int src = 1;
+int dst;
+
+asm ("mov %1, %0\n\t"
+    "add $1, %0"
+    : "=r" (dst)
+    : "r" (src));
+
+printf("%d\n", dst);
+
+
+

This code copies src to dst and add 1 to dst.

+
+

Volatile

+

GCC’s optimizers sometimes discard asm statements if they determine +there is no need for the output variables. Also, the optimizers may move +code out of loops if they believe that the code will always return the same +result (i.e. none of its input values change between calls). Using the +volatile qualifier disables these optimizations. asm statements +that have no output operands, including asm goto statements, +are implicitly volatile.

+

This i386 code demonstrates a case that does not use (or require) the +volatile qualifier. If it is performing assertion checking, this code +uses asm to perform the validation. Otherwise, dwRes is +unreferenced by any code. As a result, the optimizers can discard the +asm statement, which in turn removes the need for the entire +DoCheck routine. By omitting the volatile qualifier when it +isn’t needed you allow the optimizers to produce the most efficient code +possible.

+
void DoCheck(uint32_t dwSomeValue)
+{
+   uint32_t dwRes;
+
+   // Assumes dwSomeValue is not zero.
+   asm ("bsfl %1,%0"
+     : "=r" (dwRes)
+     : "r" (dwSomeValue)
+     : "cc");
+
+   assert(dwRes > 3);
+}
+
+
+

The next example shows a case where the optimizers can recognize that the input +(dwSomeValue) never changes during the execution of the function and can +therefore move the asm outside the loop to produce more efficient code. +Again, using volatile disables this type of optimization.

+
void do_print(uint32_t dwSomeValue)
+{
+   uint32_t dwRes;
+
+   for (uint32_t x=0; x < 5; x++)
+   {
+      // Assumes dwSomeValue is not zero.
+      asm ("bsfl %1,%0"
+        : "=r" (dwRes)
+        : "r" (dwSomeValue)
+        : "cc");
+
+      printf("%u: %u %u\n", x, dwSomeValue, dwRes);
+   }
+}
+
+
+

The following example demonstrates a case where you need to use the +volatile qualifier. +It uses the x86 rdtsc instruction, which reads +the computer’s time-stamp counter. Without the volatile qualifier, +the optimizers might assume that the asm block will always return the +same value and therefore optimize away the second call.

+
uint64_t msr;
+
+asm volatile ( "rdtsc\n\t"    // Returns the time in EDX:EAX.
+        "shl $32, %%rdx\n\t"  // Shift the upper bits left.
+        "or %%rdx, %0"        // 'Or' in the lower bits.
+        : "=a" (msr)
+        :
+        : "rdx");
+
+printf("msr: %llx\n", msr);
+
+// Do other work...
+
+// Reprint the timestamp
+asm volatile ( "rdtsc\n\t"    // Returns the time in EDX:EAX.
+        "shl $32, %%rdx\n\t"  // Shift the upper bits left.
+        "or %%rdx, %0"        // 'Or' in the lower bits.
+        : "=a" (msr)
+        :
+        : "rdx");
+
+printf("msr: %llx\n", msr);
+
+
+

GCC’s optimizers do not treat this code like the non-volatile code in the +earlier examples. They do not move it out of loops or omit it on the +assumption that the result from a previous call is still valid.

+

Note that the compiler can move even volatile asm instructions relative +to other code, including across jump instructions. For example, on many +targets there is a system register that controls the rounding mode of +floating-point operations. Setting it with a volatile asm, as in the +following PowerPC example, does not work reliably.

+
asm volatile("mtfsf 255, %0" : : "f" (fpenv));
+sum = x + y;
+
+
+

The compiler may move the addition back before the volatile asm. To +make it work as expected, add an artificial dependency to the asm by +referencing a variable in the subsequent code, for example:

+
asm volatile ("mtfsf 255,%1" : "=X" (sum) : "f" (fpenv));
+sum = x + y;
+
+
+

Under certain circumstances, GCC may duplicate (or remove duplicates of) your +assembly code when optimizing. This can lead to unexpected duplicate symbol +errors during compilation if your asm code defines symbols or labels. +Using %= +(see Assembler Template) may help resolve this problem.

+
+
+

Assembler Template

+

An assembler template is a literal string containing assembler instructions. +The compiler replaces tokens in the template that refer +to inputs, outputs, and goto labels, +and then outputs the resulting string to the assembler. The +string can contain any instructions recognized by the assembler, including +directives. GCC does not parse the assembler instructions +themselves and does not know what they mean or even whether they are valid +assembler input. However, it does count the statements +(see size-of-an-asm).

+

You may place multiple assembler instructions together in a single asm +string, separated by the characters normally used in assembly code for the +system. A combination that works in most places is a newline to break the +line, plus a tab character to move to the instruction field (written as +nt). +Some assemblers allow semicolons as a line separator. However, note +that some assembler dialects use semicolons to start a comment.

+

Do not expect a sequence of asm statements to remain perfectly +consecutive after compilation, even when you are using the volatile +qualifier. If certain instructions need to remain consecutive in the output, +put them in a single multi-instruction asm statement.

+

Accessing data from C programs without using input/output operands (such as +by using global symbols directly from the assembler template) may not work as +expected. Similarly, calling functions directly from an assembler template +requires a detailed understanding of the target assembler and ABI.

+

Since GCC does not parse the assembler template, +it has no visibility of any +symbols it references. This may result in GCC discarding those symbols as +unreferenced unless they are also listed as input, output, or goto operands.

+
+
+
+

Special format strings

+

In addition to the tokens described by the input, output, and goto operands, +these tokens have special meanings in the assembler template:

+
+
%%
+
Outputs a single % into the assembler code.
+
%=
+
Outputs a number that is unique to each instance of the asm +statement in the entire compilation. This option is useful when creating local +labels and referring to them multiple times in a single template that +generates multiple assembler instructions.
+
%{ %| %}
+
Outputs {, |, and } characters (respectively) +into the assembler code. When unescaped, these characters have special +meaning to indicate multiple assembler dialects, as described below.
+
+

Multiple assembler dialects in asm templatesOn targets such as x86, GCC supports multiple assembler dialects. +The -masm option controls which dialect GCC uses as its +default for inline assembler. The target-specific documentation for the +-masm option contains the list of supported dialects, as well as the +default dialect if the option is not specified. This information may be +important to understand, since assembler code that works correctly when +compiled using one dialect will likely fail if compiled using another. +See x86 Options.

+

If your code needs to support multiple assembler dialects (for example, if +you are writing public headers that need to support a variety of compilation +options), use constructs of this form:

+
{ dialect0 | dialect1 | dialect2... }
+
+
+

This construct outputs dialect0 +when using dialect #0 to compile the code, +dialect1 for dialect #1, etc. If there are fewer alternatives within the +braces than the number of dialects the compiler supports, the construct +outputs nothing.

+

For example, if an x86 compiler supports two dialects +(att, intel), an +assembler template such as this:

+
"bt{l %[Offset],%[Base] | %[Base],%[Offset]}; jc %l2"
+
+
+

is equivalent to one of

+
"btl %[Offset],%[Base] ; jc %l2"   /* att dialect */
+"bt %[Base],%[Offset]; jc %l2"     /* intel dialect */
+
+
+

Using that same compiler, this code:

+
"xchg{l}\t{%%}ebx, %1"
+
+
+

corresponds to either

+
"xchgl\t%%ebx, %1"                 /* att dialect */
+"xchg\tebx, %1"                    /* intel dialect */
+
+
+

There is no support for nesting dialect alternatives.

+
+

Output Operands

+

An asm statement has zero or more output operands indicating the names +of C variables modified by the assembler code.

+

In this i386 example, old (referred to in the template string as +%0) and *Base (as %1) are outputs and Offset +(%2) is an input:

+
bool old;
+
+__asm__ ("btsl %2,%1\n\t" // Turn on zero-based bit #Offset in Base.
+         "sbb %0,%0"      // Use the CF to calculate old.
+   : "=r" (old), "+rm" (*Base)
+   : "Ir" (Offset)
+   : "cc");
+
+return old;
+
+
+

Operands are separated by commas. Each operand has this format:

+
[ [``asmSymbolicName``] ] ``constraint`` (``cvariablename``)
+
+
+
asmSymbolicName
+

Specifies a symbolic name for the operand. +Reference the name in the assembler template +by enclosing it in square brackets +(i.e. %[Value]). The scope of the name is the asm statement +that contains the definition. Any valid C variable name is acceptable, +including names already defined in the surrounding code. No two operands +within the same asm statement can use the same symbolic name.

+

When not using an asmSymbolicName, use the (zero-based) position +of the operand +in the list of operands in the assembler template. For example if there are +three output operands, use %0 in the template to refer to the first, +%1 for the second, and %2 for the third.

+
+
constraint
+

A string constant specifying constraints on the placement of the operand; +See constraints, for details.

+

Output constraints must begin with either = (a variable overwriting an +existing value) or + (when reading and writing). When using +=, do not assume the location contains the existing value +on entry to the asm, except +when the operand is tied to an input; see Input Operands.

+

After the prefix, there must be one or more additional constraints +(see constraints) that describe where the value resides. Common +constraints include r for register and m for memory. +When you list more than one possible location (for example, "=rm"), +the compiler chooses the most efficient one based on the current context. +If you list as many alternates as the asm statement allows, you permit +the optimizers to produce the best possible code. +If you must use a specific register, but your Machine Constraints do not +provide sufficient control to select the specific register you want, +local register variables may provide a solution (see Specifying Registers for Local Variables).

+
+
cvariablename
+

Specifies a C lvalue expression to hold the output, typically a variable name. +The enclosing parentheses are a required part of the syntax.

+

When the compiler selects the registers to use to

+
+
+

represent the output operands, it does not use any of the clobbered registers +(see Clobbers).

+

Output operand expressions must be lvalues. The compiler cannot check whether +the operands have data types that are reasonable for the instruction being +executed. For output expressions that are not directly addressable (for +example a bit-field), the constraint must allow a register. In that case, GCC +uses the register as the output of the asm, and then stores that +register into the output.

+

Operands using the + constraint modifier count as two operands +(that is, both as input and output) towards the total maximum of 30 operands +per asm statement.

+

Use the & constraint modifier (see Constraint Modifier Characters) on all output +operands that must not overlap an input. Otherwise, +GCC may allocate the output operand in the same register as an unrelated +input operand, on the assumption that the assembler code consumes its +inputs before producing outputs. This assumption may be false if the assembler +code actually consists of more than one instruction.

+

The same problem can occur if one output parameter (a) allows a register +constraint and another output parameter (b) allows a memory constraint. +The code generated by GCC to access the memory address in b can contain +registers which might be shared by a, and GCC considers those +registers to be inputs to the asm. As above, GCC assumes that such input +registers are consumed before any outputs are written. This assumption may +result in incorrect behavior if the asm writes to a before using +b. Combining the & modifier with the register constraint on a +ensures that modifying a does not affect the address referenced by +b. Otherwise, the location of b +is undefined if a is modified before using b.

+

asm supports operand modifiers on operands (for example %k2 +instead of simply %2). Typically these qualifiers are hardware +dependent. The list of supported modifiers for x86 is found at +x86Operandmodifiersx86 Operand modifiers.

+

If the C code that follows the asm makes no use of any of the output +operands, use volatile for the asm statement to prevent the +optimizers from discarding the asm statement as unneeded +(see Volatile).

+

This code makes no use of the optional asmSymbolicName. Therefore it +references the first output operand as %0 (were there a second, it +would be %1, etc). The number of the first input operand is one greater +than that of the last output operand. In this i386 example, that makes +Mask referenced as %1:

+
uint32_t Mask = 1234;
+uint32_t Index;
+
+  asm ("bsfl %1, %0"
+     : "=r" (Index)
+     : "r" (Mask)
+     : "cc");
+
+
+

That code overwrites the variable Index (=), +placing the value in a register (r). +Using the generic r constraint instead of a constraint for a specific +register allows the compiler to pick the register to use, which can result +in more efficient code. This may not be possible if an assembler instruction +requires a specific register.

+

The following i386 example uses the asmSymbolicName syntax. +It produces the +same result as the code above, but some may consider it more readable or more +maintainable since reordering index numbers is not necessary when adding or +removing operands. The names aIndex and aMask +are only used in this example to emphasize which +names get used where. +It is acceptable to reuse the names Index and Mask.

+
uint32_t Mask = 1234;
+uint32_t Index;
+
+  asm ("bsfl %[aMask], %[aIndex]"
+     : [aIndex] "=r" (Index)
+     : [aMask] "r" (Mask)
+     : "cc");
+
+
+

Here are some more examples of output operands.

+
uint32_t c = 1;
+uint32_t d;
+uint32_t *e = &c;
+
+asm ("mov %[e], %[d]"
+   : [d] "=rm" (d)
+   : [e] "rm" (*e));
+
+
+

Here, d may either be in a register or in memory. Since the compiler +might already have the current value of the uint32_t location +pointed to by e +in a register, you can enable it to choose the best location +for d by specifying both constraints.

+
+
+

Input Operands

+

Input operands make values from C variables and expressions available to the +assembly code.

+

Operands are separated by commas. Each operand has this format:

+
[ [``asmSymbolicName``] ] ``constraint`` (``cexpression``)
+
+
+
asmSymbolicName
+

Specifies a symbolic name for the operand. +Reference the name in the assembler template +by enclosing it in square brackets +(i.e. %[Value]). The scope of the name is the asm statement +that contains the definition. Any valid C variable name is acceptable, +including names already defined in the surrounding code. No two operands +within the same asm statement can use the same symbolic name.

+

When not using an asmSymbolicName, use the (zero-based) position +of the operand +in the list of operands in the assembler template. For example if there are +two output operands and three inputs, +use %2 in the template to refer to the first input operand, +%3 for the second, and %4 for the third.

+
+
constraint
+

A string constant specifying constraints on the placement of the operand; +See constraints, for details.

+

Input constraint strings may not begin with either = or +. +When you list more than one possible location (for example, "irm"), +the compiler chooses the most efficient one based on the current context. +If you must use a specific register, but your Machine Constraints do not +provide sufficient control to select the specific register you want, +local register variables may provide a solution (see Specifying Registers for Local Variables).

+

Input constraints can also be digits (for example, "0"). This indicates +that the specified input must be in the same place as the output constraint +at the (zero-based) index in the output constraint list. +When using asmSymbolicName syntax for the output operands, +you may use these names (enclosed in brackets []) instead of digits.

+
+
cexpression
+

This is the C variable or expression being passed to the asm statement +as input. The enclosing parentheses are a required part of the syntax.

+

When the compiler selects the registers to use to represent the input

+
+
+

operands, it does not use any of the clobbered registers (see Clobbers).

+

If there are no output operands but there are input operands, place two +consecutive colons where the output operands would go:

+
__asm__ ("some instructions"
+   : /* No outputs. */
+   : "r" (Offset / 8));
+
+
+

Warning: Do not modify the contents of input-only operands +(except for inputs tied to outputs). The compiler assumes that on exit from +the asm statement these operands contain the same values as they +had before executing the statement. +It is not possible to use clobbers +to inform the compiler that the values in these inputs are changing. One +common work-around is to tie the changing input variable to an output variable +that never gets used. Note, however, that if the code that follows the +asm statement makes no use of any of the output operands, the GCC +optimizers may discard the asm statement as unneeded +(see Volatile).

+

asm supports operand modifiers on operands (for example %k2 +instead of simply %2). Typically these qualifiers are hardware +dependent. The list of supported modifiers for x86 is found at +x86Operandmodifiersx86 Operand modifiers.

+

In this example using the fictitious combine instruction, the +constraint "0" for input operand 1 says that it must occupy the same +location as output operand 0. Only input operands may use numbers in +constraints, and they must each refer to an output operand. Only a number (or +the symbolic assembler name) in the constraint can guarantee that one operand +is in the same place as another. The mere fact that foo is the value of +both operands is not enough to guarantee that they are in the same place in +the generated assembler code.

+
asm ("combine %2, %0"
+   : "=r" (foo)
+   : "0" (foo), "g" (bar));
+
+
+

Here is an example using symbolic names.

+
asm ("cmoveq %1, %2, %[result]"
+   : [result] "=r"(result)
+   : "r" (test), "r" (new), "[result]" (old));
+
+
+
+
+

Clobbers

+

While the compiler is aware of changes to entries listed in the output +operands, the inline asm code may modify more than just the outputs. For +example, calculations may require additional registers, or the processor may +overwrite a register as a side effect of a particular assembler instruction. +In order to inform the compiler of these changes, list them in the clobber +list. Clobber list items are either register names or the special clobbers +(listed below). Each clobber list item is a string constant +enclosed in double quotes and separated by commas.

+

Clobber descriptions may not in any way overlap with an input or output +operand. For example, you may not have an operand describing a register class +with one member when listing that register in the clobber list. Variables +declared to live in specific registers (see Variables in Specified Registers) and used +as asm input or output operands must have no part mentioned in the +clobber description. In particular, there is no way to specify that input +operands get modified without also specifying them as output operands.

+

When the compiler selects which registers to use to represent input and output +operands, it does not use any of the clobbered registers. As a result, +clobbered registers are available for any use in the assembler code.

+

Here is a realistic example for the VAX showing the use of clobbered +registers:

+
asm volatile ("movc3 %0, %1, %2"
+                   : /* No outputs. */
+                   : "g" (from), "g" (to), "g" (count)
+                   : "r0", "r1", "r2", "r3", "r4", "r5");
+
+
+

Also, there are two special clobber arguments:

+
+
"cc"
+
The "cc" clobber indicates that the assembler code modifies the flags +register. On some machines, GCC represents the condition codes as a specific +hardware register; "cc" serves to name this register. +On other machines, condition code handling is different, +and specifying "cc" has no effect. But +it is valid no matter what the target.
+
"memory"
+

The "memory" clobber tells the compiler that the assembly code +performs memory +reads or writes to items other than those listed in the input and output +operands (for example, accessing the memory pointed to by one of the input +parameters). To ensure memory contains correct values, GCC may need to flush +specific register values to memory before executing the asm. Further, +the compiler does not assume that any values read from memory before an +asm remain unchanged after that asm; it reloads them as +needed. +Using the "memory" clobber effectively forms a read/write +memory barrier for the compiler.

+

Note that this clobber does not prevent the processor from doing +speculative reads past the asm statement. To prevent that, you need +processor-specific fence instructions.

+

Flushing registers to memory has performance implications and may be an issue +for time-sensitive code. You can use a trick to avoid this if the size of +the memory being accessed is known at compile time. For example, if accessing +ten bytes of a string, use a memory input like:

+

{"m"( ({ struct { char x[10]; } *p = (void *)ptr ; *p; }) )}.

+
+
+
+
+

Goto Labels

+

asm goto allows assembly code to jump to one or more C labels. The +GotoLabels section in an asm goto statement contains +a comma-separated +list of all C labels to which the assembler code may jump. GCC assumes that +asm execution falls through to the next statement (if this is not the +case, consider using the __builtin_unreachable intrinsic after the +asm statement). Optimization of asm goto may be improved by +using the hot and cold label attributes (see label–attributes).

+

An asm goto statement cannot have outputs. +This is due to an internal restriction of +the compiler: control transfer instructions cannot have outputs. +If the assembler code does modify anything, use the "memory" clobber +to force the +optimizers to flush all register values to memory and reload them if +necessary after the asm statement.

+

Also note that an asm goto statement is always implicitly +considered volatile.

+

To reference a label in the assembler template, +prefix it with %l (lowercase L) followed +by its (zero-based) position in GotoLabels plus the number of input +operands. For example, if the asm has three inputs and references two +labels, refer to the first label as %l3 and the second as %l4).

+

Alternately, you can reference labels using the actual C label name enclosed +in brackets. For example, to reference a label named carry, you can +use %l[carry]. The label must still be listed in the GotoLabels +section when using this approach.

+

Here is an example of asm goto for i386:

+
asm goto (
+    "btl %1, %0\n\t"
+    "jc %l2"
+    : /* No outputs. */
+    : "r" (p1), "r" (p2)
+    : "cc"
+    : carry);
+
+return 0;
+
+carry:
+return 1;
+
+
+

The following example shows an asm goto that uses a memory clobber.

+
int frob(int x)
+{
+  int y;
+  asm goto ("frob %%r5, %1; jc %l[error]; mov (%2), %%r5"
+            : /* No outputs. */
+            : "r"(x), "r"(&y)
+            : "r5", "memory"
+            : error);
+  return y;
+error:
+  return -1;
+}
+
+
+
+
+

x86 Operand Modifiers

+

References to input, output, and goto operands in the assembler template +of extended asm statements can use +modifiers to affect the way the operands are formatted in +the code output to the assembler. For example, the +following code uses the h and b modifiers for x86:

+
uint16_t  num;
+asm volatile ("xchg %h0, %b0" : "+a" (num) );
+
+
+

These modifiers generate this assembler code:

+
xchg %ah, %al
+
+
+

The rest of this discussion uses the following code for illustrative purposes.

+
int main()
+{
+   int iInt = 1;
+
+top:
+
+   asm volatile goto ("some assembler instructions here"
+   : /* No outputs. */
+   : "q" (iInt), "X" (sizeof(unsigned char) + 1)
+   : /* No clobbers. */
+   : top);
+}
+
+
+

With no modifiers, this is what the output from the operands would be for the +att and intel dialects of assembler:

+ +++++ + + + + + + + + + + + + + + + + + + + + +
Operandmasm=attmasm=intel
%0%eaxeax
%1$22
%2$.L2OFFSET FLAT:.L2
+

The table below shows the list of supported modifiers and their effects.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModifierDescriptionOperandmasm=attmasm=intel
zPrint the opcode suffix for the size of the current integer operand (one of b/w/l/q).%z0l 
bPrint the QImode name of the register.%b0%alal
hPrint the QImode name for a ‘high’ register.%h0%ahah
wPrint the HImode name of the register.%w0%axax
kPrint the SImode name of the register.%k0%eaxeax
qPrint the DImode name of the register.%q0%raxrax
lPrint the label name with no punctuation.%l2.L2.L2
cRequire a constant operand and print the constant expression with no punctuation.%c122
+

x86 Floating-Point asm OperandsOn x86 targets, there are several rules on the usage of stack-like registers +in the operands of an asm. These rules apply only to the operands +that are stack-like registers:

+
    +
  • Given a set of input registers that die in an asm, it is +necessary to know which are implicitly popped by the asm, and +which must be explicitly popped by GCC.

    +

    An input register that is implicitly popped by the asm must be +explicitly clobbered, unless it is constrained to match an +output operand.

    +
  • +
  • For any input register that is implicitly popped by an asm, it is +necessary to know how to adjust the stack to compensate for the pop. +If any non-popped input is closer to the top of the reg-stack than +the implicitly popped register, it would not be possible to know what the +stack looked like-it’s not clear how the rest of the stack ‘slides +up’.

    +

    All implicitly popped input registers must be closer to the top of +the reg-stack than any input that is not implicitly popped.

    +

    It is possible that if an input dies in an asm, the compiler might +use the input register for an output reload. Consider this example:

    +
    asm ("foo" : "=t" (a) : "f" (b));
    +
    +
    +

    This code says that input b is not popped by the asm, and that +the asm pushes a result onto the reg-stack, i.e., the stack is one +deeper after the asm than it was before. But, it is possible that +reload may think that it can use the same register for both the input and +the output.

    +

    To prevent this from happening, +if any input operand uses the f constraint, all output register +constraints must use the & early-clobber modifier.

    +

    The example above is correctly written as:

    +
    asm ("foo" : "=&t" (a) : "f" (b));
    +
    +
    +
  • +
  • Some operands need to be in particular places on the stack. All +output operands fall in this category-GCC has no other way to +know which registers the outputs appear in unless you indicate +this in the constraints.

    +

    Output operands must specifically indicate which register an output +appears in after an asm. =f is not allowed: the operand +constraints must select a class with a single register.

    +
  • +
  • Output operands may not be ‘inserted’ between existing stack registers. +Since no 387 opcode uses a read/write operand, all output operands +are dead before the asm, and are pushed by the asm. +It makes no sense to push anywhere but the top of the reg-stack.

    +

    Output operands must start at the top of the reg-stack: output +operands may not ‘skip’ a register.

    +
  • +
  • Some asm statements may need extra stack space for internal +calculations. This can be guaranteed by clobbering stack registers +unrelated to the inputs and outputs.

    +
  • +
+

This asm +takes one input, which is internally popped, and produces two outputs.

+
asm ("fsincos" : "=t" (cos), "=u" (sin) : "0" (inp));
+
+
+

This asm takes two inputs, which are popped by the fyl2xp1 opcode, +and replaces them with one output. The st(1) clobber is necessary +for the compiler to know that fyl2xp1 pops both inputs.

+
asm ("fyl2xp1" : "=t" (result) : "0" (x), "u" (y) : "st(1)");
+
+
+

Constraints for asm Operands +.. index:: operand constraints, asm

+

Here are specific details on what constraint letters you can use with +asm operands. +Constraints can say whether +an operand may be in a register, and which kinds of register; whether the +operand can be a memory reference, and which kinds of address; whether the +operand may be an immediate constant, and which possible values it may +have. Constraints can also require two operands to match. +Side-effects aren’t allowed in operands of inline asm, unless +< or > constraints are used, because there is no guarantee +that the side-effects will happen exactly once in an instruction that can update +the addressing register.

+
+
    +
+
+
+
+

Simple Constraints

+

The simplest kind of constraint is a string full of letters, each of +which describes one kind of operand that is permitted. Here are +the letters that are allowed:

+
+
whitespace
+
Whitespace characters are ignored and can be inserted at any position +except the first. This enables each alternative for different operands to +be visually aligned in the machine description even if they have different +number of constraints and modifiers.
+
m
+
A memory operand is allowed, with any kind of address that the machine +supports in general. +Note that the letter used for the general memory constraint can be +re-defined by a back end using the TARGET_MEM_CONSTRAINT macro.
+
o
+

A memory operand is allowed, but only if the address is +offsettable. This means that adding a small integer (actually, +the width in bytes of the operand, as determined by its machine mode) +may be added to the address and the result is also a valid memory +address.

+

For example, an address which is constant is offsettable; so is an +address that is the sum of a register and a constant (as long as a +slightly larger constant is also within the range of address-offsets +supported by the machine); but an autoincrement or autodecrement +address is not offsettable. More complicated indirect/indexed +addresses may or may not be offsettable depending on the other +addressing modes that the machine supports.

+

Note that in an output operand which can be matched by another +operand, the constraint letter o is valid only when accompanied +by both < (if the target machine has predecrement addressing) +and > (if the target machine has preincrement addressing).

+
+
V
+
A memory operand that is not offsettable. In other words, anything that +would fit the m constraint but not the o constraint.
+
+
+
<
+
A memory operand with autodecrement addressing (either predecrement or +postdecrement) is allowed. In inline asm this constraint is only +allowed if the operand is used exactly once in an instruction that can +handle the side-effects. Not using an operand with < in constraint +string in the inline asm pattern at all or using it in multiple +instructions isn’t valid, because the side-effects wouldn’t be performed +or would be performed more than once. Furthermore, on some targets +the operand with < in constraint string must be accompanied by +special instruction suffixes like %U0 instruction suffix on PowerPC +or %P0 on IA-64.
+
+
+
>
+
A memory operand with autoincrement addressing (either preincrement or +postincrement) is allowed. In inline asm the same restrictions +as for < apply.
+
r
+
A register operand is allowed provided that it is in a general +register.
+
i
+
An immediate integer operand (one with constant value) is allowed. +This includes symbolic constants whose values will be known only at +assembly time or later.
+
n
+
An immediate integer operand with a known numeric value is allowed. +Many systems cannot support assembly-time constants for operands less +than a word wide. Constraints for these operands should use n +rather than i.
+
I, J, K, ... P
+
Other letters in the range I through P may be defined in +a machine-dependent fashion to permit immediate integer operands with +explicit integer values in specified ranges. For example, on the +68000, I is defined to stand for the range of values 1 to 8. +This is the range permitted as a shift count in the shift +instructions.
+
E
+
An immediate floating operand (expression code const_double) is +allowed, but only if the target floating point format is the same as +that of the host machine (on which the compiler is running).
+
F
+
An immediate floating operand (expression code const_double or +const_vector) is allowed.
+
G, H
+
G and H may be defined in a machine-dependent fashion to +permit immediate floating operands in particular ranges of values.
+
s
+

An immediate integer operand whose value is not an explicit integer is +allowed.

+

This might appear strange; if an insn allows a constant operand with a +value not known at compile time, it certainly must allow any known +value. So why use s instead of i? Sometimes it allows +better code to be generated.

+

For example, on the 68000 in a fullword instruction it is possible to +use an immediate operand; but if the immediate value is between -128 +and 127, better code results from loading the value into a register and +using the register. This is because the load into the register can be +done with a moveq instruction. We arrange for this to happen +by defining the letter K to mean ‘any integer outside the +range -128 to 127’, and then specifying Ks in the operand +constraints.

+
+
g
+
Any register, memory or immediate integer operand is allowed, except for +registers that are not general registers.
+
X
+
Any operand whatsoever is allowed.
+
0, 1, 2, ... 9
+

An operand that matches the specified operand number is allowed. If a +digit is used together with letters within the same alternative, the +digit should come last.

+

This number is allowed to be more than a single digit. If multiple +digits are encountered consecutively, they are interpreted as a single +decimal integer. There is scant chance for ambiguity, since to-date +it has never been desirable that 10 be interpreted as matching +either operand 1 or operand 0. Should this be desired, one +can use multiple alternatives instead.

+

This is called a matching constraint and what it really means is +that the assembler has only a single operand that fills two roles +which asm distinguishes. For example, an add instruction uses +two input operands and an output operand, but on most CISC +machines an add instruction really has only two operands, one of them an +input-output operand:

+
addl #35,r12
+
+

Matching constraints are used in these circumstances. +More precisely, the two operands that match must include one input-only +operand and one output-only operand. Moreover, the digit must be a +smaller number than the number of the operand that uses it in the +constraint.

+
+
p
+

An operand that is a valid memory address is allowed. This is +for ‘load address’ and ‘push address’ instructions.

+

p in the constraint must be accompanied by address_operand +as the predicate in the match_operand. This predicate interprets +the mode specified in the match_operand as the mode of the memory +reference for which the address would be valid.

+
+
other-letters
+
Other letters can be defined in machine-dependent fashion to stand for +particular classes of registers or other arbitrary operand types. +d, a and f are defined on the 68000/68020 to stand +for data, address and floating point registers.
+
+
+
+

Multiple Alternative Constraints

+

Sometimes a single instruction has multiple alternative sets of possible +operands. For example, on the 68000, a logical-or instruction can combine +register or an immediate value into memory, or it can combine any kind of +operand into a register; but it cannot combine one memory location into +another.

+

These constraints are represented as multiple alternatives. An alternative +can be described by a series of letters for each operand. The overall +constraint for an operand is made from the letters for this operand +from the first alternative, a comma, the letters for this operand from +the second alternative, a comma, and so on until the last alternative.

+

If all the operands fit any one alternative, the instruction is valid. +Otherwise, for each alternative, the compiler counts how many instructions +must be added to copy the operands so that that alternative applies. +The alternative requiring the least copying is chosen. If two alternatives +need the same amount of copying, the one that comes first is chosen. +These choices can be altered with the ? and ! characters:

+
+
?
+
Disparage slightly the alternative that the ? appears in, +as a choice when no alternative applies exactly. The compiler regards +this alternative as one unit more costly for each ? that appears +in it.
+
!
+
Disparage severely the alternative that the ! appears in. +This alternative can still be used if it fits without reloading, +but if reloading is needed, some other alternative will be used.
+
^
+
This constraint is analogous to ? but it disparages slightly +the alternative only if the operand with the ^ needs a reload.
+
$
+
This constraint is analogous to ! but it disparages severely +the alternative only if the operand with the $ needs a reload.
+
+
+
+

Constraint Modifier Characters

+

Here are constraint modifier characters.

+
+
=
+
Means that this operand is written to by this instruction: +the previous value is discarded and replaced by new data.
+
+
+

Means that this operand is both read and written by the instruction.

+

When the compiler fixes up the operands to satisfy the constraints, +it needs to know which operands are read by the instruction and +which are written by it. = identifies an operand which is only +written; + identifies an operand that is both read and written; all +other operands are assumed to only be read.

+

If you specify = or + in a constraint, you put it in the +first character of the constraint string.

+
+
&
+

Means (in a particular alternative) that this operand is an +earlyclobber operand, which is written before the instruction is +finished using the input operands. Therefore, this operand may not lie +in a register that is read by the instruction or as part of any memory +address.

+

& applies only to the alternative in which it is written. In +constraints with multiple alternatives, sometimes one alternative +requires & while others do not. See, for example, the +movdf insn of the 68000.

+

A operand which is read by the instruction can be tied to an earlyclobber +operand if its only use as an input occurs before the early result is +written. Adding alternatives of this form often allows GCC to produce +better code when only some of the read operands can be affected by the +earlyclobber. See, for example, the mulsi3 insn of the ARM.

+

Furthermore, if the earlyclobber operand is also a read/write +operand, then that operand is written only after it’s used.

+

& does not obviate the need to write = or +. As +earlyclobber operands are always written, a read-only +earlyclobber operand is ill-formed and will be rejected by the +compiler.

+
+
%
+

Declares the instruction to be commutative for this operand and the +following operand. This means that the compiler may interchange the +two operands if that is the cheapest way to make all operands fit the +constraints. % applies to all alternatives and must appear as +the first character in the constraint. Only read-only operands can use +%.

+

GCC can only handle one commutative pair in an asm; if you use more, +the compiler may fail. Note that you need not use the modifier if +the two alternatives are strictly identical; this would only waste +time in the reload pass. The modifier is not operational after +register allocation, so the result of define_peephole2 +and ``define_split``s performed after reload cannot rely on +% to make the intended insn match.

+
+
#
+
Says that all following characters, up to the next comma, are to be +ignored as a constraint. They are significant only for choosing +register preferences.
+
*
+
Says that the following character should be ignored when choosing +register preferences. * has no effect on the meaning of the +constraint as a constraint, and no effect on reloading. For LRA +* additionally disparages slightly the alternative if the +following character matches the operand.
+
+
+
+

Constraints for Particular Machines

+

Whenever possible, you should use the general-purpose constraint letters +in asm arguments, since they will convey meaning more readily to +people reading your code. Failing that, use the constraint letters +that usually have very similar meanings across architectures. The most +commonly used constraints are m and r (for memory and +general-purpose registers respectively; see Simple Constraints), and +I, usually the letter indicating the most common +immediate-constant format.

+

Each architecture defines additional constraints. These constraints +are used by the compiler itself for instruction generation, as well as +for asm statements; therefore, some of the constraints are not +particularly useful for asm. Here is a summary of some of the +machine-dependent constraints available on some particular machines; +it includes both constraints that are useful for asm and +constraints that aren’t. The compiler source file mentioned in the +table heading for each architecture is the definitive reference for +the meanings of that architecture’s constraints.

+

AArch64 family-config/aarch64/constraints.md

+
+
+
k
+
The stack pointer register (SP)
+
w
+
Floating point or SIMD vector register
+
I
+
Integer constant that is valid as an immediate operand in an ADD +instruction
+
J
+
Integer constant that is valid as an immediate operand in a SUB +instruction (once negated)
+
K
+
Integer constant that can be used with a 32-bit logical instruction
+
L
+
Integer constant that can be used with a 64-bit logical instruction
+
M
+
Integer constant that is valid as an immediate operand in a 32-bit MOV +pseudo instruction. The MOV may be assembled to one of several different +machine instructions depending on the value
+
N
+
Integer constant that is valid as an immediate operand in a 64-bit MOV +pseudo instruction
+
S
+
An absolute symbolic address or a label reference
+
Y
+
Floating point constant zero
+
Z
+
Integer constant zero
+
Ush
+
The high part (bits 12 and upwards) of the pc-relative address of a symbol +within 4GB of the instruction
+
Q
+
A memory address which uses a single base register with no offset
+
Ump
+
A memory address suitable for a load/store pair instruction in SI, DI, SF and +DF modes
+
+
+

ARC -config/arc/constraints.md

+
+
+
q
+
Registers usable in ARCompact 16-bit instructions: r0-r3, +r12-r15. This constraint can only match when the -mq +option is in effect.
+
e
+
Registers usable as base-regs of memory addresses in ARCompact 16-bit memory +instructions: r0-r3, r12-r15, sp. +This constraint can only match when the -mq +option is in effect.
+
D
+
ARC FPX (dpfp) 64-bit registers. D0, D1.
+
I
+
A signed 12-bit integer constant.
+
Cal
+
constant for arithmetic/logical operations. This might be any constant +that can be put into a long immediate by the assmbler or linker without +involving a PIC relocation.
+
K
+
A 3-bit unsigned integer constant.
+
L
+
A 6-bit unsigned integer constant.
+
CnL
+
One’s complement of a 6-bit unsigned integer constant.
+
CmL
+
Two’s complement of a 6-bit unsigned integer constant.
+
M
+
A 5-bit unsigned integer constant.
+
O
+
A 7-bit unsigned integer constant.
+
P
+
A 8-bit unsigned integer constant.
+
H
+
Any const_double value.
+
+
+

ARM family-config/arm/constraints.md

+
+
+
h
+
In Thumb state, the core registers r8-r15.
+
k
+
The stack pointer register.
+
l
+
In Thumb State the core registers r0-r7. In ARM state this +is an alias for the r constraint.
+
t
+
VFP floating-point registers s0-s31. Used for 32 bit values.
+
w
+
VFP floating-point registers d0-d31 and the appropriate +subset d0-d15 based on command line options. +Used for 64 bit values only. Not valid for Thumb1.
+
y
+
The iWMMX co-processor registers.
+
z
+
The iWMMX GR registers.
+
G
+
The floating-point constant 0.0
+
I
+
Integer that is valid as an immediate operand in a data processing +instruction. That is, an integer in the range 0 to 255 rotated by a +multiple of 2
+
J
+
Integer in the range -4095 to 4095
+
K
+
Integer that satisfies constraint I when inverted (ones complement)
+
L
+
Integer that satisfies constraint I when negated (twos complement)
+
M
+
Integer in the range 0 to 32
+
Q
+
A memory reference where the exact address is in a single register +(‘m‘ is preferable for asm statements)
+
R
+
An item in the constant pool
+
S
+
A symbol in the text segment of the current file
+
Uv
+
A memory reference suitable for VFP load/store insns (reg+constant offset)
+
Uy
+
A memory reference suitable for iWMMXt load/store instructions.
+
Uq
+
A memory reference suitable for the ARMv4 ldrsb instruction.
+
+
+

AVR family-config/avr/constraints.md

+
+
+
l
+
Registers from r0 to r15
+
a
+
Registers from r16 to r23
+
d
+
Registers from r16 to r31
+
w
+
Registers from r24 to r31. These registers can be used in adiw command
+
e
+
Pointer register (r26-r31)
+
b
+
Base pointer register (r28-r31)
+
q
+
Stack pointer register (SPH:SPL)
+
t
+
Temporary register r0
+
x
+
Register pair X (r27:r26)
+
y
+
Register pair Y (r29:r28)
+
z
+
Register pair Z (r31:r30)
+
I
+
Constant greater than -1, less than 64
+
J
+
Constant greater than -64, less than 1
+
K
+
Constant integer 2
+
L
+
Constant integer 0
+
M
+
Constant that fits in 8 bits
+
N
+
Constant integer -1
+
O
+
Constant integer 8, 16, or 24
+
P
+
Constant integer 1
+
G
+
A floating point constant 0.0
+
Q
+
A memory address based on Y or Z pointer with displacement.
+
+
+

Blackfin family-config/bfin/constraints.md

+
+
+
a
+
P register
+
d
+
D register
+
z
+
A call clobbered P register.
+
qn
+
A single register. If n is in the range 0 to 7, the corresponding D +register. If it is A, then the register P0.
+
D
+
Even-numbered D register
+
W
+
Odd-numbered D register
+
e
+
Accumulator register.
+
A
+
Even-numbered accumulator register.
+
B
+
Odd-numbered accumulator register.
+
b
+
I register
+
v
+
B register
+
f
+
M register
+
c
+
Registers used for circular buffering, i.e. I, B, or L registers.
+
C
+
The CC register.
+
t
+
LT0 or LT1.
+
k
+
LC0 or LC1.
+
u
+
LB0 or LB1.
+
x
+
Any D, P, B, M, I or L register.
+
y
+
Additional registers typically used only in prologues and epilogues: RETS, +RETN, RETI, RETX, RETE, ASTAT, SEQSTAT and USP.
+
w
+
Any register except accumulators or CC.
+
Ksh
+
Signed 16 bit integer (in the range -32768 to 32767)
+
Kuh
+
Unsigned 16 bit integer (in the range 0 to 65535)
+
Ks7
+
Signed 7 bit integer (in the range -64 to 63)
+
Ku7
+
Unsigned 7 bit integer (in the range 0 to 127)
+
Ku5
+
Unsigned 5 bit integer (in the range 0 to 31)
+
Ks4
+
Signed 4 bit integer (in the range -8 to 7)
+
Ks3
+
Signed 3 bit integer (in the range -3 to 4)
+
Ku3
+
Unsigned 3 bit integer (in the range 0 to 7)
+
Pn
+
Constant n, where n is a single-digit constant in the range 0 to 4.
+
PA
+
An integer equal to one of the MACFLAG_XXX constants that is suitable for +use with either accumulator.
+
PB
+
An integer equal to one of the MACFLAG_XXX constants that is suitable for +use only with accumulator A1.
+
M1
+
Constant 255.
+
M2
+
Constant 65535.
+
J
+
An integer constant with exactly a single bit set.
+
L
+
An integer constant with all bits set except exactly one.
+
+

H

+
+
Q
+
Any SYMBOL_REF.
+
+
+

CR16 Architecture-config/cr16/cr16.h

+
+
+
b
+
Registers from r0 to r14 (registers without stack pointer)
+
t
+
Register from r0 to r11 (all 16-bit registers)
+
p
+
Register from r12 to r15 (all 32-bit registers)
+
I
+
Signed constant that fits in 4 bits
+
J
+
Signed constant that fits in 5 bits
+
K
+
Signed constant that fits in 6 bits
+
L
+
Unsigned constant that fits in 4 bits
+
M
+
Signed constant that fits in 32 bits
+
N
+
Check for 64 bits wide constants for add/sub instructions
+
G
+
Floating point constant that is legal for store immediate
+
+
+

Epiphany-config/epiphany/constraints.md

+
+
+
U16
+
An unsigned 16-bit constant.
+
K
+
An unsigned 5-bit constant.
+
L
+
A signed 11-bit constant.
+
Cm1
+
A signed 11-bit constant added to -1. +Can only match when the -m1reg-``reg`` option is active.
+
Cl1
+
Left-shift of -1, i.e., a bit mask with a block of leading ones, the rest +being a block of trailing zeroes. +Can only match when the -m1reg-``reg`` option is active.
+
Cr1
+
Right-shift of -1, i.e., a bit mask with a trailing block of ones, the +rest being zeroes. Or to put it another way, one less than a power of two. +Can only match when the -m1reg-``reg`` option is active.
+
Cal
+
Constant for arithmetic/logical operations. +This is like i, except that for position independent code, +no symbols / expressions needing relocations are allowed.
+
Csy
+
Symbolic constant for call/jump instruction.
+
Rcs
+
The register class usable in short insns. This is a register class +constraint, and can thus drive register allocation. +This constraint won’t match unless -mprefer-short-insn-regs is +in effect.
+
Rsc
+
The the register class of registers that can be used to hold a +sibcall call address. I.e., a caller-saved register.
+
Rct
+
Core control register class.
+
Rgs
+
The register group usable in short insns. +This constraint does not use a register class, so that it only +passively matches suitable registers, and doesn’t drive register allocation.
+
Rra
+
Matches the return address if it can be replaced with the link register.
+
Rcc
+
Matches the integer condition code register.
+
Sra
+
Matches the return address if it is in a stack slot.
+
Cfm
+
Matches control register values to switch fp mode, which are encapsulated in +UNSPEC_FP_MODE.
+
+
+

FRV-config/frv/frv.h

+
+
+
a
+
Register in the class ACC_REGS (acc0 to acc7).
+
b
+
Register in the class EVEN_ACC_REGS (acc0 to acc7).
+
c
+
Register in the class CC_REGS (fcc0 to fcc3 and +icc0 to icc3).
+
d
+
Register in the class GPR_REGS (gr0 to gr63).
+
e
+
Register in the class EVEN_REGS (gr0 to gr63). +Odd registers are excluded not in the class but through the use of a machine +mode larger than 4 bytes.
+
f
+
Register in the class FPR_REGS (fr0 to fr63).
+
h
+
Register in the class FEVEN_REGS (fr0 to fr63). +Odd registers are excluded not in the class but through the use of a machine +mode larger than 4 bytes.
+
l
+
Register in the class LR_REG (the lr register).
+
q
+
Register in the class QUAD_REGS (gr2 to gr63). +Register numbers not divisible by 4 are excluded not in the class but through +the use of a machine mode larger than 8 bytes.
+
t
+
Register in the class ICC_REGS (icc0 to icc3).
+
u
+
Register in the class FCC_REGS (fcc0 to fcc3).
+
v
+
Register in the class ICR_REGS (cc4 to cc7).
+
w
+
Register in the class FCR_REGS (cc0 to cc3).
+
x
+
Register in the class QUAD_FPR_REGS (fr0 to fr63). +Register numbers not divisible by 4 are excluded not in the class but through +the use of a machine mode larger than 8 bytes.
+
z
+
Register in the class SPR_REGS (lcr and lr).
+
A
+
Register in the class QUAD_ACC_REGS (acc0 to acc7).
+
B
+
Register in the class ACCG_REGS (accg0 to accg7).
+
C
+
Register in the class CR_REGS (cc0 to cc7).
+
G
+
Floating point constant zero
+
I
+
6-bit signed integer constant
+
J
+
10-bit signed integer constant
+
L
+
16-bit signed integer constant
+
M
+
16-bit unsigned integer constant
+
N
+
12-bit signed integer constant that is negative-i.e. in the +range of -2048 to -1
+
O
+
Constant zero
+
P
+
12-bit signed integer constant that is greater than zero-i.e. in the +range of 1 to 2047.
+
+
+

Hewlett-Packard PA-RISC-config/pa/pa.h

+
+
+
a
+
General register 1
+
f
+
Floating point register
+
q
+
Shift amount register
+
x
+
Floating point register (deprecated)
+
y
+
Upper floating point register (32-bit), floating point register (64-bit)
+
Z
+
Any register
+
I
+
Signed 11-bit integer constant
+
J
+
Signed 14-bit integer constant
+
K
+
Integer constant that can be deposited with a zdepi instruction
+
L
+
Signed 5-bit integer constant
+
M
+
Integer constant 0
+
N
+
Integer constant that can be loaded with a ldil instruction
+
O
+
Integer constant whose value plus one is a power of 2
+
P
+
Integer constant that can be used for and operations in depi +and extru instructions
+
S
+
Integer constant 31
+
U
+
Integer constant 63
+
G
+
Floating-point constant 0.0
+
A
+
A lo_sum data-linkage-table memory operand
+
Q
+
A memory operand that can be used as the destination operand of an +integer store instruction
+
R
+
A scaled or unscaled indexed memory operand
+
T
+
A memory operand for floating-point loads and stores
+
W
+
A register indirect memory operand
+
+
+

Intel IA-64-config/ia64/ia64.h

+
+
+
a
+
General register r0 to r3 for addl instruction
+
b
+
Branch register
+
c
+
Predicate register (c as in ‘conditional’)
+
d
+
Application register residing in M-unit
+
e
+
Application register residing in I-unit
+
f
+
Floating-point register
+
m
+
Memory operand. If used together with < or >, +the operand can have postincrement and postdecrement which +require printing with %Pn on IA-64.
+
G
+
Floating-point constant 0.0 or 1.0
+
I
+
14-bit signed integer constant
+
J
+
22-bit signed integer constant
+
K
+
8-bit signed integer constant for logical instructions
+
L
+
8-bit adjusted signed integer constant for compare pseudo-ops
+
M
+
6-bit unsigned integer constant for shift counts
+
N
+
9-bit signed integer constant for load and store postincrements
+
O
+
The constant zero
+
P
+
0 or -1 for dep instruction
+
Q
+
Non-volatile memory for floating-point loads and stores
+
R
+
Integer constant in the range 1 to 4 for shladd instruction
+
S
+
Memory operand except postincrement and postdecrement. This is +now roughly the same as m when not used together with < +or >.
+
+
+

M32C-config/m32c/m32c.c

+
+
+
Rsp Rfb Rsb
+
$sp, $fb, $sb.
+
Rcr
+
Any control register, when they’re 16 bits wide (nothing if control +registers are 24 bits wide)
+
Rcl
+
Any control register, when they’re 24 bits wide.
+
R0w R1w R2w R3w
+
$r0, $r1, $r2, $r3.
+
R02
+
$r0 or $r2, or $r2r0 for 32 bit values.
+
R13
+
$r1 or $r3, or $r3r1 for 32 bit values.
+
Rdi
+
A register that can hold a 64 bit value.
+
Rhl
+
$r0 or $r1 (registers with addressable high/low bytes)
+
R23
+
$r2 or $r3
+
Raa
+
Address registers
+
Raw
+
Address registers when they’re 16 bits wide.
+
Ral
+
Address registers when they’re 24 bits wide.
+
Rqi
+
Registers that can hold QI values.
+
Rad
+
Registers that can be used with displacements ($a0, $a1, $sb).
+
Rsi
+
Registers that can hold 32 bit values.
+
Rhi
+
Registers that can hold 16 bit values.
+
Rhc
+
Registers chat can hold 16 bit values, including all control +registers.
+
Rra
+
$r0 through R1, plus $a0 and $a1.
+
Rfl
+
The flags register.
+
Rmm
+
The memory-based pseudo-registers $mem0 through $mem15.
+
Rpi
+
Registers that can hold pointers (16 bit registers for r8c, m16c; 24 +bit registers for m32cm, m32c).
+
Rpa
+
Matches multiple registers in a PARALLEL to form a larger register. +Used to match function return values.
+
Is3
+
-8 ... 7
+
IS1
+
-128 ... 127
+
IS2
+
-32768 ... 32767
+
IU2
+
0 ... 65535
+
In4
+
-8 ... -1 or 1 ... 8
+
In5
+
-16 ... -1 or 1 ... 16
+
In6
+
-32 ... -1 or 1 ... 32
+
IM2
+
-65536 ... -1
+
Ilb
+
An 8 bit value with exactly one bit set.
+
Ilw
+
A 16 bit value with exactly one bit set.
+
Sd
+
The common src/dest memory addressing modes.
+
Sa
+
Memory addressed using $a0 or $a1.
+
Si
+
Memory addressed with immediate addresses.
+
Ss
+
Memory addressed using the stack pointer ($sp).
+
Sf
+
Memory addressed using the frame base register ($fb).
+
Ss
+
Memory addressed using the small base register ($sb).
+
S1
+
$r1h
+
+
+

MeP-config/mep/constraints.md

+
+
+
a
+
The $sp register.
+
b
+
The $tp register.
+
c
+
Any control register.
+
d
+
Either the $hi or the $lo register.
+
em
+
Coprocessor registers that can be directly loaded ($c0-$c15).
+
ex
+
Coprocessor registers that can be moved to each other.
+
er
+
Coprocessor registers that can be moved to core registers.
+
h
+
The $hi register.
+
j
+
The $rpc register.
+
l
+
The $lo register.
+
t
+
Registers which can be used in $tp-relative addressing.
+
v
+
The $gp register.
+
x
+
The coprocessor registers.
+
y
+
The coprocessor control registers.
+
z
+
The $0 register.
+
A
+
User-defined register set A.
+
B
+
User-defined register set B.
+
C
+
User-defined register set C.
+
D
+
User-defined register set D.
+
I
+
Offsets for $gp-rel addressing.
+
J
+
Constants that can be used directly with boolean insns.
+
K
+
Constants that can be moved directly to registers.
+
L
+
Small constants that can be added to registers.
+
M
+
Long shift counts.
+
N
+
Small constants that can be compared to registers.
+
O
+
Constants that can be loaded into the top half of registers.
+
S
+
Signed 8-bit immediates.
+
T
+
Symbols encoded for $tp-rel or $gp-rel addressing.
+
U
+
Non-constant addresses for loading/saving coprocessor registers.
+
W
+
The top half of a symbol’s value.
+
Y
+
A register indirect address without offset.
+
Z
+
Symbolic references to the control bus.
+
+
+

MicroBlaze-config/microblaze/constraints.md

+
+
+
d
+
A general register (r0 to r31).
+
z
+
A status register (rmsr, $fcc1 to $fcc7).
+
+
+

MIPS-config/mips/constraints.md

+
+
+
d
+
An address register. This is equivalent to r unless +generating MIPS16 code.
+
f
+
A floating-point register (if available).
+
h
+
Formerly the hi register. This constraint is no longer supported.
+
l
+
The lo register. Use this register to store values that are +no bigger than a word.
+
x
+
The concatenated hi and lo registers. Use this register +to store doubleword values.
+
c
+
A register suitable for use in an indirect jump. This will always be +$25 for -mabicalls.
+
v
+
Register $3. Do not use this constraint in new code; +it is retained only for compatibility with glibc.
+
y
+
Equivalent to r; retained for backwards compatibility.
+
z
+
A floating-point condition code register.
+
I
+
A signed 16-bit constant (for arithmetic instructions).
+
J
+
Integer zero.
+
K
+
An unsigned 16-bit constant (for logic instructions).
+
L
+
A signed 32-bit constant in which the lower 16 bits are zero. +Such constants can be loaded using lui.
+
M
+
A constant that cannot be loaded using lui, addiu +or ori.
+
N
+
A constant in the range -65535 to -1 (inclusive).
+
O
+
A signed 15-bit constant.
+
P
+
A constant in the range 1 to 65535 (inclusive).
+
G
+
Floating-point zero.
+
R
+
An address that can be used in a non-macro load or store.
+
ZC
+
A memory operand whose address is formed by a base register and offset +that is suitable for use in instructions with the same addressing mode +as ll and sc.
+
ZD
+
An address suitable for a prefetch instruction, or for any other +instruction with the same addressing mode as prefetch.
+
+
+

Motorola 680x0-config/m68k/constraints.md

+
+
+
a
+
Address register
+
d
+
Data register
+
f
+
68881 floating-point register, if available
+
I
+
Integer in the range 1 to 8
+
J
+
16-bit signed number
+
K
+
Signed number whose magnitude is greater than 0x80
+
L
+
Integer in the range -8 to -1
+
M
+
Signed number whose magnitude is greater than 0x100
+
N
+
Range 24 to 31, rotatert:SI 8 to 1 expressed as rotate
+
O
+
16 (for rotate using swap)
+
P
+
Range 8 to 15, rotatert:HI 8 to 1 expressed as rotate
+
R
+
Numbers that mov3q can handle
+
G
+
Floating point constant that is not a 68881 constant
+
S
+
Operands that satisfy ‘m’ when -mpcrel is in effect
+
T
+
Operands that satisfy ‘s’ when -mpcrel is not in effect
+
Q
+
Address register indirect addressing mode
+
U
+
Register offset addressing
+
W
+
const_call_operand
+
Cs
+
symbol_ref or const
+
Ci
+
const_int
+
C0
+
const_int 0
+
Cj
+
Range of signed numbers that don’t fit in 16 bits
+
Cmvq
+
Integers valid for mvq
+
Capsw
+
Integers valid for a moveq followed by a swap
+
Cmvz
+
Integers valid for mvz
+
Cmvs
+
Integers valid for mvs
+
Ap
+
push_operand
+
Ac
+
Non-register operands allowed in clr
+
+
+

Moxie-config/moxie/constraints.md

+
+
+
A
+
An absolute address
+
B
+
An offset address
+
W
+
A register indirect memory operand
+
I
+
A constant in the range of 0 to 255.
+
N
+
A constant in the range of 0 to -255.
+
+
+

MSP430-config/msp430/constraints.md

+
+
+
R12
+
Register R12.
+
R13
+
Register R13.
+
K
+
Integer constant 1.
+
L
+
Integer constant -1^20..1^19.
+
M
+
Integer constant 1-4.
+
Ya
+
Memory references which do not require an extended MOVX instruction.
+
Yl
+
Memory reference, labels only.
+
Ys
+
Memory reference, stack only.
+
+
+

NDS32-config/nds32/constraints.md

+
+
+
w
+
LOW register class $r0 to $r7 constraint for V3/V3M ISA.
+
l
+
LOW register class $r0 to $r7.
+
d
+
MIDDLE register class $r0 to $r11, $r16 to $r19.
+
h
+
HIGH register class $r12 to $r14, $r20 to $r31.
+
t
+
Temporary assist register $ta (i.e. $r15).
+
k
+
Stack register $sp.
+
Iu03
+
Unsigned immediate 3-bit value.
+
In03
+
Negative immediate 3-bit value in the range of -7-0.
+
Iu04
+
Unsigned immediate 4-bit value.
+
Is05
+
Signed immediate 5-bit value.
+
Iu05
+
Unsigned immediate 5-bit value.
+
In05
+
Negative immediate 5-bit value in the range of -31-0.
+
Ip05
+
Unsigned immediate 5-bit value for movpi45 instruction with range 16-47.
+
Iu06
+
Unsigned immediate 6-bit value constraint for addri36.sp instruction.
+
Iu08
+
Unsigned immediate 8-bit value.
+
Iu09
+
Unsigned immediate 9-bit value.
+
Is10
+
Signed immediate 10-bit value.
+
Is11
+
Signed immediate 11-bit value.
+
Is15
+
Signed immediate 15-bit value.
+
Iu15
+
Unsigned immediate 15-bit value.
+
Ic15
+
A constant which is not in the range of imm15u but ok for bclr instruction.
+
Ie15
+
A constant which is not in the range of imm15u but ok for bset instruction.
+
It15
+
A constant which is not in the range of imm15u but ok for btgl instruction.
+
Ii15
+
A constant whose compliment value is in the range of imm15u +and ok for bitci instruction.
+
Is16
+
Signed immediate 16-bit value.
+
Is17
+
Signed immediate 17-bit value.
+
Is19
+
Signed immediate 19-bit value.
+
Is20
+
Signed immediate 20-bit value.
+
Ihig
+
The immediate value that can be simply set high 20-bit.
+
Izeb
+
The immediate value 0xff.
+
Izeh
+
The immediate value 0xffff.
+
Ixls
+
The immediate value 0x01.
+
Ix11
+
The immediate value 0x7ff.
+
Ibms
+
The immediate value with power of 2.
+
Ifex
+
The immediate value with power of 2 minus 1.
+
U33
+
Memory constraint for 333 format.
+
U45
+
Memory constraint for 45 format.
+
U37
+
Memory constraint for 37 format.
+
+
+

Nios II family-config/nios2/constraints.md

+
+
+
I
+
Integer that is valid as an immediate operand in an +instruction taking a signed 16-bit number. Range +-32768 to 32767.
+
J
+
Integer that is valid as an immediate operand in an +instruction taking an unsigned 16-bit number. Range +0 to 65535.
+
K
+
Integer that is valid as an immediate operand in an +instruction taking only the upper 16-bits of a +32-bit number. Range 32-bit numbers with the lower +16-bits being 0.
+
L
+
Integer that is valid as an immediate operand for a +shift instruction. Range 0 to 31.
+
M
+
Integer that is valid as an immediate operand for +only the value 0. Can be used in conjunction with +the format modifier z to use r0 +instead of 0 in the assembly output.
+
N
+
Integer that is valid as an immediate operand for +a custom instruction opcode. Range 0 to 255.
+
S
+
Matches immediates which are addresses in the small +data section and therefore can be added to gp +as a 16-bit immediate to re-create their 32-bit value.
+
+
+

PDP-11-config/pdp11/constraints.md

+
+
+
a
+
Floating point registers AC0 through AC3. These can be loaded from/to +memory with a single instruction.
+
d
+
Odd numbered general registers (R1, R3, R5). These are used for +16-bit multiply operations.
+
f
+
Any of the floating point registers (AC0 through AC5).
+
G
+
Floating point constant 0.
+
I
+
An integer constant that fits in 16 bits.
+
J
+
An integer constant whose low order 16 bits are zero.
+
K
+
An integer constant that does not meet the constraints for codes +I or J.
+
L
+
The integer constant 1.
+
M
+
The integer constant -1.
+
N
+
The integer constant 0.
+
O
+
Integer constants -4 through -1 and 1 through 4; shifts by these +amounts are handled as multiple single-bit shifts rather than a single +variable-length shift.
+
Q
+
A memory reference which requires an additional word (address or +offset) after the opcode.
+
R
+
A memory reference that is encoded within the opcode.
+
+
+

PowerPC and IBM RS6000-config/rs6000/constraints.md

+
+
+
b
+
Address base register
+
d
+
Floating point register (containing 64-bit value)
+
f
+
Floating point register (containing 32-bit value)
+
v
+
Altivec vector register
+
wa
+
Any VSX register if the -mvsx option was used or NO_REGS.
+
wd
+
VSX vector register to hold vector double data or NO_REGS.
+
wf
+
VSX vector register to hold vector float data or NO_REGS.
+
wg
+
If -mmfpgpr was used, a floating point register or NO_REGS.
+
wh
+
Floating point register if direct moves are available, or NO_REGS.
+
wi
+
FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.
+
wj
+
FP or VSX register to hold 64-bit integers for direct moves or NO_REGS.
+
wk
+
FP or VSX register to hold 64-bit doubles for direct moves or NO_REGS.
+
wl
+
Floating point register if the LFIWAX instruction is enabled or NO_REGS.
+
wm
+
VSX register if direct move instructions are enabled, or NO_REGS.
+
wn
+
No register (NO_REGS).
+
wr
+
General purpose register if 64-bit instructions are enabled or NO_REGS.
+
ws
+
VSX vector register to hold scalar double values or NO_REGS.
+
wt
+
VSX vector register to hold 128 bit integer or NO_REGS.
+
wu
+
Altivec register to use for float/32-bit int loads/stores or NO_REGS.
+
wv
+
Altivec register to use for double loads/stores or NO_REGS.
+
ww
+
FP or VSX register to perform float operations under -mvsx or NO_REGS.
+
wx
+
Floating point register if the STFIWX instruction is enabled or NO_REGS.
+
wy
+
FP or VSX register to perform ISA 2.07 float ops or NO_REGS.
+
wz
+
Floating point register if the LFIWZX instruction is enabled or NO_REGS.
+
wD
+
Int constant that is the element number of the 64-bit scalar in a vector.
+
wQ
+
A memory address that will work with the lq and stq +instructions.
+
h
+
MQ, CTR, or LINK register
+
q
+
MQ register
+
c
+
CTR register
+
l
+
LINK register
+
x
+
CR register (condition register) number 0
+
y
+
CR register (condition register)
+
z
+
XER[CA] carry bit (part of the XER register)
+
I
+
Signed 16-bit constant
+
J
+
Unsigned 16-bit constant shifted left 16 bits (use L instead for +SImode constants)
+
K
+
Unsigned 16-bit constant
+
L
+
Signed 16-bit constant shifted left 16 bits
+
M
+
Constant larger than 31
+
N
+
Exact power of 2
+
O
+
Zero
+
P
+
Constant whose negation is a signed 16-bit constant
+
G
+
Floating point constant that can be loaded into a register with one +instruction per word
+
H
+
Integer/Floating point constant that can be loaded into a register using +three instructions
+
m
+

Memory operand. +Normally, m does not allow addresses that update the base register. +If < or > constraint is also used, they are allowed and +therefore on PowerPC targets in that case it is only safe +to use m<> in an asm statement if that asm statement +accesses the operand exactly once. The asm statement must also +use %U``<opno>`` as a placeholder for the ‘update’ flag in the +corresponding load or store instruction. For example:

+
asm ("st%U0 %1,%0" : "=m<>" (mem) : "r" (val));
+
+
+

is correct but:

+
asm ("st %1,%0" : "=m<>" (mem) : "r" (val));
+
+
+

is not.

+
+
es
+
A ‘stable’ memory operand; that is, one which does not include any +automodification of the base register. This used to be useful when +m allowed automodification of the base register, but as those are now only +allowed when < or > is used, es is basically the same +as m without < and >.
+
Q
+
Memory operand that is an offset from a register (it is usually better +to use m or es in asm statements)
+
Z
+
Memory operand that is an indexed or indirect from a register (it is +usually better to use m or es in asm statements)
+
R
+
AIX TOC entry
+
a
+
Address operand that is an indexed or indirect from a register (p is +preferable for asm statements)
+
S
+
Constant suitable as a 64-bit mask operand
+
T
+
Constant suitable as a 32-bit mask operand
+
U
+
System V Release 4 small data area reference
+
t
+
AND masks that can be performed by two rldic{l, r} instructions
+
W
+
Vector constant that does not require memory
+
j
+
Vector constant that is all zeros.
+
+
+

RL78-config/rl78/constraints.md

+
+
+
Int3
+
An integer constant in the range 1 ... 7.
+
Int8
+
An integer constant in the range 0 ... 255.
+
J
+
An integer constant in the range -255 ... 0
+
K
+
The integer constant 1.
+
L
+
The integer constant -1.
+
M
+
The integer constant 0.
+
N
+
The integer constant 2.
+
O
+
The integer constant -2.
+
P
+
An integer constant in the range 1 ... 15.
+
Qbi
+
The built-in compare types-eq, ne, gtu, ltu, geu, and leu.
+
Qsc
+
The synthetic compare types-gt, lt, ge, and le.
+
Wab
+
A memory reference with an absolute address.
+
Wbc
+
A memory reference using BC as a base register, with an optional offset.
+
Wca
+
A memory reference using AX, BC, DE, or HL for the address, for calls.
+
Wcv
+
A memory reference using any 16-bit register pair for the address, for calls.
+
Wd2
+
A memory reference using DE as a base register, with an optional offset.
+
Wde
+
A memory reference using DE as a base register, without any offset.
+
Wfr
+
Any memory reference to an address in the far address space.
+
Wh1
+
A memory reference using HL as a base register, with an optional one-byte offset.
+
Whb
+
A memory reference using HL as a base register, with B or C as the index register.
+
Whl
+
A memory reference using HL as a base register, without any offset.
+
Ws1
+
A memory reference using SP as a base register, with an optional one-byte offset.
+
Y
+
Any memory reference to an address in the near address space.
+
A
+
The AX register.
+
B
+
The BC register.
+
D
+
The DE register.
+
R
+
A through L registers.
+
S
+
The SP register.
+
T
+
The HL register.
+
Z08W
+
The 16-bit R8 register.
+
Z10W
+
The 16-bit R10 register.
+
Zint
+
The registers reserved for interrupts (R24 to R31).
+
a
+
The A register.
+
b
+
The B register.
+
c
+
The C register.
+
d
+
The D register.
+
e
+
The E register.
+
h
+
The H register.
+
l
+
The L register.
+
v
+
The virtual registers.
+
w
+
The PSW register.
+
x
+
The X register.
+
+
+

RX-config/rx/constraints.md

+
+
+
Q
+
An address which does not involve register indirect addressing or +pre/post increment/decrement addressing.
+
Symbol
+
A symbol reference.
+
Int08
+
A constant in the range -256 to 255, inclusive.
+
Sint08
+
A constant in the range -128 to 127, inclusive.
+
Sint16
+
A constant in the range -32768 to 32767, inclusive.
+
Sint24
+
A constant in the range -8388608 to 8388607, inclusive.
+
Uint04
+
A constant in the range 0 to 15, inclusive.
+
+
+

S/390 and zSeries-config/s390/s390.h

+
+
+
a
+
Address register (general purpose register except r0)
+
c
+
Condition code register
+
d
+
Data register (arbitrary general purpose register)
+
f
+
Floating-point register
+
I
+
Unsigned 8-bit constant (0-255)
+
J
+
Unsigned 12-bit constant (0-4095)
+
K
+
Signed 16-bit constant (-32768-32767)
+
L
+

Value appropriate as displacement.

+
+
(0..4095)
+
for short displacement
+
(-524288..524287)
+
for long displacement
+
+
+
M
+
Constant integer with a value of 0x7fffffff.
+
N
+

Multiple letter constraint followed by 4 parameter letters.

+
+
0..9:
+
number of the part counting from most to least significant
+
H,Q:
+
mode of the part
+
D,S,H:
+
mode of the containing operand
+
0,F:
+

value of the other parts (F-all bits set)

+

The constraint matches if the specified part of a constant

+
+
+

has a value different from its other parts.

+
+
Q
+
Memory reference without index register and with short displacement.
+
R
+
Memory reference with index register and short displacement.
+
S
+
Memory reference without index register but with long displacement.
+
T
+
Memory reference with index register and long displacement.
+
U
+
Pointer with short displacement.
+
W
+
Pointer with long displacement.
+
Y
+
Shift count operand.
+
+
+

SPARC-config/sparc/sparc.h

+
+
+
f
+
Floating-point register on the SPARC-V8 architecture and +lower floating-point register on the SPARC-V9 architecture.
+
e
+
Floating-point register. It is equivalent to f on the +SPARC-V8 architecture and contains both lower and upper +floating-point registers on the SPARC-V9 architecture.
+
c
+
Floating-point condition code register.
+
d
+
Lower floating-point register. It is only valid on the SPARC-V9 +architecture when the Visual Instruction Set is available.
+
b
+
Floating-point register. It is only valid on the SPARC-V9 architecture +when the Visual Instruction Set is available.
+
h
+
64-bit global or out register for the SPARC-V8+ architecture.
+
C
+
The constant all-ones, for floating-point.
+
A
+
Signed 5-bit constant
+
D
+
A vector constant
+
I
+
Signed 13-bit constant
+
J
+
Zero
+
K
+
32-bit constant with the low 12 bits clear (a constant that can be +loaded with the sethi instruction)
+
L
+
A constant in the range supported by movcc instructions (11-bit +signed immediate)
+
M
+
A constant in the range supported by movrcc instructions (10-bit +signed immediate)
+
N
+
Same as K, except that it verifies that bits that are not in the +lower 32-bit range are all zero. Must be used instead of K for +modes wider than SImode
+
O
+
The constant 4096
+
G
+
Floating-point zero
+
H
+
Signed 13-bit constant, sign-extended to 32 or 64 bits
+
P
+
The constant -1
+
Q
+
Floating-point constant whose integral representation can +be moved into an integer register using a single sethi +instruction
+
R
+
Floating-point constant whose integral representation can +be moved into an integer register using a single mov +instruction
+
S
+
Floating-point constant whose integral representation can +be moved into an integer register using a high/lo_sum +instruction sequence
+
T
+
Memory address aligned to an 8-byte boundary
+
U
+
Even register
+
W
+
Memory address for e constraint registers
+
w
+
Memory address with only a base register
+
Y
+
Vector zero
+
+
+

SPU-config/spu/spu.h

+
+
+
a
+
An immediate which can be loaded with the il/ila/ilh/ilhu instructions. const_int is treated as a 64 bit value.
+
c
+
An immediate for and/xor/or instructions. const_int is treated as a 64 bit value.
+
d
+
An immediate for the iohl instruction. const_int is treated as a 64 bit value.
+
f
+
An immediate which can be loaded with fsmbi.
+
A
+
An immediate which can be loaded with the il/ila/ilh/ilhu instructions. const_int is treated as a 32 bit value.
+
B
+
An immediate for most arithmetic instructions. const_int is treated as a 32 bit value.
+
C
+
An immediate for and/xor/or instructions. const_int is treated as a 32 bit value.
+
D
+
An immediate for the iohl instruction. const_int is treated as a 32 bit value.
+
I
+
A constant in the range [-64, 63] for shift/rotate instructions.
+
J
+
An unsigned 7-bit constant for conversion/nop/channel instructions.
+
K
+
A signed 10-bit constant for most arithmetic instructions.
+
M
+
A signed 16 bit immediate for stop.
+
N
+
An unsigned 16-bit constant for iohl and fsmbi.
+
O
+
An unsigned 7-bit constant whose 3 least significant bits are 0.
+
P
+
An unsigned 3-bit constant for 16-byte rotates and shifts
+
R
+
Call operand, reg, for indirect calls
+
S
+
Call operand, symbol, for relative calls.
+
T
+
Call operand, const_int, for absolute calls.
+
U
+
An immediate which can be loaded with the il/ila/ilh/ilhu instructions. const_int is sign extended to 128 bit.
+
W
+
An immediate for shift and rotate instructions. const_int is treated as a 32 bit value.
+
Y
+
An immediate for and/xor/or instructions. const_int is sign extended as a 128 bit.
+
Z
+
An immediate for the iohl instruction. const_int is sign extended to 128 bit.
+
+
+

TI C6X family-config/c6x/constraints.md

+
+
+
a
+
Register file A (A0-A31).
+
b
+
Register file B (B0-B31).
+
A
+
Predicate registers in register file A (A0-A2 on C64X and +higher, A1 and A2 otherwise).
+
B
+
Predicate registers in register file B (B0-B2).
+
C
+
A call-used register in register file B (B0-B9, B16-B31).
+
Da
+
Register file A, excluding predicate registers (A3-A31, +plus A0 if not C64X or higher).
+
Db
+
Register file B, excluding predicate registers (B3-B31).
+
Iu4
+
Integer constant in the range 0 ... 15.
+
Iu5
+
Integer constant in the range 0 ... 31.
+
In5
+
Integer constant in the range -31 ... 0.
+
Is5
+
Integer constant in the range -16 ... 15.
+
I5x
+
Integer constant that can be the operand of an ADDA or a SUBA insn.
+
IuB
+
Integer constant in the range 0 ... 65535.
+
IsB
+
Integer constant in the range -32768 ... 32767.
+
IsC
+
Integer constant in the range -2^{20} ... 2^{20} - 1.
+
Jc
+
Integer constant that is a valid mask for the clr instruction.
+
Js
+
Integer constant that is a valid mask for the set instruction.
+
Q
+
Memory location with A base register.
+
R
+
Memory location with B base register.
+
Z
+
Register B14 (aka DP).
+
+
+

TILE-Gx-config/tilegx/constraints.md

+
+
+
R00 R01 R02 R03 R04 R05 R06 R07 R08 R09 R10
+
Each of these represents a register constraint for an individual +register, from r0 to r10.
+
I
+
Signed 8-bit integer constant.
+
J
+
Signed 16-bit integer constant.
+
K
+
Unsigned 16-bit integer constant.
+
L
+
Integer constant that fits in one signed byte when incremented by one +(-129 ... 126).
+
m
+

Memory operand. If used together with < or >, the +operand can have postincrement which requires printing with %In +and %in on TILE-Gx. For example:

+
asm ("st_add %I0,%1,%i0" : "=m<>" (*mem) : "r" (val));
+
+
+
+
M
+
A bit mask suitable for the BFINS instruction.
+
N
+
Integer constant that is a byte tiled out eight times.
+
O
+
The integer zero constant.
+
P
+
Integer constant that is a sign-extended byte tiled out as four shorts.
+
Q
+
Integer constant that fits in one signed byte when incremented +(-129 ... 126), but excluding -1.
+
S
+
Integer constant that has all 1 bits consecutive and starting at bit 0.
+
T
+
A 16-bit fragment of a got, tls, or pc-relative reference.
+
U
+
Memory operand except postincrement. This is roughly the same as +m when not used together with < or >.
+
W
+
An 8-element vector constant with identical elements.
+
Y
+
A 4-element vector constant with identical elements.
+
Z0
+
The integer constant 0xffffffff.
+
Z1
+
The integer constant 0xffffffff00000000.
+
+
+

TILEPro-config/tilepro/constraints.md

+
+
+
R00 R01 R02 R03 R04 R05 R06 R07 R08 R09 R10
+
Each of these represents a register constraint for an individual +register, from r0 to r10.
+
I
+
Signed 8-bit integer constant.
+
J
+
Signed 16-bit integer constant.
+
K
+
Nonzero integer constant with low 16 bits zero.
+
L
+
Integer constant that fits in one signed byte when incremented by one +(-129 ... 126).
+
m
+

Memory operand. If used together with < or >, the +operand can have postincrement which requires printing with %In +and %in on TILEPro. For example:

+
asm ("swadd %I0,%1,%i0" : "=m<>" (mem) : "r" (val));
+
+
+
+
M
+
A bit mask suitable for the MM instruction.
+
N
+
Integer constant that is a byte tiled out four times.
+
O
+
The integer zero constant.
+
P
+
Integer constant that is a sign-extended byte tiled out as two shorts.
+
Q
+
Integer constant that fits in one signed byte when incremented +(-129 ... 126), but excluding -1.
+
T
+
A symbolic operand, or a 16-bit fragment of a got, tls, or pc-relative +reference.
+
U
+
Memory operand except postincrement. This is roughly the same as +m when not used together with < or >.
+
W
+
A 4-element vector constant with identical elements.
+
Y
+
A 2-element vector constant with identical elements.
+
+
+

Visium-config/visium/constraints.md

+
+
+
b
+
EAM register mdb
+
c
+
EAM register mdc
+
f
+
Floating point register
+
l
+
General register, but not r29, r30 and r31
+
t
+
Register r1
+
u
+
Register r2
+
v
+
Register r3
+
G
+
Floating-point constant 0.0
+
J
+
Integer constant in the range 0 .. 65535 (16-bit immediate)
+
K
+
Integer constant in the range 1 .. 31 (5-bit immediate)
+
L
+
Integer constant in the range -65535 .. -1 (16-bit negative immediate)
+
M
+
Integer constant -1
+
O
+
Integer constant 0
+
P
+
Integer constant 32
+
+
+

x86 family-config/i386/constraints.md

+
+
+
R
+
Legacy register-the eight integer registers available on all +i386 processors (a, b, c, d, +si, di, bp, sp).
+
q
+
Any register accessible as ``r``l. In 32-bit mode, a, +b, c, and d; in 64-bit mode, any integer register.
+
Q
+
Any register accessible as ``r``h: a, b, +c, and d.
+
a
+
The a register.
+
b
+
The b register.
+
c
+
The c register.
+
d
+
The d register.
+
S
+
The si register.
+
D
+
The di register.
+
A
+

The a and d registers. This class is used for instructions +that return double word results in the ax:dx register pair. Single +word values will be allocated either in ax or dx. +For example on i386 the following implements rdtsc:

+
unsigned long long rdtsc (void)
+{
+  unsigned long long tick;
+  __asm__ __volatile__("rdtsc":"=A"(tick));
+  return tick;
+}
+
+
+

This is not correct on x86-64 as it would allocate tick in either ax +or dx. You have to use the following variant instead:

+
unsigned long long rdtsc (void)
+{
+  unsigned int tickl, tickh;
+  __asm__ __volatile__("rdtsc":"=a"(tickl),"=d"(tickh));
+  return ((unsigned long long)tickh << 32)|tickl;
+}
+
+
+
+
f
+
Any 80387 floating-point (stack) register.
+
t
+
Top of 80387 floating-point stack (%st(0)).
+
u
+
Second from top of 80387 floating-point stack (%st(1)).
+
y
+
Any MMX register.
+
x
+
Any SSE register.
+
Yz
+
First SSE register (%xmm0).
+
I
+
Integer constant in the range 0 ... 31, for 32-bit shifts.
+
J
+
Integer constant in the range 0 ... 63, for 64-bit shifts.
+
K
+
Signed 8-bit integer constant.
+
L
+
0xFF or 0xFFFF, for andsi as a zero-extending move.
+
M
+
0, 1, 2, or 3 (shifts for the lea instruction).
+
N
+
Unsigned 8-bit integer constant (for in and out +instructions).
+
G
+
Standard 80387 floating point constant.
+
C
+
Standard SSE floating point constant.
+
e
+
32-bit signed integer constant, or a symbolic reference known +to fit that range (for immediate operands in sign-extending x86-64 +instructions).
+
Z
+
32-bit unsigned integer constant, or a symbolic reference known +to fit that range (for immediate operands in zero-extending x86-64 +instructions).
+
+
+

Xstormy16-config/stormy16/stormy16.h

+
+
+
a
+
Register r0.
+
b
+
Register r1.
+
c
+
Register r2.
+
d
+
Register r8.
+
e
+
Registers r0 through r7.
+
t
+
Registers r0 and r1.
+
y
+
The carry register.
+
z
+
Registers r8 and r9.
+
I
+
A constant between 0 and 3 inclusive.
+
J
+
A constant that has exactly one bit set.
+
K
+
A constant that has exactly one bit clear.
+
L
+
A constant between 0 and 255 inclusive.
+
M
+
A constant between -255 and 0 inclusive.
+
N
+
A constant between -3 and 0 inclusive.
+
O
+
A constant between 1 and 4 inclusive.
+
P
+
A constant between -4 and -1 inclusive.
+
Q
+
A memory reference that is a stack push.
+
R
+
A memory reference that is a stack pop.
+
S
+
A memory reference that refers to a constant address of known value.
+
T
+
The register indicated by Rx (not implemented yet).
+
U
+
A constant that is not between 2 and 15 inclusive.
+
Z
+
The constant 0.
+
+
+

Xtensa-config/xtensa/constraints.md

+
+
+
a
+
General-purpose 32-bit register
+
b
+
One-bit boolean register
+
A
+
MAC16 40-bit accumulator register
+
I
+
Signed 12-bit integer constant, for use in MOVI instructions
+
J
+
Signed 8-bit integer constant, for use in ADDI instructions
+
K
+
Integer constant valid for BccI instructions
+
L
+
Unsigned constant valid for BccUI instructions
+
+
+
+
+
+

Controlling Names Used in Assembler Code

+

You can specify the name to be used in the assembler code for a C +function or variable by writing the asm (or __asm__) +keyword after the declarator as follows:

+
int foo asm ("myfoo") = 2;
+
+
+

This specifies that the name to be used for the variable foo in +the assembler code should be myfoo rather than the usual +_foo.

+

On systems where an underscore is normally prepended to the name of a C +function or variable, this feature allows you to define names for the +linker that do not start with an underscore.

+

It does not make sense to use this feature with a non-static local +variable since such variables do not have assembler names. If you are +trying to put the variable in a particular register, see Explicit +Reg Vars. GCC presently accepts such code with a warning, but will +probably be changed to issue an error, rather than a warning, in the +future.

+

You cannot use asm in this way in a function definition; but +you can get the same effect by writing a declaration for the function +before its definition and putting asm there, like this:

+
extern func () asm ("FUNC");
+
+func (x, y)
+     int x, y;
+/* ... */
+
+
+

It is up to you to make sure that the assembler names you choose do not +conflict with any other assembler symbols. Also, you must not use a +register name; that would produce completely invalid assembler code. GCC +does not as yet have the ability to store static variables in registers. +Perhaps that will be added.

+
+
+

Variables in Specified Registers

+

GNU C allows you to put a few global variables into specified hardware +registers. You can also specify the register in which an ordinary +register variable should be allocated.

+
    +
  • Global register variables reserve registers throughout the program. +This may be useful in programs such as programming language +interpreters that have a couple of global variables that are accessed +very often.

    +
  • +
  • Local register variables in specific registers do not reserve the +registers, except at the point where they are used as input or output +operands in an asm statement and the asm statement itself is +not deleted. The compiler’s data flow analysis is capable of determining +where the specified registers contain live values, and where they are +available for other uses. Stores into local register variables may be deleted +when they appear to be dead according to dataflow analysis. References +to local register variables may be deleted or moved or simplified.

    +

    These local variables are sometimes convenient for use with the extended +asm feature (see Extended Asm - Assembler Instructions with C Expression Operands), if you want to write one +output of the assembler instruction directly into a particular register. +(This works provided the register you specify fits the constraints +specified for that operand in the asm.)

    +
  • +
+
+
    +
+
+
+

Defining Global Register Variables

+

You can define a global register variable in GNU C like this:

+
register int *foo asm ("a5");
+
+
+

Here a5 is the name of the register that should be used. Choose a +register that is normally saved and restored by function calls on your +machine, so that library routines will not clobber it.

+

Naturally the register name is CPU-dependent, so you need to +conditionalize your program according to CPU type. The register +a5 is a good choice on a 68000 for a variable of pointer +type. On machines with register windows, be sure to choose a ‘global’ +register that is not affected magically by the function call mechanism.

+

In addition, different operating systems on the same CPU may differ in how they +name the registers; then you need additional conditionals. For +example, some 68000 operating systems call this register %a5.

+

Eventually there may be a way of asking the compiler to choose a register +automatically, but first we need to figure out how it should choose and +how to enable you to guide the choice. No solution is evident.

+

Defining a global register variable in a certain register reserves that +register entirely for this use, at least within the current compilation. +The register is not allocated for any other purpose in the functions +in the current compilation, and is not saved and restored by +these functions. Stores into this register are never deleted even if they +appear to be dead, but references may be deleted or moved or +simplified.

+

It is not safe to access the global register variables from signal +handlers, or from more than one thread of control, because the system +library routines may temporarily use the register for other things (unless +you recompile them specially for the task at hand).

+

It is not safe for one function that uses a global register variable to +call another such function foo by way of a third function +lose that is compiled without knowledge of this variable (i.e. in a +different source file in which the variable isn’t declared). This is +because lose might save the register and put some other value there. +For example, you can’t expect a global register variable to be available in +the comparison-function that you pass to qsort, since qsort +might have put something else in that register. (If you are prepared to +recompile qsort with the same global register variable, you can +solve this problem.)

+

If you want to recompile qsort or other source files that do not +actually use your global register variable, so that they do not use that +register for any other purpose, then it suffices to specify the compiler +option -ffixed-``reg``. You need not actually add a global +register declaration to their source code.

+

A function that can alter the value of a global register variable cannot +safely be called from a function compiled without this variable, because it +could clobber the value the caller expects to find there on return. +Therefore, the function that is the entry point into the part of the +program that uses the global register variable must explicitly save and +restore the value that belongs to its caller.

+

On most machines, longjmp restores to each global register +variable the value it had at the time of the setjmp. On some +machines, however, longjmp does not change the value of global +register variables. To be portable, the function that called setjmp +should make other arrangements to save the values of the global register +variables, and to restore them in a longjmp. This way, the same +thing happens regardless of what longjmp does.

+

All global register variable declarations must precede all function +definitions. If such a declaration could appear after function +definitions, the declaration would be too late to prevent the register from +being used for other purposes in the preceding functions.

+

Global register variables may not have initial values, because an +executable file has no means to supply initial contents for a register.

+

On the SPARC, there are reports that g3 ... g7 are suitable +registers, but certain library functions, such as getwd, as well +as the subroutines for division and remainder, modify g3 and g4. g1 and +g2 are local temporaries.

+

On the 68000, a2 ... a5 should be suitable, as should d2 ... d7. +Of course, it does not do to use more than a few of those.

+
+
+

Specifying Registers for Local Variables

+

You can define a local register variable with a specified register +like this:

+
register int *foo asm ("a5");
+
+
+

Here a5 is the name of the register that should be used. Note +that this is the same syntax used for defining global register +variables, but for a local variable it appears within a function.

+

Naturally the register name is CPU-dependent, but this is not a +problem, since specific registers are most often useful with explicit +assembler instructions (see Extended Asm - Assembler Instructions with C Expression Operands). Both of these things +generally require that you conditionalize your program according to +CPU type.

+

In addition, operating systems on one type of CPU may differ in how they +name the registers; then you need additional conditionals. For +example, some 68000 operating systems call this register %a5.

+

Defining such a register variable does not reserve the register; it +remains available for other uses in places where flow control determines +the variable’s value is not live.

+

This option does not guarantee that GCC generates code that has +this variable in the register you specify at all times. You may not +code an explicit reference to this register in the assembler +instruction template part of an asm statement and assume it +always refers to this variable. +However, using the variable as an input or output operand to the asm +guarantees that the specified register is used for that operand. +See Extended Asm - Assembler Instructions with C Expression Operands, for more information.

+

Stores into local register variables may be deleted when they appear to be dead +according to dataflow analysis. References to local register variables may +be deleted or moved or simplified.

+

As with global register variables, it is recommended that you choose a +register that is normally saved and restored by function calls on +your machine, so that library routines will not clobber it.

+

Sometimes when writing inline asm code, you need to make an operand be a +specific register, but there’s no matching constraint letter for that +register. To force the operand into that register, create a local variable +and specify the register in the variable’s declaration. Then use the local +variable for the asm operand and specify any constraint letter that matches +the register:

+
register int *p1 asm ("r0") = ...;
+register int *p2 asm ("r1") = ...;
+register int *result asm ("r0");
+asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
+
+
+

Warning: In the above example, be aware that a register (for example r0) can be +call-clobbered by subsequent code, including function calls and library calls +for arithmetic operators on other variables (for example the initialization +of p2). In this case, use temporary variables for expressions between the +register assignments:

+
int t1 = ...;
+register int *p1 asm ("r0") = ...;
+register int *p2 asm ("r1") = t1;
+register int *result asm ("r0");
+asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
+
+
+

Size of an asm``Some targets require that GCC track the size of each instruction used +in order to generate correct code.  Because the final length of the +code produced by an ``asm statement is only known by the +assembler, GCC must make an estimate as to how big it will be. It +does this by counting the number of instructions in the pattern of the +asm and multiplying that by the length of the longest +instruction supported by that processor. (When working out the number +of instructions, it assumes that any occurrence of a newline or of +whatever statement separator character is supported by the assembler - +typically ; - indicates the end of an instruction.)

+

Normally, GCC’s estimate is adequate to ensure that correct +code is generated, but it is possible to confuse the compiler if you use +pseudo instructions or assembler macros that expand into multiple real +instructions, or if you use assembler directives that expand to more +space in the object file than is needed for a single instruction. +If this happens then the assembler may produce a diagnostic saying that +a label is unreachable.

+
+
+
+ + +
+
+
+
+
+ + + + \ No newline at end of file diff --git a/docs/index.html b/docs/index.html index 6c1c812f8..10f988153 100644 --- a/docs/index.html +++ b/docs/index.html @@ -1 +1 @@ -docs/index.html
2c3c telltale_heart.txt
75e2 iota.html
4558 library_of_babel.txt
???? kharms.txt
2f15 cosmic.htm
0acb nadsat.txt
118a grooks.txt
5b20 nothing_but_the_fruit.txt
4067 murderer.txt
f2f2 meta2.html
0302 o_where.txt
7814 economy_and_pleasure.txt
2f7f soft_rains.txt
ba3f individualism.txt
3bbd silence.txt
0373 golden_apples_of_the_sun.txt
0370 do_not_go_gentle.txt
???? machine_stops.txt
018e damoyselle_malade.txt
5a8d dssp.txt
1558 sybilline_books.txt
3988 great_forgetting.txt
???? combinatory.html
75e6 situated_software.html
0875 businessman.txt
aca2 joy_combinators.html
???? memo444.html
800b unknowable.html
???? autobiographical_notes.txt
54cc aristasian.txt
e44e joy_rewriting.html
14f5 weavers.txt
17b8 egg.txt
4e97 alchemy.html
???? memo1564.html
???? planetary.html
???? walking.txt
8678 interview.htm
0a05 the_garden.txt
???? memo528_cadr.html
356e second_narcissus.txt
5715 folie_du_jour.txt
6308 last_question.txt
3cd4 omelas.txt
1584 stonecutter.txt
d3c8 tiny_basic.txt
???? baker_thermodynamics.html
c553 joy_math.html
b594 as_we_may_think.txt
0b31 index.html
\ No newline at end of file +docs/index.html
2c3c telltale_heart.txt
75e2 iota.html
4558 library_of_babel.txt
???? kharms.txt
2f15 cosmic.htm
0acb nadsat.txt
118a grooks.txt
5b20 nothing_but_the_fruit.txt
bcff x86.html
???? asmc.html
4067 murderer.txt
f2f2 meta2.html
0302 o_where.txt
7814 economy_and_pleasure.txt
2f7f soft_rains.txt
ba3f individualism.txt
3bbd silence.txt
0373 golden_apples_of_the_sun.txt
0370 do_not_go_gentle.txt
???? machine_stops.txt
018e damoyselle_malade.txt
5a8d dssp.txt
1558 sybilline_books.txt
3988 great_forgetting.txt
???? combinatory.html
75e6 situated_software.html
0875 businessman.txt
aca2 joy_combinators.html
???? memo444.html
800b unknowable.html
???? autobiographical_notes.txt
54cc aristasian.txt
e44e joy_rewriting.html
14f5 weavers.txt
17b8 egg.txt
4e97 alchemy.html
???? memo1564.html
???? planetary.html
???? walking.txt
8678 interview.htm
0a05 the_garden.txt
???? memo528_cadr.html
356e second_narcissus.txt
5715 folie_du_jour.txt
6308 last_question.txt
3cd4 omelas.txt
1584 stonecutter.txt
d3c8 tiny_basic.txt
???? baker_thermodynamics.html
c553 joy_math.html
b594 as_we_may_think.txt
0b85 index.html
\ No newline at end of file diff --git a/docs/x86.html b/docs/x86.html new file mode 100644 index 000000000..718bd7445 --- /dev/null +++ b/docs/x86.html @@ -0,0 +1,1247 @@ + + + + + + + + + + + + + + Guide to x86 Assembly + + + + + + + + + +
+ +University of Virginia Computer Science
+CS216: Program and Data Representation, Spring 2006

+
+ + +
+ 08 March 2022
+
+ + +
+ + + + + + +

x86 Assembly Guide

+

+Contents: Registers | Memory and +Addressing | Instructions | Calling Convention +

+

+ +This guide describes the basics of 32-bit x86 assembly language +programming, covering a small but useful subset of the available +instructions and assembler directives. There are several different +assembly languages for generating x86 machine code. The one we will use +in CS216 is the Microsoft Macro Assembler (MASM) assembler. MASM uses +the standard Intel syntax for writing x86 assembly code. + +

+ +The full x86 instruction set is large and complex (Intel's x86 +instruction set manuals comprise over 2900 pages), and we do not cover +it all in this guide. For example, there is a 16-bit subset of the x86 +instruction set. Using the 16-bit programming model can be quite +complex. It has a segmented memory model, more restrictions on register +usage, and so on. In this guide, we will limit our attention to more +modern aspects of x86 programming, and delve into the instruction set +only in enough detail to get a basic feel for x86 programming. +

+

Resources

+ + + +

+ +

Registers

+

+ +Modern (i.e 386 and beyond) x86 processors have eight 32-bit general +purpose registers, as depicted in Figure 1. The register names are +mostly historical. For example, EAX used to be called the +accumulator since it was used by a number of arithmetic operations, and +ECX was known as the counter since it was used to hold a loop +index. Whereas most of the registers have lost their special purposes in +the modern instruction set, by convention, two are reserved for special +purposes — the stack pointer (ESP) and the base pointer +(EBP). +

+ For the EAX, EBX, ECX, and +EDX registers, subsections may be used. For example, the least +significant 2 bytes of EAX can be treated as a 16-bit register +called AX. The least significant byte of AX can be +used as a single 8-bit register called AL, while the most +significant byte of AX can be used as a single 8-bit register +called AH. These names refer to the same physical +register. When a two-byte quantity is placed into DX, the +update affects the value of DH, DL, and +EDX. These sub-registers are mainly hold-overs from older, +16-bit versions of the instruction set. However, they are sometimes +convenient when dealing with data that are smaller than 32-bits +(e.g. 1-byte ASCII characters). +

+When referring to registers in assembly +language, the names are not case-sensitive. For example, the names +EAX and eax refer to the same register.

+

+

+
+
+Figure 1. x86 Registers +
+ +

Memory and Addressing Modes

+ +

Declaring Static Data Regions

+ +You can declare static data regions (analogous to global variables) in +x86 assembly using special assembler directives for this purpose. Data +declarations should be preceded by the .DATA +directive. Following this directive, the directives DB, DW, and DD can be used to declare one, two, and four byte +data locations, respectively. Declared locations can be labeled with +names for later reference — this is similar to declaring variables by +name, but abides by some lower level rules. For example, locations +declared in sequence will be located in memory next to one another. +

+Example declarations: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
.DATA
varDB 64   +; Declare a byte, referred to as location var, containing the value 64.
var2DB ? +; Declare an uninitialized byte, referred to as location var2. + +
DB 10 +; Declare a byte with no label, containing the value 10. +Its location is var2 + 1. +
XDW ?; Declare +a 2-byte uninitialized value, referred to as location X. + +
YDD 30000    ; Declare a 4-byte value, referred to as +location Y, initialized to 30000. +
+
+ +

+Unlike in high level languages where arrays can have many dimensions and +are accessed by indices, arrays in x86 assembly language are simply a +number of cells located contiguously in memory. An array can be declared +by just listing the values, as in the first example below. Two other +common methods used for declaring arrays of data are the DUP directive and the use of string literals. The +DUP directive tells the assembler to duplicate an +expression a given number of times. For example, 4 DUP(2) is equivalent to 2, 2, 2, +2. +

+Some examples:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + +
ZDD 1, 2, 3; Declare three 4-byte values, initialized to 1, +2, and 3. The value of location Z + 8 will be 3. +
+bytes   +DB 10 DUP(?) + +; Declare 10 uninitialized bytes starting at +location bytes. + +
arr +DD 100 DUP(0)     +; Declare 100 4-byte words starting at location +arr, +all initialized to 0
strDB 'hello',0; Declare 6 bytes starting at the address str, +initialized to the ASCII character values +for hello and the null (0) +byte. +
+
+ +

Addressing Memory

+ +Modern x86-compatible processors are capable of addressing up to +232 bytes of memory: memory addresses are 32-bits wide. In +the examples above, where we used labels to refer to memory regions, +these labels are actually replaced by the assembler with 32-bit +quantities that specify addresses in memory. In addition to supporting +referring to memory regions by labels (i.e. constant values), the x86 +provides a flexible scheme for computing and referring to memory +addresses: up to two of the 32-bit registers and a 32-bit signed +constant can be added together to compute a memory address. One of the +registers can be optionally pre-multiplied by 2, 4, or 8.

+ +

+ +The addressing modes can be used with many x86 instructions +(we'll describe them in the next section). Here we illustrate some examples +using the mov instruction that moves data +between registers and memory. This instruction has two operands: the +first is the destination and the second specifies the source. + +

+ +Some examples of mov instructions +using address computations are:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+mov eax, [ebx]; +Move the 4 bytes in memory at the address contained in EBX into +EAX
+mov [var], ebx + +; Move the contents of EBX into the 4 bytes at +memory address var. (Note, var is a 32-bit +constant). +
+mov eax, [esi-4] +; Move 4 bytes at memory address +ESI + (-4) into EAX +
+mov [esi+eax], cl + +; Move the contents of CL into the +byte at address ESI+EAX +
+mov edx, [esi+4*ebx]     +; Move the 4 bytes of data at address ESI+4*EBX into EDX +
+
+ +Some examples of invalid address calculations include: +

+ +
+ + + + + + + + + + +
+mov eax, [ebx-ecx] +; Can only add register +values
+mov [eax+esi+edi], ebx     + +; At most 2 registers in address +computation +
+
+ +

Size Directives

+ +In general, the intended size of the data item at a given memory +address can be inferred from the assembly code instruction in which it +is referenced. For example, in all of the above instructions, the size +of the memory regions could be inferred from the size of the register +operand. When we were loading a 32-bit register, the assembler could +infer that the region of memory we were referring to was 4 bytes +wide. When we were storing the value of a one byte register to memory, +the assembler could infer that we wanted the address to refer to a +single byte in memory. + +

+ +However, in some cases the size of a referred-to memory region is +ambiguous. Consider the instruction mov [ebx], +2. Should this instruction move the value 2 into the +single byte at address EBX? Perhaps +it should move the 32-bit integer representation of 2 into the 4-bytes +starting at address EBX. Since either +is a valid possible interpretation, the assembler must be explicitly +directed as to which is correct. The size directives BYTE PTR, WORD +PTR, and DWORD PTR serve this purpose, +indicating sizes of 1, 2, and 4 bytes respectively. +

+For example: +
+ + + + + + + + + + + + + + + +
+mov BYTE PTR [ebx], 2; Move 2 into the single byte at the address +stored in EBX. +
+mov WORD PTR [ebx], 2; Move the 16-bit integer representation +of 2 into the 2 bytes starting at the address in EBX. + +
+mov DWORD PTR [ebx], 2     + +; Move the 32-bit integer representation of 2 into the +4 bytes starting at the address in EBX. + +
+
+ +

Instructions

+ +Machine instructions generally fall into three categories: data +movement, arithmetic/logic, and control-flow. In this section, we will +look at important examples of x86 instructions from each category. This +section should not be considered an exhaustive list of x86 instructions, +but rather a useful subset. For a complete list, see Intel's +instruction set reference. +

+We use the following notation:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+<reg32>    Any +32-bit register (EAX, +EBX, +ECX, +EDX, +ESI, +EDI, +ESP, or +EBP) +
+<reg16>Any +16-bit register (AX, +BX, +CX, or +DX) +
+<reg8>Any +8-bit register (AH, +BH, +CH, +DH, +AL, +BL, +CL, or +DL) +
+<reg>Any register
+<mem>A memory address (e.g., [eax], [var + 4], or +dword ptr [eax+ebx]) +
+<con32>Any 32-bit constant
+<con16>Any 16-bit constant
+<con8>Any 8-bit constant
+<con>Any 8-, 16-, or 32-bit constant
+
+ +

Data Movement Instructions

+ +mov — Move (Opcodes: 88, 89, 8A, +8B, 8C, 8E, ...) +
+The mov instruction copies the data item referred to by +its second operand (i.e. register contents, memory contents, or a constant +value) into the location referred to by its first operand (i.e. a register or +memory). While register-to-register moves are possible, direct memory-to-memory +moves are not. In cases where memory transfers are desired, the source memory +contents must first be loaded into a register, then can be stored to the +destination memory address.

+

+Syntax
+mov <reg>,<reg>
+mov <reg>,<mem>
+mov <mem>,<reg>
+mov <reg>,<const>
+mov <mem>,<const>
+

+Examples
+mov eax, ebx — copy the value in ebx into eax
+mov byte ptr [var], 5 — store the value 5 into the +byte at location var
+

+

+push — Push stack (Opcodes: +FF, 89, 8A, 8B, 8C, 8E, ...) +
+The push instruction places its operand onto +the top of the hardware supported stack in memory. Specifically, push first decrements ESP by 4, then places its +operand into the contents of the 32-bit location at address [ESP]. ESP +(the stack pointer) is decremented by push since the x86 stack grows +down - i.e. the stack grows from high addresses to lower addresses. +

+Syntax
+push <reg32>
+push <mem>
+push <con32> +

+Examples
+push eax — push eax on the stack
+push [var] — push the 4 bytes at +address var onto the stack

+
+ +pop — Pop stack +
+The pop instruction removes the 4-byte data +element from the top of the hardware-supported stack into the specified +operand (i.e. register or memory location). It first moves the 4 bytes +located at memory location [SP] into the +specified register or memory location, and then increments SP by 4. +

+Syntax
+pop <reg32>
+pop <mem> +

+Examples
+pop edi — pop the top element of the stack into EDI.
+pop [ebx] — pop the top element of the +stack into memory at the four bytes starting at location EBX. +
+ +lea — Load effective address +
+The lea instruction places the address specified by its second operand +into the register specified by its first operand. Note, the contents of the memory location are not +loaded, only the effective address is computed and placed into the register. +This is useful for obtaining a pointer into a memory region.

+

+Syntax
+lea <reg32>,<mem> +

+Examples
+lea edi, [ebx+4*esi] — the quantity EBX+4*ESI is placed in EDI.
+lea eax, [var] — the value in var is placed in EAX.
+lea eax, [val] — the value val is placed in EAX. +

+ +

Arithmetic and Logic Instructions

+ +add — Integer Addition +
+ +The add instruction adds +together its two operands, storing the result in its first +operand. Note, whereas both operands may be registers, at most one +operand may be a memory location. +

+ +Syntax
+add <reg>,<reg>
+add <reg>,<mem>
+add <mem>,<reg>
+add <reg>,<con>
+add <mem>,<con>
+

+Examples
+add eax, 10 — EAX ← EAX + 10
+add BYTE PTR [var], 10 — add 10 to the +single byte stored at memory address var +
+ +sub — Integer Subtraction +
+ +The sub instruction stores in the value of +its first operand the result of subtracting the value of its second +operand from the value of its first operand. As with add

+ +Syntax
+sub <reg>,<reg>
+sub <reg>,<mem>
+sub <mem>,<reg>
+sub <reg>,<con>
+sub <mem>,<con>
+

+Examples
+sub al, ah — AL ← AL - AH
+sub eax, 216 — subtract 216 from the +value stored in EAX +
+ +inc, dec — Increment, Decrement +
+The inc instruction increments +the contents of its operand by one. The dec +instruction decrements the contents of its operand by one.

+

+ +Syntax
+inc <reg>
+inc <mem>
+dec <reg>
+dec <mem> +

+Examples
+dec eax — subtract one from the contents of EAX.
+inc DWORD PTR [var] — add one to the +32-bit integer stored at location var +

+ +imul — Integer Multiplication +
+The imul instruction has two basic formats: +two-operand (first two syntax listings above) and three-operand (last +two syntax listings above). +

+The two-operand form multiplies its two operands together and stores the result +in the first operand. The result (i.e. first) operand must be a +register. +

+The three operand form multiplies its second and third operands together +and stores the result in its first operand. Again, the result operand +must be a register. Furthermore, the third operand is restricted to +being a constant value. +

+Syntax
+imul <reg32>,<reg32>
+imul <reg32>,<mem>
+imul <reg32>,<reg32>,<con>
+imul <reg32>,<mem>,<con> +

+Examples
+

+imul eax, [var] — multiply the contents +of EAX by the 32-bit contents of the memory location var. Store +the result in EAX. +
+
+imul esi, edi, 25 — ESI → EDI * 25 +
+
+ +idiv — Integer Division +
+ +The idiv instruction divides the +contents of the 64 bit integer EDX:EAX (constructed by viewing EDX as +the most significant four bytes and EAX as the least significant four +bytes) by the specified operand value. The quotient result of the +division is stored into EAX, while the remainder is placed in EDX. +

+Syntax
+idiv <reg32>
+idiv <mem> +

+Examples +

+idiv ebx — divide the contents of +EDX:EAX by the contents of EBX. Place the quotient in EAX and the +remainder in EDX.
+
idiv DWORD PTR [var] — divide the +contents of EDX:EAX by the 32-bit value stored at memory location +var. Place the quotient in EAX and the remainder in EDX.
+
+and, or, xor — Bitwise logical +and, or and exclusive or +
+These instructions perform the specified logical operation (logical +bitwise and, or, and exclusive or, respectively) on their operands, placing the +result in the first operand location. +

+Syntax
+and <reg>,<reg>
+and <reg>,<mem>
+and <mem>,<reg>
+and <reg>,<con>
+and <mem>,<con>
+

+or <reg>,<reg>
+or <reg>,<mem>
+or <mem>,<reg>
+or <reg>,<con>
+or <mem>,<con>
+

+xor <reg>,<reg>
+xor <reg>,<mem>
+xor <mem>,<reg>
+xor <reg>,<con>
+xor <mem>,<con>
+

+Examples
+and eax, 0fH — clear all but the last 4 +bits of EAX.
+xor edx, edx — set the contents of EDX +to zero. +

+ +not — Bitwise Logical Not +
+Logically negates the operand contents (that is, flips all bit values in +the operand). +

+ +Syntax
+not <reg>
+not <mem> +

+Example
+not BYTE PTR [var] — negate all bits in the byte +at the memory location var. +

+

+neg — Negate +
+Performs the two's complement negation of the operand contents. +

+Syntax
+neg <reg>
+neg <mem> +

+Example
+neg eax — EAX → - EAX +

+ +shl, shr — Shift Left, Shift +Right +
+ +These instructions shift the bits in their first operand's contents +left and right, padding the resulting empty bit +positions with zeros. The shifted operand can be shifted up to 31 places. The +number of bits to shift is specified by the second operand, which can be +either an 8-bit constant or the register CL. In either case, shifts counts of +greater then 31 are performed modulo 32. +

+Syntax
+shl <reg>,<con8>
+shl <mem>,<con8>
+shl <reg>,<cl>
+shl <mem>,<cl> +

+shr <reg>,<con8>
+shr <mem>,<con8>
+shr <reg>,<cl>
+shr <mem>,<cl> +

+Examples
+

shl eax, 1 — Multiply the value of EAX +by 2 (if the most significant bit is 0)
+
shr ebx, cl — Store in EBX the floor of result of dividing the value of EBX +by 2n wheren is the value in CL.
+
+ +

Control Flow Instructions

+ +The x86 processor maintains an instruction pointer (IP) register that is +a 32-bit value indicating the location in memory where the current +instruction starts. Normally, it increments to point to the next +instruction in memory begins after execution an instruction. The IP +register cannot be manipulated directly, but is updated implicitly by +provided control flow instructions. +

+We use the notation <label> to refer to +labeled locations in the program text. Labels can be inserted anywhere +in x86 assembly code text by entering a label +name followed by a colon. For example, +

+
+       mov esi, [ebp+8]
+begin: xor ecx, ecx
+       mov eax, [esi]
+
+
+The second instruction in this code fragment is labeled begin. Elsewhere in the code, we can refer to the +memory location that this instruction is located at in memory using the +more convenient symbolic name begin. This +label is just a convenient way of expressing the location instead of its +32-bit value. +

+jmp — Jump +

+Transfers program control flow to the instruction at the memory +location indicated by the operand. +

+Syntax
+jmp <label> +

+Example
+jmp begin — Jump to the instruction +labeled begin. +

+jcondition — +Conditional Jump +
+ +These instructions are conditional jumps that are based on the status of +a set of condition codes that are stored in a special register called +the machine status word. The contents of the machine status +word include information about the last arithmetic operation +performed. For example, one bit of this word indicates if the last +result was zero. Another indicates if the last result was +negative. Based on these condition codes, a number of conditional jumps +can be performed. For example, the jz +instruction performs a jump to the specified operand label if the result +of the last arithmetic operation was zero. Otherwise, control proceeds +to the next instruction in sequence. +

+A number of the conditional branches are given names that are +intuitively based on the last operation performed being a special +compare instruction, cmp (see below). For example, conditional branches +such as jle and jne are based on first performing a cmp operation +on the desired operands. + +

+Syntax
+je <label> (jump when equal)
+jne <label> (jump when not equal)
+jz <label> (jump when last result was zero)
+jg <label> (jump when greater than)
+jge <label> (jump when greater than or equal to)
+jl <label> (jump when less than)
+jle <label> (jump when less than or equal to) +

+Example
+cmp eax, ebx
+jle done
+

+If the contents of EAX are less than or equal to the contents of EBX, +jump to the label done. Otherwise, continue to the next +instruction. +
+
+ +cmp — Compare +
+Compare the values of the two specified operands, setting the condition +codes in the machine status word appropriately. This instruction is +equivalent to the sub instruction, except the +result of the subtraction is discarded instead of replacing the first +operand. +

+Syntax
+cmp <reg>,<reg>
+cmp <reg>,<mem>
+cmp <mem>,<reg>
+cmp <reg>,<con> +

+Example
+cmp DWORD PTR [var], 10
+jeq loop
+

+If the 4 bytes stored at location var are equal to the 4-byte +integer constant 10, jump to the location labeled loop. +
+
+ +call, ret — Subroutine +call and return +
+These instructions implement a subroutine call and return. +The call instruction first pushes the current +code location onto the hardware supported stack in memory (see the push instruction for details), and then performs +an unconditional jump to the code location indicated by the label +operand. Unlike the simple jump instructions, the call instruction saves the location to return to +when the subroutine completes. +

+The ret instruction implements a subroutine +return mechanism. This instruction first pops a code location off the +hardware supported in-memory stack (see the pop instruction for details). It then performs an +unconditional jump to the retrieved code location. + +

+Syntax
+call <label>
+ret +

+ +

Calling Convention

+ +To allow separate programmers to share code and develop libraries for +use by many programs, and to simplify the use of subroutines in general, +programmers typically adopt a common calling convention. The +calling convention is a protocol about how to call and return from +routines. For example, given a set of calling convention rules, a +programmer need not examine the definition of a subroutine to determine +how parameters should be passed to that subroutine. Furthermore, given a +set of calling convention rules, high-level language compilers can be +made to follow the rules, thus allowing hand-coded assembly language +routines and high-level language routines to call one another.

+ +In practice, many calling conventions are possible. We will use the +widely used C language calling convention. Following this convention +will allow you to write assembly language subroutines that are safely +callable from C (and C++) code, and will also enable you to call C +library functions from your assembly language code. + +

+ +The C calling convention is based heavily on the use of the +hardware-supported stack. It is based on the push, pop, call, and ret +instructions. Subroutine parameters are passed on the stack. Registers +are saved on the stack, and local variables used by subroutines are +placed in memory on the stack. The vast majority of high-level +procedural languages implemented on most processors have used similar +calling conventions. + +

+ +The calling convention is broken into two sets of rules. The first set +of rules is employed by the caller of the subroutine, and the second set +of rules is observed by the writer of the subroutine (the callee). It +should be emphasized that mistakes in the observance of these rules +quickly result in fatal program errors since the stack will be left in +an inconsistent state; thus meticulous care should be used when +implementing the call convention in your own subroutines. +

+
+>
+Stack during Subroutine Call
[Thanks to +Maxence Faldor for providing a correct figure and to James Peterson for finding and fixing the bug in +the original version of this figure!] +
+

+A good way to visualize the operation of the calling convention is to +draw the contents of the nearby region of the stack during subroutine +execution. The image above depicts the contents of the stack during the +execution of a subroutine with three parameters and three local +variables. The cells depicted in the stack +are 32-bit wide memory locations, thus the memory addresses of the cells +are 4 bytes apart. The first +parameter resides at an offset of 8 bytes from the base pointer. Above +the parameters on the stack (and below the base pointer), the call instruction placed the return address, thus +leading to an extra 4 bytes of offset from the base pointer to the first +parameter. When the ret instruction is used +to return from the subroutine, it will jump to the return address stored +on the stack. + + +

Caller Rules

+ +To make a subrouting call, the caller should: +
    +
  1. Before calling a subroutine, the caller should +save the contents of certain registers that are designated +caller-saved. The caller-saved registers are EAX, ECX, EDX. +Since the called subroutine is allowed to modify these registers, if the +caller relies on their values after the subroutine returns, the caller +must push the values in these registers onto the stack (so they can be +restore after the subroutine returns. + +
  2. To pass parameters to the subroutine, push them onto the stack +before the call. The parameters should be pushed in inverted order +(i.e. last parameter first). Since the stack grows down, the first +parameter will be stored at the lowest address (this inversion of +parameters was historically used to allow functions to be passed a +variable number of parameters). + +
  3. To call the subroutine, use the call +instruction. This instruction places the return address on top of the +parameters on the stack, and branches to the subroutine code. This +invokes the subroutine, which should follow the callee rules below. +
+ +After the subroutine returns (immediately following the call instruction), the caller can expect to find +the return value of the subroutine in the register EAX. To restore the +machine state, the caller should: + +
    +
  1. Remove the parameters from stack. This restores the stack to its +state before the call was performed. + +
  2. Restore the contents of caller-saved registers (EAX, ECX, EDX) by +popping them off of the stack. The caller can assume that no other +registers were modified by the subroutine. +
+ +Example +
+The code below shows a function call that follows the caller rules. The +caller is calling a function _myFunc that takes three integer +parameters. First parameter is in EAX, the second parameter is the +constant 216; the third parameter is in memory location var. +
+
+push [var] ; Push last parameter first
+push 216   ; Push the second parameter
+push eax   ; Push first parameter last
+
+call _myFunc ; Call the function (assume C naming)
+
+add esp, 12
+
+
+ +Note that after the call returns, the caller cleans up the stack using +the add instruction. We have 12 bytes (3 +parameters * 4 bytes each) on the stack, and the stack grows down. Thus, +to get rid of the parameters, we can simply add 12 to the stack pointer. +

+The result produced by _myFunc is now available for use in the +register EAX. The values of the caller-saved registers (ECX and EDX), +may have been changed. If the caller uses them after the call, it would +have needed to save them on the stack before the call and restore them +after it. + + + +

Callee Rules

+ +The definition of the subroutine should adhere to the following rules at +the beginning of the subroutine: +
    +
  1. Push the value of EBP onto the stack, and then copy the value of ESP +into EBP using the following instructions: +
    +    push ebp
    +    mov  ebp, esp
    +
    + +This initial action maintains the base pointer, EBP. The base +pointer is used by convention as a point of reference for finding +parameters and local variables on the stack. When a subroutine is +executing, the base pointer holds a copy of the stack pointer value from +when the subroutine started executing. Parameters and local variables +will always be located at known, constant offsets away from the base +pointer value. We push the old base pointer value at the beginning of +the subroutine so that we can later restore the appropriate base pointer +value for the caller when the subroutine returns. Remember, the caller +is not expecting the subroutine to change the value of the base +pointer. We then move the stack pointer into EBP to obtain our point of +reference for accessing parameters and local variables. + +
  2. Next, allocate local variables by making space on the +stack. Recall, the stack grows down, so to make space on the top of the +stack, the stack pointer should be decremented. The amount by which the stack +pointer is decremented depends on the number and size of local variables +needed. For example, if 3 local integers (4 bytes each) were required, +the stack pointer would need to be decremented by 12 to make space for +these local variables (i.e., sub esp, 12). +As with parameters, local variables will be located at known offsets +from the base pointer.

    + +
  3. Next, save the values of the callee-saved registers that +will be used by the function. To save registers, push them onto the +stack. The callee-saved registers are EBX, EDI, and ESI (ESP and EBP +will also be preserved by the calling convention, but need not be pushed +on the stack during this step). +
+ +After these three actions are performed, the body of the +subroutine may proceed. When the subroutine is returns, it must follow +these steps: +
    +
  1. Leave the return value in EAX.

    +
  2. Restore the old values of any callee-saved registers (EDI and ESI) +that were modified. The register contents are restored by popping them +from the stack. The registers should be popped in the inverse +order that they were pushed. +
  3. Deallocate local variables. The obvious way to do this might be to +add the appropriate value to the stack pointer (since the space was +allocated by subtracting the needed amount from the stack pointer). In +practice, a less error-prone way to deallocate the variables is to +move the value in the base pointer into the stack pointer: mov esp, ebp. This works because the +base pointer always contains the value that the stack pointer contained immediately +prior to the allocation of the local variables. +
  4. Immediately before returning, restore the caller's base pointer +value by popping EBP off the stack. Recall that the first thing we did on +entry to the subroutine was to push the base pointer to save its old +value. +
  5. Finally, return to the caller by executing a ret instruction. This instruction will find and +remove the appropriate return address from the stack. +
+ +Note that the callee's rules fall cleanly into two halves that are +basically mirror images of one another. The first half of the rules +apply to the beginning of the function, and are commonly said +to define the prologue to the function. The latter half of the +rules apply to the end of the function, and are thus commonly said to +define the epilogue of the function.

+ +Example
+ +Here is an example function definition that follows the callee rules: +
+
+.486
+.MODEL FLAT
+.CODE
+PUBLIC _myFunc
+_myFunc PROC
+  ; Subroutine Prologue
+  push ebp     ; Save the old base pointer value.
+  mov ebp, esp ; Set the new base pointer value.
+  sub esp, 4   ; Make room for one 4-byte local variable.
+  push edi     ; Save the values of registers that the function
+  push esi     ; will modify. This function uses EDI and ESI.
+  ; (no need to save EBX, EBP, or ESP)
+
+  ; Subroutine Body
+  mov eax, [ebp+8]   ; Move value of parameter 1 into EAX
+  mov esi, [ebp+12]  ; Move value of parameter 2 into ESI
+  mov edi, [ebp+16]  ; Move value of parameter 3 into EDI
+
+  mov [ebp-4], edi   ; Move EDI into the local variable
+  add [ebp-4], esi   ; Add ESI into the local variable
+  add eax, [ebp-4]   ; Add the contents of the local variable
+                     ; into EAX (final result)
+
+  ; Subroutine Epilogue 
+  pop esi      ; Recover register values
+  pop  edi
+  mov esp, ebp ; Deallocate local variables
+  pop ebp ; Restore the caller's base pointer value
+  ret
+_myFunc ENDP
+END
+
+
+ +The subroutine prologue performs the standard actions of saving a +snapshot of the stack pointer in EBP (the base pointer), allocating +local variables by decrementing the stack pointer, and saving register +values on the stack. +

+ +In the body of the subroutine we can see the use of the base +pointer. Both parameters and local variables are located at constant +offsets from the base pointer for the duration of the subroutines +execution. In particular, we notice that since parameters were placed +onto the stack before the subroutine was called, they are always located +below the base pointer (i.e. at higher addresses) on the stack. The +first parameter to the subroutine can always be found at memory location +EBP + 8, the second at EBP + 12, the third at EBP + 16. Similarly, +since local variables are allocated after the base pointer is set, they +always reside above the base pointer (i.e. at lower addresses) on the +stack. In particular, the first local variable is always located at +EBP - 4, the second at EBP - 8, and so on. This conventional use of the +base pointer allows us to quickly identify the use of local variables +and parameters within a function body. + +

+ +The function epilogue is basically a mirror image of the function +prologue. The caller's register values are recovered from the stack, +the local variables are deallocated by resetting the stack pointer, the +caller's base pointer value is recovered, and the ret instruction is +used to return to the appropriate code location in the caller. + +

+

+ +

Using these Materials

+ +These materials are released under a +Creative + Commons Attribution-Noncommercial-Share Alike 3.0 United States + License. We are delighted when people want to use or adapt the + course materials we developed, and you are welcome to reuse and adapt + these materials for any non-commercial purposes (if you would like to + use them for a commercial purpose, please + contact David Evans + for more information). + +If you do adapt or use these materials, please include a credit like +"Adapted from materials developed for University of Virginia cs216 by +David Evans. This guide was revised for cs216 by David Evans, based on +materials originally created by Adam Ferrari many years ago, and since +updated by Alan Batson, Mike Lack, and Anita Jones." and a link back to +this page. + +
+ + + + + + +
+ +CS216: Program and Data Representation
+University of Virginia
+
+ +David Evans
+evans@cs.virginia.edu
+Using these Materials +
+
+ + + + + + + + + + + + + diff --git a/media/refs/x86regs.svg b/media/refs/x86regs.svg new file mode 100644 index 000000000..3f8e41be0 --- /dev/null +++ b/media/refs/x86regs.svg @@ -0,0 +1,596 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/site/2023.html b/site/2023.html index 7507174c8..0990f75dd 100644 --- a/site/2023.html +++ b/site/2023.html @@ -10,6 +10,7 @@

2023-08-06 Lambdas

So, the first pass of review for Wiktopher is done.

Implementing the recent changes to Varvara in Oquonie, I noticed how many single-purpose labels I used merely to hop over short lengths of code, enough that having ran of ideas for names to called them, I would default to things such as &skip, or &continue. The solution was to create anonymous labels, and as to be capable of nesting them, I ended up inadvertently adding lambdas to Uxntal which has drastically improve code readability, and as a side effect allowed for the rapid creation of tree data-structures.

diff --git a/site/assembly.html b/site/assembly.html index 307f4a098..516ae2d8a 100644 --- a/site/assembly.html +++ b/site/assembly.html @@ -2,7 +2,7 @@ XXIIVV — assembly
XXIIVV
-
6502 Development in Acme
6502 Development in Acme

6502 Assembly is the language used to program the Famicom, BBC Micro and Commodore 64 computers.

+
6502 Development in Acme
6502 Development in Acme

6502 Assembly is the language used to program the Famicom, BBC Micro and Commodore 64 computers.

Assembly is any low-level programming language in which there is a very strong correspondence between the instructions in the language and the architecture's machine code instructions. An assembler translates the assembly language syntax into their numerical equivalents.

diff --git a/site/bicycle.html b/site/bicycle.html index 2692595b0..260e09607 100644 --- a/site/bicycle.html +++ b/site/bicycle.html @@ -2,7 +2,7 @@ XXIIVV — bicycle
XXIIVV
-
Varvara's Bicycle Interpreter
Varvara's Bicycle Interpreter

Bicycle is an interactive Uxntal playground.

+
The Bicycle Interpreter
The Bicycle Interpreter

Bicycle is an interactive Uxntal playground.

Bicycle is a little Uxntal interpreter designed for teach the language in front of an audience. Bicycle allows you to diff --git a/site/calendar.html b/site/calendar.html index 6e1d9b25d..52a51a40d 100644 --- a/site/calendar.html +++ b/site/calendar.html @@ -7,7 +7,8 @@

This wiki uses the Arvelie time format, where the year is divided in 26 periods, or months, of 14 days, numbered from A to Z. The initial logging year and the Arvelie dates count upward from 2006. You can see more updates in the journal and now pages

17S11 talk — Strange Loop 2023, St. Louis
-17M04 canada — Sail to Princess Louisa, Canada
+
16P09 varvara — Varvara Specs Ver.1
+
17M04 canada — Sail to Princess Louisa, Canada
17H00 oquonie — Oquonie Uxn Release
17E12 talk — Biosonic 2023, Galiano Island
17D01 alicef — Lovebyte Demoscene Party 2023
diff --git a/site/index.html b/site/index.html index caa051798..67771d3a1 100644 --- a/site/index.html +++ b/site/index.html @@ -303,6 +303,7 @@

  • programming
  • programming languages
  • assembly
  • +
  • x86
  • forth
  • lisp
  • secd
  • diff --git a/site/journal.html b/site/journal.html index dec82056e..d1937cfa7 100644 --- a/site/journal.html +++ b/site/journal.html @@ -2,7 +2,7 @@ XXIIVV — journal
    XXIIVV
    -
    Varvara's Bicycle Interpreter
    bicycle — Varvara's Bicycle Interpreter
    +
    The Bicycle Interpreter
    bicycle — The Bicycle Interpreter
    In Princess Louisa Inlet
    hundred rabbits — In Princess Louisa Inlet
    Moored in Princess Louisa Inlet
    pino — Moored in Princess Louisa Inlet
    Sail to Princess Louisa, Canada
    canada — Sail to Princess Louisa, Canada
    diff --git a/site/now.html b/site/now.html index 7940b079c..dde4802e8 100644 --- a/site/now.html +++ b/site/now.html @@ -26,6 +26,7 @@

    2023-08-06 Lambdas

    So, the first pass of review for Wiktopher is done.

    Implementing the recent changes to Varvara in Oquonie, I noticed how many single-purpose labels I used merely to hop over short lengths of code, enough that having ran of ideas for names to called them, I would default to things such as &skip, or &continue. The solution was to create anonymous labels, and as to be capable of nesting them, I ended up inadvertently adding lambdas to Uxntal which has drastically improve code readability, and as a side effect allowed for the rapid creation of tree data-structures.

    diff --git a/site/postscript.html b/site/postscript.html index f2624e979..651881de3 100644 --- a/site/postscript.html +++ b/site/postscript.html @@ -344,6 +344,6 @@

    koch.ps

  • tutorials
  • -

      incoming dotgrid

      +

        incoming dotgrid uxntal lambdas

        \ No newline at end of file diff --git a/site/sitemap.html b/site/sitemap.html index 996076757..ff92399cc 100644 --- a/site/sitemap.html +++ b/site/sitemap.html @@ -292,6 +292,7 @@
      • programming
      • programming languages
      • assembly
      • +
      • x86
      • forth
      • lisp
      • secd
      • diff --git a/site/uxntal_lambdas.html b/site/uxntal_lambdas.html index 39aa5dad1..3bbdcb126 100644 --- a/site/uxntal_lambdas.html +++ b/site/uxntal_lambdas.html @@ -5,7 +5,8 @@

        Anonymous Functions in Uxntal

        In the context of Uxntal, lambdas are unlabeled inline routines delimited by -curlies. A lambda block is jumped over, and a pointer to the start of the +curlies, not unlike Postscript's anonymous +procedures. A lambda block is jumped over, and a pointer to the start of the lambda is pushed to the top of the return stack. The body of the lambda can be unquoted with the STH2r and JSR2 opcodes.

        diff --git a/site/uxntal_syntax.html b/site/uxntal_syntax.html index 6a9351122..4f215be4d 100644 --- a/site/uxntal_syntax.html +++ b/site/uxntal_syntax.html @@ -162,7 +162,8 @@

        Example

        Anonymous Functions in Uxntal

        In the context of Uxntal, lambdas are unlabeled inline routines delimited by -curlies. A lambda block is jumped over, and a pointer to the start of the +curlies, not unlike Postscript's anonymous +procedures. A lambda block is jumped over, and a pointer to the start of the lambda is pushed to the top of the return stack. The body of the lambda can be unquoted with the STH2r and JSR2 opcodes.

        diff --git a/site/varvara.html b/site/varvara.html index 21ab42686..27be1ddfa 100644 --- a/site/varvara.html +++ b/site/varvara.html @@ -652,6 +652,6 @@

        Datetime Device mask 0x07ff

        -

          incoming donsol roms left noodle nasu nebu adelie potato metadata metadata metadata oquonie paradise basic basic icn format chr format gly format ufx format tga format chip8 uxn uxntal syntax drifblim beetbug beetbug devlog computer

          +
          • 16P09 — Varvara Specs Ver.1

          incoming donsol roms left noodle nasu nebu adelie potato metadata metadata metadata oquonie paradise basic basic icn format chr format gly format ufx format tga format chip8 uxn uxntal syntax drifblim beetbug beetbug devlog computer

          \ No newline at end of file diff --git a/site/x86.html b/site/x86.html new file mode 100644 index 000000000..6357e9800 --- /dev/null +++ b/site/x86.html @@ -0,0 +1,41 @@ + + +XXIIVV — x86 +
          XXIIVV
          +

          x86 assembly

          + +

          To view the disassembly of a binary:

          + +
          +objdump [-Mintel] -d inline
          +
          + +

          ..

          + +
          +Label1:
          +	mov ax, bx
          +	;mov cx, ax   ; possibly _overwriting_ some needed value?
          +
          + +

          Inline

          + +
          +int main() {
          +	int src = 1, dst;
          +	asm ("mov %1, %0;" "add $1, %0;" : "=r" (dst) : "r" (src));
          +	printf("%d\n", dst);
          +    return 0 ;
          +}
          +
          + + + + + +
            + + \ No newline at end of file diff --git a/src/htm/2023.htm b/src/htm/2023.htm index ce2ee62c7..518800320 100644 --- a/src/htm/2023.htm +++ b/src/htm/2023.htm @@ -6,6 +6,7 @@

            2023-08-06 Lambdas

            So, the first pass of review for Wiktopher is done.

            Implementing the recent changes to Varvara in Oquonie, I noticed how many single-purpose labels I used merely to hop over short lengths of code, enough that having ran of ideas for names to called them, I would default to things such as &skip, or &continue. The solution was to create anonymous labels, and as to be capable of nesting them, I ended up inadvertently adding lambdas to Uxntal which has drastically improve code readability, and as a side effect allowed for the rapid creation of tree data-structures.

            diff --git a/src/htm/calendar.htm b/src/htm/calendar.htm index eb17f73d1..512ccf353 100644 --- a/src/htm/calendar.htm +++ b/src/htm/calendar.htm @@ -3,7 +3,8 @@

            The Calendar shows upcoming and past events from the journal.

            This wiki uses the Arvelie time format, where the year is divided in 26 periods, or months, of 14 days, numbered from A to Z. The initial logging year and the Arvelie dates count upward from 2006. You can see more updates in the journal and now pages

            17S11 talk — Strange Loop 2023, St. Louis
            -17M04 canada — Sail to Princess Louisa, Canada
            +
            16P09 varvara — Varvara Specs Ver.1
            +
            17M04 canada — Sail to Princess Louisa, Canada
            17H00 oquonie — Oquonie Uxn Release
            17E12 talk — Biosonic 2023, Galiano Island
            17D01 alicef — Lovebyte Demoscene Party 2023
            diff --git a/src/htm/journal.htm b/src/htm/journal.htm index fdd65f4a0..c790849bf 100644 --- a/src/htm/journal.htm +++ b/src/htm/journal.htm @@ -1,4 +1,4 @@ -

            Varvara's Bicycle Interpreter
            bicycle — Varvara's Bicycle Interpreter
            +
            The Bicycle Interpreter
            bicycle — The Bicycle Interpreter
            In Princess Louisa Inlet
            hundred rabbits — In Princess Louisa Inlet
            Moored in Princess Louisa Inlet
            pino — Moored in Princess Louisa Inlet
            Sail to Princess Louisa, Canada
            canada — Sail to Princess Louisa, Canada
            diff --git a/src/htm/sitemap.htm b/src/htm/sitemap.htm index f7fcbe8f4..9f6dabfed 100644 --- a/src/htm/sitemap.htm +++ b/src/htm/sitemap.htm @@ -288,6 +288,7 @@
          • programming
          • programming languages
          • assembly
          • +
          • x86
          • forth
          • lisp
          • secd
          • diff --git a/src/htm/uxntal_lambdas.htm b/src/htm/uxntal_lambdas.htm index 612a10ecc..907c37631 100644 --- a/src/htm/uxntal_lambdas.htm +++ b/src/htm/uxntal_lambdas.htm @@ -1,7 +1,8 @@

            Anonymous Functions in Uxntal

            In the context of Uxntal, lambdas are unlabeled inline routines delimited by -curlies. A lambda block is jumped over, and a pointer to the start of the +curlies, not unlike Postscript's anonymous +procedures. A lambda block is jumped over, and a pointer to the start of the lambda is pushed to the top of the return stack. The body of the lambda can be unquoted with the STH2r and JSR2 opcodes.

            diff --git a/src/htm/x86.htm b/src/htm/x86.htm new file mode 100644 index 000000000..4676abd0a --- /dev/null +++ b/src/htm/x86.htm @@ -0,0 +1,34 @@ +

            x86 assembly

            + +

            To view the disassembly of a binary:

            + +
            +objdump [-Mintel] -d inline
            +
            + +

            ..

            + +
            +Label1:
            +	mov ax, bx
            +	;mov cx, ax   ; possibly _overwriting_ some needed value?
            +
            + +

            Inline

            + +
            +int main() {
            +	int src = 1, dst;
            +	asm ("mov %1, %0;" "add $1, %0;" : "=r" (dst) : "r" (src));
            +	printf("%d\n", dst);
            +    return 0 ;
            +}
            +
            + + + + + diff --git a/src/tables/diary/15-19 b/src/tables/diary/15-19 index 8eda06700..7f7ec0eb3 100644 --- a/src/tables/diary/15-19 +++ b/src/tables/diary/15-19 @@ -1,5 +1,6 @@ 17S11+000 talk Strange Loop 2023, St. Louis -16I05-783 bicycle Varvara's Bicycle Interpreter +16P09+000 varvara Varvara Specs Ver.1 +16O05-783 bicycle The Bicycle Interpreter 17M06-815 hundred rabbits In Princess Louisa Inlet 17M05-814 pino Moored in Princess Louisa Inlet 17M04+816 canada Sail to Princess Louisa, Canada diff --git a/src/tables/lexicon b/src/tables/lexicon index 8c48aeb1f..34a4007a5 100644 --- a/src/tables/lexicon +++ b/src/tables/lexicon @@ -288,6 +288,7 @@ 3:programming 4 programming languages 5 assembly +6 x86 5;forth 5:lisp 6 secd