Understanding Extended Inline Assembly

**Absurd** · 10-19-2015

I'm having really hard time understanding the concept of the Extended Inline Assembly.
Every tutorial I found online did not attempt to explain what exactly is going on there, but rather only described the syntax.

Take the following snippet, for example, taken from here:

Code:

int no = 100, val ;
    asm ("movl %1, %%ebx;"
         "movl %%ebx, %0;"
         : "=r" ( val )        /* output */
         : "r" ( no )         /* input */
         : "%ebx"         /* clobbered register */
     );

It then goes on by saying:

In the above example, "val" is the output operand, referred to by %0 and "no" is the input operand, referred to by %1. "r" is a constraint on the operands, which says to GCC to use any register for storing the operands.

This is very confusing for several reasons:

First, who's to say that ""val" is referred to by %0 and "no" is referred to by %1."? Why not the other way around? It doesn't say anything about order there.

Second, and more importantly: I don't understand the notion of "input operands" and "output operands" in that context (or in any other context, for that matter).
I know what operands are, and I know what input and output are, but there is no context in which "output operand" sounds reasonable to me.
Furthermore, there is no function nor operation here (just two movl instructions), so "input" into where, and "output" to where?

With love and peace,
Absurd.

**Nominal Animal** · 10-19-2015

We're talking about GCC extended inline assembly here.

Originally Posted by Absurd

First, who's to say that ""val" is referred to by %0 and "no" is referred to by %1."? Why not the other way around?

You number the output operands in the output operand list first, then the input operands, starting at 0.

For example, if you have two output operands and three input operands, the first output operand will be %0 and the second output operand %1, with the input operands being %2, %3, and %4, in the order you declare them.

The order in which you use the operands in your assembly does not matter at all; the order is determined by the declarations only.

Originally Posted by Absurd

I don't understand the notion of "input operands" and "output operands" in that context (or in any other context, for that matter).

In this context, each input operand is

"constraint" (expression)

with commas separating multiple operands. The expression specifies some variable or expression the compiler must prepare to be accessible to your assembly snippet, with constraint telling the compiler where to put it.

Each output operand is

"constraint" (variable)

where constraint tells the compiler where a result you're interested in may be, and variable names the variable (or expression!) the compiler should move the value to.

Consider the i386 movzbl (%edi), %eax instruction, which reads one byte from the memory address pointed to by the edi register into the eax register, zero-extending the value (clearing all high bits).

If we look at the i386 mnemonics, we can see that the format allows any general register to be used as the address, and any general register as the value, so we can use the =r constraint for output and r for input operands.

We could use =A and D constraints, respectively, to get the exact form above, but why restrict the compiler?

The inline assembly snippet would be

Code:

int read_byte(const void *const address)
{
    int result;
    __asm__ __volatile__ ( "movzbl (%1), %0\n\t"
                         : "=r" (result) /* output operand */
                         : "r" (address) /* input operand */
                         : /* No clobbers this time */
                         );
    return result;
}

The __volatile__ tells the compiler to not move code across this assembly snippet. Here, it does not matter, really.

**Absurd** · 10-19-2015

Got it! Thank you.

One question though:

Originally Posted by Nominal Animal

Each output operand is

"constraint" (variable)

where constraint tells the compiler where a result you're interested in may be, and variable names the variable (or expression!) the compiler should move the value to.

If the output operand determines where the compiler will store the result, then how can it be an expression? Doesn't it have to be an lvalue?

**Nominal Animal** · 10-20-2015

Originally Posted by Absurd

Doesn't [output operand] have to [refer to] an lvalue?

Yes, it does; well spotted.

It is usually a variable name, but it can also be an expression that evaluates to an lvalue. For example:

Code:

struct foo {
    /* Stuff */
    int *a;
    /* Other stuff */
};

void foo_get(struct foo *const f, const void *const from)
{
    __asm__ __volatile__ ( "movzbl (%1), %0\n\t"
                         : "=r" (*(f->a)) /* output operand */
                         : "r" (from) /* input operand */
                         : /* No clobbers this time */
                         );
}

This copies the byte, zero-extending to an int, to where the a pointer in the foo structure points to. It is an expression that is an l-value.

(I'm not sure if f->a[0] would be easier on the eyes; I wanted to make it explicit that we need to dereference the pointer to make it the correct lvalue -- just as if we were to write e.g. *(f->a) = 0; to assign a zero to the same place the assembly snippet copies the data to.)

**Absurd** · 10-20-2015

Thanks again for teaching me some more new stuff. Appreciate it.

Thread: Understanding Extended Inline Assembly

Thread Tools

Search Thread

Display

Understanding Extended Inline Assembly

Similar Threads

inline assembly

Inline assembly

inline assembly in dev-cpp

inline assembly question

Inline Assembly?