Doing it better 2000-06-08 f
###

> Ha-ha-hang on. The Alpha has a defined endianness in the sense that it's
> big or little endian, and you can find out which from a status register
> somewhere. sMite is just like that.

Eh? What defines the endianness, then? (Yes, this is just like sMite -
endianness is not defined except in terms of the semantics of the
byte-access instructions). Is the status register really telling you the
endianness of the network card (the only device which accesses bytes)?

> > To worry about low-level access to memory, you have to have a really quite
> > specialised motivation, and I think you need to make sure that you do and
> > that you understand that you do.
>
> I do: my specialised motivation is to model processors.

Ah, so you're not actually going to use it then? :-) You could model
processors quite happily, and perhaps even more accurately, if you
modelled memory as an array of words, and forgot about bytes, and if you
fixed a word width, albeit with a parameter.

Your motivation permits you to abandon any attempt to access a word that
is not 64-byte aligned, in the unlikely event that this leads to a better
implementation. Your motivation does not require the abstract model to map
onto the concrete one in any particular way. It says you can ignore other
things on the computer, such as C programs compiled into native code.

Most processors are quite similar, and you have indeed found a portable
language whose mapping is near-as-dammit trivial onto most of them, and it
seems unlikely that its implementation will be improved by greater
abstraction. However, in the places where processors are not similar, you
should not be surprised that the mapping is abstract, and that the image
of sMite does not cover the entire native instruction set.

> > I'm still not quite sure whether you do see sMite as a utility to be
> > called from C, or as the basis for Tau, which is why I asked for a
> > paragraph about your motivation.
>
> Both. But it's only a basis for Tau for raw code execution. e.g. it's not
> the same sort of thing as MIN; you could easily implement MIN in
> sMite. It *is* the same sort of thing as an x86 or Alpha, or Cintcode,
> i.e. a code target. It's just that, unlike the real chips, but like
> Cintcode, it provides binary portability (at a low level, and in ways
> which you can violate).

I have to object that binaries are not portable between sMite and
Cintcode. I know what you mean, but your analogy is confused.

> > You are muddying the semantics every time you mention bytes or word
> > widths. These considerations allow you to write code whose behaviour is
> > different on different platforms. A conventional CPU is at a lower level
> > than this: it completely specifies the semantics, or at least makes it
> > absolutely clear where they are undefined.
>
> Got it. Well, I just give the user/designer/programmer some control over
> the tradeoff. I think that's better than either being completely defined
> or completely undefined. Indeed, processors are sometimes designed in a
> parametrised way too (can't think of examples OOTOMH).

HP Risc, IIRC.

> > If you have to worry about whether a pointer is to a sMite-endian or
> > native-endian word, you will never get such a clear distinction (it could
> > be undecideable whether the semantics is defined!). Superficial syntax
> > aside, it is precisely this sort of programmer-beware approach which makes
> > C a higher-level language than machine code (by which I mean more
> > abstract, not easier to write!).
>
> Fine, so build systems with sMite that insist on using sMite-endianness,
> and converting with external libraries. But I think a better approach is
> to compile those libraries to sMite too and only enforce the
> sMite-endianness at a higher level. This gives better reuse.

Now I've got it too. You have two languages, one of which is like C, and
one of which is like a processor. They are extremely similar, and both
called sMite, except that one has muddy semantics, and the other has a
couple of extra restrictions.

> Of course you should always isolate tricky assumptions. e.g. in a Forth
> compiler written for sMite I would have the normal C@ and C! words to
> read and write bytes. These would be the only place I'd have to worry
> about endianness if I factored properly.

That assumes you want to compile Forth to the muddier version of sMite.
I'd compile C@ and C! to bit shifts, within the restricted version of
sMite, so as to avoid endianness issues. Effectively, the programmer's
model within Forth would then define an endianness.

> > Maybe you don't really want something that is like a CPU, though. Maybe
> > you want something that is like C. Motivation crises R us.
>
> I just think I can have my cake and eat it. The nice thing is that if I'm
> wrong I won't have to change the design, just the restrictions. I probably
> won't even have to change the implementation!

That is indeed a nice thing.

I feel we're making progress here. I certainly understand the issues
better.

Alistair