(Follow this link to go back to the main m88k page, and this link to go back to the previous part.)
On june 29th, 2003, I announced that I was taking a break from OpenBSD, which was ``likely to last at least three months''. I definitely needed time to cool off and step back.
But although I was stepping back, I could not stop tinkering with the OpenBSD source code, and kept writing (and sharing) minor bugfixes.
In mid-july, I decided to reduce my time off to two months and come back on september 1st.
I realized this was the low-pressure time I needed to try and fix the compiler bugs. Either I would be successful and the mvme88k port would have a future, or (more likely) I would fail miserably and noone would ever know about this and be disappointed.
I started from the assertion failure in uvm with the gcc 2.95 compiled kernel.
On july 15th, I had been able to produce a simple testcase out of it.
Date: Tue, 15 Jul 2003 17:48:58 +0000
From: Miod Vallat
To: Marc Espie, Thierry Deval, Henning Brauer, Nick Holland
Subject: fun with gcc
No comments.
miod@arzon OpenBSD/mvme88k [/users/miod] $ uname -a
OpenBSD arzon 3.3 GENERIC#73 mvme88k
miod@arzon OpenBSD/mvme88k [/users/miod] $ gcc31 -v
Using builtin specs.
Reading specs from /usr/lib/gcc-lib/m88k-unknown-openbsd3.1/specs
gcc version 2.95.3 20010125 (prerelease)
miod@arzon OpenBSD/mvme88k [/users/miod] $ cat assert.c
#include <stdio.h>
#include <sys/types.h>
#define KASSERT(e) ((e) ? (void) 0 : __assert( __FUNCTION__, #e))
void
__assert(const char *funcname, const char *error)
{
printf("Assertion failed in %s: %s\n", funcname, error);
/* exit(0); */
}
void *
uvm_pagealloc_strat(void *obj, u_int64_t off, void *anon, int flags, int strat,
int free_list)
{
KASSERT(anon == NULL);
return obj;
}
void *
uvm_pagealloc(void *obj, u_int64_t off, void *anon, int flags)
{
KASSERT(anon == NULL);
return obj;
}
main()
{
char *pg;
char *kobj = "kobj";
pg = uvm_pagealloc(kobj, 0, NULL, 0);
pg = uvm_pagealloc(kobj, 0, NULL, 0);
pg = uvm_pagealloc_strat(kobj, 0, NULL, 0, 0, 0);
pg = uvm_pagealloc(kobj, 0, NULL, 0);
}
miod@arzon OpenBSD/mvme88k [/users/miod] $ gcc31 -O0 -o assert assert.c
miod@arzon OpenBSD/mvme88k [/users/miod] $ ./assert
Assertion failed in uvm_pagealloc: anon == NULL
miod@arzon OpenBSD/mvme88k [/users/miod] $ gcc -v
Reading specs from /usr/lib/gcc-lib/m88k-unknown-openbsd2.5/2.8.1/specs
gcc version 2.8.1
miod@arzon OpenBSD/mvme88k [/users/miod] $ gcc -O0 -o assert assert.c
miod@arzon OpenBSD/mvme88k [/users/miod] $ ./assert
miod@arzon OpenBSD/mvme88k [/users/miod] $
A few hours later, I had been able to shrink my test case even more, then understand the cause of the bug, and fix it.
Date: Tue, 15 Jul 2003 23:33:40 +0000
From: Miod Vallat
To: Marc Espie, Thierry Deval, Steve Murphree, Paul Weissmann, Theo de Raadt
Subject: One less gcc bug on m88k...
gcc 2.8:
$ grep FUNCTION_ARGS_ADVANCE *
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, TYPE_MODE (type), type,
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, Pmode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
function.c: FUNCTION_ARG_ADVANCE (args_so_far, promoted_mode,
gcc 2.95:
$ grep FUNCTION_ARGS_ADVANCE *
calls.c: FUNCTION_ARG_ADVANCE (*args_so_far, TYPE_MODE (type), type,
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, Pmode, (tree) 0, 1);
calls.c: FUNCTION_ARG_ADVANCE (args_so_far, mode, (tree) 0, 1);
function.c: FUNCTION_ARG_ADVANCE (args_so_far, promoted_mode,
Note how the first call uses a pointer dereference now? Can you already
guess what the bug description below is?
$ cd config/m88k
$ head -1069 m88k.h | tail -18
/* A C statement (sans semicolon) to update the summarizer variable
CUM to advance past an argument in the argument list. The values
MODE, TYPE and NAMED describe that argument. Once this is done,
the variable CUM is suitable for analyzing the *following* argument
with `FUNCTION_ARG', etc. (TYPE is null for libcalls where that
information may not be available.) */
#define FUNCTION_ARG_ADVANCE(CUM, MODE, TYPE, NAMED) \
do { \
enum machine_mode __mode = (TYPE) ? TYPE_MODE (TYPE) : (MODE); \
if ((CUM & 1) \
&& (__mode == DImode || __mode == DFmode \
|| ((TYPE) && TYPE_ALIGN (TYPE) > BITS_PER_WORD))) \
CUM++; \
CUM += (((__mode != BLKmode) \
? GET_MODE_SIZE (MODE) : int_size_in_bytes (TYPE)) \
+ 3) / 4; \
} while (0)
Note how CUM is unprotected, especially in CUM++ ...
What did this produce in practice? Well, the m88k calling convention
mandates that the arguments are passed in registers r2-r9, and if this
is not enough, on the stack. It also mandates that, if an argument can
not fit in one register (float [I really meant to write "double" here], or int64_t), it gets put in two
consecutive registers starting at an even number (so that double word
load and store opcodes can be used) - the CUM++ test logic is to know
whether we have to waste an unused odd-numbered register to respect
this.
In my test case, I used the following function prototypes:
void even64(int oddmaker, int evenmaker, u_int64_t stamper, int value);
void odd64(int oddmaker, u_int64_t stamper, int value);
Their calling conventions would be...
odd64:
r2 - oddmaker
r3 - evenmaker
r4, r5 - stamper
r6 - value
and for even64:
r2 - oddmaker
r3 - unused
r4, r5 - stamper
r6 - value
Compiling a call to even64() would trigger the CUM++ statement from the
call in calls.c using the pointer. One more bug caused by the
preprocessor on apparently correct code, especially since it had been
working in previous gcc versions...
As a result, CUM would end up with a very huge (semi-random) value,
which would cause the register allocator to consider it had exhausted
the r2-r9 range, and use this calling convention:
r2 - oddmaker
r3 - unused
r4, r5 - stamper
stack - value
When applied to the kernel, this would cause any kernel compiled by gcc
2.95 to die horribly in the first few uvm KASSERT macros...
[...]
(If you're not familiar with the way C preprocessor macros work, they perform a direct substitution of their arguments when "invoked". So when CUM was substituted with *args_so_far in the first use in calls.c, the payload of the first if statement in the macro expansion, intended to skip an odd-numbered register if the argument to pass to the function would be passed in a register pair, would be *args_so_far++, which would increment the pointer, but not the value it points to, instead of the intended (*args_so_far)++, which would increment the value it points to and leave the pointer unmodified.
Because of this, not only would the value of CUM be incorrect, but we would slowly corrupt the compiler's own memory. Apparently this was benign enough to not cause it to crash.)
This was, of course, trivial to fix, by making sure all the arguments of all the macros in the m88k backend were put in parentheses every time they were used, i.e. writing (CUM) instead of CUM. In fact, all other gcc backends had been fixed that way, but for whatever reason, the m88k backend hadn't.
Fixing this gave me a kernel which booted without failing assertions upon startup.
I was however not willing to trust that compiler and that kernel yet, and rebooted on the old kernel compiled with gcc 2.8.1 I had been using.
Before I could start trusting anything built with gcc 2.95, I wanted it to be able to rebuild itself.
In fact, building gcc 2.95, with the macro fix, using gcc 2.8.1, would work. Then attempting to build gcc 2.95 again, this time with itself, would fail quickly, with one of its helper programs, genrecog, freshly built, dumping core.
The genrecog tool is used to generate the insn-recog.c file from the machine-dependent backend information, in our case config/m88k/m88k.md. It shares a lot of code with other genfoo code extracting the various information from the m88k.md machine description file. In fact, I could track very quickly the breakage to genrecog.c. Compiling every other part of genrecog but genrecog.c with gcc 2.95, and genrecog.c with gcc 2.8, would produce a working genrecog binary.
Also, apparently, that genrecog.c miscompilation was the only thing preventing gcc 2.95 from recompiling itself (with optimization disabled, at this point.) For a while, I procrastinated on this by using gcc 2.8.1 to compile that particular file, but there was nevertheless a bug waiting to be taken care of.
genrecog emits a lot of information, in fact C source code, on its standard output. When run manually, it would die very quickly with this output:
$ cd obj $ ./genrecog /usr/src/gnu/egcs/gcc/config/m88k/m88k.md /* Generated automatically by the program `genrecog' from the machine description file `md'. */ #include "config.h" #include "system.h" #include "rtl.h" #include "insn-config.h" #include "recog.h" #include "real.h" #include "output.h" #include "flags.h" extern rtx gen_split_1 (); extern rtx gen_split_2 (); Memory fault (core dumped) $
This was simple enough to trace the segfault to this snippet from main():
while (1)
{
c = read_skip_spaces (infile);
if (c == EOF)
break;
ungetc (c, infile);
desc = read_rtx (infile);
if (GET_CODE (desc) == DEFINE_INSN)
recog_tree = merge_trees (recog_tree,
make_insn_sequence (desc, RECOG));
else if (GET_CODE (desc) == DEFINE_SPLIT)
split_tree = merge_trees (split_tree,
make_insn_sequence (desc, SPLIT));
if (GET_CODE (desc) == DEFINE_PEEPHOLE
|| GET_CODE (desc) == DEFINE_EXPAND)
next_insn_code++;
next_index++;
}
make_insn_sequence might produce an insn sequence or whatever. To continue diving into genrecog, all we need to know is what kind of return value it provides. A quick glance at the source will tell that:
static struct decision_head make_insn_sequence PROTO((rtx, enum routine_type));
...
static struct decision_head merge_trees PROTO((struct decision_head,
struct decision_head));
So these function work on decision_head structures, which are defined as:
/* Data structure for a listhead of decision trees. The alternatives
to a node are kept in a doublely-linked list so we can easily add nodes
to the proper place when merging. */
struct decision_head { struct decision *first, *last; };
Using the best debugging tools of the trade, also known as printf-based debugging, I added traces of the values of these decision_head structs.
Adding traces to genrecog is very easy: since it outputs C code, you just have to output your traces as C comments. Adding simple traces to merge_trees proved very quickly that it was given an invalid argument. But then traces in make_insn_sequence would prove that it would produce valid data! The output with my trace information would end like this:
[...] extern rtx gen_split_1 (); /* make_insn_sequence -> 2e000.2e000 */ /* merge_trees <- 0.0 8.4008 */ /* merge_trees -> 8.4008 */ extern rtx gen_split_2 (); /* make_insn_sequence -> 2e380.2e380 */ /* merge_trees <- 8.4008 18.4018 */ Memory fault (core dumped) $
This was enough to hint that the problem was related to the way short structures were being passed to and returned by functions. Mimicing the decision_head struct layout, I came with that test program:
#include <stdio.h>
struct maze { const char *item1, *item2; };
struct maze
builder(void)
{
struct maze m;
m.item1 = "you are lost";
m.item2 = "in the maze.";
printf("builder: m.item1 = %p, item2 = %p\n", m.item1, m.item2);
return m;
}
void
checker(struct maze m)
{
printf("checker: m.item1 = %p, item2 = %p\n", m.item1, m.item2);
}
main()
{
checker(builder());
}
When compiled using gcc 2.8, it would run nicely...
$ gcc28 -O0 -o maze maze.c $ ./maze builder: m.item1 = 0x10f8, item2 = 0x1108 checker: m.item1 = 0x10f8, item2 = 0x1108 $
...while, once rebuilt with gcc 2.95, it would misbehave.
$ gcc295 -O0 -o maze maze.c $ ./maze builder: m.item1 = 0x10f8, item2 = 0x1108 checker: m.item1 = 0x0, item2 = 0x0 $
Tinkering with the size of the struct being used in this program showed that only structs which size was between 8 and 32 bytes, inclusive, would not be passed correctly.
Looking at the generated code, gcc 2.8 would produce this code for main (prologue and epilogue omitted):
or r12,r0,r31
bsr _builder
bsr _checker
But gcc 2.95 would emit two more instructions:
ld.d r24,r0,r31
or r12,r0,r31
bsr _builder
st.d r24,r0,r31
bsr _checker
Before we go further, we need to know a few more information regarding the m88k calling convention. The standard is to never pass the structures in registers, but always on the stack. And if the function returns a struct itself, the address of the returned struct should be set in r12 by the caller, which will have allocated the appropriate space. This allows such function calls to be recursive, unlike the older, PCC struct passing convention, where the space for the struct would be a global anonymous variable in memory.
In the gcc 2.8 code, the compiler already optimized the calls flow so that the structure is immediately on the stack frame, thus immediately usable in checker(). After r12 is set to point to the temporary location, builder() and checker() are invoked.
gcc 2.95, on the other hand, adds two extra statements. The first one saves the stack area which is about to be used by builder() in registers r24 and r25. The second statement restores this area immediately before invoking checker(), effectively making it check uninitialized memory!
In fact, if I introduce a temporary variable to store the builder() result, like this:
main()
{
struct maze m = builder();
checker(m);
}
Then both gcc 2.8 and gcc 2.95 would produce the same, correct, code:
addu r12,r30,8
bsr _builder
or r13,r0,r31
addu r11,r30,8
ld r12,r11,0
ld r11,r11,4
st r12,r13,0
st r11,r13,4
bsr _checker
In this case, local variable m is found at r30+8; then after builder returns, the address of the struct is computed again in r11 (remember this is without any form of optimization), its two fields are copied to the stack at r31+0, and that address passed to checker in r12. (The contents of m need to be copied to a different place, because C functions are allowed to modify the arguments they receive by value, but for their own use only; therefore they must receive a copy of that value.)
This time, I could not quickly figure out what made gcc 2.95 misbehave, and I asked people with better gcc skills than me, for help.
Date: Sun, 20 Jul 2003 00:09:32 +0000
From: Miod Vallat
To: Anil Madhavapeddy, Hiroaki Etoh, Marc Espie, Niklas Hallqvist
Subject: gcc help needed
Hi,
I am finally seriously working on gcc/m88k, in order to give the
OpenBSD/mvme88k a new chance to exist. I have been finding and fixing a
few issues which makes gcc 2.95 almost working on this platform.
Unfortunately, I hit a showstopper bug, which I don't know how to hunt
right now... Since you guys know gcc internals much better than I do, I
figured I might ask for some of your time on this.
A description of the problem [...]
[...] I have no idea where the two extra assembly
statements in gcc 2.95 come from... knowing where they are generated
would be a good start!
Thanks for your time,
Miod
I could not however simply wait for outside assistance, and kept poking. In particular, that problem was specific to the m88k backend, and my test program passed with flying colours on all hardware platforms I could try it on, from alpha to vax. Investigating the differences between m88k and the other processors was the next logical step.
Depending on the various way function calls work across the different cpus,
with various calling conventions and stack layouts, not even counting register
windows, the machine-independent part of gcc tries to offer as much flexibility
as possible, letting architecture-dependent configuration files define the
different behaviours they provide or expect, using a bunch of
way too many macros.
Some of these macros must be defined for every architecture, while others are only defined if the architecture requires it. The m88k is pretty unique here, since it defines REG_PARM_STACK_SPACE and OUTGOING_REG_PARM_STACK_SPACE (a few other architectures do this as well), as well as STACK_PARMS_IN_REG_PARM_AREA.
What do these macros tell the compiler? Let's quote from the manual:
Nothing really fancy here. The m88k-specific backend indeed will automagically allocate 32 bytes on the stack during the function prologue. Hey, wait, 32 bytes, exactly like the structure size limit found earlier, with larger structures never being clobbered!
A simple experiment is to comment out OUTGOING_REG_PARM_STACK_SPACE. Compiling gcc with these settings will produce a stack-wasting compiler, because every function prologue will now automagically allocate an extra 64 bytes on the stack: 32 from the m88k-specific prologue, and 32 from the architecture-independent prologue code, since it has been now instructed to do so. However, when using this compiler, the problem disappears completely, whichever size the structure is.
Another path worth trying would be to remove this automatic stack allocation, and undefine REG_PARM_STACK_SPACE. However, I am afraid this could break some 88Open rule, or, even worse, some implicit assumption hidden in the m88k-specific backend code.
Now, time to look at STACK_PARMS_IN_REG_PARM_AREA... Quoting the manuals again:
This is very interesting, and also very obscure (you'll know what to blame for your next headache.) No architecture but m88k defines this. It means that the REG_PARM_STACK_SPACE area can be shared by both function call parameters, and local variables.
While gcc 2.8 would not seem to care much about this area being shared, and would assume we know what we are doing, gcc 2.95 seems to be more strict, and will explicitely save any variable of this shared area, when it might be clobbered. And this is exactly what we had witnessed! The area on the stack which has been saved before invoking builder() and restored after it returned, is the implicit location for a temporary struct maze.
I could not figure out how to make gcc 2.95 handle this shared area the way gcc 2.8 did, and actually, there were probably very good reasons to change this behaviour. To remain on the safe side, I opted to stop defining STACK_PARMS_IN_REG_PARM_AREA.
The generated code, with gcc 2.95, became:
addu r12,r30,8
bsr _builder
addu r13,r31,32
addu r11,r30,8
ld r12,r11,0
ld r11,r11,4
st r12,r13,0
st r11,r13,4
bsr _checker
which is the same code as when the result of builder() was stored in an explicit temporary variable, except for the stack location: instead of being at the beginning of the REG_PARM_STACK_SPACE area, it is now past this area.
I could tell the people I had asked for help that their help was no longer needed, after only a mere 5 hours... and probably went to sleep immediately afterwards, given the timestamp of that mail.
Date: Sun, 20 Jul 2003 05:11:42 +0000 From: Miod Vallat To: Anil Madhavapeddy, Hiroaki Etoh, Marc Espie, Niklas Hallqvist Subject: Re: gcc help needed [...] I tracked this down to the machine-dependent backend, related to REG_PARM_STACK_SPACE(), and I have a working, ugly, workaround until a decent fix is ready. Miod
I taunted the other developers a few days later.
Date: Mon, 21 Jul 2003 08:03:15 +0000
From: Miod Vallat
To: private OpenBSD mailinglist
Subject: bliss
$ uname -a
OpenBSD arzon 3.3 GENERIC#73 mvme88k
$ head -39 /usr/share/mk/sys.mk | tail -5
#.if (${MACHINE_ARCH} == "m88k")
#CFLAGS?= -O0 ${PIPE} ${DEBUG}
#.else
CFLAGS?= -O2 ${PIPE} ${DEBUG}
#.endif
$ gcc -v
Reading specs from /usr/lib/gcc-lib/m88k-unknown-openbsd3.3/specs
gcc version 2.95.3 20010125 (prerelease, propolice)
$
and from then on, it was obvious to everyone that I would be coming back soon.
Switching the default compiler flags from -O0 (no optimization at all) to -O2 (the usual set of optimizations), as hinted in the email above, was premature.
First, I needed to make sure I could build a complete OpenBSD/mvme88k userland, with the modified gcc 2.95, and optimization disabled; then, I could start enabling optimization and see how well they would work.
I lost some time trying to get perl to build. For reasons I don't remember, I had not rebuilt perl recently on mvme88k, and had still perl version 5.6.1 around. But in late october 2002, Todd Miller had updated perl to version 5.8. And perl would not build with the new compiler... but also not with the old one.
Eventually I tracked this down to a bug in the m88k-specific parts of libc, and I had to vent a bit.
Date: Tue, 29 Jul 2003 19:04:08 +0000 From: Miod Vallat To: "Todd C. Miller", Theo de Raadt Subject: perl on m88k Todd, since you know Perl very well, how many grumpyness points do you get for tracking perl's build problems (miniperl exiting with "panic: top_lev") to a bug in siglongjmp(3)? Miod [Now if I could fix that isakmpd regress/x509 link error, i could even make build without NO_REGRESS]
I came back on august 1st, after only one month of break, and it was business at usual.
My first commit this day was the compiler update.
A working gcc 2.95/m88k compiler, for some low standard value of working. Configuration settings mostly borrowed from the former gcc 2.8 configuration. A few typos and fixes backported from gcc 3.3, and a hell lot of fixes from my fingertips. This is enough to yield a compiler which will produce correct code at -O0. Optimization is slightly broken for some constructs, and more fixes are in the pipeline. ok deraadt@
The next one was the libc fix.
Fix the *longjmp() behaviour - it is legal to reuse a jmp_buf several times. Gets us a working perl 5.8.
A few kernel fixes followed, and then people could start making fun of my engrish commit messages again.
Date: Fri, 1 Aug 2003 23:31:06 +0000 From: Miod Vallat To: "Todd C. Miller" Subject: Re: CVS: cvs.openbsd.org: src > > CVSROOT: /cvs > > Module name: src > > Changes by: miod@cvs.openbsd.org 2003/08/01 17:15:31 > > > > Modified files: > > sys/arch/mvme88k/mvme88k: pmap.c > > > > Log message: > > The pmap potpourri du jour, while hunting for evil bugs: > > "potpourri du jour"? I thought this was an English list ;-) Like "pmap" is an english word! (-:
Date: Fri, 1 Aug 2003 23:56:19 +0000 From: Miod Vallat To: Xavier Santolaria Subject: Re: CVS: cvs.openbsd.org: src > > Modified files: > > sys/arch/mvme88k/mvme88k: pmap.c > > > > Log message: > > The pmap potpourri du jour, while hunting for evil bugs: > ^^^^^^^^^^^^^^^^^ > > - de-cretinize pmap_testbit() and pmap_page_protect() > ^^^^^^^^^^^^ > > ca c'est du log!! :) [now that's some log!] Je ne vois pas ce qu'il a de particulier. C'est de l'anglais moderne correct... [I don't see anything special in it. It's correct modern english...]
I was back and feeling much better, but gcc 2.95 was still in a recovery state. Every time I tried to enable optimization to some level, I would eventually end up, after having gcc recompile itself no more than a couple times, with a compiler which would either mysteriously fail in surprising ways, or surprisingly fail in mysterious ways, or both.
After disabling every optimizing feature under the sun, I identified the culprit.
Date: Wed, 6 Aug 2003 21:35:33 +0000
From: Miod Vallat
To: Theo de Raadt, Marc Espie
Subject: m88k -O1 workaround
The big picture: libgcc, on m88k, contains supposedly optimal block move
functions for small (less than a few hundred bytes) structures. The m88k
gcc backend will then try to generate calls to these routines whenever
possible, and revert to regular bcopy or memcpy otherwise.
It turns out that, the way this is written, the logic responsible to
invoke those functions is flawed in gcc 2.95 (and probably since the
beginning, in fact), and will sometimes miscompute the destination
address, resulting in garbage and problems.
I think I can fix this in about 8-10 gcc recompilations. That's 20-25
hours. In the meantime, as I would like to make some progress, I have
decided to simply prevent these functions from being used, hence the
ugly #if 0 ... #endif change below.
My plans on the long turn, after fixing this, are to add a
machine-specific option to control whether these routines should be
used, or not. I plan to NOT use these routines for stand/ (so as to get
rid of libgcc.a in the link phase) and for the kernel (which comes with
a heavily optimized bcopy() routine.
But in the meantime, before I enable -O1 by default on m88k (which will
wait until a real make build finishes here, as I am still running some
-O0 binaries), I would like to get this workaround in.
Objections?
Miod
Index: m88k.c
===================================================================
RCS file: /cvs/src/gnu/egcs/gcc/config/m88k/m88k.c,v
retrieving revision 1.2
diff -u -p -r1.2 m88k.c
--- m88k.c 2003/08/01 07:40:19 1.2
+++ m88k.c 2003/08/06 19:02:46
@@ -515,6 +515,7 @@ expand_block_move (dest_mem, src_mem, op
block_move_sequence (operands[0], dest_mem, operands[1], src_mem,
bytes, align, 0);
+#if 0 /* XXX */
else if (constp && bytes <= best_from_align[target][align])
block_move_no_loop (operands[0], dest_mem, operands[1], src_mem,
bytes, align);
@@ -523,6 +524,7 @@ expand_block_move (dest_mem, src_mem, op
block_move_loop (operands[0], dest_mem, operands[1], src_mem,
bytes, align);
+#endif
else
{
#ifdef TARGET_MEM_FUNCTIONS
I was actually way too optimistic in my fix estimate, as I have never been able to make these routines work correctly. Even though their constraints look correct to me, when enabled, there eventually are situations (in large code blocks) where using these routines corrupts a register. (I eventually removed the ability to invoke those routines completely years later, admitting defeat.)
The workaround I suggested in that mail got committed the day after.
And then, after more than 5 years, I succeeded in building a complete OpenBSD/mvme88k snapshot! The port was back from knocking on death's door to being barely alive, but that was nevertheless an important milestone.
Date: Sun, 10 Aug 2003 01:29:21 +0000 From: Miod Vallat To: Paul Weissmann Subject: mvme88k I am uploading a snapshot, right now. bsd.rd and bootblocks work for me, so you should be able to create a bootable tape very soon. Probably on monday (it will need to be put on the ftp site and then the mirrors will carry it). Miod
There was still a lot of work to do before the system could be considered stable and reliable, though, as can be seen from this follow-up mail.
Date: Mon, 11 Aug 2003 21:40:19 +0000
From: Miod Vallat
To: Paul Weissmann, "Luke Th. Bullock"
Subject: mvme88k snapshot howto
Hello guys,
since you will probably be the only two people interested in the
OpenBSD/mvme88k new snapshot, here are roughly what I intended to post
to misc@, until i saw there was already something on deadly.org.
- installation notes on the snapshot are outdated. If you have an
OpenBSD-current source tree,
cd /usr/src/distrib/notes && make M=mvme88k
will produce a better INSTALL.mvme88k file.
- this is not for production use. Definitely not. There are kernel bugs
left to fix before this can be considered worth using beyond testing.
- only 187 works at the moment. 197 has never been really working (at
least on my boards), and 188 is broken by a side effect of a commit
made in april 2001 (yes, that's the truth), which I am currently
debugging.
- killer bug #1: vi.recover and sendmail will not work. So, after
installing the snapshot, boot -s on the next boot, mount filesystems
by hand, set TERM to the correct value, and edit /etc/rc to comment
out the vi.recover lines, and /etc/rc.conf to set sendmail_flags to
NO. If you don't, the machine will never finish a multiuser boot, and
will panic instead.
- killer bug #2: when a process exits, or a new one is started (I am not
sure of which of both conditions is the killer), the system may
completely freeze. Even the abort switch on the board will have no
effect. The only thing to do is to use the reset switch or power
cycle. This can happen as soon as during the multi user boot, or after
one week uptime. I have no real clue at the moment to fix this bug.
- don't use gcc -O2. The default make settings will use -O1, which is
known to work (with perhaps some subtle breakage I have not
experienced yet). However, last time I tried -O2, I ran into a lot of
problems. I am working on this.
Miod
The history of kernel changes around that time shows that I had a hard time figuring out the proper sequence in which to process DAE, short for Data Access Exceptions.
The exception model of the 88100 is quite rustic, and exposes a lot of the execution pipeline state. As a consequence of this, when some exception occurs, there may also be pending load or store operations, which the operating system kernel servicing the exception is expected to perform as part of the exception processing, because, on return from the exception, the execution pipeline starts afresh, with these pending operations having been discarded.
The fun starts when performing these operations themselves causes exceptions (for example, an invalid pointer dereference), or if they are part of a data exception fault.
It took me quite some time to find the right order in which to service these. And since there were other bugs interfering with that, sometimes fixing a bug in one area would suddenly expose another, completely unrelated bug, and it was sometimes difficult to figure out whether the last changes were regressions or only exposing something new.
There were also a lot of cleaning tasks overdue, as shown in this mail, shortly before the 3.4 release cutoff.
Date: Tue, 9 Sep 2003 12:12:36 +0000
From: Miod Vallat
To: Theo de Raadt
Subject: mvme88k snapshot and todolist
I am uploading, right now, a more recent 3.4 snapshot to cvs.
Here is also my todolist for m88k, in case you're interested. I am
trying to keep [out of] sys/arch/mvme88k/include whenever possible.
Miod
URGENT (needed for 3.4)
* m8820x_cmmu_init()
-- not (| ! foo) but (& ~foo). Will really help 188.
-- may need an api change for m8820x_cmmu_set().
* test 188 and 188A with all the HyperModule configurations ASAP
-- my 188@20 SYSCON needs a new NVRAM
URGENT TOO (won't make 3.4, but I wish it could have)
* libpthread and arla MD threading bits
-- libpthread done, but with bugs
LESS URGENT
* Kill the bitfields, use constants
-- m88100.h dmt_reg
-- mmu.h cmmu_apr batc_entry_t
-- psl.h psr
* vector_init()
-- the 4 NOPs appear to be unnecessary
* db_machdep
-- we need PC_REGS inst_return and inst_call in a separate file suitable
for non-DDB kernels. See trap.c
-- PC_REGS() is inlined in different places (m88k_pc) . Factorize
-- NOISY2 and NOISY3 not used. NOISY used only once
-- l_PC_REGS and pC_REGS are not used
-- remove __GNUC__ test from db_machdep.h
* sigreturn()
-- compare to sendsig()
* rewrite copystr() in assembly?
* only print master CPU# if more than one
ANY TIME
* remove TODO and syscall.stub
* put back M88100 and M88110 defines in the kernel configuration files, a la
m68k. Will help files.conf, and make better dependencies...
* ... as we add an M88410 define to compile with 88410 support (197SP/DP only).
* kill locore_c_routines.c. Move DAE code to dae.c (depends: M88100), and the
rest to machdep.c
* constify string tables
COSMETIC
* typos to fix (grep the whole tree)
-- proccessor
-- PAMP
-- defintions
* no need to initialize cpuspeed
* g/c getscsiid()
* BROKEN_MMU_MASK
-- is the behaviour correct? can it be determined at runtime?
* *_BREATHING_ROOM unused
* autoconf.h
-- always_match() does not exist
* CLKF_INTR
-- theoretically only if frame points to interrupt trap. intstack not reliable
* cpu_number.h
-- provide short version ifndef MVME188, and KNF. But what about 197DP?
* cpus.h
-- no need to include in cmmu.c and m88110.c
* locore.h
-- fubail/subail not used anymore
-- db_are_interrupts_disabled move to db_machdep.h?
* PATC_ENTRIES unused
* mvme*.h
-- remove U(), we're in ansi world nowadays
* mvme197.h
-- also 187->197
* KNF, KNF, KNF (ugh)
All of this did not prevent me to give MVME197 boards a try.
Date: Wed, 10 Sep 2003 10:37:35 +0000
From: Miod Vallat
To: Theo de Raadt, Steve Murphree
Subject: 197LE status
Short version:
Userland still does not work and I don't know how to make it
work at this point.
Long version:
With the help of an extremely verbose trap handler and syscall
routines, I came to the following results:
* init gets correctly scheduled, and starts executing. The kernel will
fault for every new page accessed, and uvm_fault() will do the
necessary page shuffling to load the pages from disk.
* everything goes well until a syscall fails.
The syscall code ends up, in userland like this:
tb 0, r0, 128 ! invoke system call
br _C_LABEL(_cerror) ! handle error
jmp r1 ! handle success (return)
If the syscall is successful, exip is moved to point to the "jmp r1"
instruction. If it is unsuccessfull, exip is moved to point to the "br
_cerror" instruction.
The first instruction at _cerror puts the hi16 of the address of errno
in a register. Since _cerror is in a text page which has not been
accessed yet, I would expect to get an instruction access trap, reason
page fault, at _cerror. However, I get a data access trap, reason page
fault, wanted address 0 !!!
I thought initially that returning from a trap to a br instruction
would let the processor think that the br is safe, so it would not
check the address and fault to let us pick the page. However, this is
false, the br would execute and I would only get a trap later; plus I
built a libc and init with the syscall trampoline changed to
tb 0, r0, 128
br 1f
jmp r1
1:
br _C_LABEL(_cerror)
and still got the same result.
This may, of course, be a pmap problem; however, forcing the _cerror
page to be fetched with
extern void _cerror(int):
_cerror(42);
early in main() will cause the page to be fetched, and still the
problem to occur.
This is not a problem related to having fetched more than N pages, as
commenting out syscalls known to fail can make me go much further in
init.
Yet, as soon as I let a syscall fail, I'm dead. I tried flushing
caches, invalidating tlb, etc, to no avail.
If you have any idea on what to try or look at, in order to pass this
hurdle, I'd love to hear about it.
Miod
After the 3.4 release binaries were built (the first real OpenBSD/mvme88k release!), I resumed working on the MVME188 support, since I knew it had been working sometime in the past.
The #1 reason why it would no longer work is that the clock interrupts were not being acknowledged correctly. This was fixed in a series of two commits on september 28th.
In november, Steve Murphree made an interesting discovery...
Date: Thu, 20 Nov 2003 02:03:49 -0800 From: Steve Murphree To: Miod Vallat Subject: mc88110 Miod, I've been playing with a MVME197DP unit that I picked up. It has SVR4 on it with a complete development system on it. Finaly, I have found the 'filter' they use to preprocess asm files in order to avoid the errata of the mc88110! The program is called siff. There are 2 scripts that integrate into the compiler environment, asfilter0 and asfilter1. They run siff with -fj and -fz, respectively. So it seems that those 2 cases are the ones that they were most worried about. I have also provided the list of options and descriptions from siff. With this, we might be able to dulpicate the funtionality of siff for OpenBSD. Possibly, it can be integrated into the C compiler itself. Contrary to the option output of siff, there is no man page. I tried running eh.S through siff, but it only accepts SVR4 88k asm comment syntax. :( And, it is designed to be used after any preprocessor macros have been expanded. Basically, the .s output of a compiler. Steve
The first script, asfilter0, had this content:
#!/bin/sh #ident "@(#)asfilter0 4.1.1.1 13 Apr 1993 " # # (C) COPYRIGHT 1993 MOTOROLA, INC. # ALL RIGHTS RESERVED # # THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF MOTOROLA, INC. # The copyright notice above does not evidence any actual or # intended publication of such source code. # # This asfilter script is intended to be used to resolve # the 88110, chip revision 3.2, Errata #14: # # "A multicycle instruction with r1 as the destination which # is followed later by a store of r1 followed later by a # bsr/jsr may cause the store data to be incorrect if the # bsr/jsr executes before the writeback of the multicycle # instruction. The scoreboard bit for r1 is not checked in # all circumstances before the execution of the bsr/jsr. # # Workaround: Ensure that the bsr cannot issue until a writeback # by the multicycle instruction is forced to complete. For # example, precede affected bsr/jsr instructions with # or r0,r1,r0 to force the multicycle instruction to complete # before the bsr can execute." # # BINDIR=/usr/ccs/bin SIFF=$BINDIR/siff # Silicon Filter binary SIFF_FLAGS="-q" # Start with quiet mode SIFF_FLAGS="$SIFF_FLAGS -fj" # Turn -M 4E93 filter on # $SIFF $SIFF_FLAGS -o $2 $1
(let me laugh for a few seconds regarding the "UNPUBLISHED" part of the copyright notice, given that file was apparently installed on all System V/88 installations as part of the development tools package.)
The other script, asfilter1, was identical, but invoking siff with the -fz option, and was documented to address that errata:
# This asfilter script is intended to be used to resolve # the 88110, chip revision 4.1, Errata #3: # # "Under certain conditions, a bcnd instruction may resolve # incorrectly. Consider the following scenario: two single- # cycle integer operations are issued on the clock cycle # before the bcnd instruction is issued. This will result in # both integer operations writing data back when the bcnd is # executed. In this case, if the results of these two integer # instructions contain many "1" bits, it is possible that the # bcnd operation may be resolved incorrectly. # # Workaround: Precede a bcnd[.n] instruction with a pair of # 'or r0,r0,r0' instructions."
Murphree also pasted the siff help messages:
Options for siff : ( Usage: siff [options] <Assembly File Name> )
==========================================================================
-o <filename> : Specify Output File. (Default: <filename - '.s'>.siff.s)
-r : Remove Comments.
-V : Print out the tool's name and version.
-q : Quiet Mode - No output messages
-O : List Options.
-f<c> : Set a specific filtering option On.
<c> should be set according to the 'Filter Options'
list below.
-w<c> : Set a specific warning option On.
<c> should be set according to the 'Warning Options'
list below.
-G <Register> : Specify a scratch general purpose register.
-C <Register> : Specify a scratch control register.
-M <Mask Name> : Set filtering options On for a given mask set.
<Mask Name> should be set according to the 'Mask Settings'
list below.
Filter Options:
-fd : NOP Insert - Single/Double Dest. Gen. Reg. of same even value.
-fn : Keep stores out of the delay slots of bsr.n and jsr.n
-fp : Repeat an ldcr rD, XPPU if not preceded by an ldcr rD, XPPL
instruction.
-f3 : Follow all ldcr instructions with a branch past 3 no-ops.
-fs : Insert a nop between any combination of ldcr/stcr instructions.
-fm : Insert a stcr r0, XSR before the stcr rS, XCMD part of a probe
command.
-fy : Pull all ldcr|stcr|xcr instr. out of delay slot of bsr|jmp|jsr &
insert align before it.
-ft : Insert an (or r0,r0,rD / mov x0,xD) before a store, where rD/xD is
the store's dest reg.
-fq : Insert a flush ICACHE instruction before an rte instruction.
-fr : Insert a flush ICACHE instruction sequence before an rte instruction.
-fi : Insert a flush load before all xmem instructions.
-fg : Precede a load extended of rd with a load single of rd to the same
address.
-fl : Follow a ld.x ra,rb,rc with a ld.b r0,rb,rc (Touch Load).
-ff : NOP Insert - Dest. Registers with = numbers but from diff. files
-fe : Follow ld.b[.usr]/ld.h[.usr] instructions with a signed extract.
-fx : Follow all xmem instructions with a trap not taken.
-fa : Insert a trap not taken after a carry out that's followed by a carry
in.
-fc : Precede add.co/sub.co instructions with a trap not taken.
-fb : Replace bcnd[.n] with cmp and bb0[.n].
-fw : Insert nops after a st.wt to keep loads at least 2 instructions away.
-fh : Insert nops after an alloc. load to keep stores >=2 instructions
away.
-fj : Insert an or r1,r1,r1 before all bsr[.n] instructions.
-fz : Insert two or r0,r0,r0 before all bcnd[.n] instructions.
-fk : Insure that the rD of the 2nd instr. preceeding a cond. branch != its
rS1
-fo : Insert a nop between a Arith. or logic instr. and store that both
have rD=r0.
Warning Options:
-wP : Issue warning if an ldcr rD, XPAR instruction is encountered.
-wD : Issue warning if st.d w/ an odd general purpose dest. reg. is found.
-wC : Issue warning if an add/sub[u].c[io] instruction is encountered.
-wW : Issue warning if a store with the .wt option is encountered.
-wX : Issue warning if st.x is encountered.
-wF : Issue warning if an fmul with unequal source specifiers is
encountered.
Mask Settings:
-M 0d18w :
Default Filters (-f?) : n, 3, x
Non Default Filters (-f?) :
Warnings (-w?) :
-M 1d18w :
Default Filters (-f?) : a, b, c, d, e, f, g, h, j, k, l, m, n, o, p, r,
s, t, w, x, y, 3
Non Default Filters (-f?) : i
Warnings (-w?) : C, D, F, P, X
-M 0e98b :
Default Filters (-f?) : x, q, l, g, b, y, j, k, o
Non Default Filters (-f?) :
Warnings (-w?) : W, P
-M 1e98b :
Default Filters (-f?) : x, q, l, g, b, y, j, k, o
Non Default Filters (-f?) :
Warnings (-w?) : W, P
** Please refer to the man page for a more detailed description of siff **
Despite what the last line said, and as mentioned in Murphree's email, there was no manual page for that binary to be found, at least on end-user systems. Maybe development environments internal to Motorola had a proper manual page installed.
Murphree then started on implementing his own version of siff, based on the description of every option in the help output. He shared that code with me late november.
Date: Thu, 27 Nov 2003 04:51:49 -0800 From: Steve Murphree To: Miod Vallat Subject: new siff Miod, I got out my green book and figured out some stuff. Most things are implemented now. I hacked up a version of locore.S to look more like a .s file and processed it. All I can say is "wow". This could lead to something. The more I looked at what some of the filters do, the more I think we were in a losing situation before. Steve
I was quite worried about having to perform intrusive post-processing of the compiler output, and also at the impact, in code size and speed, it could have on the existing 88100 systems. Also, given how far the kernel would run on a 88110, I really thought what prevented userland from working was a kernel bug, rather than a processor errata. The fact that the first script was targetting an errata for the version 3 of the 88110, while all the errata information I had was for versions 4 and 5, with all my 88110 hardware being also versions 4 or 5, made me believe that these changes might be overkill and that we would have a chance to run without them.
(To this day, I still don't know if some of these changes would have helped - 88110-based systems, at least the MVME197LE, run reliably, except when sometimes they suddenly freeze, without being able to enter the kernel debugger, and I don't know whether this is a software bug or a bad combination. Maybe I should experiment a bit with siff on a rainy day...)
While the quest towards stability kept me busy, I nevertheless started to work on other improvements to the system. In late december, I significantly improved the MVME376 Ethernet driver, adding support for all Motorola board configurations, and reusing as much of the machine-independent AMD Lance Ethernet code as possible.
Early 2004, as I did not have a copy of the 88110 manual yet, and it had not been scanned on bitsavers yet, I asked Theo de Raadt multiple times for excerpts from the manual. He started taking pictures of pages of the book with a digital camera and sending them to me. That was better than nothing, but unfortunately did not allow me to make much progress on MVME197 support.
Date: Wed, 14 Jan 2004 23:22:53 +0000
From: Miod Vallat
To: Theo de Raadt
Subject: 197
It's better. Init does not die. It won't spawn process correctly either:
197-Bug>bo 6 0 -s
Booting from: VME328, Controller 6, Drive 0
Loading: -s
Volume: M88K
IPL loaded at: $009F0000
Boot: bug device: ctrl=6, dev=0
bootxx: first level bootstrap program [$Revision: 1.1 $]
\
>> OpenBSD/mvme88k bootsd [$Revision: 1.2 $]
2007040+139264+255920+[75840+91275]=0x27347f
Start @ 0x10020 ...
Controler Address @ ffff9000 ...
[ using 167115 bytes of bsd a.out symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
Copyright (c) 1995-2004 OpenBSD. All rights reserved. http://www.OpenBSD.org
OpenBSD 3.4-current (GENERIC) #532: Wed Jan 14 23:00:16 GMT 2004
miod@ramade.gentiane.org:/usr/src/sys/arch/mvme88k/compile/GENERIC
real mem = 67104768
avail mem = 59047936 (14416 pages)
using 844 buffers containing 3457024 bytes of memory
mainbus0 (root): Motorola MVME197, 50MHz
cpu0: M88110 version 0x3
bussw0 at mainbus0 addr 0xfff00000: rev 1
pcctwo0 at bussw0 offset 0x42000: rev 0
clock0 at pcctwo0 ipl 5
nvram0 at pcctwo0 offset 0xc0000: MK48T08 len 8192
cl0 at pcctwo0 offset 0x45000 ipl 3: console
ssh0 at pcctwo0 offset 0x47000 ipl 2: version 2 target 7
scsibus0 at ssh0: 8 targets
vme0 at pcctwo0 offset 0x40000: vector base 0x80, system controller
vme0: using BUG parameters
vme0: 1phys 0x04000000-0xefff0000 to VME 0x04000000-0xefff0000
vme0: 2phys 0x00000000-0x00000000 to VME 0x00000000-0x00000000
vme0: 3phys 0x00000000-0x00000000 to VME 0x00000000-0x00000000
vme0: 4phys 0x00000000-0x00000000 to VME 0x00000000-0x00000000
vme0: vme to cpu irq level 1:1
vmes0 at vme0
vs0 at vmes0 addr 0xffff9000 vec 0x80 ipl 2: target 7
scsibus1 at vs0: 8 targets
sd0 at scsibus1 targ 0 lun 0: <COMPAQPC, DCAS-32160, S6CA> SCSI2 0/direct fixed
sd0: 2006MB, 8188 cyl, 3 head, 167 sec, 512 bytes/sec, 4110000 sec total
vmel0 at vme0
ie0 at pcctwo0 offset 0x46000 ipl 1: address 08:00:3e:22:db:21
boot device: sd0
root on sd0a
rootdev=0x400 rrootdev=0x800 rawdev=0x802
Enter pathname of shell or RETURN for sh:
Jan 14 23:18:55 Enter pathname of shell or RETURN for sh:
Jan 14 23:18:59 Enter pathname of shell or RETURN for sh:
Jan 14 23:19:01 Enter pathname of shell or RETURN for sh:
Jan 14 23:19:03 Enter pathname of shell or RETURN for sh:
Jan 14 23:19:06 Enter pathname of shell or RETURN for sh:
Jan 14 23:19:11 Enter pathname of shell or RETURN for sh:
Not able to make progress on MVME197 yet, I diverted my attention back to the compiler.
Date: Fri, 13 Feb 2004 09:43:02 +0000
From: Miod Vallat
To: Hiroaki Etoh
Subject: Help on a gcc problem
Hello,
I am still fighting a code generation bug in gcc 2.95 on mvme88k,
which only happens at -O2.
Unfortunately, I am quite stuck at the moment, so I figured I could
ask you for help, you always have good advice...
Before I expose the problem, here is a quick reminder of the m88k
calling convention and register usage:
r0 always zero
r1 return address
(i.e. you return from a routine with "jmp r1")
r2-r9 function parameters (if there are more parameters,
the remaining ones are passed on the stack)
r10-r13 non-preserved registers
r14-r25 preserved registers (callee must preserve them)
r26-r29 reserved by the ABI (needed for PIC code, etc)
r30 frame pointer
r31 stack pointer
The problem I see now, happens in a non-leaf function with a lot of
living local variables. In this case, the register allocator table
(REG_ALLOC_ORDER in config/m88k/m88k.h) will eventually suggest using
registers from the r2-r9 range, in my test case r8 and r9. This does not
cause a conflict, because this routine uses only r2-r4 as parameters.
Unfortunately, gcc will not produce code to initialize these
registers!
I was wondering if this was related to the fact that these registers
match FUNCTION_ARG_REGNO_P, contrary to the other registers; but then,
architectures such as arm always use unused parameter registers as
temporaries, and do not have such a problem.
Compiling with -O1 does not trigger the problem, only because the
generated code uses less registers as temporaries, and especially not
the two uninitialized registers.
You can find my current testcode on gentiane, in ~miod/test.c - I also
have gcc -S output with -O1 and -O2 as test.s1 and test.s2 respectively.
If you need to tinker with gcc on an mvme88k machine, you can login to
"arzon".
Thanks for any ideas you could have on this!
Miod
(You might remember Hiroaki Etoh as the stack-protector author; he had also been an invaluable help with some gcc bugs.)
This was not enough to help him figure out the cause of the problem, until my experiments with optimization options allowed me to narrow the scope a lot.
Date: Thu, 18 Mar 2004 17:19:01 +0000 From: Miod Vallat To: Hiroaki Etoh Subject: Re: Help on a gcc problem Hello, remember by m88k gcc -O2 problem? It turns out it is caused by -O2 implying -fcaller-saves. Compiling with -O2 -fno-caller-saves produces correct code. I'll have a closer look at the caller-saves logic soon... Miod
And then, this immediately rang a bell with him, which in turn allowed me to find the appropriate bugfix in the gcc 3.0 sources.
Date: Fri, 19 Mar 2004 10:17:13 +0000 From: Miod Vallat To: Hiroaki Etoh Subject: Re: Help on a gcc problem > I remember the problem is caused by the bug of the register analysis on > the architecture where a register is related to the other register. hmm, I > can't explain well. let see register 8 and 9 on m88k. DImode r8 contains > SImode r8 and r9. So, r9 is damaged after the set of DImode 8. Apparently, revision 1.32 of caller-save.c fixes this: * caller-save.c (mark_referenced_regs): Mark partially-overwritten multi-word registers. this was commited between 2.95 and 3.0. I will try to build a compiler with this change backported, and check how it behaves. In the meantime I'll force -fno-caller-saves on m88k so that we can still benefit from other -O2 optimizations. Miod
Date: Fri, 19 Mar 2004 22:48:02 +0000
From: Miod Vallat
To: "Luke Th. Bullock", Paul Weissmann, Kenji Aoyama
Subject: gcc/m88k -O2 fix you might want to play with...
Hello,
With the help of Hiroaki Etoh, I eventually found the -O2 killer bug
on m88k. It turns out this is a genuine gcc 2.95 bug, which was fixed in
gcc 3.0, but is triggered more easily on m88k than other architectures.
If you are tracking OpenBSD/mvme88k, you'll need to revert my last
gcc/config/m88k/m88k.h change, which forces -fno-caller-saves at all
optimization levels (-O2 and -Os imply -fcaller-saves). I commited it so
that the OpenBSD/mvme88k 3.5 release would ship with a compiler able to
produce correct -O2 code, though it will still use -O1 by default. If
you're not, your m88k.h is probably clean!
Once the release is over, I'll revert this change, commit the
following diff, and enable -O2 on mvme88k by default.
Here's the diff - a straight backport from gcc 3.0. Apply in
gnu/egcs/gcc/, and recompile gcc. Then you can safely use -O2 (until the
next subtle bug is found).
If you're curious, I have also attached a simple test program,
gccregtest.c, derived from the libc's strtoll() code, which will dump
core on m88k when compiled at -O2 by an unfixed gcc. If you have tried
to compile your libc at -O2, recompiled sshd afterwards, and found you
could not login through ssh anymore, you know what I mean (-:
Have fun,
Miod
Index: caller-save.c
===================================================================
RCS file: /cvs/src/gnu/egcs/gcc/caller-save.c,v
retrieving revision 1.1.1.1
diff -u -p -r1.1.1.1 caller-save.c
--- caller-save.c 1999/05/26 13:34:01 1.1.1.1
+++ caller-save.c 2004/03/19 15:25:16
@@ -504,7 +504,14 @@ mark_referenced_regs (x)
x = SET_DEST (x);
code = GET_CODE (x);
if (code == REG || code == PC || code == CC0
- || (code == SUBREG && GET_CODE (SUBREG_REG (x)) == REG))
+ || (code == SUBREG && GET_CODE (SUBREG_REG (x)) == REG
+ /* If we're setting only part of a multi-word register,
+ we shall mark it as referenced, because the words
+ that are not being set should be restored. */
+ && ((GET_MODE_SIZE (GET_MODE (x))
+ >= GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))))
+ || (GET_MODE_SIZE (GET_MODE (SUBREG_REG (x)))
+ <= UNITS_PER_WORD))))
return;
}
if (code == MEM || code == SUBREG)
For the people interested, the test program was:
/*
* This is a ``simple'' test program, derived from OpenBSD libc's strtoll()
* function, which exposes a bug in gcc 2.95 -fcaller-saves feature on m88k.
*
* When compiled with -fcaller-saves, this program will dump core on m88k,
* unless caller-save.c has the 1.32 revision fix, which was available in
* gcc 3.0 onwards.
*/
#include <sys/types.h>
#include <ctype.h>
#include <errno.h>
#include <limits.h>
#include <stdlib.h>
u_quad_t __qdivrem(u_quad_t, u_quad_t, u_quad_t *);
quad_t my__divdi3(quad_t, quad_t);
/*
* This is a slightly simplified version of strtoll(), as found in
* /usr/src/lib/libc/stdlib/strtoll.c
*
* It will not always produce correct results anymore ! Its purpose
* is _only_ to trigger the bug!
*
* Most of the code has been kept in order to:
* - have a lot of _living_ local variables
* - invoke __divdi3
*/
long long
mystrtoll(const char *nptr, char **endptr, int base)
{
const char *s;
long long acc;
long long cutoff;
int c;
int neg, cutlim;
s = nptr;
do {
c = (unsigned char) *s++;
} while (isspace(c));
if (c == '-') {
neg = 1;
c = *s++;
} else {
neg = 0;
if (c == '+')
c = *s++;
}
cutoff = neg ? LLONG_MIN : LLONG_MAX;
cutlim = cutoff % base;
/*
* The following statement will be invoked with incorrect value
* in the second parameter when compiled with -O2
*/
cutoff = my__divdi3(cutoff, (quad_t)base);
for (acc = 0;; c = (unsigned char) *s++) {
if (isdigit(c))
c -= '0';
else
break;
if (neg) {
if (acc < cutoff || (acc == cutoff && c > cutlim)) {
acc = LLONG_MIN;
} else {
acc *= base;
acc -= c;
}
} else {
if (acc > cutoff || (acc == cutoff && c > cutlim)) {
acc = LLONG_MAX;
} else {
acc *= base;
acc += c;
}
}
}
return (acc);
}
int
main(void)
{
return mystrtoll("42", NULL, 10);
}
quad_t
my__divdi3(a, b)
quad_t a, b;
{
u_quad_t ua, ub, uq;
ua = a;
ub = b;
/* printf("qdivrem(%lld, %lld)\n", ua, ub); */
uq = __qdivrem(ua, ub, (u_quad_t *)NULL);
return (uq);
}
(I guess it's sort of a good thing I keep archives of too many things, because, more than 20 years later, I have absolutely no recollection of that compiler bug at all.)
(Follow this link to go forward to the next part.)