OpenBSD stories
miod > software > OpenBSD > stories > OpenBSD on SGI, 6/6: The last challenges

OpenBSD on SGI, 6/6: The last challenges

(Follow this link to go back to the main SGI page, and this link to go back to the previous part.)

2010, OpenBSD

The multiprocessing work eventually reached a state where the Octane multiprocessor kernel was running stably, and it was enabled in the builds on january 19th.

Joel Sing contributed a console video driver for the Odyssey frame buffer (VPro) found on the Fuel and some high-end Octane systems in early march; I took care of the other Octane frame buffer, the Impact, immediately after.

Date: Sun, 7 Mar 2010 20:54:06 +0000
From: Miod Vallat
To: Theo de Raadt, Joel Sing
Subject: sgi (octane) impact frame buffer driver

This is the other framebuffer option available on Octane. Had I known I
could have get it done in two days, I would have written it much
earlier...

So do we want this for the release? (this would allow all Octane users
to be able to use glass console regardless of their setup). This diff
conveniently forgets to add a framebuffer type for it (thus no
wsconsio.h change) and a manpage, to avoid changing the tree outside
arch/sgi.

Miod

PS: this is heavily based upon the linux code, with the command fifo
programming made easier to understand, and putchar() able to use a 12x22
font, while the linux code does not work for width != 8.

We were getting close to the OpenBSD 4.7 release, so we had to hurry. I moved the frame buffer of one of the two Octanes in my lab into the other, to create a multihead setup, and check that the OpenBSD kernel would make the same decision as the PROM to decide which of the two heads was the console. Guess what, it didn't and was needing some more code.

Date: Fri, 12 Mar 2010 16:46:26 +0000
From: Miod Vallat
To: Joel Sing, Mark Kettenis, Theo de Raadt
Subject: important fixes for multihead on sgi

After putting two heads on an Octane, the following fixes turned out to
be necessary:
- MAKEDEV: create device nodes for up to four heads.
- ip30_find_video(): when multiple heads are present, the PROM picks the
  highest widget as the console, not the lowest.
- impact_attach(): allocate non-console screens with M_ZERO, otherwise
  we will not allocate backing store correctly and panic at the first
  output.

Miod
[...]

These changes were small enough to be allowed to make the release.

After that, there was not much sgi-specific activity for a while.

On his Origin 350 machine, Theo de Raadt noticed that, sometimes, all disk I/O would stall, and after a hard reset, the filesystems on the disk would often be so horribly corrupted that he would have to reinstall. I suspected a lost interrupt, but could never figure out the real cause of the problem (and therefore could not deliver a fix). I myself encountered this issues only twice in more than five years after getting my own Origin 350 system at the end of 2014. My machine using slower 700MHz processors compared to Theo's 1GHz processors, this could also be a race condition somewhere, which would explain why it happened much more often to him. But none of the tentative changes I made to try and understand the problem helped.

2012, OpenBSD

You might remember the email I sent to another developer in october 2009 about the work needed to run on Indy and Indigo2 systems? I eventually decided it was time to roll up my sleeves and do the work.

Date: Sun, 18 Mar 2012 18:49:51 +0000
From: Miod Vallat
To: other developer
Subject: Re: Indy

It's been a long time since we last discussed porting to the Indy and
related systems.

As a convenient excuse to keep myself busy and unable to work on other
topics, I have started porting the NetBSD code. This is still a long way
from being usable, but here is a teaser, with loads of debug messages:




                         Running power-on diagnostics...



                           Starting up the system...

               To perform system maintenance instead, press <Esc>
Setting $netaddr to 10.0.1.210 (from server )

OpenBSD/sgi-IP22 ARCBios boot
arg 0: bootp()/bootecoff
arg 1: OSLoadOptions=auto
arg 2: ConsoleIn=serial(0)
arg 3: ConsoleOut=serial(0)
arg 4: SystemPartition=bootp()
arg 5: OSLoader=bootecoff
arg 6: OSLoadPartition=bootp()
arg 7: OSLoadFilename=/bsd.IP22
Boot: bootp()/bsd.IP22
Setting $netaddr to 10.0.1.210 (from server )
Setting $netaddr to 10.0.1.210 (from server )
1239328+552872 [100Setting $netaddr to 10.0.1.210 (from server )
+95496+52587]=0x1d9f28
ARCS32 Firmware Version 1.10
MEM 0, 0x8002000 to  0x8740000
MEM 1, 0x8800000 to  0x14000000
Found SGI-IP22, setting up.
Initial setup done, switching console.
[ using 149088 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2012 OpenBSD. All rights reserved.  http://www.OpenBSD.org

uvm_km_kmem_grow: grown to 0xc00000000c000000
OpenBSD 5.1-current (GENERIC-IP22) #32: Sun Mar 18 18:25:48 GMT 2012
    miod@saliouse.gentiane.org:/usr/src/sys/arch/sgi/compile/GENERIC-IP22
real mem = 201326592 (192MB)
rsvd mem = 794624 (0MB)
avail mem = 191455232 (182MB)
mainbus0 at root: Indy
cpu0 at mainbus0: MIPS R5000 CPU rev 1.0 150 MHz, R5000 based FPC rev 1.0
cpu0: cache L1-I 32KB D 32KB 2 way
cpu0: L1 set size 16384:16384
cpu0: L1 line size 32:32
cpu0: Alias mask 0x3000
cpu0: Config Register 1043e6f3
cpu0: Cache configuration 2
cpu0: Status Register 100048a0
clock0 at mainbus0: ticker on int5 using count register
int0 at mainbus0 addr 0x1fbd9880
imc0 at mainbus0: revision 3, board rev 0
memory cfg: 2f407f20 00000000
gio0 at imc0
framebuffer at gio0 addr 0x1f000000 not configured
hpc0 at gio0 addr 0x1fb80000: SGI HPC3 (onboard)

Serial EEPROM:
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
         0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0xffff
zs0 at hpc0 offset 0x00059830
zstty0 at zs0 channel 1: console
zstty1 at zs0 channel 0
pckbc at hpc0 offset 0x00059840 not configured
sq at hpc0 offset 0x00054000 not configured
wdsc at hpc0 offset 0x00044000 not configured
haltwo at hpc0 offset 0x00058000 not configured
pione at hpc0 offset 0x00059800 not configured
panel at hpc0 offset 0x00059850 not configured
boot device: 'bootp()' unrecognized.
root device:

One week later, I gave him a status update.

Date: Sun, 25 Mar 2012 19:23:11 +0000
From: Miod Vallat
To: other developer
Subject: Re: Indy

Status update: at the moment I am running multiuser diskless on all my
R4400 and R5000 based systems. R4000-based systems exhibit very odd
crashes and what looks like memory corruption, so I don't know how much
of this is caused by these processors not being reliable in 64-bit mode,
and how much of this is caused by bugs in my shiny new cache routines.

There are issues with the Ethernet driver, though, some packets fail to
transmit with a `should never occur according to the specs' TX DMA
error. This causes ypbind to fail, and I currently have no idea what
causes this (except perhaps for bugs introduced while porting the
driver from NetBSD).

What is left to port is the SCSI driver, the frame buffer drivers, and
the few extras (Indy power button and volume controls, keyboard and
mouse ports, and Indy/Indigo2 audio). My intent is to commit this as
soon as the SCSI driver is working, and do the frame buffer / input
devices work later.

What is left to support as well:
- R4600 systems (I don't have any, a friend of mine has one so this will
  get tested eventually)
- L2 cache on R4600 and R5000 Indy. At the moment, the L2 cache on the
  R5000 Indy is disabled; I have work-in-progress code to handle it, but
  this can't really fit in until my revamp of the mips cache routines is
  ready and stable; but again, this can be completed later.

Mandatory boot log:

System Maintenance Menu

1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option? 1


                           Starting up the system...

Setting $netaddr to 10.0.1.210 (from server )

OpenBSD/sgi-IP22 ARCBios boot
arg 0: bootp()/bootecoff
arg 1: OSLoadOptions=auto
arg 2: ConsoleIn=serial(0)
arg 3: ConsoleOut=serial(0)
arg 4: SystemPartition=bootp()
arg 5: OSLoader=bootecoff
arg 6: OSLoadPartition=bootp()
arg 7: OSLoadFilename=/bsd.IP22
Boot: bootp()/bsd.IP22
Setting $netaddr to 10.0.1.210 (from server )
Setting $netaddr to 10.0.1.210 (from server )
2824608+643136 [100Setting $netaddr to 10.0.1.210 (from server )
+175584+103501]=0x392ff8
ARCS32 Firmware Version 1.10
Found SGI-IP22, setting up.
Initial setup done, switching console.
[ using 280088 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2012 OpenBSD. All rights reserved.  http://www.OpenBSD.org

uvm_km_kmem_grow: grown to 0xc000000008000000
OpenBSD 5.1-current (GENERIC-IP22) #162: Sun Mar 25 19:11:16 GMT 2012
    miod@saliouse.gentiane.org:/usr/src/sys/arch/sgi/compile/GENERIC-IP22
real mem = 134217728 (128MB)
rsvd mem = 802816 (0MB)
avail mem = 128237568 (122MB)
mainbus0 at root: Indy
cpu0 at mainbus0: MIPS R5000 CPU rev 1.0 150 MHz, R5000 based FPC rev 1.0
cpu0: cache L1-I 32KB D 32KB 2 way
clock0 at mainbus0: ticker on int5 using count register
int0 at mainbus0 addr 0x1fbd9880
imc0 at mainbus0: revision 3
gio0 at imc0
hpc0 at gio0 addr 0x1fb80000: SGI HPC3 (onboard)
zs0 at hpc0 offset 0x00059830 irq 29
zstty0 at zs0 channel 1: console
zstty1 at zs0 channel 0
pckbc at hpc0 offset 0x00059840 irq 28 not configured
sq0 at hpc0 offset 0x00054000 irq 3: Seeq 80c03, address 08:00:69:09:62:15
wdsc at hpc0 offset 0x00044000 irq 1 not configured
haltwo at hpc0 offset 0x00058000 irq 12 not configured
pione at hpc0 offset 0x00059800 irq 5 not configured
panel at hpc0 offset 0x00059850 irq 9 not configured
dsclock0 at hpc0 offset 0x00060000
vscsi0 at root
scsibus0 at vscsi0: 256 targets
softraid0 at root
scsibus1 at softraid0: 256 targets
boot device: 'bootp()' unrecognized.
root device: sq0
nfs_boot: using interface sq0, with revarp & bootparams
nfs_boot: client_addr=10.0.1.210
nfs_boot: server_addr=10.0.1.1 hostname=auze
root on 10.0.1.1:/netboot/rance/root
swap on 10.0.1.1:/netboot/rance/swap
Automatic boot in progress: starting file system checks.
setting tty flags
pf enabled
ddb.console: 0 -> 1
vm.swapencrypt.enable: 1 -> 0
starting network
starting early daemons: syslogd pflogd ntpd.
starting RPC daemons: portmapnfs server odyssee:/netboot/rance/root/usr: not responding
nfs server odyssee:/netboot/rance/root/usr: is alive again
 ypbindsq0: transmit underflow
.

Two days later, I sent him basic instructions to give that work a try.

Date: Tue, 27 Mar 2012 16:56:21 +0000
From: Miod Vallat
To: other developer
Subject: OpenBSD/sgi on Indy quick crash course

Diskless and serial console only at the moment, and the L2 cache will
not be used. But you can impress your friends!

- get a recentish OpenBSD/sgi snapshot (at least base and etc).

- get bootecoff and bsd.IP22 from http://miod.online.fr/sgi/
  bsd.IP22 is a moving target, look for more files in there.

- setup your NFS server

  mkdir -p /exports/indy
  echo "/exports -maproot=root -alldirs -network=<mynet> -mask=<mymask>" \
    >> /etc/exports
  pgrep portmap || portmap
  pgrep mountd || mountd
  pgrep nfsd || nfsd
  pkill -HUP mountd nfsd
  echo "<myindyhostname> root=<server ip>:/exports/indy" \
    >> /etc/bootparams
  pgrep rpc.bootparamd && pkill rpc.bootparamd
  rpc.bootparamd

  cd /exports/indy
  tar xpzf /path/to/base51.tgz
  tar xpzf /path/to/etc51.tgz
  cd dev; sh ./MAKEDEV all

  (and of course you might want to setup hostname.sq0, hosts,
   resolv.conf, myname, mygate, etc). And change the root password,
   which is empty - but you can do it from the Indy later.

- setup your tftp server

  mkdir -p /tftpboot

  uncomment tftpd line in inetd.conf and HUP it, or use dlg's rewrite
  which I have not tinkered with yet

  cp bootecoff bsd.IP22 /tftpboot

- on the indy serial console

  abort the boot, pick `5' at the menu choice

  setenv netaddr <myip>
  setenv console d
  # to set up boot blocks path
  setenv SystemPartition bootp()
  setenv OSLoader bootecoff
  # kernel path
  setenv OSLoadPartition bootp()
  setenv OSLoadFilename bsd.IP22

  then enter `exit', and choose `1'.

What will happen:
- the Indy will try to fetch OSLoader from SystemPartition.
- SystemPartition being bootp() means network boot.
- network boot will issue a reverse ARP request unless `netaddr' is set
  (this saves time)
- bootecoff will get fetched from tftp
- bootecoff will try to fetch OSLoadFilename from OSLoadPartition, again
  using tftp
- the kernel will load, then there will be a pause while it gets read
  again for symbols (backwards seek do not exist in tftp). This will
  garble the display with several  ``Setting $netaddr to ...'' messages.
- the kernel will ask for its root device. Enter `sq0'.

If the Indy does not tftp load: try to
  sysctl net.inet.ip.portlast=32767
but R5k Indy should not need this at all. Only older PROMs need this.

In any case, `tcpdump ether host <ethernet address of the indy>' on the
server can help.

I also wanted to get other developers' opinions on support for some of these systems.

Date: Tue, 27 Mar 2012 21:19:38 +0000
From: Miod Vallat
To: private OpenBSD mailinglist
Subject: mips: do we want to run on R4000 systems?

I have been working recently on something I had promised for years:
extending our OpenBSD/sgi port to the Indigo, Indy and Indigo2 family.

These systems used to be quite popular and there are still people who
have one of these systems but nothing more recent (such as an O2 or an
Octane).

Supporting these systems makes sense, from a hobbyist point of view, and
also to keep people from installing NetBSD on these machines.

Now that's all for the nice side of this work.

The bad news: the R4000 and, to a lesser extent, the R4400 processor,
are plagued by processor bugs, when running in 64-bit mode. Some of them
can be worked around in the kernel, others can be worked around in the
compiler.

At this point, we have the following three choices:

1) don't bother trying to support these systems.

   pros: already done.
   cons: will make miod and [other developer] unhappy.

2) only support R4600- and R5000- based systems, which do not suffer
   from these horrible bugs. This means limiting support to the
   ``modern'' Indy systems only, since none of the Indigo and Indigo2
   systems exist with these processors.

   pros: it's just a bunch of rm in my Indigo tree...
   cons: will make miod unhappy. Especially since I would like to port
   to the R8000 and R10000-based Indigo2 systems in the future (assuming
   I can get any, that is)

3) support all 64-bit capable processors, including the R4000 and R4400.
   This requires the kernel and userland to be built with the following
   options: -mfix-r4000 -mfix-r4400. These options reorder some
   troublesome instructions to make sure there is no risk of triggering
   a processor errata.
   It is *very* important to compile everything with these options.
   Without them, a kernel dies with a bogus assertwaitok panic within
   seconds, because a M_NOWAIT argument becomes trashed into M_WAIT.
   Without them, openssl hangs trying to draw an RSA key. So will sshd.

   I have built the system with these options, and to my surprise this
   does not cause a noticeable code size growth, nor does it seem to
   make things run slower (on an R16000 cpu at least.)

   So, if we are considering doing, this, it would make sense to make
   these options enabled by default when building on big-endian mips
   systems, so that packages could be used on R4k-based sgi, without
   affecting loongson systems.

   pros: allows us to support all 64-bit capable systems in the Ind*
   families, and will make miod very happy.
   cons: small side-effect of changing the gcc options on all sgi
   systems.

What would you guys prefer?

Hardware support status:
- processors: R4000, R4400 and R5000 supported. L2 cache on R5000 not
  supported yet (not enabled, code is in the works). R4600 support is
  missing, coming very soon as well (joint work with R5000SC support).
  Tested on: R4000PC (no L2), R4000SC, R4400SC, R5000SC. Will get tested
  on R4600PC soon.
- on-board devices: serial ports, SCSI and Ethernet are supported.
  frame buffers and input devices are next, as well as Indy/Indigo2
  audio. No plans to support the parallel port.
  There are spurious (but reliably reproduceable) Ethernet TX errors on
  some systems, which I am investigating.
- expansion buses: SCSI and E++ Ethernet GIO options should be supported
  (I have an E++ and it seems to work). Fast Ethernet GIO option not
  supported yet (it's a de(4) behind a PCI-GIO bridge, driver can't be
  ported blindly from NetBSD without having the hardware). EISA bus on
  Indigo 2 is not supported yet, but I have work in progress code and
  hope to be able to support it soon (unless R4000 and R4400 support is
  ditched).

Mandatory bootlog. This is my Indigo R4000SC system, with a drive from
the ``expandable'' SCSI disk pile, used for testing until the scsi
controller becomes stable (it is now). Note how many kernels it took to
reach this state...


                           Starting up the system...


OpenBSD/sgi-IP20 ARCBios boot
arg 0: scsi(0)disk(2)rdisk(0)partition(8)/boot
arg 1: OSLoadOptions=auto
arg 2: ConsoleIn=serial(0)
arg 3: ConsoleOut=serial(0)
arg 4: SystemPartition=scsi(0)disk(2)rdisk(0)partition(8)
arg 5: OSLoader=boot
arg 6: OSLoadPartition=scsi(0)disk(2)rdisk(0)partition(0)
arg 7: OSLoadFilename=bsd
Boot: scsi(0)disk(2)rdisk(0)partition(0)bsd
2900720+643480 [100+180816+106161]=0x3a7978
ARCS32 Firmware Version 1.10
Found SGI-IP20, setting up.
Initial setup done, switching console.
[ using 287984 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2012 OpenBSD. All rights reserved.  http://www.OpenBSD.org

uvm_km_kmem_grow: grown to 0xc000000004000000
OpenBSD 5.1-current (GENERIC-IP22) #201: Tue Mar 27 20:37:09 GMT 2012
    miod@saliouse.gentiane.org:/usr/src/sys/arch/sgi/compile/GENERIC-IP22
real mem = 67108864 (64MB)
rsvd mem = 802816 (0MB)
avail mem = 61571072 (58MB)
mainbus0 at root: Indigo
cpu0 at mainbus0: MIPS R4000 CPU rev 2.2 100 MHz, R4010 FPC rev 0.0
cpu0: cache L1-I 8KB D 8KB direct, L2 1024KB direct
clock0 at mainbus0: ticker on int5 using count register
int0 at mainbus0 addr 0x1fb801c0
imc0 at mainbus0: revision 1
gio0 at imc0
hpc0 at gio0 addr 0x1fb80000: SGI HPC1 (onboard)
zs0 at hpc0 offset 0x00000d10 irq 5
zstty0 at zs0 channel 1: console
zstty1 at zs0 channel 0
zs1 at hpc0 offset 0x00000d00 irq 5
zstty2 at zs1 channel 1
zstty3 at zs1 channel 0
sq0 at hpc0 offset 0x00000100 irq 3: Seeq 80c03, address 08:00:69:06:b8:f2
wdsc0 at hpc0 offset 0x00000122 irq 2: WD33C93B (20.0 MHz clock, BURST DMA)
wdsc0: microcode revision 0x0d, Fast SCSI
scsibus0 at wdsc0: 8 targets, initiator 0
sd0 at scsibus0 targ 2 lun 0: <FUJITSU, M2624F-512, M405> SCSI1 0/direct fixed
sd0: 496MB, 512 bytes/sector, 1015812 sectors
dpclock0 at hpc0 offset 0x00000e00
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
boot device: sd0
root on sd0a (125e5cc33e7dbdca.a) swap on sd0b dump on sd0b
Automatic boot in progress: starting file system checks.
setting tty flags
pf enabled
ddb.console: 0 -> 1
starting network
starting early daemons: syslogd pflogd ntpd.
starting RPC daemons: portmap ypbind.
savecore: no core dump
checking quotas: done.
clearing /tmp
starting pre-securelevel daemons:.
setting kernel security level: kern.securelevel: 0 -> 1
creating runtime link editor directory cache.
preserving editor files.
starting network daemons: sshd sendmail inetd sndiod.
starting local daemons: cron.
Mon Mar 26 05:44:14 MDT 2012

OpenBSD/sgi (clamouse.gentiane.org) (console)

login:

The general answer was roughly "whatever, we don't care", so this code hit the tree on march 28th.


Investigating the Ethernet issues led to an interesting fix on april 8th.

Be more careful when reprogramming the sq(4) DMA and PIO timing parameters;
the current logic can be traced back to DaveM's intership at SGI in 1996,
and are adequate for the hardware he had access to.

However, ``recent'' Indigo2 and Indy systems are fit with a faster (33MHz
instead of 25MHz) GIO64 bus, which need different timing parameters, and
guess what? The PROM knows the right values to set.

Since programming these timing registers was apparently only necessary for
the Challenge S second interface:
1) only reprogram those registers on an IP24 (Indy, Challenge S) system.
2) pick proper values depending upon the actual GIO64 bus speed.

Item #1 fixes Ethernet operation on Indigo2 (at least my teal R4400SC).
Item #2 fixes Ethernet operation on my R5000SC Indy.

For the record, programming unoptimal value caused `TX DMA underrun' errors
(documented as `can't happen' in the HPC3 documentation, oh the irony),
which could be reproduced reliably with ypbind(8).

I also hit an interesting hardware behaviour where reading from an important hardware register, a ``bus arbitration register'', which value should remain constant unless written to, was sometimes returning a completely different value than its actual value, as if the read cycle had not been performed correctly, and stale bus data was returned instead of the actual data.

The kernel initialization code would read that register, set or clear some specific bits, and write it back. But if the value being read in the first place was incorrect, despite the register contents being correct, the kernel would write a completely bogus value, which could cause some busses in the system to be disabled or become nonresponsive.

This was fixed in this commit by making the register write-only in the kernel, after making sure we've read a correct value.

Reading the IMC bus arbitration register is not reliable, at least on IP20,
and can return completely bogus values; writing these values back to the
register can have unexpected and hilarious side effects, such as disabling the
frame buffer.

Workaround this `feature' by reading the register in a loop until we read
twice the same value, and the value looks legit; then cache this value in a
global variable and handle the register from now on, as a write-only register.

Once things were stabilized, I posted a summary of this work to the OpenBSD/sgi mailinglist.

Date: Tue, 24 Apr 2012 21:58:40 +0000
From: Miod Vallat
To: sgi@openbsd.org
Subject: Support for R4k Indigo, Indy and Indigo2 added

Hello,

  I am happy to announce that the current snapshots of OpenBSD support
the IP20, IP22 and IP24 SGI systems. In other words:

  - R4000 and R4400 Indigo (IP20)
  - R4000, R4400 and R4600 Indigo2 and Challenge M (IP22)
  - all Indy and Challenge S (IP24)

  I know that several (many?) of you had been asking in the past, and I
had always intended to work on this (didn't I get an R4000 Indigo, 12
years ago, for this purpose?). Wait no more! This has eventually
happened.

  This work was helped by the existing NetBSD work, which provided basic
device support (serial, Ethernet and SCSI). Of course running these
systems in 64-bit mode opened quite a large Pandora box-of-problems, but
we have reached a point where the system is running stably and able to
rebuild itself. Oh, did I mention that I stumbled onto a few hardware
bugs, such as extremely important device registers not always returning
their value when being read?

  Anyways, what this really means is that you can wipe the dust off your
system, plug it back, boot the OpenBSD installer and get a working
system in less than half an hour (unless there really is a lot of dust
to clean first).


  There are still a few rough edges, though.


  What works:

- onboard serial, keyboard/mouse, SCSI and Ethernet controllers. Tested
  and confirmed to work.

- all frame buffers are supposed to work, at least in console mode.
  However, only Newport (XL/XGE, found on Indy and Indigo2) and
  Express/Ultra (XZ/Elan/Extreme, found on Indigo, Indy and Indigo2)
  have been tested. The Light/Entry/Starter graphics on Indigo, as well
  as Impact on Indigo2, ought to work out of the box, but could not be
  tested (donations welcome). There is no X11 server for any of these
  frame buffers yet.

- GIO E++ boards work. GIO32 SCSI boards ought to work too, but could
  not be tested.

- the Challenge S second Ethernet interface (on the IO+ mezzanine)
  ought to work.


  What doesn't work (yet):

- the extra SCSI controllers (WD33C95) on the Challenge S IO+ mezzanine
  are not supported (anyone got a Challenge S to spare?)

- Fast Ethernet GIO options (Phobos G130 and G160, as well as `Set
  Engineering' Fast Ethernet are not supported. There is code to borrow
  from NetBSD, but it's not really an option without access to the
  hardware.

- on-board audio on Indigo (hdsp) and Indy/Indigo2 (haltwo).

- on-board parallel port (honestly, I couldn't care less).

- L2 cache on R4600SC and R5000SC processor modules. I am working on
  this, but need to introduce a few interfaces first, and this takes
  time checking I do not break other systems. Coming soon.


  What sort of works:

- the EISA bus on the Indigo2 attaches and cards get detected. However I
  have only been able to test it with 3Com ep(4) boards; the 10MBit/s
  models have abysmal performance, and the 100MBit/s 3C597 (or Phobos
  G100) doesn't seem to interrupt. Your mileage may vary.


  Snapshots are available from OpenBSD FTP mirrors near you; do not
hesistate and play with them! However, please, please, pretty please, do
not expose an R4000 or R4400 based system to the internet. These
processors suffer from unfixed errata which can be used by local users
to gain supervisor privileges. On these systems, you can't trust your
local users - at all. As in, never give me a shell access to the system
or I'll root it without even thinking. R4600 and R5000 based systems do
not suffer from such problems.

  If you have an early R4000 processor (either an 1.x or 2.x version),
other bugs will prevent it from running stably (the so-called ``end of
page'' bugs). The system will run multiuser, but from time to time, some
processes will misbehave. My own Indigo has a revision 2.2 R4000, so I
am experiencing these issues first hand (and this system is currently
not able to rebuild itself because of these processor bugs). However the
processor errata documentation is public, and I am working on designing
a good way to circumvent them (and it's no easy task, especially with
the goal of not affecting the performance of other processors).


  But enough said.


  The real reason for this work was to get good device support, in order
to be able to port OpenBSD to the Power Indigo2 systems: both the R8000
flavour (IP26) and the R10000 flavour (IP28). Those systems come with a
specific memory controller which needs some care in order to operate
properly; and of course the R8000 processor is an odd beast which, to
this day, is not supported by any free operating system.

  I would like to change this state of things, and have IP26 and IP28
systems be able to run OpenBSD, if only for the sake of it and the joy
of getting my code to run on formerly unsupported hardware.
Unfortunately, I do not own any such system - neither an R8000 Indigo2
nor an R10000 Indigo2. If you know of such a system collecting dust,
which owner would not mind parting of, please let me know. I'm sure we
can arrange something.


Miod


PS: If you only have R4k ``PC'' processors, that is, without a secondary
cache... then do yourself a favour and don't try to compile anything on
them... these small caches get blown a few orders of magnitude by gcc 4
when compiling even the smallest program. This is very hard on the
R4000PC systems, which only have 2x8KB cache (I have such a system, it's
perfectly stable but glacial as soon as you are compiling anything on
it). R4400PC, R4600PC will be a bit more bearable because their cache is
twice larger.

This led to an interesting discussion with Steve Rumble about how to handle the R4000 end-of-page errata.

Technical details you may skip!
The R4000 end-of-page errata affects R4000 processors before the 3.x revision. On these processors, if the last instruction of a MMU page is a jump instruction, and the next page does not have a valid TLB entry, in certain circumstances documented in the errata sheet, the processor will not service the TLB miss exception correctly. This usually manifests as a nested exception being taken and causing the userland program to terminate with prejudice a segmentation fault.
Date: Thu, 26 Apr 2012 20:15:51 +0000
From: Miod Vallat
To: Stephen M. Rumble
Subject: Re: Support for R4k Indigo, Indy and Indigo2 added

[...]
R4000 EOP pages are easy to workaround, really. There are multiple ways
to do it, but what worries me is the cost of them, speedwise.

One way to do it is to check all pages for the troublesome instructions
when they are pmap_enter'ed with PROT_X, and force the next page to
always be mapped in a TLB if the troublesome page is mapped. Using two
wired TLB pairs, this can be done. But this adds overhead because you
can no longer insert random TLB entries in the TLB miss handler, you
first need to check if these concern entries which are in the wired
slots. There is also work to do if you have several EOP-erratae pages in
sequence, because of the limited number of wired entries. So if you
have two contiguous EOP pages being an odd page (second half of wired
tlb #0) and an even page (first page of wired tlb #1), you need to also
map the second page of wired tlb #1. But what if it's also an EOP-errata
page? Then you need to map the page next to it, with yet another wired
tlb entry...

A way to block this chain is to map this page to a dedicated page full
of break instructions. Then if execution goes from the first EOP page
(last of tlb #0) to the second one (first of tlb #1), then to the third,
it will hit a break instruction and the kernel can realize that the
original EOP page which caused this wired TLB arrangment no longer needs
to be mapped (and actually needs to be unmapped). But this will cause a
lot of overhead in the TLB miss handlers.

A different, easier to implement, way to do this, is to overwrite the
potential errata triggering jump instruction with a special break
instruction, and remember the original instruction in the pcb. This
avoids the need for mapping the next page, but on the other hand this
will cause extra physical memory usage due to copy-on-write. And you
still have the problem of emulating the delay slot instruction because
the MIPS exception handling does not allow delay slots to be recovered
(unlike, say, sparc, where the exception code returns after setting the
addresses of the next two instructions to execute: the delay slot and
the branch destination).

I am still thinking about this, maybe there are different ways to
achieve this.

Miod
Date: Fri, 27 Apr 2012 04:40:31 +0000
From: Miod Vallat
To: "Stephen M. Rumble"
Subject: Re: Support for R4k Indigo, Indy and Indigo2 added

[...]
> > One way to do it is to check all pages for the troublesome instructions
> > when they are pmap_enter'ed with PROT_X, and force the next page to
> > always be mapped in a TLB if the troublesome page is mapped. Using two
> > wired TLB pairs, this can be done. But this adds overhead because you
> > can no longer insert random TLB entries in the TLB miss handler, you
> > first need to check if these concern entries which are in the wired
> > slots. There is also work to do if you have several EOP-erratae pages in
> > sequence, because of the limited number of wired entries. So if you
> > have two contiguous EOP pages being an odd page (second half of wired
> > tlb #0) and an even page (first page of wired tlb #1), you need to also
> > map the second page of wired tlb #1. But what if it's also an EOP-errata
> > page? Then you need to map the page next to it, with yet another wired
> > tlb entry...
>
> I think that SGI took this approach with a limit on the number of EOP
> pages in a process. There was some documentation I came across long ago
> warning that a few really old binaries might not work properly because
> of too many of these consecutive evil pages.

That would make sense. I have added code to check for such pages and
warn on the console, and got spammed by messages; however I don't
remember seeing consecutive pages having the problem.

Also, I was using a hammer and reporting all pages with a jump
instruction as the last word of the page, regardless of the preceding
instructions. According to the errata the errata will only trigger if
there are pending loads before the jump, but this makes deciding the
risk of errata tricky, and I don't want to go there yet. Turns out there
are several pages in libc which end up with jumps, so all binaries are
being reported as errata-prone.

On the other hand, we are using 16KB pages which cuts the odds of
triggering the errata by four.

> > A way to block this chain is to map this page to a dedicated page full
> > of break instructions. Then if execution goes from the first EOP page
> > (last of tlb #0) to the second one (first of tlb #1), then to the third,
> > it will hit a break instruction and the kernel can realize that the
> > original EOP page which caused this wired TLB arrangment no longer needs
> > to be mapped (and actually needs to be unmapped). But this will cause a
> > lot of overhead in the TLB miss handlers.
>
> That's a clever solution. It'd be interesting to see how many such bad
> pages actually show up in practice. It might be much worse today with
> larger binaries and libraries than it was 20 years ago.

I'd say the worst issue here would be jump tables in kernel code, or
other rodata merged with text. There would be the risk of code in the
EOP-errata page referencing memory in the page which temporarily points
to the break instructions. The data access will not fault but return
wrong data. Curse these unified TLBs!

[...]

> I think the simplest fix, especially as far as the kernel is concerned,
> would be to have the toolchain generate jumps and branches only at
> 8-byte aligned addresses. I believe that this is what SGI did, or
> perhaps they were more precise in banning the end-of-page jump/branch.
> I looked into hacking up gcc to emit .align directives around jumps a
> long time ago, but was out of my depth and didn't make much progress.

This is a no-go in my opinion, if only because, by default, the
assembler will do an optimization pass itself and reorder instructions
to get faster code.
[...]

In the second half of may, I managed to source a POWER Indigo2 R10000 (IP28) to try my luck with it.

Technical details you may skip!
An important difference between the R4000-based Indigo2 (IP22) and the other models, be they R8000-based (IP26) or R10000-based (IP28), is that the latter have an ECC memory subsystem. But the way the ECC memory controller works, it needs to perform extra operations if there are memory accesses bypassing the processor's cache, i.e. if the processor performs uncached memory accesses.

In order to not permanently inflict the cost penalty of these extra operations, the controller works in two modes:

The machine boots in slow mode, but once the operating system kernel has performed its initialization, it is expected to switch to fast mode to perform better.

The Indigo2 onboard Ethernet driver accesses its Ethernet buffer descriptors using uncached memory accesses. This would only work in slow mode, so porting to the IP28 design required some work in order to make this transparent to the driver code. I ended up modifying the driver to fetch descriptor data, alter it if needed, and store (write) it back; when running on the IP22 systems, the fetch and store operations would do nothing and the returned address would be the address of the descriptor in uncached memory, while on IP28 a copy of the descriptor in cached memory would be made, and written back later. This allowed me to narrow the moments where I had to switch back to slow mode to the smallest possible time frames.


In my archives, the oldest traces of a kernel booting on it appear in a discussion about how the Insite Floptical drives were being probed.

Date: Fri, 25 May 2012 08:27:35 +0000
From: Miod Vallat
To: Sebastian Reitenbach
Subject: Re: Support for R4k Indigo, Indy and Indigo2 added

> Do you have a floptical drive? I guess I should be able to remove it,
> without harm to the system, and could bring it with me for you to g2k12.

Does this match what you were seeing:

> bootp()boot64 bootp()bsd.IP28 --a
Setting $netaddr to 10.0.1.9 (from server )
Obtaining boot64 from server
1024+36624Setting $netaddr to 10.0.1.9 (from server )
+3824+1368+320Setting $netaddr to 10.0.1.9 (from server )
Setting $netaddr to 10.0.1.9 (from server )
 entry: 0x900000002fff4e70

OpenBSD/sgi-IP28 ARCBios boot version 1.1
arg 0: bootp()boot64
arg 1: bootp()bsd.IP28
arg 2: --a
arg 3: ConsoleIn=serial(0)
arg 4: ConsoleOut=serial(0)
arg 5: SystemPartition=scsi(0)disk(1)rdisk(0)partition(8)
arg 6: OSLoader=sash
Boot: bootp()bsd.IP28
Setting $netaddr to 10.0.1.9 (from server )
Obtaining bsd.IP28 from server
Setting $netaddr to 10.0.1.9 (from server )
Obtaining bsd.IP28 from server
3262800+915560 [91Setting $netaddr to 10.0.1.9 (from server )
Obtaining bsd.IP28 from server
+193752+115291]=0x447c90
ARCS64 Firmware Version 64.0
Found SGI-IP28, setting up.
Initial setup done, switching console.
[ using 309976 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2012 OpenBSD. All rights reserved.  http://www.OpenBSD.org

uvm_km_kmem_grow: grown to 0xc000000010000000
OpenBSD 5.1-current (GENERIC-IP28) #0: Fri May 25 08:20:58 GMT 2012
    miod@saliouse.gentiane.org:/usr/src/sys/arch/sgi/compile/GENERIC-IP28
real mem = 268435456 (256MB)
rsvd mem = 1064960 (2MB)
avail mem = 260440064 (248MB)
mainbus0 at root: POWER Indigo2 R10000
cpu0 at mainbus0: MIPS R10000 CPU rev 2.5 194 MHz, R10000 FPU rev 0.0
cpu0: cache L1-I 32KB D 32KB 2 way, L2 1024KB 2 way
clock0 at mainbus0: ticker on int5 using count register
int0 at mainbus0 addr 0x1fbd9000
imc0 at mainbus0: revision 5
gio0 at imc0
hpc0 at gio0 addr 0x1fb80000: SGI HPC3 (onboard)
zs0 at hpc0 offset 0x00059830 irq 29: 85230
zstty0 at zs0 channel 1: console
zstty1 at zs0 channel 0
pckbc0 at hpc0 offset 0x00059840 irq 28
sq0 at hpc0 offset 0x00054000 irq 3: Seeq 80c03, address 08:00:69:07:f0:e2
wdsc0 at hpc0 offset 0x00044000 irq 1: WD33C93B, 20.0 MHz, burst DMA
wdsc0: microcode revision 0x0d, fast SCSI
scsibus0 at wdsc0: 8 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0: <SGI, Seagate ST11200N, 8996> SCSI2 0/direct fixed serial.SGI_Seagate_ST11200N00637118
sd0: 1014MB, 512 bytes/sector, 2076777 sectors
sd1 at scsibus0 targ 2 lun 0: <INSITE, I325VM *F, 0387> SCSI1 0/direct removable
probe(wdsc0:2:1): wdsc0: timed out; asr=0x00 [acb 0x9800000020dfc000 (flags 0x1, dleft 24)], <state 1, nexus 0x0, resid 24, msg(q 0,o 0)>probe(wdsc0:2:1): ABORT in timeout: csr=0x85, asr=0x00
probe(wdsc0:2:1): wdsc0: timed out; asr=0x00 [acb 0x9800000020dfc000 (flags 0x41, dleft 24)], <state 1, nexus 0x0, resid 24, msg(q 0,o 0)>probe(wdsc0:2:1): ABORT in timeout: csr=0x85, asr=0x00

The code was commited a bit later (with the Floptical removed from my machine for the time being, until some bugfixes in the SCSI controller driver solved the issue) on the same day.

Support for the POWER Indigo2 R10000 systems (IP28). Currently running with
ECC checking disabled, which allows the existing Indigo2 drivers to run
unmodified.

In order to support the many processor combination on the Indy and Indigo2 systems, I needed to replace the cache handling routines, of which there was a set per processor model, from almost unreadable assembly code to C code. I sent a mail to explain why.

Date: Mon, 28 May 2012 19:25:32 +0000
From: Miod Vallat
To: private OpenBSD mailinglist
Subject: A note about mips cache routines

Just a friendly reminder!

In case you didn't notice, I have started, over the last few weeks, to
replace assembly cache routines for MIPS processors, with C code.

I have three good reasons to do this:
- all this assembly code roots in 32 bit MIPS code written in 1993-1994
  or so.
- while gcc 2.7 would not compile C code to something as good as the
  assembly code we have, 15+ years later, this is definitely not the
  case. And if the C code is only slightly behind, it's way more
  readable.
- rewriting this code as C is easier to maintain, and will allow me to,
  eventually, provide tailor-built routines on a `what processor are we
  running on today?'-basis.

That third point by itself is the main reason behind this work, because
R5000 routines have become a mess (3 hierarchies of cache, of which L2
and L3 may not exist, and L2 may not be implemented the same way, and
thus need two different code paths).

Once everything is in C code, it will be easy to have the ``MI'' (well,
mips64-specific but port-agnostic) cache routines being curcpu()
function pointers.

This will finally allow me to write the necessary code for L2 cache on
R4600 and R5000 SGI Indy systems, without disrupting other platforms too
much (and then I might be able to convince jsg@ to buy me a beer or two,
next time we meet). <-- that's my #1 reason to do this, of course


Doing this is taking a bit longer than expected, because
1) it's a real PITA.
2) I can't help but want to fix or improve things in the process, yet
   this has to wait until the assembly code is converted to bug-identical
   C code; then we'll be able to improve things. This is known as the
   `art syndrome'. Example: the R10000 code we have assumes the L2 line
   size is 64 bytes. Well, this was true on O2 R10000 systems, but
   higher-end systems (Origin, Octane, Fuel...) use 128 bytes per line.
   This means we can actually make some L2 routines run faster on those
   systems... eventually. So while I have faster code ready, I first
   need to confirm the `iso' code works, before improving on it.

In particular, there is growing evidence that we need some cache
routines to operate on virtually indexed caches only (to avoid cache
aliasing and satisfy pmap_prefer when applicable), while some really
need to operate on all cache levels (when writebacks are really supposed
to hit memory, and not an intermediate cache level).

This mess will be definitely easier to clean once all cache code is
converted to tame C code.

I apologize in advance for possible broken kernels. If you run into
trouble with the kernels within the next few days (make that half a
month), please let me know.


Oh, and of course, make sure to rm cache_*.d from your kernel build
directories and rerun config, every time I commit new cache code.


Miod

This led to the R4600SC and R5000SC having their external cache supported.

Code for the external L2 cache controller on Indy/Indigo2 R4600SC and Indy
R5000SC processor modules; these sport an up to 512KB, physically indexed,
write-through L2 cache which is not connected to the canonical external cache
interface of these processors (hence requiring specific code to drive it).
The cache is enabled early and disabled before returning to ARCBios (for very
nasty things happen otherwise).

Tested on R5000SC, will be tested on R4600SC soon.

I could then move onto my next goal: running on the POWER Indigo2 R8000 (IP26). These are extremely rare and only appear on the 2nd-hand market once in a blue moon, so I bought one from Ian Mapleson in the UK. I initially aimed for example configuration #31 on his Indigo2 list, but he offered me #30 for the same price (i.e. extra 128MB of memory). With shipping costs, I ended up paying UKP 340 for it, much less that the current (2026) prices... (remember than Ian makes a living providing high quality systems and spares, with a warranty, and SGI hardware in working condition is becoming increasingly rare those days.)

I collected the machine from the local UPS dispatching center on july 3rd, and started working on it a few days later.

Technical note you may ignore!
The R8000 processor is quite different from the other MIPS processors. It has some unique features, such as not being able to run in 32-bit mode at all, the ability to use a different page size in the kernel than in userland, and a completely different TLB organization than the other 64-bit MIPS processors (in 3 banks of 128 independent entries).

It is also a highly superscalar architecture, able to schedule up to four instructions in parallel (two integer instructions and two floating-point instructions), but still performing in-order execution, which requires significant work from the compiler to order instructions in order to be able to issue as many instructions as possible in parallel. With a good compiler, such as SGI's MipsPro, the floating-point performance was impressive, and in fact, the code name for that processor was TFP, for Tremendous Floating-Point.

For a very long time, there was no public documentation for this processor. Thanks to the efforts of the linux-mips people, scans of a manual were released with permission to distribute them, and this allowed me to work on supporting that processor; various comments and macros in IRIX header files documented the processor errata and workarounds required to run stably.

Another particularity of that processor is that some exception conditions which are usually reported as internal exceptions on the other MIPS processors, are reported as external interrupts. All these details required some plumbing and internal reworkings of the generic MIPS code in the OpenBSD kernel, before I could try and run a kernel on my new toy system.

Cache memory operation on the R8000 is also unusual, to say the least. First, its internal (L1) cache is write-through only, and systems built around that processor, of which there are only two: the POWER Indigo2 R8000 (IP26) and the multiprocessor POWER Challenge R8000 and POWER Onyx R8000 (IP21), were using large and fast write-back L2 caches to compensate for this. Second, there is no documented way to invalidate the instruction cache (or a subset of it), and OpenBSD had no other choice than dedicate a page full of nop instructions to fill the instruction cache and eventually evict its previous contents.


Date: Tue, 10 Jul 2012 20:58:02 +0000
From: Miod Vallat
To: Theo de Raadt
Subject: I think I am hitting a new level of madness

This is what I have been working in the last two days. Still far from
complete.

>> bootp()boot64 bootp()bsd.IP26
Setting $netaddr to 10.0.1.228 (from server )
Obtaining boot64 from server
1024+35376Setting $netaddr to 10.0.1.228 (from server )
+3920+1400+304Setting $netaddr to 10.0.1.228 (from server )
Setting $netaddr to 10.0.1.228 (from server )
 entry: 0xa80000003fff52e0
bios_base = 0x9800000000001000

OpenBSD/sgi-IP26 ARCBios boot version 1.3
arg 0: bootp()boot64
arg 1: bootp()bsd.IP26
arg 2: ConsoleIn=serial(0)
arg 3: ConsoleOut=serial(0)
arg 4: SystemPartition=scsi(0)disk(1)rdisk(0)partition(8)
arg 5: OSLoader=sash
arg 6: OSLoadPartition=scsi(0)disk(1)rdisk(0)partition(0)
arg 7: OSLoadFilename=/unix
Boot: bootp()bsd.IP26
Setting $netaddr to 10.0.1.228 (from server )
Obtaining bsd.IP26 from server
Setting $netaddr to 10.0.1.228 (from server )
Obtaining bsd.IP26 from server
3310176+751488 [91Setting $netaddr to 10.0.1.228 (from server )
Obtaining bsd.IP26 from server
+192696+114541]=0x42ada8
ARCS64 Firmware
Found SGI-IP26, setting up.
ip22_ecc_init: not working on IP26 yet
Initial setup done, switching console.
[ using 308168 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2012 OpenBSD. All rights reserved.  http://www.OpenBSD.org

OpenBSD 5.2-beta (GENERIC-IP26) #43: Tue Jul 10 20:40:31 GMT 2012
    miod@saliouse.gentiane.org:/usr/src/tfp/sys/arch/sgi/compile/GENERIC-IP26
real mem = 671088640 (640MB)
rsvd mem = 1064960 (2MB)
avail mem = 660078592 (629MB)
mainbus0 at root: POWER Indigo2 R8000
cpu0 at mainbus0: MIPS R8000 CPU rev 0.0 75 MHz, R8010 FPU rev 0.1
cpu0: cache L1-I 16KB D 16KB direct, L2 2048KB direct
cpu0: L1 set size 16384:16384
cpu0: L1 line size 32:32
cpu0: L2 line size 128
cpu0: cache configuration 0
cpu0: virtual alias mask 0x0
cpu0: config register 00000000000088b0, status register 0000000014034820
clock0 at mainbus0: ticker on int5 using count register
int0 at mainbus0 addr 0x1fbd9000
imc0 at mainbus0: revision 5
gio0 at imc0
grtwo0 at gio0 addr 0x1f000000: GU1-Extreme
grtwo0: device has not been setup by firmware!
hpc0 at gio0 addr 0x1fb80000: SGI HPC3 (onboard)
zs0 at hpc0 offset 0x00059830 irq 29: 85230
zstty0 at zs0 channel 1: console
zstty1 at zs0 channel 0
pckbc0 at hpc0 offset 0x00059840 irq 28
sq0 at hpc0 offset 0x00054000 irq 3: Seeq 80c03, address 08:00:69:08:6a:35
wdsc0 at hpc0 offset 0x00044000 irq 1: WD33C93B, 20.0 MHz, burst DMA
wdsc0: microcode revision 0x0d, fast SCSI
scsibus0 at wdsc0: 8 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0: <COMPAQ, BF03685A35, HPB8> SCSI3 0/direct fixed eui.000c50fffe3d119a
sd0: 34732MB, 512 bytes/sector, 71132000 sectors
wdsc1 at hpc0 offset 0x0004c000 irq 2: WD33C93B, 20.0 MHz, burst DMA
wdsc1: microcode revision 0x0d, fast SCSI
scsibus1 at wdsc1: 8 targets, initiator 0
haltwo at hpc0 offset 0x00058000 irq 12 not configured
pione at hpc0 offset 0x00059800 irq 5 not configured
panel0 at hpc0 offset 0x00059850 irq 9: power button
dsclock0 at hpc0 offset 0x00060000
eisa0 at imc0 irq 27
bus error: cpu_stat 00000000 addr 1fa000e4, gio_stat 00000000 addr 1fbd9007
Stopped at      Debugger+0x4:   jr      ra
[...]
ddb>

Later that month:

<landry> mmh always nice when the sgi powers off.. much less noise.
<miod> balony!
<otto> it all music to miod's ears
<miod> ah, you can't understand. this machine he is complaining about used to be mine. some bits of our sgi port where written on this very machine.
<krw> hence its continued screams of pain?
<landry> it's a nice box... but noiser than an Ultra60, which is already not quiet :)
<miod> no, that's the baby-mulching machine in the background
<miod> landry, i think it would matter less if you were not running this machine in a closet
<miod> too bad i have not been able to figure out how to drive the fans on this machine. i'd be happy if i could report them, to be fair.

Getting the IP26 kernel to run turned out to be challenging. I eventually reached a state where the machine would run single-user, but processes would randomly fail with segmentation faults.

I commited the R8000 support on september 29th:

Basic R8000 processor support. R8000 processors require MMU-specific code,
exception-specific code, clock-specific code, and L1 cache-specific code. L2
cache is per-design, of which only two exist: SGI Power Indigo2 (IP26) and SGI
Power Challenge (IP21) and are not covered by this commit.

R8000 processors also are 64-bit only processors with 64-bit coprocessor 0
registers, and lack so-called ``compatibility'' memory spaces allowing 32-bit
code to run with sign-extended addresses and registers.

The intrusive changes are covered by #ifdef CPU_R8000 stanzas. However,
trap() is split into a high-level wrapper and a new function, itsa(),
responsible for the actual trap servicing (which name couldn't be helped
because I'm an incorrigible punster). While an R8000 exception may cause
(via trap() ) multiple exceptions to be serviced, non-R8000 processors will
always service one exception in trap(), but they are nevertheless affected
by this code split.
This was followed with the IP26 support minutes later.
Work in progress support for the Power Indigo2 R8000 system (IP26). This is
basically an IP22 system (R4000 Indigo2) with the ECC memory board of IP28,
and a so-called ``streaming'' L2 cache.

IP26 kernels currently boot single-user, but don't live long; I am suspecting
a bug in the tcc cache routines, but am currently not able to find it (come
to think of it, my understanding of how this cache works could be wrong, and
of course there is no documentation for it but what can be gathered from
IRIX' <sys/IP26.h> comments and defines).

Hopefully this situation will improve in the near future; in the meantime I
am commiting this as `work in progress' to make sure this code doesn't get
lost.

fall 2012 status

SGI model common name Linux NetBSD OpenBSD
IP6, IP10 Personal Iris 4D/2x complete distribution
no X server
IP12 Indigo (R3000) complete distribution
IP20 Indigo (R4000) complete distribution complete distribution
no X server
IP22 Indigo2 complete distribution
XL (newport) graphics only
complete distribution
XL (newport) graphics only
complete distribution
no X server
IP24 Indy complete distribution
XL (newport) graphics only
complete distribution
XL (newport) graphics only
complete distribution
no X server
IP26 POWER Indigo2 R8000 code in the public source tree
no distribution yet
IP27 Origin 200, Origin 2000 complete distribution complete distribution
no SMP
IP28 POWER Indigo2 R10000 same as IP22 complete distribution
no X server
IP30 Octane not-yet-integrated kernel patches
X server on Impact only
complete distribution
no X server
IP32 O2 complete distribution
no R10000 support
complete distribution complete distribution
IP35 Fuel, Origin 300, Origin 350, Origin 3000, Onyx 350, Onyx 4, Tezro complete distribution
no SMP
no X server
Origin 300 and 3000 not supported

2014, OpenBSD

Being unable to figure out what was wrong with my code on IP26 was quite a setback to me, and I didn't do much work on sgi after that.

Some time ago, I had swapped Matthieu Herrb's O2 mainboard with a matching mainboard (same processor and memory), in order to be able to update its older PROM to a version supporting the RM5200 processor, so that he could replace the 180MHz R5000 processor module with a 300MHz RM5200 module. The PROM update can only be done from a fairly recent IRIX system, using

  flashinst -T -y /usr/cpu/firmware/IP32prom.img

I had brought his mainboard to a friend who was running IRIX on his sgi systems and had accepted to let me perform the update, so we put Matthieu's mainboard in my friend's O2, started IRIX, and ran flashinst.

I brought the updated mainboard, with the new processor, to Matthieu mid-december 2013. Shortly afterwards, he reported to me that, when he would try to halt or reboot the machine, it would shutdown as usual, but then hit a kernel panic due to an invalid memory access.

The obvious culprit was the PROM update, and then I realized none of the O2 mainboards I had around was running the latest 4.18 PROM I had flashed for Matthieu. I thus visited my IRIX friend again on january 17th, to flash another O2 mainboard.

I went back home, put the mainboard back in its chassis, booted OpenBSD, logged in, issued "shutdown -r now", and hit the same kernel panic as Matthieu.

A few hours of debug showed that the ARCBios-provided function pointers for halt, reboot and powerdown had wrong addresses! The OpenBSD kernel would trust these pointers but end up jumping in the middle of an ARCBios routine. No wonder this ended up with invalid memory accesses.

I added a crude workaround, checking if the ARCBios function pointer values would match the 4.18 values, and update their values in this case.

For some reason (lack of testing being my #1 candidate), IP32 PROM version 4.18
ends up with the CKSEG1 function pointer values being wrong. All of them.
Wrong, as in, off-by-0x60.

Of course, invoking the advertized function pointers leads to interesting
results, from bogus panics to invalid pointer dereferences.

Attempt to identify this revision by:
- checking that all five function pointer values, as set up by bios_ident(),
match the 4.18 bogus values;
- instructions at said pointer match the 4.18 values.

If the test is positive, then the pointer values are replaced with the correct
values. This allows O2 systems with 4.18 PROM to correctly powerdown, reboot
and halt.

Found the hard way by matthieu@ after a PROM upgrade, verified on a second
system by me.

2015, OpenBSD

In september 2015, Naruaki Etomi, who had an R8000 Indigo2 system, tried to run OpenBSD on it, noticed binaries would not work reliably, and investigated why. He sent a patch allowing him to boot multiuser, to the OpenBSD/sgi mailinglist.

Although his patch was not completely correct, it made me realize my bug. In a set of assembler instructions computing the position of a page table entry, I was incorrectly clearing one bit too much. This caused half of the virtual addresses of a process to end up aliasing lower addresses instead of pointing to the right memory pages, causing memory corruption. In fact, it was a little miracle that I had been able to reach single-user mode (it's a good thing OpenBSD's /bin/sh is reasonably small, I doubt I would have had the same outcome, had the shell been bash or zsh.)

I quickly commited a fix for the problem, and could enjoy my R8000 Indigo2 booting multiuser!

I could then enable the build of IP26 kernels and boot blocks in the OpenBSD/sgi snapshots, on september 27th.

Visa Hankala then spent almost the whole month of december to get multiprocessor operation on the IP27 and IP35 systems, finally allowing Theo's Origin 350 machine to use its four processors. He also switched the pmap module to use 3-level page tables, finally allowing the 64-bit userland to use up to one terabyte of virtual memory, instead of the 2GB it was still constrained with.

fall 2015 status

SGI model common name Linux NetBSD OpenBSD
IP6, IP10 Personal Iris 4D/2x complete distribution
no X server
IP12 Indigo (R3000) complete distribution
IP20 Indigo (R4000) complete distribution complete distribution
no X server
IP22 Indigo2 complete distribution
XL (newport) graphics only
complete distribution
XL (newport) graphics only
complete distribution
no X server
IP24 Indy complete distribution
XL (newport) graphics only
complete distribution
XL (newport) graphics only
complete distribution
no X server
IP26 POWER Indigo2 R8000 complete distribution
no X server
IP27 Origin 200, Origin 2000 complete distribution complete distribution
IP28 POWER Indigo2 R10000 same as IP22 complete distribution
no X server
IP30 Octane not-yet-integrated patches
X server on Impact only
complete distribution
no X server
IP32 O2 complete distribution
no R10000 support
complete distribution complete distribution
IP35 Fuel, Origin 300, Origin 350, Origin 3000, Onyx 350, Onyx 4, Tezro complete distribution
no X server
Origin 300 and 3000 not supported

2018, OpenBSD

Although I had not been active in OpenBSD for three years, I was still tinkering from time to time with my SGI systems.

While playing with O2 machines, I had noticed that, in some cases, the network interface would operate very slowly, but had no idea why. Eventually, I started to investigate, and gathered some clues. This ended up with a fix sent to the OpenBSD tech mailinglist.

Date: Sat, 8 Dec 2018 17:32:41 +0000
From: Miod Vallat
To: tech@openbsd.org
Subject: SGI O2 mec(4) cold boot issue (and workaround)

I have noticed, for a while, that my O2 systems were horribly slow
during installs or upgrades, when fetching sets from the network (28
*minutes* to fetch base64.tgz).

At first, I thought this was a bsd.rd specific bug, but couldn't find
anything obvious. After gathering enough data, I found out that the
problem only occurs on a cold boot. After a reboot, the network
performance is as good as it can be. That would explain why I would only
notice it during upgrades.

I also noticed that, on a warm boot, the dmesg would show:

mec0 at macebus0 base 0x00280000 irq 3: MAC-110 rev 1, address 08:00:69:0e:bf:a1
nsphy0 at mec0 phy 8: DP83840 10/100 PHY, rev. 1

but on cold boots, it would show:

mec0 at macebus0 base 0x00280000 irq 3: MAC-110 rev 1, address 08:00:69:0e:bf:a1
nsphy0 at mec0 phy 10: DP83840 10/100 PHY, rev. 1

Note that, in these cases, the phy seems to attach to a different
address. In these cases, after booting, "ifconfig mec0" would show:

mec0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        lladdr 08:00:69:0e:bf:a1
        llprio 3
        media: Ethernet autoselect
        status: active
        inet 10.0.1.193 netmask 0xff000000 broadcast 10.255.255.255

while one would expect the "media" line to be similar to:

        media: Ethernet autoselect (100baseTX full-duplex)

Investigating further, it seems that, after a cold boot, the MII bus
takes some time to initialize; the phy does not answer to address 8 but
to a larger address (10 or 11), then, after being reset, to its correct
address of 8.

So the kernel would discover the phy at a wrong address, attach it, and
after it gets reset, reading from the phy at the wrong address would
return either all bits clear or all bits set, confusing the link speed
logic without any way to recover.

What I tried but did not work:
- invoking mii_attach() twice in the mec driver. This would attach nsphy
  twice, once at the wrong address, then once at the correct address,
  but the first (wrong) attachment would be preferred.
- adding a one second delay between the Ethernet interface reset and
  mii_attach(). This would work most of the time, but not always.

What I tried and works:
- the first time the interface is reset, the mii bus is walked and all
  phys found on it are reset. Thus, by the time mii_attach() runs and
  walks the bus again, the phy will answer at the right address.

The diff below implements this (last chunk of if_mec.c), and also cleans
the mii read/write routines a bit (all the other chunks).

Tested on three different R5K family O2 systems, which have all been
exposing that problem on cold boot.

Visa Hankala commited that fix (as well as a few other minor diffs I had) for me.

2021, OpenBSD

All this work was short-lived, as Theo de Raadt decided to pull the plug on OpenBSD/sgi in 2019.

Date: Mon, 26 Aug 2019 09:34:46 -0600
From: Theo de Raadt
To: Miod Vallat
Subject: sgi

I'm going to terminate sgi base builds.
It's been fun.

But the disk controller hang makes it pretty difficult for me to do
repeated builds.  It happens during intense-write, about every 4th-8th
release build, towards the end when installing to destdir.  But worse,
it also happens very often when relinking libc.

The octeon port is working a fair bit better these days, I've done 80
builds a in a row with visa's latest pmap change.  It is fairly fast.
Also, I've come to an agreement with Rhino to buy some of their machines
with have M.2 SATA slots, so we'll be able to stand up a pkg cluster.

For loongson, things are looking dire.  The addition of clang there has
been dismal.  longsoon takes longer to build than landisk.

I'll keep carrying on with as many test canary architectures as possible,
but each time the sgi eats itself there is a 10% chance it eats it's
filesystem quite badly....

Despite this mail, OpenBSD/sgi release builds were still performed until the 6.9 release in may 2021, after which the port was removed from the tree.

Note that, despite the end of the OpenBSD/sgi port, a kind soul is still keeping the sgi kernels up to date (the userland can be obtained from the OpenBSD/octeon distribution sets, as Octeon share the same endianness as SGI processors.)

fall 2021 status

SGI model common name Linux NetBSD OpenBSD
IP6, IP10 Personal Iris 4D/2x complete distribution
no X server
IP12 Indigo (R3000) complete distribution
IP20 Indigo (R4000) complete distribution no more...
IP22 Indigo2 complete distribution?
XL (newport) graphics only
complete distribution
XL (newport) graphics only
no more...
IP24 Indy complete distribution?
XL (newport) graphics only
complete distribution
XL (newport) graphics only
no more...
IP26 POWER Indigo2 R8000 no more...
IP27 Origin 200, Origin 2000 complete distribution? no more...
IP28 POWER Indigo2 R10000 same as IP22 no more...
IP30 Octane complete distribution?
X server on Impact only
no more...
IP32 O2 complete distribution?
no R10000 support
complete distribution no more...
IP35 Fuel, Origin 300, Origin 350, Origin 3000, Onyx 350, Onyx 4, Tezro no more...

(to be fair, I am not sure there were still Linux distributions supporting SGI hardware in 2021 - the Gentoo efforts did not appear to produce anything, and Debian dropped all its mips ports in 2019, the year the Octane support had finally been merged into the linux-mips tree...)

Conclusion

That has been quite an eventful story. What strikes me the most, in these 20+ years, is that, although SGI hardware was appealing to many kernel developers, and became affordable in the 2000's, most of the work towards supporting these systems have almost often been one-man efforts: in Linux, both the O2 and the Octane work were done each by a single person; this has also be the case in NetBSD for most of the SGI families it runs on.

OpenBSD is not much better in that regard, as I have done quite a significant part of the work once Per Fogelström put the port on good tracks (especially with a 64-bit kernel from almost the beginning). Of course, I am thankful to everyone who helped, and glad they did! I have some regrets, such as not trying much to get the X server to run on more frame buffers (Newport and Impact could have been done in a reasonable amount of time) and not working on audio support.

Even if that work seems to be lost, it survives in the Attic of the OpenBSD source tree, and I hope that the various lengthy comments I have put in these drivers can be used as good documentation, should someone want to tinker on these SGI systems in the future, regardless of which operating system they want to run.

OpenBSD will remain forever the first free software operating system to run on the Fuel, Tezro and Origin 350, as well as on the POWER Indigo2 R8000, despite so many people saying that no other system than IRIX would even run on that processor, and that's something noone can ever take from me.

I remain proud of that work, and I hope you have enjoyed this (long) story.