Disinformation 23

On oddments and ANSI


Column IndexAnsible Information Home PageDavid Langford Home Page

Wrong Number Blues

Does FORMAT.EXE work with PC/Xi BIOS 2.7? I met conflicting reports, and a weird statistical anomaly whereby older, more experienced Ansible customers were generally less able to make FORMAT work. Sometimes, even more bafflingly, it would work for a bit and then start issuing rude "Wrong BIOS version" messages. Odd.

The culprit was Apricot's own SETUP disk configuration utility, the old PC/Xi version with ladder menus -- as supplied to early customers and used only by the brave. Although initially confusing, this seemed to work well... but secretly it was saving one wrong byte. In fact, SETUP configuration changes the "major" BIOS version number to 1 before writing its bumf to the disk label sector. (What does this unnoticed error suggest about the BIOS version prevalent when SETUP was written?) BIOS 2.7 is thus heavily disguised as 1.7, and so on.

At boot-up time, the bad information is copied to the BIOS configuration table in RAM, which is read by FORMAT. This table starts at a double-word address stored in the usual MicroSoft offset-then-segment format at 700 hex in the zero segment (address 0000:0700 in DEBUG display format): offset at 700H, segment at 702H. Using this information you can read the actual configuration table, which starts with... well, Peter Rodwell's Advanced User's Guide is strangely silent on the table's early bytes, I suspect because of the cock-up. You should be able to read the minor and major BIOS version numbers directly from the first and second bytes of the table, e.g. 7 and 2 for version 2.7, 1 and 3 for 3.1. To make life more difficult, the late BIOS 3.1.2 is reported as though it were 3.2. After SETUP, these version numbers come out as 1.7, 1.1 and 1.2. Perhaps Rodwell thought it better not to document such a seemingly unreliable feature.

Apricot have provided a cure with the disk label-sector editor LABEL.COM, which I've also seen renamed as DSKLABED.COM to distinguish it from the LABEL.EXE volume label utility supplied with Apricot MS-DOS 3.20. The third edit screen of LABEL allows you to modify the bytes "Version No. LO" and "Version No. HI", correcting SETUP's error and also allowing the potentially dangerous experiment of fooling BIOS-specific programs about the version under which they're running. Be warned! Despite what some people think and a few will loudly tell you, altering this version number information does not (of course) alter the BIOS version which actually boots up.

Since macho users avoid such almost-friendly utilities as LABEL, I've been curing afflicted floppies more swiftly with DEBUG. With the disk in drive A, you can load the label sector with L 100 0 0 1. The command D 180 shows the hexadecimal contents of the disk's configuration table, starting with the two bytes already mentioned and going on with a diagnostic byte, then a byte which is zero if the parallel printer is selected and 1 for the serial one, then a byte for beeper volume (0 full, 15 off), and on into information about DOS, keyboard, screen, comms, etc. If your BIOS version is 2.7 and the D display begins with 07 01, you can fix it with E 181 2 and write the corrected sector to disk: W 100 0 0 1.

Assuming the BIOS number is now reliable, your own programs might usefully read it to learn what facilities are available. Getting it from disk would be folly: a floppy could have been changed since boot-up. Instead, imitate FORMAT and read the RAM configuration table. In Turbo Pascal the information can be had as an Integer (or Word in version 4.0) via:

 MemW[MemW[0:$702]:MemW[0:$700]]

Or in assembler:

 XOR AX,AX
 MOV ES,AX
 MOV BX,ES:[700h]
 MOV ES,ES:[702h]
 MOV AX,ES:[BX]

In both cases the major and minor numbers will appear as the high and low bytes of the integer result (Turbo) or AX register (assembler). When constructing assembler code in DEBUG, the H-for-hexadecimal is omitted from numbers, and the ES: segment command deported to a line of its own before the MOV command... e.g. ES: and then MOV BX,[700].

In BASIC? I wish you wouldn't ask such tactless things, but just this once....

 10 DEF SEG = 0 
 20 BOFS% = PEEK(&H700) + 256*PEEK(&H701) 
 30 BSEG% = PEEK(&H702) + 256*PEEK(&H703) 
 40 DEF SEG = BSEG% 
 50 MIN% = PEEK(BOFS%) 
 60 MAJ% = PEEK(BOFS%+1)

The last line shows how to read any byte of the table -- just slot in the appropriate internal offset instead of the 1. Similar offsets can be inserted in assembler; the last line there would become MOV AL,ES:[BX+3] to get the printer flag byte already mentioned. The equivalent Turbo expression is

Mem[MemW[0:$702]:MemW[0:$700]+3]

... using Mem rather than MemW to extract a single byte.

A dictionary utility

Speaking of disks, the F2 and F10 floppy drives are widely rumoured to be untrustworthy. For a while I noticed no trouble, but eventually found that the F10 sometimes formatted disks less than perfectly. A knowledgeable pal claimed that early Apricot FORMATs would skip the odd sector -- and that to be on the safe side you should format every floppy two or three times. Doing this, and switching to the latest FORMAT, gave a slight improvement but not 100% reliability.

I suspected that the stepping mechanism for the drive head had got old and creaky; the problem seemed connected with head position. On rare good days the drive wrote to double-sided disks and verified OK; on average days only single-sided disks would work; on bad days the words Disk Error and Data Error were endlessly repeated like mantras. When this happened I used the empirically devised Bad Day Disk whose low-numbered data sectors were filled with about 50K of junk file: once this much had been written to a blank disk, the rest would usually copy safely.

Then Steve Gimblett of Western Computer Maintenance (0752 228564, plug plug) suggested the Dictionary Utility. The F10 is now handling floppies perfectly, with a copy of Webster's Seventh New Collegiate Dictionary stuffed between the main box and the monitor. With the monitor standing directly on top as usual, magnetic leakage from the CRT can interfere sporadically with disk operation -- depending, presumably, on screen brightness and tilt, the head's physical location, and even the bit-pattern being written to the floppy. Do try a dictionary before giving your drive heads the wire brush and Dettol treatment.

The WEBSTER.COM configuration utility also cured an F2 which would boot from drive B but not from drive A. Guess which of the vertically stacked drives is closer to the monitor?

The smartarse corner

In the last episode of Mark Whitehorn's thrill-packed Turbo Pascal primer (Release 3.11), I couldn't help noticing some oddities -- doubtless deliberately inserted as a test for budding programmers. These are connected with the code which writes quantities of goodies into random positions of a game-location array. The program GAMEDEMO.PAS defines an enumerated Contents type whose values include Empty, Water, Gold and Penguin, and the variable Location: Array[1..40,1..30] of Contents. So far, so good.

After setting every element of Location to Empty, and then filling 20 at random with Gold, 20 with Water and 10 with Penguins (in that order), the program claims in comments that the array will now contain 20 or fewer water jugs, 20 or fewer sacks of gold, and exactly 10 Penguins. Obviously there's a chance that some random Gold assignations will overwrite previous ones for Water, and that Penguins can overwrite either. The missed-out point is that there's a small but significant chance that with true random selection, Gold will also overwrite Gold and Penguins will overwrite Penguins. Statistically, this should happen in approximately 3.7% of program runs. In other words, pedants should note that there'll be exactly ten Penguins only 96.3% of the time....

Also, here's a useful tip. GAMEDEMO.PAS sets all the elements of Location to Empty by use of nested For loops. No harm in that; but should program speed ever become vital, with the Location array being cleared frequently, Turbo offers a much faster way:

FillChar(Location,SizeOf(Location),Empty);

Note that this statement works no matter how you change the dimensions of Location!

Those were minor points. The major one, showing the commonest pitfall of array usage, appears when values are assigned to array elements by such statements as:

For Count:= 1 to 10 do
 Location[Random(40),Random(30)]:= Penguin;

Seems OK? Check the array definition and look again. The function Random(N) returns an integer value in the range zero to N-1. Not only will this statement never affect any array element whose first index is 40 or whose second is 30, it will occasionally try to change elements with a zero index -- outside the boundaries of the array. In the absence of array bounds checking, this could write Penguins (whatever numerical value they might actually have) into the data preceding the array proper... possibly even into the machine code. This is immensely dangerous, and a Bad Thing.

When developing a Turbo program with heavy array use, it's worth asking for index range checking, with the compiler directive {$R+}... just include it near the top of the program. Bad array calls, like an attempt to slot a Penguin into Location[0,0], will then stop the program with a "Run-time error 90" message and a note of the PC (program counter) value. It's easy then to locate the offending statement -- either immediately by pressing ESC if you're running the program in memory from within Turbo, or otherwise by asking for "Find run-time error" on Turbo's Options menu, and then entering the given PC value.

When all's well -- and remember, some array indexing errors may show up only rarely -- remove the {$R+} again to restore full (if reckless) program speed. Meanwhile, when trying last issue's GAMEDEMO.PAS, shift those random array positions into the correct range with, for example:

Location[1+Random(40),1+Random(30)]:= Penguin;

Further escape plans

Bright ideas for future Disinformation columns are always welcome, unless they imply hard work. Eddie Bromhead suggested that I supplement last year's Complete Escape Sequences series with a piece on the ANSI (American National Standards Institute) screen controls, usually mentioned here in tones of dislike. Let's take a look, explaining in the process why this "IBM compatible" screen handling fails to impress me with its efficiency. First I'll expand on the original series's background, and mention display methods where ESC-sequence controls don't apply.

As detailed last year (Releases 2.9 to 2.11), "screen" escape sequences are strings of two or more characters which always begin with ESC, and which when "displayed" don't appear but are intercepted by the Apricot screen driver and interpreted as special BIOS commands. These usually affect the screen (erasing or deleting all or part of it, moving the cursor, setting attributes like underline or inverse video, etc) but can in principle do anything which came into the clever little heads of the BIOS authors (e.g. keyboard modification).

All this is tackled by what Apricot call the Interrupt F1H handler, through which all MS-DOS screen output is passed (this includes "file" writing and copying to the CON display device). The famously undocumented MS-DOS Interrupt 29H sends stuff to the display by the same route. Note that there are other routes! The Control Device screen handler (Int FCH, Device 31H) doesn't intercept ESC sequences and treat them as special: the ESC just appears as the left-pointing arrow which is its nominal font character. If an Apricot program using ESC sequences runs at all on a bog-standard IBM, there's usually a rash of these arrows, introducing ESC sequences which the IBM tactlessly displays without acting on.

ESC sequences are likewise ignored by Apricot's final BIOS 3.1 screen handler, working through Int ECH. An article on this is perhaps due: Apricot recommended it strongly for all "generic" applications, but although it's fast the user has to do a frightful lot of work. "Displaying a character" through Int F1H involves no more than sticking the character code in the AL register and calling the interrupt: the BIOS automatically puts the character at the current cursor position, making it bright, underlined or inverse-video according to the current default, and goes on to reposition the cursor and scroll the screen as necessary. Writing a character via Int ECH requires that you load CL with command code 4 ("display character"), CH with a screen ID, DX with the character plus attribute codes, AX with display position coordinates, SI with optional font information and DI with an "image ID" laboriously obtained from earlier calls... and after all that, if you want the cursor moved or the screen scrolled up you must do it yourself with more Int ECH calls. Help!

Finally... displaying text by the disreputable method of writing straight to video memory will naturally also bypass the ESC sequence handler, and ensure incompatibility not only with IBMs but also with other Apricots. The F series and Portable keep their video memory at one address; the PC, Xi and Xen use another location and a different format. In the latter case the video memory at segment F000H is the same as that for the Victor/Sirius: but Sirius programs which write straight to video memory produce gibberish on the Xi and Xen, because the Sirius character coding isn't compatible with Apricot's system of font pointers. Rule of computing: it's always more complicated than you thought.

Into the ANSI world

Once understood, Apricot's own "VT52" ESC sequences are straightforward. ESC Y is a good example. When this sequence is received, the BIOS "knows" that the next two characters passed to the screen driver will be coded row and column coordinates for cursor repositioning. The characters are intercepted and the cursor moved as appropriate.

My fundamental objections to the committee-designed ANSI screen controls are that they're invariably longer -- wasting time by requiring more characters to be sent to the display -- while their weird back-to-front syntax requires more processor overhead. Thus....

All ANSI sequences start with ESC [ (i.e. ASCII characters 27 and 91 decimal, 1B and 5B hex). This tends to be followed by a number or numbers, in the long-winded decimal ASCII format whereby the value 64 is sent as two characters, 6 and 4, rather than being compressed Apricot-fashion into the single character @ whose ASCII code is 64. (Already, processor time is being wasted converting raw numbers into numerical strings.) The rule is that the first alphabetic character in the sequence constitutes the "code" which finally tells the BIOS what the command has been all about. For example, the cursor positioning code is H (or f). To move the cursor to the tenth row and fortieth column of the screen, you send the eight- character string:

ESC [ 1 0 ; 4 0 H

Or, if you prefer:

ESC [ 1 0 ; 4 0 f

(The screen's first row and column are taken throughout as being number 1, not zero. Semicolons are used to separate multiple numerical ANSI parameters.)

The format accounts for my nasty remarks about processor overhead. Obviously, eight characters take twice as long to send to the screen driver as the four in Apricot's normal ESC Yyx. Perhaps less obviously, the fact that the command code comes at the end means that the screen driver has had to spend time testing each of the six characters after the ESC [ lead-in, to see whether it's a command letter (with the character being stored for future reference only if it isn't).

To be fair, the ANSI sequences needn't always be quite this verbose. Numerical parameters can be omitted and, if so, a default value of 1 will be assumed. All the following sequences and their ESC [..f equivalents will home the cursor.

ESC [ 1 ; 1 H
ESC [ 1 ; H
ESC [ ; 1 H
ESC [ ; H
ESC [ H

...the last being only mildly less efficient than Apricot's coincidentally similar ESC H.

In what follows, I'll use capital N to stand for a numerical ASCII parameter as described above. A value of (say) 10 for N means sending two ASCII characters, 1 and 0, as in the cursor positioning example. (0 and 1 are characters 48 and 49 in the ASCII code table, remember!) Where there are two Ns, for cursor positioning, we'll call them X and Y for column and row. Don't forget that omitting any N, X or Y will have the same effect as setting it to 1, and that the ; separator can be omitted when there's only one parameter or none at all. The actual capital letters N, X and Y are not part of any valid Apricot ANSI sequence, and neither are the spaces I've inserted for clarity.

1: Cursor Control

cursor up N rows              ESC [ N A
cursor down N rows            ESC [ N B
cursor right N columns        ESC [ N C
cursor left N columns         ESC [ N D
cursor moved to (X,Y)         ESC [ Y ; X H
ditto                         ESC [ Y ; X f
save cursor position          ESC [ s
restore saved position        ESC [ u
cursor position report        ESC [ 6 n

Notes: Single-step cursor movements with ESC [ 1 A or ESC [ A (and so on) bear an uncanny resemblance to the Apricot ESC A, ESC B etc. cursor commands. The ANSI codes will similarly not move the cursor beyond the edge of the current screen window: rather than wrapping around, it won't move at all.

Why two identical "move to X,Y" commands? Apricot's BIOS 3.1 manual lists the second only as "IBM compatible", though both are supported on the IBMs to hand here. Older Apricot manuals claim a different syntax for the second sequence, with the row and column parameters swapped: ESC [ X ; Y f. This is apparently just a mistake. As far back as BIOS 2.4, the commands work with identical syntax.

The ESC [s and ESC [u cursor save and restore are interchangeable with the more usual Apricot ESC j and ESC k: a position saved with ESC [s can be restored with ESC k, and so on.

The "cursor position report" function is documented as "device status report" and is a real weirdie. What it should do, according to IBM ANSI documentation (required to make sense of the BIOS 3.1 manual's incomprehensible reference to it), is to return the cursor position to the keyboard buffer in the format of the dummy ANSI sequence ESC [ Y ; X R. What it actually does is to imitate Apricot's own position-report sequence ESC n, returning the position not in ANSI format but as an Apricot ESC Yyx sequence which can be read as if from the keyboard. (See Release 2.10.) No IBM compatibility there.

2: Erase Screen Text

erase to end of screen        ESC [ 0 J
erase from start of screen    ESC [ 1 J
clear whole screen            ESC [ 2 J
erase to end of line          ESC [ 0 K
erase from start of line      ESC [ 1 K
clear whole current line      ESC [ 2 K

Notes: Erasures "to" a given point begin at and include the cursor position. Erasures "from" a point end at and include the cursor position. None of these commands extends beyond the current screen window or moves the cursor. (Some Apricot-standard sequences do; ESC E clears the screen and homes the cursor. Oddly enough, ESC [2J does home the cursor on IBMs. So much for the gesture towards compatibility!)

3: Screen Text Attributes

turn off all attributes       ESC [ 0 m
turn on bold text             ESC [ 1 m
turn on underlining           ESC [ 4 m
turn on inverse video         ESC [ 7 m

Notes: The attributes can be freely combined (not so on an IBM, where monochrome inverse-video text can't be also bright or underlined). But ANSI's lack of specific "off" commands makes for clumsiness. If all three attributes are set and you want to turn off underlining alone, you must send ESC [0m ESC [1m ESC [7m (as opposed to the Apricot ESC 1).

4: Miscellaneous

set auto CR after LF          ESC [ 20 h
turn off auto CR after LF     ESC [ 20 l
set wrap at line end          ESC [ 7 h
turn off auto wrapping        ESC [ 7 l
set screen window
 (lines N1 to N2)             ESC [ N1 ; N2 r
set origin mode               ESC [ 6 h
turn off origin mode          ESC [ 6 l

Notes: The "wrap" and "auto CR after LF" features behave just like Apricot's standard ones. The "auto CR" feature works with all three of the "move cursor down one line" screen control characters, LF (0A hex), VT (0B) and FF (0C). In each case, sending the character to the screen when this feature is on will also cause the cursor to be moved to the left of the current window.

The "screen window" sequence isn't as good as Apricot's: you can set the window's top and bottom lines only. The cursor is homed in the selected screen area. If an Apricot ESC , sequence has already been used to specify a window, ESC [..r will preserve its left and right margins while resetting the top and bottom. N1 and N2 are always measured from the physical top of the screen, not that of the current window, and the first screen line is counted as line 1 as above -- except that on the F10 here, it's treated as line 0 for the purposes of this ANSI command only. Setting/clearing normal windows with ESC , or ESC . will also wipe out the effects of ANSI windowing. As with normal Apricot windowing, a window height of only one line, or two, can't be set on the F2/F10.

"Origin mode" doesn't feature in my IBM ANSI references, and to encourage completists Apricot have explained it only in obsolete Technical Reference Manuals -- the super final BIOS 3.1 manual merely mentions these commands. The obvious implication is that they don't actually work. With origin mode on, the ANSI cursor moves supposedly behave like ESC Y by refusing to move the cursor out of the current screen window. With origin mode off, you can supposedly move outside the window with ESC [..H or ESC [..f. In fact it makes no difference; apparently you can always leave the ANSI window with these sequences though not with ESC Y, while none of them will let you leave a non-ANSI-compatible window -- i.e. one set with ESC , which doesn't fill the whole screen width.

So much for the fabled ANSI sequences. On the whole I wouldn't bother.


Column IndexAnsible Information Home PageDavid Langford Home Page