Time Trouble
Last Friday I had a moment of panic. While investigating why different run-time libraries might interpret file timestamps differently, I noticed that even Windows doesn’t always agree with itself. When wasdos4gw.exe
last modified, at 10:14 PM or 9:14 PM?
For some (but not all) files, thedir
command incmd
shows timestamps that are off by an hour compared to timestamps shown by Explorer or PowerShell. How is this possible? And is either of those timestamps even correct?
Learn Something Old Every Day, Part XIV: read() Return Value May Surprise
Last week I amused myself by porting some source code from Watcom C to Microsoft C. In general that is not difficult, because Watcom C was intended to achieve a high degree of compatibility with Microsoft’s C dialect.
Yet one small-ish program kept crashing when built with Microsoft C. It didn’t seem to be doing anything suspicious and didn’t produce any noteworthy warnings when built with either compiler.
After some head scratching and debugging, I traced the difference to a piece of code like this:
if( read( hdl, buf, BUF_SIZE ) != BUF_SIZE )
// Last file block read, deal with EOF
else
// Not near end of file
To my surprise, the return value fromread()
is rather different between the two compilers’ run-time libraries when the file is open with theO_TEXT
flag (and therefore meant to translate line endings from CR/LF to LF when reading).
Learn Something Old Every Day, Part XIII: InDOS Is Not Enough
The other day I spent a while trying to understand the purpose of a rather strange looking piece of code inside Borland’s THELP.COM utility shipped with Turbo Pascal 6.0 (THELP.COM was misbehaving under emulated DOS).
The THELP utility performs the following actions:
- Use INT 21h/34h to get the address of the InDOS flag
- Starting from the beginning of the InDOS segment, search for word 3E80h using the SCASW instruction
- If found, check if the location in memory six bytes past the 3E80h word holds the value BCh
- If so, store the word just past 3E80h for later use
This logic is applied to DOS version 4 and below (effectively 2.0 to 4.x), not newer versions. But what could it possibly be good for?
Continue reading→Minor 387 Documentation Mystery
So here I am, writing a bit of test code to figure out the behavior of x87 FPUs with regard to saving and loading the FPU state (FSTENV/FLDENV and FSAVE/FRSTOR instructions in different modes and formats).
The original real-mode only 8087 state format included the instruction and operand pointers as 20-bit linear addresses (because $REASONS) and also stored 11 bits of the floating-point (FP) opcode; the remaining five were always the ESC instruction.
The 287 also needed to be able to save the state in protected mode, with full segment and offset addresses. For whatever reason, Intel decided to keep the size of the saved FP environment (7 words); because the saved code and data pointer addresses used 32 bits each instead of 20, there was no longer room for the floating-point opcode. That wasn’t a huge deal because the opcode could be fished out of memory.
When the 387 came out, it naturally needed to support 32-bit state, with 16-bit segments and 32-bit offsets. Instead of 7 words, the 32-bit state was extended to 7 dwords, with extra padding in reserved fields.
I was pretty sure the floating-point opcode is there somewhere in the 32-bit protected-mode state, but needed a reminder as to where exactly. I happened to have Agarwal’s80×86 Architecture & Programming, Volume II (1991) open on a nearby page, so I looked there first. On page 240, there’s a diagram of the 32-bit protected-mode FPU state format. But no FP opcode. Odd.
The next closest book was Hummel’sPC Magazine Programmer’s Technical Reference: The Processor and Coprocessor (1992). On page 696, documenting the FSTENV/FNSTENV instruction, there’s the saved state diagram. But again, no floating-point opcode! Is my memory that bad? (Rhetorical question, don’t answer!)
Continue reading→The Other Three
A previous blog post explored the semi-mysterious yet sometimes highly useful DOSAPPEND command. Now it’s time to look at its relatives: ASSIGN, JOIN, and SUBST.
ASSIGN
ASSIGN is the oldest of the bunch. It was written by IBM and first appeared in PC DOS 2.0 in March 1983 (it wasn’t part of MS-DOS 2.x). It is very simple and rather limited.
ASSIGN re-routes requests to an existing drive to another drive. If the user runs
ASSIGN A=C
then requests to drive A: end up addressing drive C: instead.
Note “existing” — the drive letter that is being reassigned must exist. On a machine that has drives A:, B:, C:, and D:, an attempt to run
ASSIGN F=D
will fail with “Invalid parameter”.
Like APPEND, the ASSIGN command is a TSR, and it is one of the earliest DOS TSRs, together with the PRINT command.
ASSIGN works by intercepting INT 25h and 26h vectors (direct disk I/O) and re-routes all accesses according to its internal drive map.
There are no provisions to unload ASSIGN, but running ASSIGN without any arguments will clear its drive mapping table and undo any effects of previous ASSIGN commands.
In later DOS versions,
ASSIGN /STATUS
will show the current drive mappings, if any.
Continue reading→I Thought I Found a Bug…
So I was working on improving a DOS emulator, when I found that something seemingly trivial wasn’t working right when COMMAND.COM was asked to do the following:
echo AB> foo.txt
echo CD>> foo.txt
Instead of ABCD, foo.txt contained ABBC.
I verified that yes, the right data was being passed tofwrite()
, with the big caveat that what COMMAND.COM was doing wasn’t quite as straightforward as one might think:
- Open foo.txt
- Write ‘AB’
- Close foo.txt
- Open foo.txt
- Seek one byte backward from the end of the file
- Read one byte
- Write ‘CD’
- Close foo.txt
The reason for the complexity is that COMMAND.COM tries to deal with a case that the file ends with a Ctrl-Z character (which wasn’t the case for me), and if so, the Ctrl-Z needs to be deleted. Somehow the seek/read/write sequence was confusing things. But why?
Continue reading→DOS APPEND
For a long time, I couldn’t quite grasp what the DOS APPEND command could possibly be good for. Until I came across a situation which APPEND was made for.
When I worked on organizing and building theDOS 2.11 source code, I tried to place the source files in a tree structure similar to that used by DOS 3.x (this is known from DOS 3.x OAKs):
C:.
└───src
├───bios
├───cmd
│ ├───chkdsk
│ ├───command
│ ├───debug
│ ├───diskcopy
│ ├───edlin
│ ├───exe2bin
│ ├───fc
│ ├───find
│ ├───format
│ ├───more
│ ├───recover
│ ├───sort
│ └───sys
├───dos
├───inc
└───msdos
Theinc
subdirectory unsurprisingly contains shared include files such asDOSSYM.ASM
, which are included just about from everywhere. No problem, right?
Except… to get output that most closely matches existing DOS 2.x binaries, it is necessary to use an old version of MASM (version 1.25 seems to do the trick). But MASM 1.25 is designed to run on top of DOS 1.x, and knows nothing whatsoever about directories.
It is possible that back in the day, DOS 2.x was built from a single huge directory on a hard disk. In fact it is known that DOS 2.0 could not be built on PCs at all, and was built on DEC mainframes. Yet DOS 2.11 was also clearly modified such that itcould be build on PCs using Microsoft’s development tools.
However it was done back in 1983, lumping 150+ assembler source files into a single directory, and then adding hundreds of object and executable files, did not soundat all appealing. CloningDOSSYM.ASM
to every directory where it was needed seemed even worse.
That’s when I somehow remembered that APPEND exists, and realized that it’s the perfect solution to the problem. Before building, one can run
APPEND ..\..\INC;..\INC
and theinc
directory becomes accessible from all of its sibling subdirectories and from subdirectories one level deeper. It would have been possible to use an absolute path as well, but this way the build batch file does not need to know where it lives.
With APPEND in place, the old MASM 1.25 which uses FCB I/O will find the centrally located include files, and the source code can be organized into a neat hierarchical structure that’s far easier to work with than one giant blob.
Continue reading→Stack Checking on OS/2
A while ago I was involved in debugging a seemingly simple yet mysterious problem:
A piece of code (a fairly simple interface DLL) built with the Open Watcom compiler was failing with a bogus stack overflow error. The mystery was that this failure only happened on OS/2 Warp Connect. It didn’t happen on OS/2 2.0 or Warp Server for e-Business (WSeB) or MCP2. And italso didn’t happen on Warp Connect updated to FixPack 40.
That’s weird, right? And getting to the bottom of the faulty stack check was a bit of a journey…
Continue reading→Programming NetBIOS on OS/2
Recently I spent some time trying to understand apiece of networking code, and it turned out to be far more difficult than it should have been. The code in question is the NetBIOS interface of C-Kermit and was originally written in the early 1990s.
The module uses two very similar but not identical code paths. The comments suggest that one alternative, using ACSNETB.DLL, is the “Traditional NetBios interface” used by IBM’s LAN Adapter and Protocol Support (aka LAPS), OS/2 Extended Services, and Communications Manager/2. The other code path uses NETAPI.DLL and it’s called “Newer Netbeui Interface”, used by the Microsoft LAN Manager Requester, IBM LAN Server Requester, and Novell Netware Requester. The comments are wrong, but more about that below.
The API used by NETAPI.DLL is documented in the Microsoft LAN Manager programming reference. It consists of four functions:NetBiosEnum
,NetBiosOpen
,NetBiosSubmit
, andNetBiosClose
. TheNetBiosSubmit
function takes an NCB (Network Control Block) as input, except the NCB usage is left more or less completely undocumented by Microsoft.
The LAN Manager document points programmers curious about NCBs to the “IBM PC-NET Technical Reference”, “IBM PC LAN Technical Reference”, and “IBM PC-LAN Technical Reference”, all in the same Microsoft manual. (Which is exactly why IBMers considered Microsoft sloppy and undisciplined.)
My initial search for programming the ACSNETB.DLL interface turned up nothing. I could not find anything in the Communications Manager manuals, or in the MPTS documentation, or really anywhere one would normally find OS/2 programming information. Because it’s not there.
Continue reading→OS/2 TCPBEUI Name Resolution
Sometimes I have the following problem to deal with: An OS/2 system uses NetBIOS over TCP/IP (aka TCPBEUI) and should communicate with a SMB server (likewise using TCPBEUI) on a different subnet. This does not work on OS/2 out of the box without a little bit of help.
History and Technology
NETBIOS (originally literally the ROM BIOS on the 1984 IBM PC Network adapter) was designed to work on a LAN, specifically a single LAN segment. There is no need for centralized infrastructure, workstations can come and go. This makes using ad hoc networks very easy and does not require additional dedicated infrastructure and administration.
When NETBIOS (or NetBIOS) moved to Ethernet, there were initially many different ways of implementing it. Eventually the world settled on NetBIOS Frames aka NBF.
But in the 1980s, there was also a parallel effort to move NetBIOS on top of TCP/IP, eventually standardized asRFC 1001 andRFC 1002 (both dated March 1987). This effort was originally driven by non-PC platforms, but soon enough DOS-based (e.g. HP ARPA Services circa 1990, PC/FTP likely earlier) and OS/2-based (MS LAN Manager 2.1 in 1991) implementations of NetBIOS over TCP/IP became available.
As mentioned above, classic NETBEUI (whether using the original IBM PC Network Adapter, Token Ring NETBEUI, the NBF protocol, or some other variant of NetBIOS over Ethernet) always resolves names using broadcasts. When a workstation (i.e. a NetBIOS application running on that workstation) looks for a NetBIOS “name”, it uses a broadcast to find out the network address of the machine which owns that name; this is not unlike ARP.
Continue reading→