Microsoft Disk Operating System
Submitted By: Dheeraj Chugh
An operating system is a set of interrelated programs that manage and control computer processing. The Microsoft Disk Operating System, MS-DOS, is a traditional microcomputer operating system that consists of five major components.
The Operating-system loader
The MS-DOS BIOS
The MS-DOS Kernel
The User Interface (shell)
A very brief introduction of all the above topics is given below:
The Operating system Loader
The operating systems loader brings the operating system from the startup into RAM. The complete loading process, called Bootstrapping came about because each level pulls up the next part of the system. The ROM loader, which is the first program the microcomputer executes when it is turned on, reads the disk bootstrap loader from the first (boot) sector of the startup disk and executes it. The disk bootstrap loader in turn reads the main portions of MS-DOS—MSDOS.SYS and IO.SYS from conventional disks into memory.
The MS-DOS BIOS
The MS-DOS BIOS loaded from the file IO.SYS during initialization, is the layer of the operating system that sits between the operating system kernel and the hardware. An application performs input and output by making requests to the operating system kernel, which, in turn, calls the MS-DOS BIOS routines that access, the hardware directly.
USER INTERFACE (SHELL)
The user interface for an operating system, also called a shell or command processor, is generally a conventional program that allows the user to interact with the operating system itself. The default MS-DOS user interface is a replaceable shell program called COMMAND.COM.
One of the fundamental tasks of a shell is to load a program into memory on request and pass control of the system to the program so that the program can execute. When the program terminates, control returns to the shell, which prompts the user for another command. In addition, shell usually includes functions for file and directory maintenance and display. In theory, most of these functions could be provided as programs, but making them resident in the shell allows them to be accessed more quickly. The tradeoff is memory space versus speed and flexibility. Early microcomputer based operating systems provided a minimal number of resident shell commands because of the limited memory space; modern operating systems such as MS-DOS include a wide variety of these functions as internal commands.
The MS-DOS software includes support programs that provide access to the operating- system facilities not supplied as resident shell commands built into COMMAND.COM. Because these programs are stored as executable files on disk, they are essentially the same as application programs and MS-DOS loads and executes them as it would any other program.
THE MS-DOS KERNEL
The MS-DOS Kernel is the heart of MS-DOS. It is contained in a single proprietary file, MSDOS.SYS, supplied by Microsoft Corporation. The kernel provides it support functions called as system functions to application programs in a hardware-independent manner and, in turn, is isolated from the hardware characteristics by relying on the driver routines in the MS-DOS BIOS to perform physical input and output operations. Programs access system functions using software interrupt (INT) instructions. MS-DOS reserves Interrupts 20H through 3FH for this purpose. The MS-DOS interrupts are
20H Terminate Program
21H MS-DOS Function Calls
22H Terminate Routine Address
23H Control-C Handler Address
24H Critical Error Handler Address
25H Absolute Disk Read
26H Absolute Disk Write
27H Terminate and Stay Resident
28H – 2EH Reserved
30H – 3FH Reserved
The services provided to the application programs by the MS-DOS kernel include
A file system
Device Input and output
The main focus of this report is on the above listed topics, which we discuss in the next section:
The File System
Block devices are accessed on a sector basis. The MS – DOS kernel, through the device driver, sees a block device as a logical fixed – size array of sectors and assumes the array contains a valid MS – DOS file system. The device driver, in turn, translates the logical sector requests from MS – DOS into physical locations on the block device.
The file system is one of the largest portions of the operating system. The file system is built on the storage medium of a block device (usually a floppy disk or a fixed disk) by mapping a directory structure and files onto the physical unit of storage. A file system on a disk contains, at a minimum, allocation information, a directory, and space for files.
The file allocation information can take various forms, depending on the operating system, but all forms basically track the space used by files and the space available for new data. The directory contains a list of the files stored on the device, their sizes and information about where the data for each file is located. MS-DOS uses a particular allocation method called FILE ALLOCATION TABLE (FAT) and a hierarchical directory structure.
Layout of MS-DOS File System
Block devices are accessed on a sector basis. The MS-DOS kernel, through the device driver, sees a block device as a logical fixed-size array of sectors and assumes that the array contains a valid MS-DOS file system. The device driver translates the logical sector requests from the MS-DOS into physical locations on the block device.
OEM identification, BIOS parameter block, Loader routine, Reserved area
File Allocation table (FAT) #1
Possible additional copies of FAT
Root disk directory
The MS-DOS File System
The Boot Sector
The boot sector is always at the beginning of a partition. It contains the OEM identification, a loader routine, and a BIOS parameter block (BPB) with information about the device, and an optional area of reserved sectors follows it.
E9 XX XX or EB XX 90
OEM name and version (8 bytes)
Bytes per sector (2 bytes)
Sectors per allocation unit (1 byte)
Reserved sectors, starting at 0 (2 bytes)
Number of FATs (1 byte)
Number of root directory entries (2 bytes)
Total sectors in logical volume (2 bytes)
Media descriptor type
Number of sectors per FAT (2 bytes)
Sectors per track (2 bytes)
Number of heads (2 bytes)
Number of hidden sectors (2 bytes)
Map of the boot sector of an MS-DOS disk
The BPB information contained in bytes 0BH through 17H indicates that there are
512 bytes per sector, 2 sectors per cluster, 1 reserved sector (for the boot sector), 2 FATs, 112 root directory entries, 1440 sectors on the disk and 3 sectors per FAT.
Additional information after BPB indicates that there are 9 sectors per track, 2 read/ write heads, and 0 hidden sectors.
The File AllocationTable
The file allocation table provides a map to the storage locations of files on a disk by indicating which clusters are allocated to each file and in what order. To enable MS-DOS to locate a file, the file’s directory entry contains its beginning FAT entry number. This FAT entry, in turn, contains the entry number of the next cluster if the file is larger than one cluster or a last-cluster number if there is only one cluster associated with the file.
Additional copies of the FAT are used to provide backup in case of damage to the first. The FATs are arranged sequentially after the boot sector, with some possible intervening reserved area.
MS-DOS supports two types of FAT: one uses 12-bit links; the other introduced with version 3.0 to accommodate large fixed size disks with more than 4087 clusters, uses 16- bit links. The first two entries of the FAT are always reserved and are filled with a copy of the media descriptor byte. The remaining FAT entries have a one-to-one relationship with the clusters in the file data area. Each cluster’s use status is indicated by its corresponding FAT value. If the FAT entry is nonzero, the corresponding cluster has been allocated. A free cluster is found by scanning the FAT from the beginning to find the first zero value.
0 1 2 3 4 5 6
FFDH FFFH 003H 005H FF7H 000H 000H Continues….
(4093) (4095) (3) (5) (4087) (0) (0)
Unused; Available cluster
Unused; not available
Disk is Double sided, double density
Space allocation in the FAT for a typical MS-DOS disk
Free FAT entries contain a link value of zero; a link value of 1 is never used. Thus, the first allocatable link number, associated with the first available cluster in the file data area, is 2, which is the number assigned to the first physical cluster in the file data area.
Directory entries that are 32 bytes long are found in both the root directory and the subdirectories. Each entry includes a filename and an extension, the file’s size, the starting FAT entry, the time and date the file was created or last revised, and the file’s attributes. The root directory can optionally have a special type of entry called the volume label, identified by an attribute type of 08H, which is used to identify disks by name. A root directory can contain only one volume label. The root directory can also contain entries that point to subdirectories; an attribute type of 10H and a file size of zero identify such entries.
Two other special types of directory entries are found only within subdirectories. These entries have the filenames . and .. and correspond to the current directory and the parent directory of the current directory. These special entries, sometimes called directory aliases, can be used to move quickly through the directory structure.
The File Area
The file area contains subdirectories, file data, and unallocated clusters. The area is divided into fixed size clusters and the use for a particular cluster is specified by the corresponding FAT entry.
Because of the amount of memory a program needs varies from program to program, the traditional operating system ordinarily provides memory – management functions. Memory requirements can also vary during program execution, and memory management is especially necessary when two or more programs are present in main memory at the same time.
MS-DOS Memory Management is based on a pool of variable sized memory blocks. The two basic memory – management actions are to allocate a block from a pool and to return an allocated block to the pool. MS-DOS allocates program space from the pool when the program is loaded; programs themselves can allocate additional memory from the pool. Many programs perform their own memory management using a local memory pool, or heap—an additional memory block allocated from the operating system that the application program itself divides into a blocks for use by its various routines.
Personal computers that are MS-DOS compatible can be outfitted with three kinds of RAMs: conventional memory, expanded memory, and extended memory.
Conventional memory is the term used for the up to 1MB of memory that is directly addressable by an Intel 8086/8088 microprocessor running in real mode. Physical addresses for references to conventional memory are generated by a 16- bit segment register, which acts as a base register and holds a paragraph address, combined with a 16-bit offset contained in an index register or in the instruction being executed. On IBM PCs and compatibles, MS-DOS and the programs that run under its control occupy the bottom 640 KB or less of the conventional memory space. The bottom 640 KB of memory administered by MS-DOS is divided into three zones.
The interrupt vector table
The operating system area
The transient program area
The interrupt vector table occupies the lowest 1024 bytes of memory (locations 00000- 003FFH); its address and length are hardwired into the processor and cannot be changed. Each double word position in the table is called interrupt vector and contains the segment and offset of an interrupt handler routine for the associated hardware or software interrupt number.
The operating system begins area begins immediately above the interrupt vector table and holds the operating system proper, its tables and buffers, any additional installable device drivers and the resident portion of the command interpreter. The amount of memory occupied by the operating system area depends with the version of the MS-DOS used, the number of disk buffers and the number of the installable drivers.
The transient program area (TPA) is the remainder of the RAM above the operating system area, extending to the 640 KB limit or the size of the RAM. The transient area is organized into a structure called the Memory Arena, which is divided into portions called arena entries (or memory blocks). Each arena entry is preceded by a control structure called arena entry header, which contains information indicating the size and status of the arena entry.
MS-DOS inspects the arena entry headers whenever a function requesting a memory block allocation, modification, or release is issued; when a program is loaded and executed with the EXEC function (interrupt 21H function 4BH); or when a program is terminated. If any of the arena entry headers appear to be damaged, MS-DOS returns an error to the calling process. If that process is COMMAND.COM, then it displays Memory allocation Error and halts the system.
MS-DOS Support for conventional memory management
The MS-DOS kernel supports three memory management functions, invoked with interrupt 21H, which operate on the TPA:
Function 48H (Allocate Memory block)
Function 49H (Free Memory block)
Function 4AH (Resize Memory block)
These three functions can be called by application programs, by the command processor, and by the MS-DOS itself to dynamically free, allocate and resize arena entries. When the MS-DOS Kernel receives a memory allocation request, it inspects the chain of arena entry headers to find a free arena entry that can satisfy the request. The memory manager can use any of three allocation strategies:
First fit – the arena entry at the lowest address that is large enough to satisfy the request.
Best fit – the smallest available arena entry that satisfies the request, regardless of its position.
Last fit – the arena entry at the highest address that is large enough to satisfy the request
If the arena entry selected is larger than the size needed to fulfill the request, the arena entry is divided and the program is given entry exactly the size it requires. A new arena entry header is then created for the remaining portion of the original arena entry; it is marked “unowned” and can be used to satisfy subsequent allocation calls.
MS-DOS uses the first – fit approach as its default approach. However MS-DOS version 3.0 and above can use a different strategy for memory management with interrupt 21H Function 58H (Get / Set allocation strategy).
The original Expanded Memory specification (EMS) was designed to provide a uniform means for applications running on 8086/8088 personal computers or 80286/80386- based computers in real mode, to circumvent the 1MB limit on conventional memory, thus providing such programs with very larger amounts of fast random access memory. The EMS is a functional definition of a bank-switched memory subsystem; it consists of user installable boards that plug into the IBM PCs expansion bus and a resident driver program called the Expanded Memory Manager. As much as 8 MB of expanded memory can be installed in a single machine. Expanded memory is made available to application software in 16KB pages, which are mapped to the EMM into a 64 KB area called the page frame somewhere above the conventional memory area used by MS-DOS (0 – 640 KB). An application program can thus access four 16 KB expanded memory pages simultaneously.
The Expanded Memory Manager
The expanded memory manager provides a hardware independent interface between application programs and the expanded memory board(s). The EMM is supplied by the board manufacturer in the form of an installable character – device and is linked into MS-DOS by a device directive.
Internally the EMM is divided into two distinct components that can be referred to as the Driver and Manager. The driver portion mimics the some of the actions of a genuine installable device driver, in that it includes initialization and output status sub functions and a valid device driver header.
The second, and major, element of the EMM is the true interface between application software and the expanded memory hardware. Several classes of services provide
Status of the expanded memory subsystem
Allocation of expanded memory pages
Mapping of logical pages into physical memory
Deallocation of expanded memory pages
Support for multitasking operating systems
Extended Memory is that storage at addresses above 1 MB that can be accessed by 80286 and 80386 in protected mode. Unlike expanded memory extended memory is linearly addressable. The address of each memory cell is fixed so no special manager program is needed.
Protected mode operating systems, such as XENIX and MS OS/2, can use extended memory for execution of programs. MS-DOS on the other hand runs in real mode on an 80286 or 80386, and programs running under its control cannot ordinarily execute from extended memory or even address that memory for storage of data.
To provide some access to extended memory for real – mode programs, IBM PC/AT- compatible machines contain two routines in their ROM BIOS that allow the amount of extended memory present to be determined (interrupt 15H Function 88H) and that transfer blocks of data between conventional memory and extended memory (interrupt 15H Function 87H). These routines can be used by electronic disks (RAM disks) and by other programs that wish to use extended memory for fast storage and retrieval of information that would otherwise have to be written to a slower physical disk.
Device Input and Output
MS-DOS recognizes two types of devices: block devices which are usually floppy disk or fixed disk drives; and character devices, such as keyboard, display, printer, and communication ports.
MS-DOS identifies each block device by a drive letter assigned when the devices controlling software, the device driver, is loaded. A character device on the other hand, is identified by a logical name (similar to a filename and subject to many of the same restrictions) built into the device driver.
One important distinction between block and character devices is that the MS-DOS always adds new block – device drivers to the tail of the driver chain but adds new character – device drivers to the head of the chain. Thus, because MS-DOS searches the chain sequentially and uses the first device driver it finds in the chain that satisfies its search conditions, any existing character – device driver can be suspended by simply installing another driver with an identical logical name.
Application program can use either of the two basic techniques to access character devices in a portable manner under MS-DOS. A program can use the handle type function calls or it can use so – called “traditional” character device functions. A handle is a16 – bit number returned by the operating system whenever a file or device is opened or created by passing a name to MS-DOS interrupt 21H function 3CH, 3DH, 5AH or 5BH. The second method for accessing character devices is through MS-DOS character input and output functions, interrupt 21H Functions 01H through 0CH. These functions are designed to communicate directly with the keyboard, display, printer, and serial port.
Every MS-DOS system supports at least the following set of logical character devices without the need for additional installable drivers.
CON Keyboard and display
PRN System list device, usually a parallel port
AUX Auxiliary device, usually a serial port
CLOCK$ System real time clock
NUL “bit – bucket” device
These devices can be opened by name or they can be addressed through the “traditional” function calls; strings can be read from or written to the devices according to their capabilities on any MS-DOS system.
The operating system provides peripheral support to the programs through a set of operating system calls that are translated by the operating system into calls to the appropriate device drivers.
Peripheral support can be a direct logical to physical device translation or the operating system can interject additional features or translations. Keyboards, displays, and printers usually require only logical to physical device translations: that is, the data transferred between the application program and physical device with minimal alteration, if any by the operating system.
Process, or task, control includes program loading, task execution, task termination, task scheduling and intertask communication.
Although MS-DOS is not a multitasking operating system, it can have multiple programs residing in memory at the same time. One program can invoke another, which then becomes the active task. When the invoked task terminates, the invoking program again becomes the foreground task. Because these tasks never execute simultaneously, this stack like operation is still considered to be a single tasking operating system.
MS-DOS does have a few hooks that allow certain programs to do some multitasking on their own. For Example, terminate and stay resident (TSR) programs such as PRINT use these hooks to perform limited concurrent processing by taking control of system resources while MS-DOS is “ idle” and the Microsoft Windows operating environment adds support for nonpreemptive task switching,
The traditional intertask communication methods include semaphores, queues, shared memory, and pipes. Of these, MS-DOS formally supports only pipes. (A pipe is a logical, unidirectional, sequential stream of data that is written by one program and read by another.) The data in pipe resides in memory or in a disk file, depending on the implementation; MS-DOS uses disk files for intermediate storage of data in pipes because it is a single tasking operating system.
MS-DOS is widely accepted traditional operating system. Each newer version added more features that made the system easier to use for both users and programmers. While MS-DOS had been evolving, intense efforts were put into the areas of user interfaces and multitasking operating systems. Microsoft Windows, first shipped in 1985, provides a multitasking, graphical user "desktop" for MS-DOS systems. Windows has kept on evolving, adding new features with every version and recently Windows XP has been released.
MS-DOS supports two distinct but overlapping sets of file and record management services. The handle oriented functions operate in terms of null terminated filenames and 16-bit file identifiers called handles that are returned by the MS-DOS when the file is created or opened.
Personal computers that run on MS-DOS can support as many as three different types of fast, random access memory (RAM). Conventional memory is the term used for the 1MB of linear space access able by 80286 or 80386 microprocessor in real mode. As much as 8MB of expanded memory can be installed in a PC, which is made available in 16KB pages and is administered by a driver program called the Expanded Memory Manager. Extended Memory refers to the memory addresses that can be accessed by 80286 or 80386 in protected mode. As much as 15MB of Extended Memory can be installed.
MS-DOS recognizes two types of devices, block and character devices. Block devices are usually floppy disk and fixed disk, which are accessed on a sector basis. The character devices such as keyboard, display and printer can be accessed using either of the two methods, the handle type function calls or through MS-DOS character input and output functions.
MS-DOS is not a multitasking operating system; it is a single tasking operating system.
DOS Internals by Geoff Chappell.
Undocumented DOS by Schulman.
MS-DOS encyclopedia by Ray Duncan and Bill Gates.
An Overview of Microsoft Disk Operating System by Dheeraj Chugh