Today I am going to discuss WINDOWS ARCHITECTURE where I will try to explain the architecture and working of windows in simple words. The first important thing to notice is that here I will be discussing the architecture of Windows NT, as the windows 9X line has already ended up with Windows 98, (first version of Windows NT is called Windows 2000).
First I would like to tell the difference between Windows 9X series and Windows NT so that we would have a clear idea of OS versions and the differences between them
1) Windows NT is a symmetric multiprocessing (SMP) OS that means it can use more than one processor at a time.
2) Windows NT is a truly 32 bit OS (Operating System), However this was not the case with Windows 9X, Windows 9X contained many 16 bit OS code which has been ported to it from Windows 3.1 (I doubt these days anyone uses it so just leaving it), which makes it more un-secure and unstable because in Windows 9X many of OS portion memory was accessible from User mode due to usage of 16 Bit code base security flaws.
3) Windows NT has inbuilt file system security mechanism which was missing from Windows 9X.
4) Though Memory sharing among various processes/applications is allowed under both the OS but there lies a great difference here. In windows 9X the shared memory is visible to all running processes/applications by default whereas under Windows NT only those applications/processes that specifically request a shared memory resource can see this memory.
These are some of the very common differences between Windows 9X and NT. So I think we have a got a rough idea that what were the loopholes which MS tried to remove or what were the features that MS tried to add with Windows NT. Now coming back to our main point of discussion.
So before starting the discussion I would like to put some light over the common terms of OS.
Process and Threads: In a very general term I can say that “A process is a running instance of an application”, together with a set of resources that are allocated to the running application. So for example I will say that Notepad.exe (Notepad) is a process.Winword.exe (MS WORD) is a process. A Thread is an object within a process that is allocated processor time by the operating system in order to
execute code. In short, threads, not processes, execute program code. Every process must have at least one thread. The purpose of threads, of course, is to allow a process to maintain more than one line of execution, that is, to do more than one thing at a time. In a multiprocessor environment (a computer with
more than one CPU), Windows NT (but not Windows 9x) can assign different threads to different processors, providing true multiprocessing. In a single-processor environment, the CPU must provide time slices to each thread that is currently running on the system.
execute code. In short, threads, not processes, execute program code. Every process must have at least one thread. The purpose of threads, of course, is to allow a process to maintain more than one line of execution, that is, to do more than one thing at a time. In a multiprocessor environment (a computer with
more than one CPU), Windows NT (but not Windows 9x) can assign different threads to different processors, providing true multiprocessing. In a single-processor environment, the CPU must provide time slices to each thread that is currently running on the system.
Now let’s have a look at the Windows Architecture:
Figure 1: Windows Architecture
Confused! I was too, when I saw it first timeJ, no problem I will make it simple for you. Pentium Microprocessor supports four Privilege levels, also called RINGS. Every thread executed under one of these privilege levels. Ring 0 is the most high privilege level with complete access to all memory and CPU instructions where as Ring 3 is the least privilege level.
However MS OS, in order to maintain compatibility with Non-INTEL systems supports only 2 level of privileges – Ring 0 and Ring 3.
Thread running in Ring 0 is called Kernel Mode and thread running in Ring 3 is called User Mode. (As you can see in the figure that there are 2 broad classifications namely KERNEL MODE & USER MODE). Low level OS codes (and all system drivers) execute in kernel mode and general application runs in user mode.
By this time I hope that you all will understand the meaning of Kernel mode and User mode execution. So if I say that an application (thread) is running in kernel mode it means that it is running with the most high privilege on OS and has a full access over the memory and CPU, but if I say that a thread is running in User mode, it means that it has limited/restricted access over the memory and CPU and its can’t access all the memory and execute all the CPU instructions directly.
However there is one very interesting thing I would like to share with you all at this point. No Operation on windows (Not a single operation), can be performed without going to kernel mode or without executing kernel mode operation. So now the strange question here is that how some application (which don’t have any control over Kernel mode operations are able to execute something on Windows). The answer is via Windows itself.
Let me explain it with some very simple examples:
Here is a code in VC++ which will display a message-box:
MessageBox(_T("DATA1"),_T("CAPTION"),0);
This code will output a message-box:
And the code
system("notepad");
Will simply launch a notepad. The doubt here is that if nothing can be done without going to Kernel mode then how my simple application (coded above) were able to execute the given commands. The secret lies within the concept of control switching between User mode and Kernel mode.
Both the simple code I wrote above will execute in User mode, but they will get switched to kernel mode for a very less time to actually perform the actions and then again come back to user mode. Confused!!!! J. Well let me explain it in easy way.
The application will start executing in User mode until it reaches the instruction
MessageBox(_T("DATA1"),_T("CAPTION"),0);
Now here this application (or more preciously the thread) will request the kernel mode to execute this instruction on behalf on this thread as the thread does not have that high level privilege to execute this code. At this moment, this call will get transferred from user mode thread to a corresponding kernel mode thread which is meant for handling these kinda of request and the user mode thread will go in WAIT STATE . The kernel mode thread will handle the request and carry the appropriate action (creating a message box, starting a process etc) and then return the control back to appropriate user mode thread with the output value of the function it executed on behalf of the thread.
And that’s how all the user mode threads actually perform their actions. This kinda of architecture serves two major benefits to OS:
1) A very clear separation of controls.
2) Easy way to implement security policies. Before executing any request on behalf of user mode thread, kernel perform a thorough checking that whether that thread is liable to perform that action or not. If the thread having all the necessary permissions the kernel mode executes the command on behalf of the thread else return an error value to the thread based on criteria (for example ERROR_ACCESS_DENIED, ERROR_NO_PERMISSION etc).
So in a lay man language it means that an application thread will switch from user mode to kernel mode when making certain API function calls that require a higher privilege level, such as those that involve accessing files or performing
graphics-related functions. In fact, some user threads can spend more time in kernel mode than in user mode! When the kernel mode code is completed, the user thread is automatically switched back to user mode. This prevents the programmer from being able to write instructions that run in kernel mode. The programmer can call only system functions that run in kernel mode.
graphics-related functions. In fact, some user threads can spend more time in kernel mode than in user mode! When the kernel mode code is completed, the user thread is automatically switched back to user mode. This prevents the programmer from being able to write instructions that run in kernel mode. The programmer can call only system functions that run in kernel mode.
On Windows NT, we can easily see when a thread is running in User mode and when it’s running in Kernel mode , by using the tool “Perfmon” (Inbuilt with OS) (I am not going in deep with this tool as this will divert the main topic)
So for my above written code (which is running with the name “Blog.exe”), I have created a rule to find out that when this application thread is running in kernel mode and when in user mode.
The bold red lines in the graph shows the total privilege time (Kernel mode execution time) where as the Bold green line shows the total User time (User mode execution time), on a very quick glance we can say that this thread actually spends more time in kernel mode rather than in user mode.
Okie, now we have a clear idea that how the tasks got executed over windows lets jump to the windows internal architecture.
In windows there are several services kept running in user mode and kernel mode to carry the OS operations successfully.
Let’s examine these services one by one:
1) API Service: An API function or subroutine that performs an operating system "service," such as creating a file or drawing some graphics (a line or circle). For example, in my above sample code messagebox () is a WIN32 API for creating a windows based message box.
2) System service: An undocumented function that is callable from user mode. These are often called by Win32 API functions to provide low-level services. For instance, the CreateProcess() API function (used to create a windows based process) calls the NtCreateProcess() system service to actually create the process. A user application might call these APIs directly also thus By-Passing the native WIN32 APIs usage, but as most of these APIs are undocumented so usually application takes the route of using native WIN32 APIs.
(In figure 1) we can see that these undocumented APIs reside under NTDLL.dll and user application is free to either go via WIN32 native APIs path (using the win32 subsystem, kernel32.dll, user32.dll etc) or can directly call the APIs inside NTDLL.dll)
3) Internal Service: A function or subroutine that is callable only from kernel mode. These reside in the lower-level portions of Windows: the Windows NT executive, kernel, or Hardware Abstraction Layer (HAL).
Then, there are several system processes running there. System processes are special processes that support the operating system. Every Windows system has the following system processes (and more) running at all times. Note that all of these processes run in user mode except the system process.
1) The IDLE Process, a single threaded process that monitors the CPU’s idle time.
2) System process, this is the only process which runs in kernel mode), this process contains system threads, which are kernel mode threads. Windows and various device drivers create system process threads for various reasons. For example, the memory manager creates system threads for performing virtual memory tasks, the cache manager uses system threads for managing cache memory, and the floppy disk driver uses a system thread to monitor the floppy drives.
3) The session manager (SMSS.EXE), it is one of the first processes to be created when the operating system boots. It performs important initialization functions, such as creating system environment variables, defining MS-DOS devices names such as LPT1 and COM1, loading the kernel mode portion of the Win32 subsystem (discussed later), and starting the logon process WinLogon.
4) Win32 Subsystem (CSRSS.EXE), this is the main area of concern here. It is one type of Windows environment subsystem. Other Windows environment subsystems (not pictured in Figure) include POSIX and OS/2. POSIX
stands (more or less) for "portable operating system based on UNIX," which provides limited support for the UNIX operating system. The purpose of an environment subsystem is to act as an interface between user applications and relevant portions of the Windows executive. Each subsystem exposes different functionality from the Windows executive. Every executable file is bound to one of the subsystems. For example you may use the tool “PE Explorer” (later provide more on this tool) to view that with which Subsystem the concerned EXE is bound. I have checked the subsystem for two tools, “Process explorer” and the command prompt (cmd.exe, resides under OS ACTIVE DIRECTORY\WINDOWS\SYSTEM32\).
stands (more or less) for "portable operating system based on UNIX," which provides limited support for the UNIX operating system. The purpose of an environment subsystem is to act as an interface between user applications and relevant portions of the Windows executive. Each subsystem exposes different functionality from the Windows executive. Every executable file is bound to one of the subsystems. For example you may use the tool “PE Explorer” (later provide more on this tool) to view that with which Subsystem the concerned EXE is bound. I have checked the subsystem for two tools, “Process explorer” and the command prompt (cmd.exe, resides under OS ACTIVE DIRECTORY\WINDOWS\SYSTEM32\).
Figure: Subsystem for Process Explorer Tool (WIN 32 GUI)
Figure: Subsystem for Process Explorer Tool (WIN 32 Console)
As we can see here that the exe, “process explorer.exe” is associated with WIN32 GUI, it means it is an UI based WIN 32 application and exe, “cmd.exe” is associated with WIN32 Console which means it is console based WIN32 application. The Win32 subsystem houses the Win32 API, in the form of DLLs, such as
KERNEL32.DLL, GDI32.DLL, and USER32.DLL. It is interesting to note that under Windows NT 4.0, Microsoft moved part of the Win32 subsystem from user mode to kernel mode.
IMP: Calling a Win32 API function
When an application calls a Win32 API function in the Win32 subsystem, one of several things may happen:
• If the subsystem DLL (such as USER32.DLL) that exports the API function contains all code necessary to execute the function, it will do so and return the results.
• The API function may require that additional code within the Win32 subsystem (but outside of the DLL that exports the function) be called in support of the function.
• The function may need the services of an undocumented system service. For instance, to create a new process, the CreateProcess() API function calls the undocumented system service NtCreateProcess() to actually create the process. This is done through the NTDLL.dll function library, which helps make the transition from user to kernel mode
5) Win logon process (WINLOGON.EXE), this system service handles user logons and logoffs and processes the special Windows key combination
Ctrl-Alt-Delete. WinLogon is responsible for starting the Windows shell (Windows Explorer).
Ctrl-Alt-Delete. WinLogon is responsible for starting the Windows shell (Windows Explorer).
You can easily see all these processes using “Process Explorer” (This tool I have already discussed in my first post named “Basic System Security (I)”.
The Windows Executive:
Windows executive services make up the low-level kernel mode portion of Windows NT and are contained in the file NTOSKRNL.EXE. (This is the real kernel, exacted
J ) . Executive services are sometimes divided into two groups: the executive (upper layer) and the kernel (lower layer). The kernel is the lowest layer of the operating system, performing the most fundamental services, such as:
• Thread scheduling
• Exception handling
• Interrupt handling
• Synchronization of processors in a multiprocessor system
•Creating kernel objects
Here are some of the major portions of the executive:
Process and Thread Manager
This component creates and terminates both processes and threads, using the services of the low-level kernel.
The Virtual Memory Manager
This component implements virtual memory.
The Input/Output Manager
This component implements device-independent input/output and communicates with device drivers.
The Cache Manager
This component manages disk caching.
The Object Manager
This component creates and manages Windows executive objects. Windows uses objects to represent its various resources, such as processes and threads.
Runtime Libraries
These components contain runtime library functions, such as string manipulation functions and arithmetic functions.
The Hardware Abstraction Layer (HAL)
The hardware abstraction layer (or HAL) is a kernel mode library (HAL.DLL) that provides a low-level interface with the hardware. Windows components and third-party device drivers communicate with the hardware through the HAL. Accordingly, there are many versions of HAL to accommodate different hardware platforms. The appropriate HAL is chosen when Windows is installed.
I would like to end this post with an interesting discussion:
Most of the time we confuse that WHETHER WINDOWS NT is a MICROKERNEL BASED SYSTEM: Though there is some confusion but actually NO. Microkernel OS by definition is an OS where the principal operating system components (Such as the memory manager, process manager, and I/O manager) run as separate processes in their own private address spaces, layered on a primitive set of services which the microkernel provides. The reason is simple; the pure Microkernel design is commercially impractical because of its complexions and real time In-efficiency.
Refrences
1) Windows Internals by Mark E.Russinovich & David Solomon.
2) Win32 API Programming by Steven Roman.