UNIX Application Migration Guide

08/30/2006

178 minutes to read

In this article

Chapter 9: Win32 Code Conversion

Larry Twork, Larry Mead, Bill Howison, JD Hicks, Lew Brodnax, Jim McMicking, Raju Sakthivel, David Holder, Jon Collins, Bill Loeffler

Microsoft Corporation

October 2002

Applies to:

Microsoft® Windows®

UNIX applications

The patterns & practices team has decided to archive this content to allow us to streamline our latest content offerings on our main site and keep it focused on the newest, most relevant content. However, we will continue to make this content available because it is still of interest to some of our users.

We offer this content as-is, without warranty that it is still technically accurate as some of the material is undoubtedly outdated. Note that the content may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Summary: Chapter 9: Win32 Code Conversion covers the fundamentals of converting code from UNIX to Windows. Functional areas that surface when migrating to Win32 such as processes, threads, signal handling, memory management, networking and a host of related subjects are covered. In general, the material is presented first with background on the migration issue followed by code samples to illustrate before and after migration. (212 printed pages)

Contents

Introduction

Processes

Signals and Signal Handling

Threads

Memory Management

Users, Groups and Security

File and Data Access

Interprocess Communication

Sockets and Networking

The Process Environment

Multiprocessor Considerations

Daemons and Services

Appendixes

Introduction

This chapter describes how you can modify the source code for your UNIX application so that it will compile on the Microsoft® Windows® operating system. You need to modify your code due to the differences between the UNIX and Windows application and coding environments described earlier in this guide.

The potential coding differences that need to be addressed are described in the following categories:

Processes

Signals and signal handling

Threads

Memory management

Users, groups, and security

File and data access

Interprocess communication

Sockets and networking

Process environment

Multiprocessor considerations

Daemons and services

For each of these categories, this chapter:

Describes the coding differences.

Outlines options for converting the code.

Illustrates with source code examples.

You can then choose the solution appropriate to your application and use these examples as a basis for constructing your Windows code.

This guide gives you sufficient information so that you can choose the best method of converting the code. Once you have made your choice, you can refer to the standard documentation to ensure that you understand the details of the Microsoft Win32® application programming interface functions and application program interfaces (APIs). Throughout this chapter, there are references for further information on the recommended coding changes. In particular, references to the detail of the function calls and libraries are given.

Processes

The UNIX and Windows process models are very different, and the major difference lies in the creation of processes. UNIX uses fork to create a new copy of a running process and exec to replace a process's executable file with a new one. Windows does not have a fork function. Instead, Windows creates processes in one step by using CreateProcess. While there is no need in Win32 to execute the process after its creation (as it will already by executing the new code), the standard exec functions are still available in Win32.

These differences (and others) result in the need to convert the UNIX code before it can run on a Win32 platform.

The areas that you need to consider that are covered in this section are:

Process creation

Replacing a process's executable code

Spawning child processes and the process hierarchy

Waiting for a child process

Setting process resource limits

The concept of Windows Jobs is also introduced, which allows you to group processes together for management purposes. This functionality is not available in UNIX.

Note: There are a number of process management functions in the Win32 API. For details of these functions, consult the Win32 API reference.

Creating a New Process

In UNIX, a developer creates a new process by using fork. The fork function creates a child process that is an almost exact copy of the parent process. The fact that the child is a copy of the parent ensures that the process environment is the same for the child as it is for the parent.

In Windows, the CreateProcess function enables the parent process to create an operating environment for a new process. The environment includes the working directory, window attributes, environment variables, execution priority and command line arguments. A handle is returned by the CreateProcess function, which enables the parent application to perform operations on the process and its environment while it is executing. Unlike UNIX, the executable file run by CreateProcess is not a copy of the parent process, and it has to be explicitly specified in the call to CreateProcess.

An alternative to CreateProcess is to use one of the spawn functions that is present in the standard C runtime. There are 16 variations of the spawn function. Each spawn function creates and executes a new process. Many of these have the same functionality as the similarly named exec functions. The spawn functions include an additional argument that permits the new process to replace the current process, suspend the current process until the spawned process terminates, run asynchronously with the calling process or run simultaneously and detach as a background process.

For a UNIX application to change the executable file run in the child process, the child process must explicitly call an exec function to overwrite the executable file with a new application. The combination of fork and exec is similar to, but not the same as, CreateProcess.

The example below shows a UNIX application that forks to create a child process and then runs the UNIX ps command by using execlp.

Creating a process in UNIX using fork and exec

#include <unistd.h> #include <stdio.h> #include <sys/types.h> int main() { pid_t pid; printf("Running ps with fork and execlp

"); pid = fork(); switch(pid) { case -1: perror("fork failed"); exit(1); case 0: if (execlp("ps", NULL) < 0) { perror("execlp failed"); exit(1); } break; default: break; } printf("Done.

"); exit(0); }

You can port this code to Windows using the Win32 CreateProcess function discussed earlier, or by using a spawn function from the standard C runtime library. In both cases, the old and new processes run parallel, asynchronously.

Creating a process in Windows using CreateProcess

#include <windows.h> #include <process.h> #include <stdio.h> void main() { STARTUPINFO si; PROCESS_INFORMATION pi; GetStartupInfo(&si); printf("Running Notepad with CreateProcess

"); CreateProcess(NULL, "notepad", // Name of app to launch NULL, // Default process security attributes NULL, // Default thread security attributes FALSE, // Don't inherit handles from the parent 0, // Normal priority NULL, // Use the same environment as the parent NULL, // Launch in the current directory &si, // Startup Information &pi); // Process information stored upon return printf("Done.

"); exit(0); }

The arguments supported by CreateProcess (shown in the preceding example) give you a considerable degree of control over the newly created process. This contrasts with the spawn functions, which do not provide options to set process priority, security attributes, or the debug status.

The next example shows a port of the same code using the _spawnlp function.

Creating a process in Windows using spawn

#include <windows.h> #include <process.h> #include <stdio.h> void main() { printf("Running Notepad with spawnlp

"); _spawnlp( _P_NOWAIT, "notepad", "notepad", NULL ); printf("Done.

"); exit(0); }

Running either of the above examples yields a console window similar to that shown here:

Figure 1. Output from spawn example code

Replacing a Process Image (exec)

A UNIX application replaces the executing image with that of another application by using one of the exec functions. As mentioned previously, a fork followed by an exec is similar to CreateProcess.

Windows supports the six POSIX variants of the exec function plus two additional ones (execlpe and execvpe). The function signatures are identical, and come as part of the standard C runtime. Porting UNIX code that uses exec to Win32 is easy to understand. The following is a simple UNIX example showing the use of the execlp function.

Note For more information on exec support on Win32, see the standard C runtime library documentation that comes with the Microsoft Visual Studio® development system.

Replacing a process image in UNIX using exec

#include <unistd.h> #include <stdio.h> int main() { printf("Running ps with execlp

"); execlp("ps", "ps", "-l", 0); printf("Done.

"); exit(0); }

The preceding example compiles and runs on Windows with only minor modifications. It does, however, require an executable file called ps.exe to be available (one is included with the Interix product).

The <unistd.h> include file is not a valid header file when using Windows. To use this example when using Windows, you need to change the header file to <process.h>. Doing so allows you to compile, link, and run this simple application.

Waiting for a Spawned Process

In the preceding section, the example showed how you can create an asynchronous process where the parent and child processes execute simultaneously. No synchronization was performed. This section describes how to modify the previous example to include functionality that enables the parent process to wait for the child process to complete or terminate before continuing.

To accomplish this in UNIX, a developer would use one of the wait functions to suspend the parent process until the child process terminates. The same semantics are available when using Windows. The functions used are different, but the results are the same.

When you view the examples, keep in mind that this is not an exhaustive comparison between the two platforms. A very simple scenario is described, but if you need to expand the scenario to include waiting for multiple child processes, the spawn example does not map adequately as it does not include support for this functionality. In this case, you need to consider the CreateProcess approach and WaitForMultipleObjects.

To see the code for this example, see Appendix G: Waiting for a Spawned Process.

Process vs. Threads

In the next example, the UNIX code is forking a process, but not executing a separate runtime image. This creates a separate execution path within the application. When using Windows, this is achieved by using threads rather than processes. If your UNIX application creates separate threads of execution in this manner, you should use the Win32 API CreateThread.

The process of creating threads is covered in the next section, Threads.

UNIX code with forking executable

#include <unistd.h> #include <stdio.h> #include <sys/types.h> int main() { pid_t pid; int n; printf("fork program started

"); pid = fork(); switch(pid) { case -1: perror("fork failed"); exit(1); case 0: puts("I'm the child"); break; default: puts("I'm the parent"); break; } exit(0); }

Managing Process Resource Limits

Developers often want to create processes that run with a specific set of resource restrictions. In some cases, they may impose limitations for the purposes of stress testing or forced failure condition testing. In other cases, however, the limitations may be imposed to restrict runaway processes from using up all available memory, CPU cycles, or disk space.

In UNIX, the getrlimit function retrieves resource limits for a process, the getrusage function retrieves current usage, and setrlimit function sets new limits. The common limit names and their meanings are described in Table 1:

Table 1. Common limit names and definitions

Limit Description RLIMIT_CORE The maximum size, in bytes, of a core file created by this process. If the core file is larger than RLIMIT_CORE, the write is terminated at this value. If the limit is set to 0, then no core files are created. RLIMIT_CPU The maximum time, in seconds, of CPU time a process can use. If the process exceeds this time, the system generates SIGXCPU for the process. RLIMIT_DATA Maximum size, in bytes, of a process's data segment. If the data segment exceeds this value, the functions brk, malloc, and sbrk will fail. RLIMIT_FSIZE The maximum size, in bytes, of a file created by a process. If the limit is 0, the process cannot create a file. If a write or truncation call exceeds the limit, further attempts will fail. RLIMIT_NOFILE The highest possible value for a file descriptor, plus one. This limits the number of file descriptors a process may allocate. If more than RLIMIT_NOFILE files are allocated, functions allocating new file descriptors may fail with the error EMFILE. RLIMIT_STACK The maximum size, in bytes, of a process's stack. The stack won't automatically exceed this limit; if a process tries to exceed the limit, the system generates SIGSEGV for the process. RLIMIT_AS Maximum size, in bytes, of a process's total available memory. If this limit is exceeded, the memory functions brk, malloc, mmap, and sbrk will fail with errno set to ENOMEM, and automatic stack growth will fail as described for RLIMIT_STACK.

Windows uses job objects to set job limits (rather than process limits). Unlike UNIX, Windows job objects do not have File input/output (I/O) source restrictions. If you require File I/O limits in your application, you need to create your own code to handle this.

Windows job objects

Windows supports the concept of job objects, which allows you to group one or more processes into a single entity. Once a job object has been populated with the desired processes, the entire group can be manipulated for various purposes ranging from termination to imposing resource restrictions.

The restrictions that job objects allow you to enforce are described in Table 2:

Table 2. Job objects

Member Description Notes PerProcessUser-TimeLimit Specifies the maximum user-mode time allotted to each process (in 100 ns intervals). The system automatically terminates any process that uses more than its allotted time. To set this limit, specify the JOB_OBJECT_LIMIT_ PROCESS_TIME flag in the LimitFlags member. PerJobUser-TimeLimit Specifies how much more user-mode time the processes in this job can use (in 100 ns intervals). By default, the system automatically terminates all processes when this time limit is reached. You can change this value periodically as the job runs. To set this limit, specify the JOB_OBJECT_LIMIT_JOB_TIME flag in the LimitFlags member. LimitFlags Specifies which restrictions to apply to the job. See the job objects API reference for more information. MinimumWorkingSetSize/

MaximumWorkingSetSize Specifies the minimum and maximum working set size for each process (not for all processes within the job). Normally, a process's working set can grow above its maximum; setting MaximumWorkingSetSize forces a hard limit. Once the process's working set reaches this limit, the process pages against itself. Calls to SetProcessWorkingSetSize by an individual process are ignored unless the process is just trying to empty its working set. To set this limit, specify the JOB_OBJECT_ LIMIT_WORKINGSET flag in the LimitFlags member. ActiveProcessLimit Specifies the maximum number of processes that can run concurrently in the job. Any attempt to go over this limit causes the new process to be terminated with a "not enough quota" error. To set this limit, specify the JOB_OBJECT_ LIMIT_ACTIVE_PROCESS flag in the LimitFlags member. Affinity Specifies the subset of the CPU(s) that can run the processes. Individual processes can limit this even further. To set this limit, specify the JOB_OBJECT_ LIMIT_AFFINITY flag in the LimitFlags member. PriorityClass Specifies the priority class that all processes use. If a process calls SetPriorityClass, the call will return successfully even though it actually fails. If the process calls GetPriorityClass, the function returns what the process has set the priority class to even though this might not be process's actual priority class. In addition, SetThreadPriority fails to raise threads above normal priority but can be used to lower a thread's priority. To set this limit, specify the JOB_OBJECT_LIMIT_PRIORITY_CLASS flag in the LimitFlags member. SchedulingClass Specifies a relative time quantum difference assigned to threads in the job. Value can be from 0 to 9 inclusive; 5 is the default. See the text after this table for more information. To set this limit, specify the JOB_OBJECT_LIMIT_SCHEDULING_CLASS flag in the LimitFlags member.

As you may have observed by reviewing the table for setrlimit and job objects, the restrictions offered by job objects are comparable except in one major area: File I/O.

Limiting file I/O when using Windows

When a process is created in UNIX, the Process Control Block (PCB) in kernel space contains an array of limits that is initialized with default values. In the case of the RLIMIT_FSIZE limit, the write procedures in the kernel are aware of the limit structure in the PCB, and these functions make checks to enforce the limits. The Windows operating system does not implement similar limits on files. To solve this problem, you must write your own solution and build it into your application.

This section presents a solution that you could use in your application. This solution emulates the UNIX file resource limits with:

An array of limits held as a static variable. This is similar to how some of the C runtime functions use static variables.

Our own versions of the UNIX functions getrlimit() and setrlimit() . These functions manipulate the limit array.

Wrappers for each of the disk write functions. These wrappers are resource limit aware.

This solution is implemented as three files. Two of the files, resource.h and resource.c, implement the getrlimit(), setrlimit(), rfwrite() and _rwrite() functions. Only fwrite() and _write() are wrapped since they are the most common disk write functions encountered in the UNIX world. The third file is rlimit.c which is a very simple test program used to confirm that rfwrite() will fail when the limit was reached.

For more information, see Appendix B: Limiting File I/O.

Process Accounting

The Win32 API has several functions for gathering process accounting information:

GetProcessShutdownParameters

GetProcessTimes

GetProcessWorkingSetSize

SetPriorityClass

SetProcessShutdownParameters

SetProcessWorkingSetSize

Alternatively, a better method of obtaining process information is through the Windows Management Instrumentation (WMI) API.

For more information on WMI, see Windows Management Instrumentation (WMI) Tools.

Signals and Signal Handling

The UNIX operating system supports a wide range of signals. UNIX signals are software interrupts that catch or indicate different types of events. Windows on the other hand supports only a small set of signals that is restricted to exception events only. Consequently, converting UNIX code to Win32 requires the use of new techniques replacing the use of some UNIX signals.

The Windows signal implementation is limited to the following signals (Table 3):

Table 3. Windows signals

Signal Meaning SIGABRT Abnormal termination SIGFPE Floating-point error SIGILL Illegal instruction SIGINT CTRL+C signal SIGSEGV Illegal storage access SIGTERM Termination request

Note When a Ctrl+C interrupt occurs, Win32 operating systems generate a new thread to handle the interrupt. This can cause a single-thread application, such as one ported from UNIX, to become multithreaded, potentially resulting in unexpected behavior.

When an application uses other signals not supported in Windows, you have two choices:

Use additional libraries that provide required signals, such as those provided by Microsoft Windows Services for UNIX.

Use a comparable Windows mechanism, such as Windows Messages.

This section focuses on the Windows mechanisms that you can use to replace the use of some UNIX signals. Table 4 shows the recommended mechanisms that you can use to replace common UNIX signals. There are three main mechanisms:

Native signals

Event objects

Messages

Table 4. UNIX signals and replacement mechanisms

Signal name Description Link to reference material SIGABRT Abnormal termination SIGABRT SIGALRM Time-out alarm SetTimer–WM_TIMER - CreateWaitableTimer SIGCHLD Change in status of child WaitForSingleObject SIGCONT Continue stopped process WaitForSingleObject SIGFPE Floating point exception SIGFPE SIGHUP Hangup NA SIGILL Illegal hardware instruction SIGILL SIGINT Terminal interrupt character WM_CHAR SIGKILL Termination WM_QUIT SIGPIPE Write to pipe with no readers WaitForSingleObject SIGQUIT Terminal Quit character WM_CHAR SIGSEGV Invalid memory reference SIGSEGV SIGSTOP Stop process WaitForSingleObject SIGTERM Termination SIGTERM SIGTSTP Terminal Stop character WM_CHAR SIGTTIN Background read from control tty NA SIGTTOU Background write to control tty NA SIGUSR1 User defined signal SendMessage–WM_APP SIGUSR2 User defined signal SendMessage–WM_APP

Note Only POSIX signals are considered in this table (that is, Seventh Edition, System V, and BSD signals are not).

This section discusses how you can use the three mechanisms in Table 4 to convert the parts of your code that use signals into the Windows environment.

Another mechanism that can be useful when converting some UNIX uses of signals to Windows is event kernel objects. For more information on these objects, see the CreateEvent example in the Logging System Messages section later in this chapter.

Using Native Signals in Windows

In the following example, the simple case of catching SIGINT to detect Ctrl-C is demonstrated. As you can see from the two source listings, support for handling native signals in UNIX and Win32 is very similar.

Managing signals in UNIX

#include <unistd.h> #include <stdio.h> #include <signal.h> /* The intrpt function reacts to the signal passed in the parameter signum. This function is called when a signal occurs. A message is output, then the signal handling for SIGINT is reset (by default generated by pressing CTRL-C) back to the default behavior. */ void intrpt(int signum) { printf("I got signal %d

", signum); (void) signal(SIGINT, SIG_DFL); } /* main intercepts the SIGINT signal generated when Ctrl-C is input. Otherwise, sits in an infinite loop, printing a message once a second. */ int main() { (void) signal(SIGINT, intrpt); while(1) { printf("Hello World!

"); sleep(1); } }

Managing signals in Windows

#include <windows.h> #include <signal.h> #include <stdio.h> void intrpt(int signum) { printf("I got signal %d

", signum); (void) signal(SIGINT, SIG_DFL); } /* main intercepts the SIGINT signal generated when Ctrl-C is input. Otherwise, sits in an infinite loop, printing a message once a second. */ void main() { (void) signal(SIGINT, intrpt); while(1) { printf("Hello World!

"); Sleep(1000); } }

Note By default, signal terminates the calling program with exit code 3, regardless of the value of sig. For more information, see the signal topic in the Visual C++ Run Time Library Reference.

With the exception of requiring an additional header file, and the different signature of the sleep function, these two examples are identical. Unfortunately, this is the extent of the similarities in signal handling between the two platforms.

Replacing UNIX Signals Within Windows

UNIX uses signals to send alerts to processes when specific actions occur. A UNIX application would use the kill function to activate signals internally. As discussed earlier, Win32 provides only limited support for signals. As a result, you have to rewrite your code to use another form of event notification in Win32.

The following example illustrates how you would convert UNIX code to Windows Messages or Event Objects. It shows a simple main that forks a child process, which issues the SIGALRM signal. The parent process catches the alarm and outputs a message when it is received.

Using the SIGALRM signal in UNIX

#include <unistd.h> #include <stdio.h> #include <signal.h> static int alarm_fired = 0; /* The alrm_bell function simulates an alarm clock. */ void alrm_bell(int sig) { alarm_fired = 1; } int main() { int pid; /* Child process waits for 5 sec's before sending SIGALRM to its parent. */ printf("alarm application starting

"); if((pid = fork()) == 0) { sleep(5); kill(getppid(), SIGALRM); exit(0); } /* Parent process arranges to catch SIGALRM with a call to signal and then waits for the child process to send SIGALRM. */ printf("waiting for alarm

"); (void) signal(SIGALRM, alrm_bell); pause(); if (alarm_fired) printf("Ring...Ring!

"); printf("alarm application done

"); exit(0); }

Replacing UNIX Signals with Windows Messages

In the first Win32 example below, a form of Microsoft Windows Messages is used to signal the parent process. In the example, the SetTimer function is used to signal the parent process that an alarm has been activated. While code could have been created to do the timing, using the SetTimer function greatly simplifies this example.

Another advantage of using SetTimer is that the callback function is invoked in the same thread that calls SetTimer. No synchronization is necessary.

If the requirements are simple, consider using a thread to act as a timer thread, which simply calls Sleep to create the desired delay. At the end of the delay, a call is made to a timer callback function. The problem with this approach is that the callback function is called from a different thread than your primary thread. If the callback function requires resources that are thread specific, you will need to use one of the appropriate synchronization mechanisms discussed later in the "Threads" section.

Additional code has been added to the example so that an application using this code can catch any standard Windows message as well as application and user defined messages. You can use these messages to engineer solutions to other signals that are not directly supported by the native signal implementation in Win32.

Replacing SIGALRM using Windows messages

#include <windows.h> #include <stdio.h> #include <conio.h> #include <stdlib.h> static int alarm_fired = 0; /* The alrm_bell function simulates an alarm clock. */ VOID CALLBACK alrm_bell(HWND hwnd, UINT uMsg, UINT idEvent, DWORD dwTime ) { alarm_fired = 1; printf("Ring...Ring!

"); } void main() { printf("alarm application starting

"); /* Set up a 5 second timer which calls alrm_bell */ SetTimer(0, 0, 5000, (TIMERPROC)alrm_bell); printf("waiting for alarm

"); MSG msg = { 0, 0, 0, 0 }; /* Get the message, & dispatch. This causes alrm_bell to be invoked. */ while(!alarm_fired) if (GetMessage(&msg, 0, 0, 0) ) { if (msg.message == WM_TIMER) printf("WM_TIMER

"); DispatchMessage(&msg); } printf("alarm application done

"); exit(0); }

Notice in this example that the WM_TIMER message is issued and captured by the GetMessage function. If you remove the call to DispatchMessage, the alrm_bell function would never be called, but the WM_TIMER message would be received. With this simple application, you can capture a variety of Windows messages. Moreover, if you want to trigger the callback function before the specified time, you can use the PostMessage(WM_TIMER) call. This is analogous to using the kill function to send a signal in UNIX.

Replacing UNIX Signals with Windows Event Objects

Some events that UNIX handles through signals are represented in Win32 as objects. Functions are available to integrate these event objects. An example of these functions is WaitForSingleObject.

In the example code below, a timer object is used to signal when a timed interval has elapsed. Again, this example provides the same functionality as the UNIX SIGALRM example above.

Note While this illustration encompasses the process in a single thread, this is not a requirement. The timer object can be tested and waited for in other threads if necessary.

Replacing SIGALRM using event objects

#define _WIN32_WINNT 0X0500 #include <windows.h> #include <stdio.h> #include <conio.h> #include <stdlib.h> void main() { HANDLE hTimer = NULL; LARGE_INTEGER liDueTime; liDueTime.QuadPart = -50000000; printf("alarm application starting

"); // Set up a 5 second timer object hTimer = CreateWaitableTimer(NULL, TRUE, "WaitableTimer"); SetWaitableTimer(hTimer, &liDueTime, 0, NULL, NULL, 0); // Now wait for the alarm printf("waiting for alarm

"); // Wait for the timer object WaitForSingleObject(hTimer, INFINITE); printf("Ring...Ring!

"); printf("alarm application done

"); exit(0); }

Porting the Sigaction Call

Win32 does not support sigaction. The UNIX example below shows how sigaction is typically used in a UNIX application. In this example, the handler for the SIGALRM signal has been set. How this code can be converted to use Windows Messages was shown earlier. You could also use Windows Messages here if you prefer.

Note To terminate this application from the keyboard, press CTRL+\.

#include <unistd.h> #include <stdio.h> #include <signal.h> void intrpt(int signum) { printf("I got signal %d

", signum); } int main() { struct sigaction act; act.sa_handler = intrpt; sigemptyset(&act.sa_mask); act.sa_flags = 0; sigaction(SIGINT, &act, 0); while(1) { printf("Hello World!

"); sleep(1); } }

Threads

A thread is an independent path of execution in a process that shares the address space, code, and global data of the process. Time slices are allocated to each thread based on priority, and consist of an independent set of registers, stack, I/O handles, and message queue. Threads can usually run on separate processors on a multiprocessor computer. Win32 enables you to assign threads to a specific processor on a multiprocessor hardware platform.

An application using multiple processes usually has to implement some form of interprocess communication (IPC). This can result in significant overhead, and possibly a communication bottleneck. In contrast, threads share the process data between them, and interthread communication can be much faster. The problem with threads sharing data is that this can lead to data access conflicts between multiple threads. You can address these conflicts using synchronization techniques, such as semaphores and mutexes.

In UNIX, developers implement threads by using the POSIX pthread functions. In Win32, developers can implement UNIX threading by using the Win32 API thread management functions. The functionality and operation of threads in UNIX and Win32 is very similar; however, the function calls and syntax are very different.

The following are some similarities between UNIX and Windows:

Every thread must have an entry point. The name of the entry point is entirely up to you so long as the signature is unique and the linker can adequately resolve any ambiguity.

Each thread is passed a single parameter when it is created. The contents of this parameter are entirely up to the developer and have no meaning to the operating system.

A thread function must return a value.

A thread function needs to use local parameters and variables as much as possible. When you use global variables or shared resources, threads must use some form of synchronization to avoid potentially clobbering and corrupting data.

This section looks at how you should go about converting UNIX threaded applications into Win32 thread applications. As you know from the preceding section about processes, you may also have decided to convert some of your application's use of UNIX processes into threads.

Note More information about programming with threads in Win32 can be found on the MSDN Web site at _core_multithreading.3a_.programming_tips

For details on thread management functions in the Win32 API, see the Win32 API reference in Visual Studio or MSDN.

Creating a Thread

When creating a thread in UNIX, use the pthread_create function. This function has three arguments: a pointer to a data structure that describes the thread, an argument specifying the thread's attributes (usually set to NULL indicating default settings) and the function the thread will run. The thread finishes execution with a pthread_exit, where in this case, it returns a string. The process can wait for the thread to complete using the function pthread_join.

This simple UNIX example below creates a thread and waits for it to finish.

Creating a single thread in UNIX

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> char message[] = "Hello World"; void *thread_function(void *arg) { printf("thread_function started. Arg was %s

", (char *)arg); sleep(3); strcpy(message, "Bye!"); pthread_exit("See Ya"); } int main() { int res; pthread_t a_thread; void *thread_result; res = pthread_create(&a_thread, NULL, thread_function, (void *)message); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("Waiting for thread to finish...

"); res = pthread_join(a_thread, &thread_result); if (res != 0) { perror("Thread join failed"); exit(EXIT_FAILURE); } printf("Thread joined, it returned %s

", (char *)thread_result); printf("Message is now %s

", message); exit(EXIT_SUCCESS); }

In Win32, threads are created using the CreateThread function. CreateThread requires:

The size of the thread's stack

The security attributes of the thread

The address at which to begin execution of a procedure

An optional 32 bit value that is passed to the thread's procedure

Flags that permit the thread priority to be set

An address to store the system-wide unique thread identifier

Once a thread is created, the thread identifier can be used to manage the thread until it has terminated. The next example demonstrates how you should use CreateThread to create a single thread.

Creating a single thread in Windows

#include <windows.h> #include <stdio.h> #include <stdlib.h> char message[] = "Hello World"; DWORD WINAPI thread_function(PVOID arg) { printf("thread_function started. Arg was %s

", (char *)arg); Sleep(3000); strcpy(message, "Bye!"); return 100; } void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); printf("Thread joined, it returned %d

", thread_result); printf("Message is now %s

", message); exit(EXIT_SUCCESS); }

The UNIX and Win32 examples have roughly equivalent semantics. There are only two notable differences:

The thread function in the Win32 code cannot return a string value. Developers must use some other means to convey the string message back to the parent (for example, returning an index into a string array).

The Win32 version of the thread function simply returns a DWORD value rather than calling a function to terminate the thread. ExitThread could have been called, but this is not necessary because ExitThread is called automatically upon the return from the thread procedure. TerminateThread could also be called, but this isn't necessary, nor is it recommended. This is because TerminateThread causes the thread to exit unexpectedly. The thread then has no chance to execute any user-mode code and its initial stack in not deallocated. Furthermore, any DLLs attached to the thread are not notified that the thread is terminating. For more information, see Process and Thread Functions.

The two solutions have vastly different syntaxes. Win32 uses a different set of API calls to manage threads. As a result, the relevant data elements and arguments are considerably different.

Canceling a Thread

The details of terminating threads differ significantly between UNIX and Win32. While both environments allow threads to block termination entirely, UNIX offers additional facilities that allow a thread to specify if it is to be terminated immediately or deferred until it reaches a safe recovery point. Moreover, UNIX provides a facility known as cancellation cleanup handlers, which a thread can push and pop from a stack that is invoked in a last-in-first-out order when the thread is terminated. These cleanup handlers are coded to clean up and restore any resources before the thread is actually terminated.

The Win32 API allows you to terminate a thread asynchronously. Unlike UNIX, in Win32 code you cannot create cleanup handlers and it is not possible for a thread to defer from being terminated. Therefore, it is recommended that you design your code so that threads terminate by returning an exit code and so that threads cannot be terminated forcibly. To do this, you should design your thread code to accept some form of message or event to signal that they should be terminated. Based on this notification, the thread logic can elect to execute cleanup-handling code and return normally.

To prevent a thread from being terminated, you should remove the security attributes for THREAD_TERMINATE from the thread object.

While forcing a thread to end by using TerminateThread is not recommended, for completeness, the following example shows how you could convert UNIX code that cancels a thread into Win32 code that cancels a thread using this method.

Canceling a thread in UNIX

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> void *thread_function(void *arg) { int i, res; res = pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL); if (res != 0) { perror("Thread pthread_setcancelstate failed"); exit(EXIT_FAILURE); } res = pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL); if (res != 0) { perror("Thread pthread_setcanceltype failed"); exit(EXIT_FAILURE); } printf("thread_function is running

"); for(i = 0; i < 10; i++) { printf("Thread is running (%d)...

", i); sleep(1); } pthread_exit(0); } int main() { int res; pthread_t a_thread; void *thread_result; res = pthread_create(&a_thread, NULL, thread_function, NULL); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } sleep(3); printf("Cancelling thread...

"); res = pthread_cancel(a_thread); if (res != 0) { perror("Thread cancellation failed"); exit(EXIT_FAILURE); } printf("Waiting for thread to finish...

"); res = pthread_join(a_thread, &thread_result); if (res != 0) { perror("Thread join failed"); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); }

Canceling a thread in Windows

#include <windows.h> #include <stdio.h> #include <stdlib.h> DWORD WINAPI thread_function(PVOID arg) { printf("thread_function is running. Argument was %s

", (char *)arg); for(int i = 0; i < 10; i++) { printf("Thread is running (%d)...

", i); Sleep(1000); } return 100; } void main() { HANDLE a_thread; DWORD thread_result; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)NULL, 0, NULL); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } Sleep(3000); printf("Cancelling thread...

"); if (!TerminateThread(a_thread, 0)) { perror("Thread cancellation failed"); exit(EXIT_FAILURE); } printf("Waiting for thread to finish...

"); WaitForSingleObject(a_thread, INFINITE); GetExitCodeThread(a_thread, &thread_result); exit(EXIT_SUCCESS); }

When you compare the UNIX and Win32 examples, you can see that in the Win32 implementation the setting for the deferred termination is absent. This is because deferring termination is not supported in Win32. TerminateThread is not immediate and it is not predictable. The termination resulting from a TerminateThread call can occur at any point during the thread execution. In contrast, UNIX threads tagged as deferred can terminate when a safe cancellation point is reached.

If you need to match the UNIX behavior in your Win32 application exactly you must create your own cancellation code, and thereby prevent the thread from being forcibly terminated.

Thread Synchronization

When you have more than one thread executing simultaneously, you have to take the initiative to protect shared resources. For example, if your thread increments a variable, you cannot predict the result as the variable may have been modified by another thread before or after the increment. The reason that you cannot predict the result is that the order in which threads have access to a shared resource is indeterminate.

The following example illustrates code that is, in principle, indeterminate.

Note This is a very simple example and on most computers the result would always be the same, but the important point to note is that this is not guaranteed.

The main thread in the below example is represented by the parent. It generates a "P", and the child or secondary thread outputs a "T". A UNIX example and a Windows example are shown.

Multiple non-synchronized threads in UNIX

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> void *thread_function(void *arg) { int count2; printf("thread_function is running. Argument was: %s

", (char *)arg); for (count2 = 0; count2 < 10; count2++) { sleep(1); printf("T"); } sleep(3); } char message[] = "Hello I'm a Thread"; int main() { int count1, res; pthread_t a_thread; void *thread_result; res = pthread_create(&a_thread, NULL, thread_function, (void *)message); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("entering loop

"); for (count1 = 0; count1 < 10; count1++) { sleep(1); printf("P"); } printf("

Waiting for thread to finish...

"); res = pthread_join(a_thread, &thread_result); if (res != 0) { perror("Thread join failed"); exit(EXIT_FAILURE); } printf("

Thread joined

"); exit(EXIT_SUCCESS); }

Multiple non-synchronized threads in Windows

#include <windows.h> #include <stdio.h> #include <stdlib.h> DWORD WINAPI thread_function(PVOID arg) { int count2; printf("thread_function is running. Argument was: %s

", (char *)arg); for (count2 = 0; count2 < 10; count2++) { Sleep(1000); printf("T"); } Sleep(3000); return 0; } char message[] = "Hello I'm a Thread"; void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; int count1; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("entering loop

"); for (count1 = 0; count1 < 10; count1++) { Sleep(1000); printf("P"); } printf("

Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); printf("

Thread joined

"); exit(EXIT_SUCCESS); }

No actual synchronization between these two threads is being performed, and each thread uses the same, shared variable. If the threads were running serially, you'd see output like the following:

MOV EAX, 2 ; Thread 1: Move 2 into a register. MOV [run_now], EAX ; Thread 1: Store 2 in run_now. MOV EAX, 1 ; Thread 2: Move 1 into a register. MOV [run_now], EAX ; Thread 2: Store 1 in run_now.

However, since there is no guarantee of the order that the threads will be executed in, you could have the following output:

MOV EAX, 2 ; Thread 1: Move 2 into a register. MOV EAX, 1 ; Thread 2: Move 1 into a register. MOV [run_now], EAX ; Thread 1: Store 2 in run_now. MOV [run_now], EAX ; Thread 2: Store 1 in run_now.

It is not possible to predict the output that you will see from these examples. In most applications, unpredictable results are an undesirable feature. Consequently, it is important that you take great care in controlling access to shared resources in threaded code. UNIX and Windows provide mechanisms for controlling resource access. These mechanisms are referred to as synchronization techniques, which are discussed in the next few sections.

Interlocked exchange

A simple form of synchronization is to use what is known as an interlocked exchange. An interlocked exchange performs a single operation that cannot be preempted. The threads of different processes can only use this mechanism if the variable is in shared memory. The variable pointed to by the target parameter must be aligned on a 32-bit boundary; otherwise, this function will fail on multiprocessor x86 systems. Since this is not the case in the example, it does not help much, but it does illustrate the use of the InterlockedExchange functions.

Rewriting the previous Win32 example by using InterlockedExchange results in the following code:

Thread synchronization using interlocked exchange in Windows

#include <windows.h> #include <stdio.h> #include <stdlib.h> LONG new_value = 1; char message[] = "Hello I'm a Thread"; DWORD WINAPI thread_function(PVOID arg) { int count2; printf("thread_function is running. Argument was: %s

", (char *)arg); for (count2 = 0; count2 < 10; count2++) { Sleep(1000); printf("(T-%d)", new_value); InterlockedExchange(&new_value, 1); } Sleep(3000); return 0; } void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; int count1; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("entering loop

"); for (count1 = 0; count1 < 10; count1++) { Sleep(1000); printf("(P-%d)", new_value); InterlockedExchange(&new_value, 2); } printf("

Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); printf("

Thread joined

"); exit(EXIT_SUCCESS); }

Synchronization with SpinLocks

In the previous example, as noted, you still have no synchronization between the two threads. The output may still be out of order. One simple mechanism that offers synchronization would be to implement a spin lock. To accomplish this, a variant of the Interlocked function called InterlockedCompareExchange is used.

#include <windows.h> #include <stdio.h> #include <stdlib.h> LONG run_now = 1; char message[] = "Hello I'm a Thread"; DWORD WINAPI thread_function(PVOID arg) { int count2; printf("thread_function is running. Argument was: %s

", (char *)arg); for (count2 = 0; count2 < 10; count2++) { if (InterlockedCompareExchange(&run_now, 1, 2) == 2) printf("T-2"); else Sleep(1000); } Sleep(3000); return 0; } void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; int count1; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("entering loop

"); for (count1 = 0; count1 < 10; count1++) { if (InterlockedCompareExchange(&run_now, 2, 1) == 1) printf("P-1"); else Sleep(1000); } printf("

Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); printf("

Thread joined

"); exit(EXIT_SUCCESS); }

Spinlocks work well for synchronizing access to a single object, but most applications are not this simple. Moreover, using spinlocks is not the most efficient means to control access to a shared resource. Running a While loop in user mode while waiting for a global value to change wastes CPU cycles unnecessarily. A mechanism is needed that allows the thread to not waste CPU time while waiting to access a shared resource.

When a thread requires access to a shared resource (for example a shared memory object), it must either be notified or scheduled to resume execution. To accomplish this, a thread must call an operating system function, passing it parameters that indicate what the thread is waiting for. If the operating system detects that the resource is available, the function returns and the thread resumes.

If the resource is unavailable, the system places the thread in a wait state, making the thread not schedulable. This prevents the thread from wasting any CPU time. When a thread is waiting, the system permits the exchange of information between the thread and the resource. The operating system tracks the resources that a thread needs and automatically resumes the thread when the resource becomes available. The thread's execution is synchronized with the availability of the resource.

Mechanisms that prevent the thread from wasting CPU time include critical sections (for example, the EnterCriticalSection function waits for ownership of the specified critical section object, and returns when the calling thread has been granted ownership), semaphores and mutexes. Windows includes all three of these mechanisms, and UNIX provides both semaphores and mutexes. These three mechanisms are described in the following sections.

Synchronization with critical sections

Another mechanism for solving this simple scenario is to use a critical section. A critical section is similar to InterlockedExchange except that you have the ability to define the logic that takes place as an atomic operation.

What follows is the simple example from the previous section with the InterlockedExchange replaced with critical sections. On multiprocessor systems, it's best to use InitializeCriticalSectionAndSpinCount, instead of InitializeCriticalSection, which provides an optimized version of critical sections by employing spin counting. A critical section with spin locking allows the EnterCriticalSection to be tried up to spin count times before transitioning into kernel mode to wait for the resource. The advantage to this is that the transition into kernel mode requires approximately 1,000 CPU cycles.

Moreover, there is a slight chance that entering a critical section may fail due to memory limitations. The InitializeCriticalSectionAndSpinCount form of the critical section function then returns a status of STATUS_NO_MEMORY. This is an improvement over the InitializeCriticalSection function, which does not return any status as can be determined by its void return type.

Critical section code is highlighted in bold.

#include <windows.h> #include <stdio.h> #include <stdlib.h> CRITICAL_SECTION g_cs; char message[] = "Hello I'm a Thread"; DWORD WINAPI thread_function(PVOID arg) { int count2; printf("

thread_function is running. Argument was: %s

", (char *)arg); for (count2 = 0; count2 < 10; count2++) { EnterCriticalSection(&g_cs); printf("T"); LeaveCriticalSection(&g_cs); } Sleep(3000); return 0; } void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; int count1; InitializeCriticalSection(&g_cs); // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("entering loop

"); for (count1 = 0; count1 < 10; count1++) { EnterCriticalSection(&g_cs); printf("P"); LeaveCriticalSection(&g_cs); } printf("

Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); printf("

Thread joined

"); DeleteCriticalSection(&g_cs); exit(EXIT_SUCCESS); }

Synchronization using semaphores

In the following example, two threads are created that use a shared memory buffer. Access to the shared memory is synchronized using a semaphore. The primary thread (main) creates a semaphore object and uses this object to handshake with the secondary thread (thread_function). The primary thread instantiates the semaphore in a state that prevents the secondary thread from acquiring the semaphore while it is initiated.

The primary thread relinquishes the semaphore once the user types in some text at the console and presses return. Once this is done, the secondary thread acquires the semaphore and processes the shared memory area. At this point, the main thread is blocked waiting for the semaphore, and will not resume until the secondary thread has relinquished control by calling ReleaseSemaphore.

These two examples are somewhat different. In UNIX, the semaphore object functions of sem_post and sem_wait are all that are required to perform handshaking. With Win32, you must use a combination of WaitForSingleObject and ReleaseSemaphore in both the primary and the secondary threads in order to facilitate handshaking. The two solutions are also very different from a syntactic standpoint. The primary difference between their implementations is with the API calls that are used to manage the semaphore objects.

One aspect of CreateSemaphore that you need to be aware of is the last argument in its parameter list. This is a string parameter specifying the name of the semaphore. You should not pass a NULL for this parameter. Most (but not all) of the kernel objects, including semaphores, are named. All kernel object names are stored in a common namespace except if it is a server running Microsoft Terminal Server, in which case there will also be a namespace for each session. If the namespace is global, one or more unassociated applications could attempt to use the same name for a semaphore. To avoid namespace contention, applications should use some unique naming convention. One solution you could use would be to base your semaphore names on globally unique identifiers (GUIDs).

Terminal server and naming semaphore objects

As mentioned earlier, Terminal Servers have multiple namespaces for kernel objects. There is one global namespace, which is used by kernel objects that are accessible by any and all client sessions and is usually populated by services. Additionally, each client session has its own namespace to prevent namespace collisions between multiple instances of the same application running in different sessions.

In addition to the session and global namespaces, Terminal Servers also have a local namespace. By default, an application's named kernel objects reside in the session namespace. It is possible, however, to override what namespace will be used. This is accomplished by prefixing the name with Global\ or Local\. These prefix names are reserved by Microsoft, are case sensitive and are ignored if the computer is not operating as a Terminal Server.

UNIX example: synchronization using semaphores

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <semaphore.h> #define SHARED_SIZE 1024 char shared_area[SHARED_SIZE]; sem_t bin_sem; void *thread_function(void *arg) { sem_wait(&bin_sem); while(strncmp("done", shared_area, 4) != 0) { printf("You input %d characters

", strlen(shared_area) -1); sem_wait(&bin_sem); } pthread_exit(NULL); } int main() { int res; pthread_t a_thread; void *thread_result; res = sem_init(&bin_sem, 0, 0); if (res != 0) { perror("Semaphore initialization failed"); exit(EXIT_FAILURE); } res = pthread_create(&a_thread, NULL, thread_function, NULL); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("Input some text. Enter 'done' to finish

"); while(strncmp("done", shared_area, 4) != 0) { fgets(shared_area, SHARED_SIZE, stdin); sem_post(&bin_sem); } printf("

Waiting for thread to finish...

"); res = pthread_join(a_thread, &thread_result); if (res != 0) { perror("Thread join failed"); exit(EXIT_FAILURE); } printf("

Thread joined

"); sem_destroy(&bin_sem); exit(EXIT_SUCCESS); }

Win32 example: synchronization using semaphores

#include <windows.h> #include <stdio.h> #include <stdlib.h> #define SHARED_SIZE 1024 char shared_area[SHARED_SIZE]; LPCTSTR lpszSemaphore = "SEMAPHORE-EXAMPLE"; HANDLE sem_t; DWORD WINAPI thread_function(PVOID arg) { LONG dwSemCount; HANDLE hSemaphore = OpenSemaphore( SYNCHRONIZE | SEMAPHORE_MODIFY_STATE, FALSE, lpszSemaphore ); WaitForSingleObject( hSemaphore, INFINITE ); while(strncmp("done", shared_area, 4) != 0) { printf("You input %d characters

", strlen(shared_area) -1); ReleaseSemaphore(hSemaphore, 1, &dwSemCount); WaitForSingleObject( hSemaphore, INFINITE ); } ReleaseSemaphore(hSemaphore, 1, &dwSemCount); CloseHandle( hSemaphore ); return 0; } void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; LONG dwSemCount; // Initialize Semaphore object. sem_t = CreateSemaphore( NULL, 0, 1, lpszSemaphore ); if (sem_t == NULL) { perror("Semaphore initialization failed"); exit(EXIT_FAILURE); } // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)NULL, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("Input some text. Enter 'done' to finish

"); while(strncmp("done", shared_area, 4) != 0) { fgets(shared_area, SHARED_SIZE, stdin); ReleaseSemaphore(sem_t, 1, &dwSemCount); WaitForSingleObject(sem_t, INFINITE); } printf("

Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); printf("

Thread joined

"); exit(EXIT_SUCCESS); }

Synchronization using mutexes

A mutex is a kernel object that provides a thread with mutually exclusive access to a single resource. Any thread of the calling process can specify the mutex-object handle in a call to one of the wait functions. The single-object wait functions return when the state of the specified object is signaled. The state of a mutex object is signaled when it is not owned by any thread. When the mutex's state is signaled, one waiting thread is granted ownership, the mutex's state changes to nonsignaled, and the wait function returns. Only one thread can own a mutex at any given time. The owning thread uses the ReleaseMutex function to release its ownership.

The next example looks at the use of mutexes to coordinate access to a shared resource, and to handshake between two threads. The logic is virtually identical to the semaphore example in the previous section. The only real difference is that this example uses a mutex instead of a semaphore.

UNIX example: thread synchronization using mutexes

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> #include <semaphore.h> #define SHARED_SIZE 1024 char shared_area[SHARED_SIZE]; pthread_mutex_t shared_mutex; /* protects shared_area */ void *thread_function(void *arg) { pthread_mutex_lock(&shared_mutex); while(strncmp("done", shared_area, 4) != 0) { printf("You input %d characters

", strlen(shared_area) -1); pthread_mutex_unlock(&shared_mutex); pthread_mutex_lock(&shared_mutex); } pthread_mutex_unlock(&shared_mutex); pthread_exit(0); } int main() { int res; pthread_t a_thread; void *thread_result; res = pthread_mutex_init(&shared_mutex, NULL); if (res != 0) { perror("Mutex initialization failed"); exit(EXIT_FAILURE); } res = pthread_create(&a_thread, NULL, thread_function, NULL); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } pthread_mutex_lock(&shared_mutex); printf("Input some text. Enter 'done' to finish

"); while (strncmp("done", shared_area, 4) != 0) { fgets(shared_area, SHARED_SIZE, stdin); pthread_mutex_unlock(&shared_mutex); pthread_mutex_lock(&shared_mutex); } pthread_mutex_unlock(&shared_mutex); printf("

Waiting for thread to finish...

"); res = pthread_join(a_thread, &thread_result); if (res != 0) { perror("Thread join failed"); exit(EXIT_FAILURE); } printf("

Thread joined

"); pthread_mutex_destroy(&shared_mutex); exit(EXIT_SUCCESS); }

Win32 example: thread synchronization using mutexes

#include <windows.h> #include <stdio.h> #include <stdlib.h> #define SHARED_SIZE 1024 char shared_area[SHARED_SIZE]; LPCTSTR lpszMutex = "MUTEX-EXAMPLE"; HANDLE shared_mutex; DWORD WINAPI thread_function(PVOID arg) { HANDLE hMutex = OpenMutex(MUTEX_ALL_ACCESS, FALSE, lpszMutex); WaitForSingleObject( hMutex, INFINITE ); while(strncmp("done", shared_area, 4) != 0) { printf("You input %d characters

", strlen(shared_area) -1); ReleaseMutex(hMutex); WaitForSingleObject(hMutex, INFINITE); } ReleaseMutex(hMutex); CloseHandle(hMutex); return 0; } void main() { HANDLE a_thread; DWORD a_threadId; DWORD thread_result; // Initialize Semaphore object. shared_mutex = CreateMutex( NULL, TRUE, lpszMutex ); if (shared_mutex == NULL) { perror("Mutex initialization failed"); exit(EXIT_FAILURE); } // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)NULL, 0, &a_threadId); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } printf("Input some text. Enter 'done' to finish

"); while(strncmp("done", shared_area, 4) != 0) { fgets(shared_area, SHARED_SIZE, stdin); ReleaseMutex(shared_mutex); WaitForSingleObject(shared_mutex, INFINITE); } ReleaseMutex(shared_mutex); printf("Waiting for thread to finish...

"); if (WaitForSingleObject(a_thread, INFINITE) != WAIT_OBJECT_0) { perror("Thread join failed"); exit(EXIT_FAILURE); } // Retrieve the code returned by the thread. GetExitCodeThread(a_thread, &thread_result); CloseHandle(shared_mutex); printf("Thread joined

"); exit(EXIT_SUCCESS); }

Thread Attributes

There are a number of attributes associated with threads in UNIX that you need to convert into equivalent attributes in Win32. This section contrasts the UNIX and Win32 thread attributes and describes how you should convert your code. Table 5 lists the relevant UNIX thread attributes, and then each attribute is discussed individually:

Table 5. UNIX thread attributes

Attribute Default values Description detachstate PTHREAD_CREATE_JOINABLE

PTHREAD_CREATE_DETACHED Thread may be joined by other threads.

Threads may not be waited on for termination. inheritsched PTHREAD_INHERIT_SCHED

PTHREAD_EXPLICIT_SCHED

Scheduling parameters, policy, and scope are inherited from creating thread.

Scheduling parameters for the newly created thread are specified in the thread attribute. schedparam — Priority set to default for scheduling policy. schedpolicy SCHED_OTHER

SCHED_FIFO

SCHED_RR Scheduling policy is determined by the system.

Threads are scheduled in a first-in-first-out order.

Threads are scheduled in a round-robin fashion. Scope PTHREAD_SCOPE_SYSTEM

PTHREAD_SCOPE_PROCESS Threads are scheduled system-wide.

Threads are scheduled based on other threads in the owning process. Stackaddr N/A Attribute not supported; address selected by the operating system. Stacksize 0 Stack size inherited from process stack size attribute.

Detachstate

Detachstate indicates whether a thread can be waited on for termination. Within Win32, the same effect is achieved by closing any handles that exist for a given thread. Since a handle is required for one of the wait and thread management functions, without a handle, you are effectively stopped from acting on a thread. You can also control thread objects based on a security descriptor that is provided at the time the thread is created.

Note For more information on Access-Control, see Platform SDK: Security: Access Control.

The handle returned by the CreateThread function has THREAD_ALL_ACCESS access to the thread object. When you call the GetCurrentThread function, the system returns a pseudohandle with the maximum access that the thread's security descriptor allows the caller.

The valid access rights for thread objects include the DELETE, READ_CONTROL, SYNCHRONIZE, WRITE_DAC, and WRITE_OWNER standard access rights, in addition to the thread-specific access rights shown in Table 6.

Table 6. Thread-specific access rights

Value Meaning SYNCHRONIZE A standard right required to wait for the thread to exit. THREAD_ALL_ACCESS Specifies all possible access rights for a thread object. THREAD_DIRECT_IMPERSONATION Required for a server thread that impersonates a client. THREAD_GET_CONTEXT Required to read the context of a thread by using GetThreadContext. THREAD_IMPERSONATE Required to use a thread's security information directly without calling it by using a communication mechanism that provides impersonation services. THREAD_QUERY_INFORMATION Required to read certain information from the thread object. THREAD_SET_CONTEXT Required to write the context of a thread. THREAD_SET_INFORMATION Required to set certain information in the thread object. THREAD_SET_THREAD_TOKEN Required to set the impersonation token for a thread. THREAD_SUSPEND_RESUME Required to suspend or resume a thread. THREAD_TERMINATE Required to terminate a thread.

Inheritsched/schedparam/schedpolicy/scope

Inheritsched/schedparam/schedpolicy/scope indicates that the scheduling is either inherited from the thread that created the new thread, or is explicitly set. It also defines the policy and scope applied to scheduling threads. In Win32, by default, the priority class of a process is NORMAL_PRIORITY_CLASS. Use the CreateProcess function to specify the priority class of a child process when you create it.

If the calling process is IDLE_PRIORITY_CLASS or BELOW_NORMAL_PRIORITY_CLASS, the new process inherits this class. You use the GetPriorityClass function to determine the current priority class of a process and the SetPriorityClass function to change the priority class of a process.

Stacksize

The stack size applied to a thread is controlled at the time the thread is created by using CreateThread. The initial size of the stack is specified in bytes. The system rounds this value to the nearest page. If this parameter is of zero value, the new thread uses the default size for the executable.

Setting thread attributes

Now that the thread attributes have been described, let's take a look at a simple example of how the attributes of a thread can be set.

The UNIX example below makes some basic use of thread attributes. The corresponding Win32 example doesn't even need to use attributes to accomplish the same functionality. All that is required with Win32 is to create a thread that can't be acted upon by a wait. This can be accomplished by passing NULL as the dwThreadId parameter to CreateThread, and by closing the handle that is returned by the call.

The net effect of these combined activities effectively hinders an application's ability to manage the thread. This issue is addressed in the "Thread Scheduling and Priorities" section later in this chapter.

UNIX example setting thread attributes

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> char message[] = "Hello I'm a Thread"; int thread_finished = 0; void *thread_function(void *arg) { printf("thread_function running. Arg was: %s

", (char *)arg); sleep(4); printf("Second thread setting finished flag, and exiting now

"); thread_finished = 1; pthread_exit(NULL); } int main() { int count=0, res; pthread_t a_thread; void *thread_result; pthread_attr_t thread_attr; res = pthread_attr_init(&thread_attr); if (res != 0) { perror("Attribute creation failed"); exit(EXIT_FAILURE); } res = pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_DETACHED); if (res != 0) { perror("Setting detached attribute failed"); exit(EXIT_FAILURE); } res = pthread_create(&a_thread, &thread_attr, thread_function, (void *)message); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } (void)pthread_attr_destroy(&thread_attr); while(!thread_finished) { printf("Waiting for thread to finish (%d)

", ++count); sleep(1); } printf("Other thread finished, See Ya!

"); exit(EXIT_SUCCESS); }

Win32 example: setting thread attributes

#include <windows.h> #include <stdio.h> #include <stdlib.h> char message[] = "Hello I'm a Thread"; int thread_finished = 0; DWORD WINAPI thread_function(PVOID arg) { printf("

thread_function running. Arg was: %s

", (char *)arg); Sleep(4000); printf("Second thread setting finished flag, and exiting now

"); thread_finished = 1; return 100; } void main() { int count=0; HANDLE a_thread; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, NULL); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } CloseHandle(a_thread); while(!thread_finished) { printf("Waiting for thread to finish (%d)

", ++count); Sleep(1000); } printf("Other thread finished, See Ya!

"); exit(EXIT_SUCCESS); }

Win32 security and thread objects

Threads are kernel objects. As such, they are protected by Windows security, and therefore a process must request permission to manipulate an object before attempts are made. The creator of the object can prevent an unauthorized user from doing anything with the object by denying access to it.

Object flags are covered as part of the thread discussion here, but this information also pertains to other kernel objects that are obtained by using one of the Win32 Create functions.

Until now, threads have been created in these solutions with a NULL security attribute. This indicated that the thread should be created using the default security, and that the returned handle should be inheritable. If you want to change the behavior of the previous example to prevent the thread handle from being inherited and or closed, you could use the SetHandleInformation function to accomplish this. The following is an example of this:

#define HANDLE_FLAG_INHERIT 0x00000001 #define HANDLE_FLAG_PROTECT_FROM_CLOSE 0x00000002 SetHandleInformation(hThread, HANDLE_FLAG_INHERIT, HANDLE_FLAG_INHERIT); SetHandleInformation(hThread, HANDLE_FLAG_PROTECT_FROM_CLOSE, HANDLE_FLAG_PROTECT_FROM_CLOSE);

To change both flags in a single call you should bitwise OR the flags together. After this call, attempting to close the handle by using CloseHandle would result in an exception being raised.

Thread Scheduling and Priorities

This section looks at how you can change the scheduling priority of a thread in UNIX and Win32.

Ideally, you want to map Win32 priority classes to UNIX scheduling policies, and Win32 thread priority levels to UNIX priority levels. Unfortunately, it isn't this simple.

The priority level of a Win32 thread is determined by both the priority class of its process and its priority level. The priority class and priority level are combined to form the base priority of each thread.

Every thread in Windows has a base priority level determined by the thread's priority value and the priority class of its owning process. The operating system uses the base priority level of all executable threads to determine which thread gets the next slice of CPU time. Threads are scheduled in a round-robin fashion at each priority level, and only when there are no executable threads at a higher level will scheduling of threads at a lower level take place.

UNIX offers both round robin and FIFO scheduling algorithms, whereas Windows uses only round robin. This does not mean that Windows is less flexible; it simply means that any fine tuning that was performed on thread scheduling in UNIX has to be implemented differently when using Windows.

Table 7 shows the base priority levels for combinations of priority class and priority value.

Table 7. Process and thread priority

Process priority class Thread priority level 1 IDLE_PRIORITY_CLASS THREAD_PRIORITY_IDLE 1 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE 1 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE 1 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE 1 HIGH_PRIORITY_CLASS THREAD_PRIORITY_IDLE 2 IDLE_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 3 IDLE_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 4 IDLE_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 4 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 5 IDLE_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 5 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 5 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 6 IDLE_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 6 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 6 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 7 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 7 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 7 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 8 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 8 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 8 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 8 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 9 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 9 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 9 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 10 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 10 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 11 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 11 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 11 HIGH_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 12 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 12 HIGH_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 13 HIGH_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 14 HIGH_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 15 HIGH_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 15 HIGH_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 15 IDLE_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 15 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 15 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 15 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL 16 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_IDLE 17 REALTIME_PRIORITY_CLASS -7 18 REALTIME_PRIORITY_CLASS -6 19 REALTIME_PRIORITY_CLASS -5 20 REALTIME_PRIORITY_CLASS -4 21 REALTIME_PRIORITY_CLASS -3 22 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_LOWEST 23 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL 24 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_NORMAL 25 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL 26 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST 27 REALTIME_PRIORITY_CLASS 3 28 REALTIME_PRIORITY_CLASS 4 29 REALTIME_PRIORITY_CLASS 5 30 REALTIME_PRIORITY_CLASS 6 31 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL

Managing thread priorities in Windows

The Win32 API provides a number of functions for managing thread priorities:

GetThreadContext Returns the execution context of the specified thread. The following is an example showing the thread context: CONTEXT context; TCHAR szBuffer[128]; Context.ContextFlags = CONTEXT_FULL | CONTEXT_DEBUG_REGISTERS; GetThreadContext( GetCurrentThread(), &context); printf("CS=%X, EIP=%X, FLAGS=%X, DR1=%X

", context.SegCs, context.Eip, context.EFlags, context.Dr1);

GetThreadPriority Returns the assigned thread priority level for the specified thread. The priority for the thread and the process class determine the thread's base priority level. To see how thread priority affects the system, a simple test like the one below could be added to a simple Windows application: SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_LOWEST); DWORD dwTicks = GetTickCount(); for(long i = 0; i < 200000; i ++) for(long j = 0; j < 2000; j ++) printf("Test time=%ld

", GetTickCount()-dwTicks); Adjusting the thread priority should yield different time deltas.

GetThreadPriorityBoost Retrieves the priority boost control state of the specified thread. Threads have dynamic priority, meaning the priority that the scheduler uses to identify which thread will execute. Initially, a thread's priority is the same as its base priority, but the system may increase or decrease the priority to maintain thread responsiveness. Only threads with a priority between 0 and 15 are eligible for dynamic priority boost. The system boosts the dynamic priority of a thread to enhance its responsiveness as follows: When a process that uses NORMAL_PRIORITY_CLASS is brought to the foreground, the scheduler boosts the priority class of the process associated with the foreground window so that it is greater than or equal to the priority class of any background processes. The priority class returns to its original setting when the process is no longer in the foreground. In the Microsoft Windows NT® operating system, as well as in Windows 2000 or later, the user can control the boosting of processes that use NORMAL_PRIORITY_CLASS through the system Control Panel. When a window receives input, such as timer messages, mouse messages or keyboard input, the scheduler boosts the priority of the thread that owns the window. When the wait conditions for a blocked thread are satisfied, the scheduler boosts the priority of the thread. For example, when a wait operation associated with disk or keyboard I/O finishes, the thread receives a priority boost.

SetPriorityClass Adjusts the priority class of a given process.

SetThreadIdealProcessor Specifies the preferred processor for a specific thread. The system schedules threads on the preferred processor when possible.

SetThreadPriority Changes the priority level for a thread. Consult the Win32 API reference for details on the different priority levels.

SetThreadPriorityBoost Enables or disables dynamic priority boost by the system.

An example of converting UNIX thread scheduling into Windows

In this example, the thread priority level is set to the lowest level within the given policy or class for UNIX and Windows respectively. For UNIX, lowering the thread priority level requires creating an attribute object prior to instantiating the thread, and then setting the policy of the attribute object. Once this activity is complete, the thread is created with the modified attribute. Upon successfully instantiating the thread, the priority level is adjusted to the lowest level within the designated policy and class. In UNIX, this is accomplished by a call to pthread_attr_setschedparam, and when using Win32 by a call to SetThreadPriority.

UNIX example: thread scheduling

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> char message[] = "Hello I'm a Thread"; int thread_finished = 0; void *thread_function(void *arg) { printf("thread_function running. Arg was %s

", (char *)arg); sleep(4); printf("Second thread setting finished flag, and exiting now

"); thread_finished = 1; pthread_exit(NULL); } int main() { int count=0, res, min_priority, max_priority; struct sched_param scheduling_params; pthread_t a_thread; void *thread_result; pthread_attr_t thread_attr; res = pthread_attr_init(&thread_attr); if (res != 0) { perror("Attribute creation failed"); exit(EXIT_FAILURE); } res = pthread_attr_setschedpolicy(&thread_attr, SCHED_OTHER); if (res != 0) { perror("Setting schedpolicy failed"); exit(EXIT_FAILURE); } res = pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_DETACHED); if (res != 0) { perror("Setting detached attribute failed"); exit(EXIT_FAILURE); } res = pthread_create(&a_thread, &thread_attr, thread_function, (void *)message); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } max_priority = sched_get_priority_max(SCHED_OTHER); min_priority = sched_get_priority_min(SCHED_OTHER); scheduling_params.sched_priority = min_priority; res = pthread_attr_setschedparam(&thread_attr, &scheduling_params); if (res != 0) { perror("Setting schedparam failed"); exit(EXIT_FAILURE); } (void)pthread_attr_destroy(&thread_attr); while(!thread_finished) { printf("Waiting for thread to finish (%d)

", ++count); sleep(1); } printf("Other thread finished, See Ya!

"); exit(EXIT_SUCCESS); }

Win32 example: thread scheduling

#include <windows.h> #include <stdio.h> #include <stdlib.h> DWORD WINAPI thread_function(PVOID arg); char message[] = "Hello I'm a Thread"; int thread_finished = 0; void main() { int count=0; HANDLE a_thread; // Create a new thread. a_thread = CreateThread(NULL, 0, thread_function, (PVOID)message, 0, NULL); if (a_thread == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } if (!SetThreadPriority(a_thread, THREAD_PRIORITY_LOWEST)) { perror("Setting sched priority failed"); exit(EXIT_FAILURE); } CloseHandle(a_thread); while(!thread_finished) { printf("Waiting for thread to finished (%d)

", ++count); Sleep(1000); } printf("Other thread finished, bye!

"); exit(EXIT_SUCCESS); } DWORD WINAPI thread_function(PVOID arg) { printf("thread_function running. Arg was %s

", (char *)arg); Sleep(4000); printf("Second thread setting finished flag, and exiting now

"); thread_finished = 1; return 100; }

In the preceding Win32 example, the priority level of the thread is adjusted to the lowest level within the priority class of the owning process. If you want to change the priority class as well as the priority level, insert the following code just before the SetThreadPriority call

SetPriorityClass(GetCurrentProcess(), PriorityClass)

where PriorityClass would have been one of the following values Table 8 summarizes how to change the scheduling priority for a thread and priority class for the owning process.

Table 8. PriorityClass values

PriorityClass Meaning ABOVE_NORMAL_PRIORITY_CLASS Windows 2000 and XP: Indicates a process that has priority above NORMAL_PRIORITY_CLASS but below HIGH_PRIORITY_CLASS. BELOW_NORMAL_PRIORITY_CLASS Windows 2000 and XP: Indicates a process that has priority above IDLE_PRIORITY_CLASS but below NORMAL_PRIORITY_CLASS. HIGH_PRIORITY_CLASS Specifies this class for a process that performs time-critical tasks that must be executed immediately. The threads of the process preempt the threads of normal or idle priority class processes. An example is the Task List, which must respond quickly when called by the user, regardless of the load on the operating system. Use extreme care when using the high-priority class, because a high-priority class application can use nearly all available CPU time. IDLE_PRIORITY_CLASS Specifies this class for a process whose threads run only when the system is idle. The threads of the process are preempted by the threads of any process running in a higher priority class. An example is a screen saver. The idle-priority class is inherited by child processes. NORMAL_PRIORITY_CLASS Specifies this class for a process with no special scheduling needs. REALTIME_PRIORITY_CLASS Specifies this class for a process that has the highest possible priority. The threads of the process preempt the threads of all other processes, including the operating system processes, which may be performing important tasks. For example, a real-time process that executes for more than a very brief interval can prevent disk caches from flushing, or can cause the mouse to be unresponsive.

Managing Multiple Threads

In the next two examples, numerous threads are created that terminate at random times. Their termination and display messages are then caught to indicate their termination status.

Although this example is contrived, it does illustrate one key point: the semantics of creating multiple threads and waiting for their completion are similar in both platforms.

UNIX example: multiple threads in

#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <pthread.h> #define NUM_THREADS 5 void *thread_function(void *arg) { int t_number = *(int *)arg; int rand_delay; printf("thread_function running. Arg was %d

", t_number); // Seed the random-number generator with current time so that // the numbers will be different each time function is run. srand( (unsigned)time(NULL)); // random time delay from 1 to 10 rand_delay = 1+ 9.0*(float)rand()/(float)RAND_MAX; sleep(rand_delay); printf("See Ya from thread #%d

", t_number); pthread_exit(NULL); } int main() { int res; pthread_t a_thread[NUM_THREADS]; void *thread_result; int multiple_threads; for(multiple_threads = 0; multiple_threads < NUM_THREADS; multiple_threads++) { res = pthread_create(&(a_thread[multiple_threads]), NULL, thread_function, (void *)&multiple_threads); if (res != 0) { perror("Thread creation failed"); exit(EXIT_FAILURE); } sleep(1); } printf("Waiting for threads to finish…

"); for(multiple_threads = NUM_THREADS - 1; multiple_threads >= 0; multiple_threads--) { res = pthread_join(a_thread[multiple_threads], &thread_result); if (res == 0) { printf("Another thread

"); } else { perror("pthread_join failed"); } } printf("All done

"); exit(EXIT_SUCCESS); }

Win32 example: multiple threads in

#include <windows.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #define NUM_THREADS 5 DWORD WINAPI thread_function(PVOID arg) { int t_number = *(int *)arg; int rand_delay; printf("thread_function running. Arg was %d

", t_number); // Seed the random-number generator with current time so that // the numbers will be different each time function is run. srand((unsigned)time(NULL)); // random time delay from 1 to 10 rand_delay = 1 + (rand() % 10); Sleep(rand_delay*1000); printf("See Ya from thread #%d

", t_number); return 100; } void main() { HANDLE a_thread[NUM_THREADS]; int multiple_threads; for(multiple_threads = 0; multiple_threads < NUM_THREADS; multiple_threads++) { // Create a new thread. a_thread[multiple_threads] = CreateThread(NULL, 0, thread_function, (PVOID)&multiple_threads, 0, NULL); if (a_thread[multiple_threads] == NULL) { perror("Thread creation failed"); exit(EXIT_FAILURE); } Sleep(1000); } printf("Waiting for threads to finish...

"); for(multiple_threads = NUM_THREADS - 1; multiple_threads >= 0; multiple_threads--) { if (WaitForSingleObject(a_thread[multiple_threads], INFINITE) == WAIT_OBJECT_0) { printf("Another thread

"); } else { perror("WaitForSingleObject failed"); } } printf("All done

"); exit(EXIT_SUCCESS); }

Fibers

A fiber is a lightweight thread that must be scheduled by the owning thread. Fibers exist within the context of the thread that schedules them and operate with the identity of the thread. Fibers should not be considered a replacement for a well-designed, multithreaded application. Instead, fibers should be used in situations where a design requires finely tuned scheduling, and are typically used when porting applications that require proprietary task-switching algorithms.

The primary difference between fibers and threads is that fibers are not preemptively scheduled. One key point, however, is that fibers are owned by a thread, and threads can be preempted by the task switcher. When a thread is suspended, so is the current fiber, and when a thread is resumed, so is the fiber that was active before being preempted.

Memory Management

Like UNIX, Windows has the standard heap management functions. Windows also sports functions for managing memory on a thread basis. Like many of the functional comparisons between UNIX and Windows, you can be best served by consulting guides for UNIX and Win32 programming. The basic functional mapping is covered in the next few sections.

Heap

Windows provides services similar to UNIX with respect to heap management functionality. The standard C runtime includes comparable functions for calloc, malloc, free and so on. It also has additional functions that may or may not be available in UNIX. The more significant added functionality is covered briefly in the following section.

Thread Local Storage

This section is a brief introduction to Thread Local Storage (TLS). For complete details, you should consult the Win32 API reference. The purpose of TLS is to define memory on a per-thread basis. The typical scenario where TLS would be used is within a dynamic-linked library (DLL), but this is not the only possible use. In the case of the DLL scenario, here are some of the details of its use:

When a DLL attaches to a process, the DLL uses TlsAlloc to allocate a TLS index. The DLL then allocates some dynamic storage and uses the TLS index in a call to TlsSetValue to store the address in the TLS slot. This concludes the per-thread initialization for the initial thread of the process. The TLS index is stored in a global or static variable of the DLL.

to allocate a TLS index. The DLL then allocates some dynamic storage and uses the TLS index in a call to to store the address in the TLS slot. This concludes the per-thread initialization for the initial thread of the process. The TLS index is stored in a global or static variable of the DLL. Each time the DLL attaches to a new thread of the process, the DLL allocates some dynamic storage for the new thread and uses the TLS index in a call to TlsSetValue to store the address in the TLS slot. This concludes the per-thread initialization for the new thread.

to store the address in the TLS slot. This concludes the per-thread initialization for the new thread. Each time an initialized thread makes a DLL call requiring the data in its dynamic storage, the DLL uses the TLS index in a call to TlsGetValue to retrieve the address of the dynamic storage for that thread.

The functions used to manage Thread Local Storage are described below:

TlsAlloc Allocates a thread local storage (TLS) index. A TLS index is used by a thread to store and retrieve values that are local to the thread. The minimum number of indices available to each process is defined by TLS_MINIMUM_AVAILABLE. TLS indices are not valid across process boundaries.

TlsFree Releases a thread local storage index. This, however, does not release the data allocated and set in the TLS index slot.

TlsSetValue Stores memory in a thread local storage index.

TlsGetValue Returns a memory element stored in a specified thread local storage index.

Thread local storage example

The following section shows a portion of an example application. It illustrates allocation and access to a memory space on a per-thread basis. First, there is the main thread of the process that allocates a memory slot. The memory slot is then accessed and modified by a child thread. If several instances of the thread are active, each thread procedure would have a unique TLSIndex value to ensure the separation and isolation of data and state.

DWORD TLSIndex = 0; DWORD WINAPI ThreadProc( LPVOID lpData) { HWND hWnd = (HWND) lpData; LPVOID lpVoid = HeapAlloc( GetProcessHeap(), 0, 128 ); TlsSetValue( TLSIndex, lpVoid ); // Do your processing on the memory within the thread here. . . HeapFree( GetProcessHeap(), 0, lpVoid ); Return(0); } LRESULT CALLBACK WndProc( HWND … { switch( uMsg ) { case WM_CREATE: TLSIndex = TlsAlloc(); // Start your threads using CreateThread… Break; Case WM_DESTROY: TlsFree( TLSIndex ); Break; Case WM_COMMAND: Switch( LWORD( wParam )) { case IDM_TEST: // Do something with the TLS value by a call to TlsGetValue(DWORD) break; } } }

Memory-Mapped Files

Windows supports memory-mapped files and memory-mapped page files. Memory-mapped page files are covered in the "Shared Memory" section as part of an exercise to port System V IPC shared memory to Windows using memory-mapped files.

Creating and using shared memory in UNIX and in Windows are conceptually the same, but syntactically different. A simple example of creating a shared memory area and mapping it in UNIX follows:

if ( (fd = open("/dev/zero", O_RDWR)) < 0) err_sys("open error"); if ( (area = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0)) == (caddr_t) -1) err_sys("mmap error"); close(fd); // can close /dev/zero now that it's mapped

In Win32, it is coded as follows:

hMapObject = CreateFileMapping( INVALID_HANDLE_VALUE, // use paging file NULL, // no security attributes PAGE_READWRITE, // read/write access 0, // size: high 32-bits SHMEMSIZE, // size: low 32-bits "dllmemfilemap"); // name of map object if (hMapObject != NULL) { // Get a pointer to the file-mapped shared memory. lpvMem = MapViewOfFile( hMapObject, // object to map view of FILE_MAP_WRITE, // read/write access 0, // high offset: map from 0, // low offset: beginning 0); // default: map entire file if (lpvMem == NULL) { CloseHandle(hMapObject); } }

For details on the CreateFileMapping and MapViewOfFile functions, see the Win32 API documentation.

Shared Memory

Shared memory permits two or more threads or processes to share a region of memory. It is generally considered the most performant method of IPC since data is not copied as part of the communication process. Instead, the same physical area of memory is accessed by both the client and the server.

Windows does not support the standard System V interprocess communications mechanisms for shared memory (the shm*() APIs). It does, however, support memory-mapped files and memory-mapped page files, which you can use as an alternative to the shm*() APIs.

In Appendix A: Shared Memory, there is an example of how to port a simple UNIX application based on System V IPC to Windows, based on memory-mapped page file.

Synchronizing Access to Shared Resources

The technical challenge of using shared memory is to ensure that the server and client are not attempting to access the shared resource simultaneously. This is particularly troublesome if one or both are writing to the same-shared memory area. For example, if the server is writing to the shared memory, the client should not try to access the data until the server has completed the write operation.

To address this, several forms of synchronization are available for use in Windows. In Appendix C: Creating a Thread in Windows, the different forms of synchronization available in Windows are shown. These are:

Semaphore

Mutex

Event and critical section

UNIX has two of these mechanisms, the semaphore and the semaphore, as well as an additional mechanism: file locking.

The first three mechanisms have two states: signaled and non-signaled. The synchronization object is considered busy when it is in a non-signaled state. When the object is busy, a waiting thread will block until the object switches to a signaled state. At this time, the pending thread continues executing.

The last form of synchronization is the critical section object. The critical section object is only for synchronizing threads within a single process. This synchronization mechanism only works for a single instance of the example application. While this is true, you can still consider its use as an IPC synchronization mechanism. This form of synchronization is appropriate for cases where you want to migrate your existing application from a multiprocess architecture to a single process with multithreaded architecture.

A complete Windows example of using threads, shared memory, semaphore, mutexes, and critical sections and events can be found in Appendix C: Creating a Thread in Windows.

Note For applications that consume large amounts of memory and that are constrained by a lack of virtual address space, large memory support is available on Windows 2000 Advanced Server, Windows 2000 Datacenter Server, and Windows XP. A process running when using Windows normally has 2 GB of memory available in both user and system space. If the /3GB switch is inserted into Boot.ini file, Windows changes the split to give user space 3 GB and system space 1 GB. This change is a system wide option and applies to all applications run on the computer so using the /3GB switch should be analyzed for undesired side effects. See Knowledge Base article Q295443 for a sample of how to modify Boot.ini.

Applications that need to control the amount of stack or heap space can use linker switches for this purpose. The default size for both the stack and heap is 1 MB. Use the /STACK option to set the size of the stack and the /HEAP option to set the heap size. Both options take the size in bytes.

Further Reading on Memory Management

A few references on memory management that you may want to acquire are:

Solomon, David A., and Russinovich, Mark E. Inside Microsoft Windows 2000, Third Edition. Redmond, WA: Microsoft Press, 2000. (See Chapters 7 and 10.)

Richter, Jeffrey. Programming Applications for Microsoft Windows, Fourth Edition. Redmond, WA: Microsoft Press, 1999. (See Part III, Chapters 13-18.)

Stevens, W. Richard. Advanced Programming in the UNIX Environment. Reading, MA: Addison-Wesley Publishing Co., 1992.

Users, Groups and Security

The UNIX and Windows security models are quite different. Win32 uses the underlying Windows security model. This results in some key differences between the way Win32 security works and the way UNIX security works. Some of these differences have already been covered in Chapter 2 in the Comparison of Windows and UNIX Architectures section. This section covers the differences in the security model and how you should modify your code to operate in Win32.

The key areas that are addressed here are:

A comparison of the UNIX and Win32 user and group APIs

Adding a new group

Adding a user to a group

Listing groups

Adding a user account

Changing a user's password

Removing a user account

Getting user information about all users

Getting information about a specific user

Retrieving the current user's user name

Security functions

UNIX and Win32 User and Group Functions

This section describes the different user and group functions for UNIX and Win32.

The UNIX user and group functions

Table 9 shows user and group management functions that control user and group accounts in a security database. To link some of the code examples that use these functions, you must add -lcrypt to the gcc option list.

Table 9. UNIX User and group functions—security

Function Description Group database functions endgrent Close database fgetgrent, fgetgrent_r Get next Group database entry from FILE Stream getgrent, getgrent_r Get next Group database entry getgrgid, getgrgid_r Get Group database entry with Group ID getgrnam, getgrnam_r Get Group database entry with Name setgrent Rewind database Supplementary group access list functions getgroups Get initgroups Initialize setgroups Set User "shadow" database functions endspent Close database fgetspent, fgetspent_r Get next User "Shadow" database entry from FILE Stream getspent, getspent_r Get next User "Shadow" database entry getspnam, getspnam_r Get User "Shadow" database entry with Name setspent Rewind database User database functions endpwent Close database fgetpwent, fgetpwent_r Get next User database entry from FILE Stream getpw User database "get" function to get passwd entry from UID getpwent, getpwent_r Get next User database entry getpwnam, getpwnam_r Get User database entry with Name getpwuid, getpwuid_r Get User database entry with User ID setpwent Rewind database User database "Lock/UnLock" functions lckpwdf "Lock" function ulckpwdf "UnLock" function User database "Write" functions putgrent Write group file entry putpwent Write password file entry putspent Write shadow password file entry

The Win32 user functions

Table 10 shows Win32 user management functions that control a user's account in a security database. To link these functions in the example code later in this section, you must add Netapi32.lib to the Visual Studio project link-library list.

Table 10. Win32 User and Group functions—security

Function Description NetUserAdd Adds a user account and assigns a password and privilege level. NetUserChangePassword Changes a user's password for a specified network server or domain. NetUserDel Deletes a user account from the server. NetUserEnum Lists all user accounts on a server. NetUserGetGroups Returns a list of global group names to which a user belongs. NetUserGetInfo Returns information about a particular user account on a server. NetUserGetLocalGroups Returns a list of local group names to which a user belongs. NetUserSetGroups Sets global group membersh