forked from mirrors/gecko-dev
		
	
		
			
				
	
	
		
			131 lines
		
	
	
	
		
			5.5 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			131 lines
		
	
	
	
		
			5.5 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| This technical memo is a cautionary note on using NetScape Portable
 | |
| Runtime's (NSPR) IO timeout and interrupt on Windows NT 3.51 and 4.0.
 | |
| Due to a limitation of the present implementation of NSPR IO on NT,
 | |
| programs must follow the following guideline:
 | |
| 
 | |
| If a thread calls an NSPR IO function on a file descriptor and the IO
 | |
| function fails with <tt>PR_IO_TIMEOUT_ERROR</tt> or
 | |
| <tt>PR_PENDING_INTERRUPT_ERROR</tt>, the file descriptor must be closed
 | |
| before the thread exits.
 | |
| 
 | |
| In this memo we explain the problem this guideline is trying to work
 | |
| around and discuss its limitations.
 | |
| 
 | |
| .. _NSPR_IO_on_NT:
 | |
| 
 | |
| NSPR IO on NT
 | |
| -------------
 | |
| 
 | |
| The IO model of NSPR 2.0 is synchronous and blocking. A thread calling
 | |
| an IO function is blocked until the IO operation finishes, either due to
 | |
| a successful IO completion or an error. If the IO operation cannot
 | |
| complete before the specified timeout, the IO function returns with
 | |
| <tt>PR_IO_TIMEOUT_ERROR</tt>. If the thread gets interrupted by another
 | |
| thread's <tt>PR_Interrupt()</tt> call, the IO function returns with
 | |
| <tt>PR_PENDING_INTERRUPT_ERROR</tt>.
 | |
| 
 | |
| On Windows NT, NSPR IO is implemented using NT's *overlapped* (also
 | |
| called *asynchronous*) *IO*. When a thread calls an IO function, the
 | |
| thread issues an overlapped IO request using the overlapped buffer in
 | |
| its <tt>PRThread</tt> structure. Then the thread is put to sleep. In the
 | |
| meantime, there are dedicated internal threads (called the *idle
 | |
| threads*) monitoring the IO completion port for completed IO requests.
 | |
| If a completed IO request appears at the IO completion port, an idle
 | |
| thread fetches it and wakes up the thread that issued the IO request
 | |
| earlier. This is the normal way the thread is awakened.
 | |
| 
 | |
| .. _IO_Timeout_and_Interrupt:
 | |
| 
 | |
| IO Timeout and Interrupt
 | |
| ------------------------
 | |
| 
 | |
| However, NSPR may wake up the thread in two other situations:
 | |
| 
 | |
| -  if the overlapped IO request is not completed before the specified
 | |
|    timeout. (Note that we can't specify timeout on overlapped IO
 | |
|    requests, so the timeouts are all handled at the NSPR level.) In this
 | |
|    case, the error is <tt>PR_IO_TIMEOUT_ERROR</tt>.
 | |
| -  if the thread gets interrupted by another thread's
 | |
|    <tt>PR_Interrupt()</tt> call. In this case, the error is
 | |
|    <tt>PR_PENDING_INTERRUPT_ERROR</tt>.
 | |
| 
 | |
| These two errors are generated by the NSPR layer, so the OS is oblivious
 | |
| of what is going on and the overlapped IO request is still in progress.
 | |
| The OS still has a pointer to the overlapped buffer in the thread's
 | |
| <tt>PRThread</tt> structure. If the thread subsequently exists and its
 | |
| <tt>PRThread</tt> structure gets deleted, the pointer to the overlapped
 | |
| buffer will be pointing to freed memory. This is problematic.
 | |
| 
 | |
| .. _Canceling_Overlapped_IO_by_Closing_the_File_Descriptor:
 | |
| 
 | |
| Canceling Overlapped IO by Closing the File Descriptor
 | |
| ------------------------------------------------------
 | |
| 
 | |
| Therefore, we need to cancel the outstanding overlapped IO request
 | |
| before the thread exits. NT's <tt>CancelIo()</tt> function would be
 | |
| ideal for this purpose. Unfortunately, <tt>CancelIo()</tt> is not
 | |
| available on NT 3.51. So we can't go this route as long as we are
 | |
| supporting NT 3.51. The only reliable way to cancel outstanding
 | |
| overlapped IO request that works on both NT 3.51 and 4.0 is to close the
 | |
| file descriptor, hence the rule of thumb stated at the beginning of this
 | |
| memo.
 | |
| 
 | |
| .. _Limitations:
 | |
| 
 | |
| Limitations
 | |
| -----------
 | |
| 
 | |
| This seemingly harsh way to force the completion of outstanding
 | |
| overlapped IO request has the following limitations:
 | |
| 
 | |
| -  It is difficult for threads to shared a file descriptor. For example,
 | |
|    suppose thread A and thread B call <tt>PR_Accept()</tt> on the same
 | |
|    socket, and they time out at the same time. Following the rule of
 | |
|    thumb, both threads would close the socket. The first
 | |
|    <tt>PR_Close()</tt> would succeed, but the second <tt>PR_Close()</tt>
 | |
|    would be freeing freed memory. A solution that may work is to use a
 | |
|    lock to ensure only one thread can be using that socket at all times.
 | |
| -  Once there is a timeout or interrupt error, the file descriptor is no
 | |
|    longer usable. Suppose the file descriptor is intended to be used for
 | |
|    the life time of the process, for example, the logging file, this is
 | |
|    really not acceptable. A possible solution is to add a
 | |
|    <tt>PR_DisableInterrupt()</tt> function to turn off interrupts when
 | |
|    accessing such file descriptors.
 | |
| 
 | |
| ..
 | |
| 
 | |
|    *A related known bug is that timeout and interrupt don't work for
 | |
|    <tt>PR_Connect()</tt> on NT. This bug is due to a different
 | |
|    limitation in our NT implementation.*
 | |
| 
 | |
| .. _Conclusions:
 | |
| 
 | |
| Conclusions
 | |
| -----------
 | |
| 
 | |
| As long as we need to support NT 3.51, we need to program under the
 | |
| guideline that after an IO timeout or interrupt error, the thread must
 | |
| make sure the file descriptor is closed before it exits. Programs should
 | |
| also take care in sharing file descriptors and using IO timeout or
 | |
| interrupt on files that need to stay open throughout the process.
 | |
| 
 | |
| When we stop supporting NT 3.51, we can look into using NT 4's
 | |
| <tt>CancelIo()</tt> function to cancel outstanding overlapped IO
 | |
| requests when we get IO timeout or interrupt errors. If
 | |
| <tt>CancelIo()</tt> really works as advertised, that should
 | |
| fundamentally solve this problem.
 | |
| 
 | |
| If these limitations with IO timeout and interrupt are not acceptable to
 | |
| the needs of your programs, you can consider using the Win95 version of
 | |
| NSPR. The Win95 version runs without trouble on NT, but you would lose
 | |
| the better performance provided by NT fibers and asynchronous IO.
 | |
| 
 | |
| |
 | |
| 
 | |
| .. _Original_Document_Information:
 | |
| 
 | |
| Original Document Information
 | |
| -----------------------------
 | |
| 
 | |
| -  Author: larryh@netscape.com
 | |
| -  Last Updated Date: December 1, 2004
 | 
