[Python-Dev] GIL removal question

Den 09.08.2011 11:33, skrev Марк Коренберг: > Probably I want to re-invent a bicycle. I want developers to say me > why we can not remove GIL in that way: > > 1. Remove GIL completely with all current logick. > 2. Add it's own RW-locking to all mutable objects (like list or dict) > 3. Add RW-locks to every context instance > 4. use RW-locks when accessing members of object instances > > Only one reason, I see, not do that -- is performance of > singlethreaded applications. Why not to fix locking functions for this > 4 cases to stubs when only one thread present? This has been discussed to death before, and is probably OT to this list. There is another reason than speed of single-threaded applications, but it is rather technical: As CPython uses reference counting for garbage collection, we would get "false sharing" of reference counts -- which would work as an "invisible GIL" (synchronization bottleneck) anyway. That is, if one processor writes to memory in a cache-line shared by another processor, they must stop whatever they are doing to synchronize the dirty cache lines with RAM. Thus, updating reference counts would flood the memory bus with traffic and be much worse than the GIL. Instead of doing useful work, the processors would be stuck synchronizing dirty cache lines. You can think of it as a severe traffic jam. To get rid of the GIL, CPython would either need (a) another GC method (e.g. similar to .NET or Java) or (b) another threading model (e.g. one interpreter per thread, as in Tcl, Erlang, or .NET app domains). As CPython has neither, we are better off with the GIL. Nobody likes the GIL, fork a project to write a GIL free CPython if you can. But note that: 1. With Cython, you have full manual control over the GIL. IronPython and Jython does not have a GIL at all. 2. Much of the FUD against the GIL is plain ignorance: The GIL slows down parallel computational code, but any serious number crunching should use numerical performance libraries (i.e. C extensions) anyway. Libraries are free to release the GIL or spawn threads internally. Also, the GIL does not matter for (a) I/O bound code such as network servers or clients and (b) background threads in GUI programs -- which are the two common use-cases for threads in Python programs. If the GIL bites you, it's most likely a warning that your program is badly written, independent of the GIL issue. There seems to be a common misunderstanding that Python threads work like fibers due to they GIL. They do not! Python threads are native OS threads and can do anything a thread can do, including executing library code in parallel. If one thread is blocking on I/O, the other threads can continue with their business. The only thing Python threads cannot do is access the Python interpreter concurrently. And the reason CPython needs that restriction is reference counting. Sturla