

Subject: [ruby-core:33487] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods

From: Charles Oliver Nutter <headius@ a u c >

Date: Tue, 30 Nov 2010 19:20:35 +0900

References:

In-reply-to: Subject: [ruby-core:From: Date: Tue, 30 Nov 2010 19:20:35 +0900References: 33322 In-reply-to: 33322

This is a long response, and for that I apologize. I want to make sure I'm being clear about my concerns, so they can be addressed in a meaningful way. SUMMARY: * "using" not being a keyword requires all calls to check for refinements all the time, globally degrading performance. * instance_eval propagating refinements requires all block invocations to localize what refinements they use for invocation on every activation. * shared, mutable structures can't be used to store the active refinement, due to concurrency issues * there are very likely many more complexities than illustrated here that result from the combination of runtime-mutable lexical scoping structures, concurrency, and method caching. On Wed, Nov 24, 2010 at 7:12 AM, Shugo Maeda <redmine / ruby-lang.org> wrote: > If a module or class is using refinements, they are enabled in > module_eval, class_eval, and instance_eval if the receiver is the > class or module, or an instance of the class. I am surprised nobody else has questioned this behavior. I believe it is a problem. Currently, blocks handle method dispatch like any other scope...i.e. they look only at the "self" object's class (for fcall/vcall) or the target object's class (for normal call). A typical caching structure to optimize this then has an entry that stores previously-seen method(s) and invalidates based on some trivial guard. In 1.9, this is a global serial number. In JRuby, it's a class hierarchy-based guard. The global serial number approach in 1.9 means that any change that flip that serial number cause all caches everywhere to invalidate. Normally this only happens on method definition or module inclusion, which is why defining methods or including modules at runtime is strongly discouraged for performance reasons. Refinements make method lookup more complicated, since now any scope where a refinement can no longer use the simple "target class" lookup and cache-validation logic. Because refinements are enabled at runtime, after parse, this also means that all calls everywhere must constantly check if a refinement is enabled. This is performance hit #1. If "using" were a keyword, we could know at parse time that some calls must check for refinements and other calls do not need to, localizing the performance hit to only scopes where refinements are active. I would strongly encourage "using" be made into a keyword. Without "using" being a keyword, we can still avoid a global performance hit by pretending it's a keyword and proactively changing how scopes are parsed in the presence of "using" in a containing scope. This is likely what we would do in JRuby, forcing all class-body calls named "using" to "damage" performance in child scopes. We would also disallow or strongly discourage aliasing of "using", since it would be impossible at parse time to make a proper decision. We already do this for methods like "eval", which force a method body to be completely deoptimized. The instance_eval case basically makes it impossible to avoid the performance hit for method calls within a block, since at any time a previously-captured block could be instance_eval'ed against a receiver class with refinements enabled. So all blocks everywhere would have to constantly check for refinements, forever. One possible suggestion to get around this would be to make all method calls in blocks check a global serial number, as in CRuby. At best, this is still an additional check in implementations that don't use a global serial number to invalidate method caches. At worst, it's still infeasible. Recall that previously, refinements were largely lexical and morely static. In other words, even though refinements would not be applied at parse time, they would be applied at method-definition time and unchangeable from then on. The instance_eval case throws this out completely. The same block could be instance_eval'ed against two different refinements at the same time. Since the current logic stores the active refinement in the cache, and the cache is shared across all invocations (including concurrent invocations), we now have a case where mid-call, the static in-memory code/caches for a block would have to switch to a different refinement. Obviously this is intractable, since we wanted refinements in the first place for their isolation characteristics. In order to avoid this, all blocks everywhere would need to *never* cache method lookups, and always do a full slow-path lookup on their thread-local structures. Even if an implementation isn't actually concurrent, things are still intractable, since any invocation of instance_eval against a refinement would have to force a global serial number change (at minimum) to force caches to be invalidated. This means that any use anywhere of instance_eval against a refinement would cause all block-borne method calls to flush and recache their next invocation. And if that's not bad enough, even on a non-concurrent implementation the context-switch boundaries are finer-grained than individual calls...so any shared mutable data structures indicating what refinement to use would *still* require slow-path lookup every time in order to isolate one refinement-targeted instance_eval's effects from others. And even if we don't consider concurrency, there's the issue of the *same* block being used in the *same* call stack for *different* refinements. Any call you make downstream from a given block could cause that block's static in-memory structures to be modified. It might be possible to reduce the slow-path logic to checking the call frame for *every single call* to see if a refinement is active, and then if the caller knows that a refinement is active call frames would have to have this bit set. But the caller is not responsible for the call frame construction, so all calls everywhere would have to check the caller's frames to see if a refinement is active. Accessing the caller's frame means every call needs to do additional pointer dereferences and checks for every Ruby method call. And one last case that's a problem: author's intent. If I write a block of code that does "1 + 1", I intend for that code to do normal Fixnum#+, and I intend for the result to be 2. It should not be possible for a caller to change the intent of my code just because I passed it in a bock. This has been my argument against using blocks as bindings, and it's another argument against instance_eval being able to force refinments into "someone else's code". I could continue to try to theorize about possible implementations, but they all lead toward runtime-alterable refinements being a devlishly complicated thing to implement and potentially impossible to optimize. I could be wrong, especially if my understanding about the feature is flawed. Now, some positive reinforcement for "using" being a keyword and instance_eval not propagating refinements. If "using" were a keyword, calls within the related lexical scopes would become "refinement-aware calls", localizing performance impact to only those calls. They would need additional cache guards, potentially with global impact, but at least normal code would work exactly as it does today. Block bodies would be no exception; unless a "using" were active at parse time, all calls could be taken at face value. This would also preclude instance_eval of a block propagating refinements, since the parse-time nature of "using" would mean a block is what it is, forever. Your code can't modify the intent of my code, and only calls where a parent scope at parse time contains the "using" keyword would know anything about refinements. This is, in fact, how I implemented "selector namespaces" over a year ago in JRuby, as an example. http://groups.google.com/group/ruby-core-google/msg/6f45dcb363e75267 I can try to come up with a concrete example of the problems with the current proposal and implementation, but the concurrency cases would be difficult to show. - Charlie