Which way insertion into sorted array takes same O(n) time? Do you mean

cost of memcpy after finding position for insert? Despite memcpy (which I

be leave cheap on this amounts) cost of insertion is O(log n).

Can you show result with caching LOAD_PATH and without indexing

LOADED_FEATURES? I believe that indexing will add small improvement after

caching.

28.10.2012 5:30 пользователь "gregprice (Greg Price)" price@mit.edu

написал:

Issue #7158 has been updated by gregprice (Greg Price).

Sasada-san, thank you for the review. You're right, patch (4) should

(and doesn't) invalidate the cache when the working directory changes. I

believe Yura is right that it should also invalidate the cache when the

filesystem encoding changes.

Yura, thanks for your existing patches and your detailed comments. At

first look, I suspect that the reason a sorted-list approach to

$LOADED_FEATURES was not a major speedup is that insertion is slow -- it

takes time O(n), on the same order as the lookup takes in trunk. So it

should still be helpful when one does many 'require' calls on the same

files, but not as much as if insertion is also fast. With the

hash-table-based index in (3), insertion takes time O(p), where p ~= 10 is

the number of path components in the library filenames, and lookup is O(1)

except in pathological cases.

I'm not too worried about recomputing the $LOAD_PATH cache from scratch

after it's modified. For one thing, we do so only in an actual 'require'

call, and it's only as expensive as the computation we do now in trunk on

every 'require' call, so there's no case in which it should cause a

regression. I suspect it's also the case that in common usage (with e.g.

Bundler) most $LOAD_PATH modifications happen all at once with no

intervening 'require' calls, and in that case we will only recompute it

once for all of those changes. (But I haven't checked that that's the

case.) So I think the trade-off of recomputing from scratch with less code

complexity is a good one.

Shirosaki-san, thanks for your additional patches. I'm reading them now.

Bug #7158: require is slow in its bookkeeping; can make Rails startup

2.2x faster

https://bugs.ruby-lang.org/issues/7158#change-31844

Author: gregprice (Greg Price)

Status: Assigned

Priority: Normal

Assignee: ko1 (Koichi Sasada)

Category: core

Target version:

ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35409) [i686-linux]

=begin

Starting a large application in Ruby is slow. Most of the startup

time is not spent in the actual work of loading files and running Ruby

code, but in bookkeeping in the 'require' implementation. I've

attached a patch series which makes that bookkeeping much faster.

These patches speed up a large Rails application's startup by 2.2x,

and a pure-'require' benchmark by 3.4x.

These patches fix two ways in which 'require' is slow. Both problems

have been discussed before, but these patches solve the problems with

less code and stricter compatibility than previous patches I've seen.

Currently we iterate through $LOADED_FEATURES to see if anything

matches the newly required feature. Further, each iteration

iterates in turn through $LOAD_PATH. Xavier Shay spotted this

problem last year and a series of patches were discussed

(in Issue #3924) to add a Hash index alongside $LOADED_FEATURES,

but for 1.9.3 none were merged; Masaya Tarui committed Revision r31875,

which mitigated the problem. This series adds a Hash index,

and keeps it up to date even if the user modifies $LOADED_FEATURES.

This is worth a 40% speedup on one large Rails application,

and 2.3x on a pure-'require' benchmark.

Currently each 'require' call runs through $LOAD_PATH and calls

rb_file_expand_path() on each element. Yura Sokolov (funny_falcon)

proposed caching this last December in Issue #5767, but it wasn't

merged. This series also caches $LOAD_PATH, and keeps the cache up

to date with a different, less invasive technique. The cache takes

34 lines of code, and is worth an additional 57% speedup in

starting a Rails app and a 46% speedup in pure 'require'.

== Staying Compatible

With both the $LOADED_FEATURES index and the $LOAD_PATH cache,

we exactly preserve the semantics of the user modifying $LOAD_PATH

or $LOADED_FEATURES;

both $LOAD_PATH and $LOADED_FEATURES remain ordinary Arrays, with

no singleton methods;

we make just one semantic change: each element of $LOAD_PATH and

$LOADED_FEATURES is made into a frozen string. This doesn't limit

the flexibility Ruby offers to the programmer in any way; to alter

an element of either array, one simply reassigns it to the new

value. Further, normal path-munging code which only adds and

removes elements shouldn't have to change at all.

These patches use the following technique to keep the cache and the

index up to date without modifying the methods of $LOADED_FEATURES or

$LOAD_PATH: we take advantage of the sharing mechanism in the Array

implementation to detect, in O(1) time, whether either array has been

mutated. We cause $LOADED_FEATURES to be shared with an Array we keep

privately in load.c; if anything modifies it, it will break the

sharing and we will know to rebuild the index. Similarly for

$LOAD_PATH.

== Benchmarks

First, on my company's Rails application, where $LOAD_PATH.size is 207

and $LOADED_FEATURES.size is 2126. I measured the time taken by

'bundle exec rails runner "p 1"'.

. Rails startup time,

version best of 5 speedup

v1_9_3_194 12.197s

v1_9_3_194+index 8.688s 1.40x

v1_9_3_194+index+cache 5.538s 2.20x

And now isolating the performance of 'require', by requiring

16000 empty files.

version time, best of 5 speedup

trunk (at r36920) 10.115s

trunk+index 4.363s 2.32x

trunk+index+cache 2.984s 3.39x

(The timings for the Rails application are based on the latest release

rather than trunk because a number of gems failed to compile against

trunk for me.)

== The Patches

I've attached four patches:

(1) Patch 1 changes no behavior at all. It adds comments and

simplifies a bit of code to help in understanding why patch 3 is

correct. 42 lines, most of them comments.

(2) Patch 2 adds a function to array.c which will help us tell when

$LOAD_PATH or $LOADED_FEATURES has been modified. 17 lines.

(3) Patch 3 adds the $LOADED_FEATURES index. 150 lines.

(4) Patch 4 adds the $LOAD_PATH cache. 34 lines.

Reviews and comments welcome -- I'm sure there's something I could do

to make these patches better. I hope we can get some form of them

into trunk before the next release. My life has been happier since I

switched to this version because commands in my Rails application all

run faster now, and I want every Ruby programmer to be happier in the

same way with 2.0 and ideally with 1.9.4.

=end

--

http://bugs.ruby-lang.org/