Skip to content

Commit e5e8c0a

Browse files
committed
Split the download and install process of a gem:
- ### Problem Bundler awaits for the dependencies of a gem to be download and installed before it proceeds to downloading and installing the dependency itself. This creates a bottleneck at the end of the installation process and block a thread unnecessarily. ### Details The installation strategy in Bundler is to await for "leaf gems" to be download/installed before the "root gem" is processed. For instance, in this scenario: - Gem "foo" has a dependency on "bar" (We can call this the "root gem") - Gem "bar" has no dependency (We call cal this the "leaf gems") - A Gemfile adds "gem 'foo'" In this case, only a single thread will have work to do, that is because Bundler will queue the gem "bar" to be downloaded and installed, and only when "bar" is finished, then Bundler will queue work for "foo". For **pure ruby gems**, this strategy is a waste of time because during the installation, a gem's code is not evaluated and no "require" statement is evaluated. For gems with native extensions, this strategy make sense. When the `extconf.rb` of a gem is evaluated, it's possible that the extconf requires a dependency and therefore Bundler needs to wait for those dependencies to be installed before it can execute the extconf. A typical example is a native extension that require 'mini_portile2'. ### Solution From the explanation above, I'd like to split the download from the installation of a gem. The tricky aspect is that there is no RubyGems API to know whether a gem has a native extension. The only way to know is after we download the gem and read the `metadata` from the tarball. So the solution in this patch is as follow: 1. We download all gems without doing any checks. 2. Once the gems are downloaded, we now know whether a gem has a native extensions. 3. If a gem is a pure ruby gems, we install it without waiting. 4. If a gem has a native extension, we check whether its dependencies are installed. If a dependency is not yet installed, we put back the gem in the queue. It will be dequeued later on and we'll redo this logic. ### Performance gain The speed gain highly depends on how deep the dependency tree is. E.g. bar depends on foo which depends on baz which depends on jane ... Previously we'd proceed to installing gems just one by one and only a single thread would be used. In a freshly generated Rails application, the speed gain is not that important. And the reason is because we are having a "tail latency" issue. Around the end of the installation process, the remaining gems to be installed are the one with native extensions, since we had to wait for for their dependencies to be installed. Compiling them is the slowest part, and since we are doing it at the end then the speed gain is not that noticeable. ### Misc Another advantage of this change is to be able to more easily implement this kind of feature #9138 to let users solely download gems, without installing them .
1 parent 643174d commit e5e8c0a

7 files changed

Lines changed: 133 additions & 44 deletions

File tree

bundler/lib/bundler/installer/gem_installer.rb

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,20 @@ def install_from_spec
2525
[false, specific_failure_message(e)]
2626
end
2727

28+
def download
29+
spec.source.download(
30+
spec,
31+
force: force,
32+
local: local,
33+
build_args: Array(spec_settings),
34+
previous_spec: previous_spec,
35+
)
36+
37+
[true, nil]
38+
rescue Bundler::BundlerError => e
39+
[false, specific_failure_message(e)]
40+
end
41+
2842
private
2943

3044
def specific_failure_message(e)

bundler/lib/bundler/installer/parallel_installer.rb

Lines changed: 59 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,12 @@ def ready_to_enqueue?
3232
state == :none
3333
end
3434

35+
def ready_to_install?(installed_specs)
36+
return false unless state == :downloaded
37+
38+
spec.extensions.none? || dependencies_installed?(installed_specs)
39+
end
40+
3541
def has_post_install_message?
3642
!post_install_message.empty?
3743
end
@@ -84,6 +90,7 @@ def initialize(installer, all_specs, size, standalone, force, local: false, skip
8490

8591
def call
8692
if @rake
93+
do_download(@rake, 0)
8794
do_install(@rake, 0)
8895
Gem::Specification.reset
8996
end
@@ -107,26 +114,54 @@ def failed_specs
107114
end
108115

109116
def install_with_worker
110-
enqueue_specs
111-
process_specs until finished_installing?
117+
installed_specs = {}
118+
enqueue_specs(installed_specs)
119+
120+
process_specs(installed_specs) until finished_installing?
112121
end
113122

114123
def install_serially
115124
until finished_installing?
116125
raise "failed to find a spec to enqueue while installing serially" unless spec_install = @specs.find(&:ready_to_enqueue?)
117126
spec_install.state = :enqueued
127+
do_download(spec_install, 0)
118128
do_install(spec_install, 0)
119129
end
120130
end
121131

122132
def worker_pool
123133
@worker_pool ||= Bundler::Worker.new @size, "Parallel Installer", lambda {|spec_install, worker_num|
124-
do_install(spec_install, worker_num)
134+
case spec_install.state
135+
when :enqueued
136+
do_download(spec_install, worker_num)
137+
when :installable
138+
do_install(spec_install, worker_num)
139+
else
140+
spec_install
141+
end
125142
}
126143
end
127144

128-
def do_install(spec_install, worker_num)
145+
def do_download(spec_install, worker_num)
129146
Plugin.hook(Plugin::Events::GEM_BEFORE_INSTALL, spec_install)
147+
148+
gem_installer = Bundler::GemInstaller.new(
149+
spec_install.spec, @installer, @standalone, worker_num, @force, @local
150+
)
151+
152+
success, message = gem_installer.download
153+
154+
if success
155+
spec_install.state = :downloaded
156+
else
157+
spec_install.error = "#{message}\n\n#{require_tree_for_spec(spec_install.spec)}"
158+
spec_install.state = :failed
159+
end
160+
161+
spec_install
162+
end
163+
164+
def do_install(spec_install, worker_num)
130165
gem_installer = Bundler::GemInstaller.new(
131166
spec_install.spec, @installer, @standalone, worker_num, @force, @local
132167
)
@@ -147,9 +182,19 @@ def do_install(spec_install, worker_num)
147182
# Some specs might've had to wait til this spec was installed to be
148183
# processed so the call to `enqueue_specs` is important after every
149184
# dequeue.
150-
def process_specs
151-
worker_pool.deq
152-
enqueue_specs
185+
def process_specs(installed_specs)
186+
spec = worker_pool.deq
187+
188+
if spec.installed?
189+
installed_specs[spec.name] = true
190+
return
191+
elsif spec.failed?
192+
return
193+
elsif spec.ready_to_install?(installed_specs)
194+
spec.state = :installable
195+
end
196+
197+
worker_pool.enq(spec)
153198
end
154199

155200
def finished_installing?
@@ -185,18 +230,15 @@ def require_tree_for_spec(spec)
185230
# Later we call this lambda again to install specs that depended on
186231
# previously installed specifications. We continue until all specs
187232
# are installed.
188-
def enqueue_specs
189-
installed_specs = {}
190-
@specs.each do |spec|
191-
next unless spec.installed?
192-
installed_specs[spec.name] = true
193-
end
194-
233+
def enqueue_specs(installed_specs)
195234
@specs.each do |spec|
196-
if spec.ready_to_enqueue? && spec.dependencies_installed?(installed_specs)
197-
spec.state = :enqueued
198-
worker_pool.enq spec
235+
if spec.installed?
236+
installed_specs[spec.name] = true
237+
next
199238
end
239+
240+
spec.state = :enqueued
241+
worker_pool.enq spec
200242
end
201243
end
202244
end

bundler/lib/bundler/plugin/api/source.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,14 @@ def options_to_lock
7474
{}
7575
end
7676

77+
# Download the gem specified by the spec at appropriate path.
78+
#
79+
# A source plugin can implement this method to split the download and the
80+
# installation of a gem.
81+
#
82+
# @return [Boolean] Whether the download of the gem succeeded.
83+
def download(spec, opts); end
84+
7785
# Install the gem specified by the spec at appropriate path.
7886
# `install_path` provides a sufficient default, if the source can only
7987
# satisfy one gem, but is not binding.

bundler/lib/bundler/plugin/installer.rb

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,8 @@ def install_from_specs(specs)
110110
paths = {}
111111

112112
specs.each do |spec|
113-
spec.source.install spec
113+
spec.source.download(spec)
114+
spec.source.install(spec)
114115

115116
paths[spec.name] = spec
116117
end

bundler/lib/bundler/self_manager.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ def install_and_restart_with(version)
6363
end
6464

6565
def install(spec)
66+
spec.source.download(spec)
6667
spec.source.install(spec)
6768
end
6869

bundler/lib/bundler/source.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ def version_message(spec, locked_spec = nil)
3131
message
3232
end
3333

34+
def download(*); end
35+
3436
def can_lock?(spec)
3537
spec.source == self
3638
end

bundler/lib/bundler/source/rubygems.rb

Lines changed: 47 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ def initialize(options = {})
2121
@allow_local = options["allow_local"] || false
2222
@prefer_local = false
2323
@checksum_store = Checksum::Store.new
24+
@gem_installers = {}
25+
@gem_installers_mutex = Mutex.new
2426

2527
Array(options["remotes"]).reverse_each {|r| add_remote(r) }
2628

@@ -162,49 +164,40 @@ def specs
162164
end
163165
end
164166

165-
def install(spec, options = {})
167+
def download(spec, options = {})
166168
if (spec.default_gem? && !cached_built_in_gem(spec, local: options[:local])) || (installed?(spec) && !options[:force])
167-
print_using_message "Using #{version_message(spec, options[:previous_spec])}"
168-
return nil # no post-install message
169+
return true
169170
end
170171

171-
path = fetch_gem_if_possible(spec, options[:previous_spec])
172-
raise GemNotFound, "Could not find #{spec.file_name} for installation" unless path
173-
174-
return if Bundler.settings[:no_install]
175-
176-
install_path = rubygems_dir
177-
bin_path = Bundler.system_bindir
178-
179-
require_relative "../rubygems_gem_installer"
180-
181-
installer = Bundler::RubyGemsGemInstaller.at(
182-
path,
183-
security_policy: Bundler.rubygems.security_policies[Bundler.settings["trust-policy"]],
184-
install_dir: install_path.to_s,
185-
bin_dir: bin_path.to_s,
186-
ignore_dependencies: true,
187-
wrappers: true,
188-
env_shebang: true,
189-
build_args: options[:build_args],
190-
bundler_extension_cache_path: extension_cache_path(spec)
191-
)
172+
installer = rubygems_gem_installer(spec, options)
192173

193174
if spec.remote
194175
s = begin
195176
installer.spec
196177
rescue Gem::Package::FormatError
197-
Bundler.rm_rf(path)
178+
Bundler.rm_rf(installer.gem)
198179
raise
199180
rescue Gem::Security::Exception => e
200181
raise SecurityError,
201-
"The gem #{File.basename(path, ".gem")} can't be installed because " \
182+
"The gem #{installer.gem} can't be installed because " \
202183
"the security policy didn't allow it, with the message: #{e.message}"
203184
end
204185

205186
spec.__swap__(s)
206187
end
207188

189+
spec
190+
end
191+
192+
def install(spec, options = {})
193+
if (spec.default_gem? && !cached_built_in_gem(spec, local: options[:local])) || (installed?(spec) && !options[:force])
194+
print_using_message "Using #{version_message(spec, options[:previous_spec])}"
195+
return nil # no post-install message
196+
end
197+
198+
return if Bundler.settings[:no_install]
199+
200+
installer = rubygems_gem_installer(spec, options)
208201
spec.source.checksum_store.register(spec, installer.gem_checksum)
209202

210203
message = "Installing #{version_message(spec, options[:previous_spec])}"
@@ -516,6 +509,34 @@ def extension_cache_slug(spec)
516509
return unless remote = spec.remote
517510
remote.cache_slug
518511
end
512+
513+
# We are using a mutex to reaed and write from/to the hash.
514+
# The reason this double synchronization was added is for performance
515+
# and lock the mutex for the shortest possible amount of time. Otherwise,
516+
# all threads are fighting over this mutex and when it gets acquired it gets locked
517+
# until a threads finishes downloading a gem, leaving the other threads waiting
518+
# doing nothing.
519+
def rubygems_gem_installer(spec, options)
520+
@gem_installers_mutex.synchronize { @gem_installers[spec.name] } || begin
521+
path = fetch_gem_if_possible(spec, options[:previous_spec])
522+
raise GemNotFound, "Could not find #{spec.file_name} for installation" unless path
523+
524+
require_relative "../rubygems_gem_installer"
525+
526+
installer = Bundler::RubyGemsGemInstaller.at(
527+
path,
528+
security_policy: Bundler.rubygems.security_policies[Bundler.settings["trust-policy"]],
529+
install_dir: rubygems_dir.to_s,
530+
bin_dir: Bundler.system_bindir.to_s,
531+
ignore_dependencies: true,
532+
wrappers: true,
533+
env_shebang: true,
534+
build_args: options[:build_args],
535+
bundler_extension_cache_path: extension_cache_path(spec)
536+
)
537+
@gem_installers_mutex.synchronize { @gem_installers[spec.name] ||= installer }
538+
end
539+
end
519540
end
520541
end
521542
end

0 commit comments

Comments
 (0)