I have some rake tasks that sit alongisde a Rails app that fetch data from remote
sources and stuff into the database. These tasks sun in a background worker every 5 minutes. In Rails, rake tasks can load the entire Rails
environment with the environment option:
task some_task: :environment do
  SomeClass.do_my_thing!
endMy task was to move these tasks from the infrastructure where the Rails app lives into a CI-like environment that executes short-lived tasks more efficiently, and exposes better logs. The purpose of this move was to be able to debug errors in the data syncs more easily, restart them on demand, and to isolate logs away from the web server.
Because the new system does not have access to the production database server,
I had to make my rake tasks run independently and POST the data to the Rails server
at a privileged API endpoint.
Intro aside, the biggest hurdle I ran into during this conversion was to load code without Rails' autoloading feature.
Rails does this fantastic thing where you can reference any class from anywhere.
Code in Controllers or classes in lib/ can reference models, Models can reference classes from installed gems, etc. There are a set of rules that govern this automatic loading
behavior and its configurable with config.autoload_paths, but the point is that it's rare
to need explicit require statements.
Converting my rake task to run without loading the Rails environment made it painfully obvious
that organizing code that doesn't come with its own structure is challenging! The good news is
that I ended up with a decent pattern for using require and require_relative effectively:
Use
requirefor files outside the module andrequire_relativefor files inside the same module.
To illustrate, here's a sample directory structure:
thing_1
  - foo.rb
  - bar.rb
thing_2
  - baz.rb
thing_1.rb
thing_2.rbNote that thing_1 has both a root level file and a root level directory. The file thing_1.rb looks like this:
# thing_1.rb
Dir["./thing_1/**/*.rb"].each { |f| require f }
module Thing1
endand thing1/foo.rb looks like this:
### thing1/foo.rb
module Thing1
  class Foo
  end
endIf thing_2/baz.rb wants to use the Thing1::Foo class, it can require
it like this:
# thing2/baz.rb
require './thing_1'but if thing_1/bar.rb wants to use Thing1::Foo (inside the same module), it can require it like this:
# thing1/bar.rb
require_relative './foo'The tradeoff for this is that thing_1 is all-or-nothing. thing_2 cannot
import Thing1::Foo without also importing Thing1::Bar. I tend to think
this tradeoff is worth it, and if a module gets too big, it's time to think
of how to break it down further.
Another thing worth noting here is that require always runs relative to $CWD, whereas require_relative runs relative to __FILE__ (the current file). So if thing2/baz.rb really wanted to require only thing1/foo.rb (and not thing1/bar.rb), its two options would be:
require_relative '../thing1/foo.rb'
# or
require './thing1/foo.rb'I found that it was easier to constrain myself to requireing only the top level "loader" files than to reach into other modules and require indivudual files and have to remember the right path to use.
