Today I Learned

28 posts about #ruby

Finding Where a Method is Defined

One great thing about Ruby is how flexible it is. Although sometimes it can be hard to determine where a method definition comes from.

One useful technique to pull out in just this circumstance is the source_location method.

For example, suppose you see a call to the quantity method in a spec and you wonder where its definition comes from. Do some puts debugging by adding this line:

puts method(:quantity).source_location

Lambda argument passing shorthand

Suppose that I have a Ruby lambda like so:

greeting = -> name { "Hello #{name}" }

The conventional way of calling this lambda is like so:

greeting.call("Ned") # "Hello Ned"

However, for those of you dreading having to type in call (hello, JavaScript developers), you can use this syntactic sugar instead:

greeting["Ned"]
greeting.("Ned")

Source: https://stackoverflow.com/questions/18774139/how-works-with-lambdas

UPDATE

I am now a sad panda to learn that this shorthand syntax is not recommended in the ruby-style-guide: https://github.com/bbatsov/ruby-style-guide/issues/205 https://github.com/bbatsov/ruby-style-guide/blob/master/README.md#proc-call

It's still a cool learning for me to know that this syntax exists in the first place, though.

FactoryGirl identifiers as symbols not strings

Why is FactoryGirl.create(:my_object) better than FactoryGirl.create("my_object")?

Symbols are immutable objects that are stored by the Ruby interpreter as a numeric ID (an implementation detail, the design of Ruby is flexible for different storage mechanisms).

Strings, on the other hand, take up a memory footprint as large as the number of characters in the string. Using symbols will thus take up less memory and potentially be faster because the same address in memory will be used.

If you have n instances of the same string in your code, Ruby will use O(n) memory to store all those strings, however, if you have n instances of the same symbol in your code, Ruby will use O(1) memory to do the same lookup.

Further reading: https://bugs.ruby-lang.org/issues/7792#note-58

This TIL came courtesy of a discussion with Evan and later on Arturo about best practices in writing test code, but shared here because it's broadly applicable and the FactoryGirl example is just one use case.

Squish Those Strings

While reading some Rails code, I came across a deprecation warning:

ActiveSupport::Deprecation.warn(<<-MESSAGE.squish)
  `redirect_to :back` is deprecated and will be removed from Rails 5.1.
  Please use `redirect_back(fallback_location: fallback_location)` where
  `fallback_location` represents the location to use if the request has
  no HTTP referer information.
MESSAGE

What caught my eye was the squish method on the heredoc. It's common to see methods after heredocs to clean up formatting, but squish is a great method name.

squish removes all leading and trailing whitespace, then replaces all consecutive whitespace with a single space. One application has been cleaning up long string messages that use the line continuation operator. The community style guide says to only use line continuations for concatenating strings, but I think squish is cleaner:

long_string = "a long string " \
              "spanning " \
              "three lines"
# => "a long string spanning three lines"

better_long_string = "a long string
                      squished to a single line
                      without extra spaces or backslashes".squish
# => "a long string squished to a single line without extra spaces or backslashes"

An all-too-common caveat: squish is an extenstion method from active_support.

bundle update --conservative

Let's say you want to update the gem foo. After running bundle update foo you look at your gemfile.lock and find that not only was foo updated, but so were many of its dependencies. You also notice that many of these dependency updates weren't necessary, as the previously installed versions were compatible with the new version of foo.

The default bundle update foo behaviour will unlock and update all dependencies of foo.

If you don't want to update these dependencies unnecessarily, one solution is to add the current versions to your gemfile.lock.

However, an easier way to prevent the updating of shared dependencies is to use bundler's new --conservative flag.

Replacing num.times.map with Array.new(num)

Problem

I want to create an array containing objects created with an incrementing integer index parameter. For example, an array of hashes containing strings built off of the incrementing number.

Standard Solution

my_array = num.times.map do |index|
  build_hash(index)
end

BUT...rubocop didn't like this:

... C: Performance/TimesMap: Use Array.new with a block instead of .times.map

Improved Solution

The improved solution allows us to build the same array with less chained methods, thus improving readability:

my_array = Array.new(num) do |index|
  build_hash(index)
end

Comparing Version Strings in Ruby

While writing a Ruby script, I needed to check the the version of a binary dependancy. The --version switch gets me the data, but how to compare to the required version?

The binary follows semver, so a quick and dirty attempt might be:

"1.4.2".gsub(".", "") >= "1.3.1".gsub(".", "")
# => true

Unfortunately, this is misleading: we are lexicographically comparing the strings and these strings happen to have the same length. Thus, "142" comes after "131".

Testing that version "1.200.0" is newer than "1.9.0" will fail as "120" comes before "190".

It would be straight-forward to write a small class to parse the string and compare the major, minor, and patch values. But, Ruby has a quick solution provided by RubyGems. Since Ruby 1.9, RubyGems has been included in Ruby's standard library:

Gem::Version.new("1.200.1") >= Gem::Version.new("1.3.1")
# => true

Gem also provides a way handle pessimistic constraints:

dependency = Gem::Dependency.new("", "~> 1.3.1")
dependency.match?("", "1.3.9")
# => true
dependency.match?("", "1.4.1")
# => false

Decorator Pattern in Ruby with SimpleDelegator

The Decorator Pattern allows us to chain new behaviours to objects without modifying the underlying objects. It is an application of the Open/Closed Principle. This pattern is useful for example when we need to tack on logging, monitoring, and other non-functional requirements to objects.

In Java or C# this can be achieved using interfaces. In Ruby, we can use the SimpleDelegator class to achieve this:

require "delegate"

class FooDecorator < SimpleDelegator
  def bar
    "This is a decorated #{__getobj__.bar}"
  end
end

class Foo
  def bar
    "bar"
  end

  def fiz
    "Fiz"
  end
end

decorated = FooDecorator.new(Foo.new)
puts decorated.bar # outputs "This is a decorated bar"
puts decorated.fiz # outputs "Fiz"

double_decorated = FooDecorator.new(FooDecorator.new(Foo.new))
puts double_decorated.bar # outputs "This is a decorated This is a decorated bar"

Sources:

Bundle Console

bundle console [GROUP]

runs Ruby console with bundled gems

Passing all env. variables to a shell command

Some of the methods in the Kernel module allows you to pass environment variables to a shell command. So rather than doing:

system("RAILS_ENV=test rake do_stuff")

You can do

system({ "RAILS_ENV" => "test" }, "rake do_stuff")  

This is particularly useful when we want to pass all environment variables on our current process.

system(ENV, "rake do_stuff")

Prefer sort_by to sort when providing a block

Prefer the sort_by method over the sort method whenever you provide a block to define the comparison.

Common form:

line_adds.sort { |x, y| x.elements["ItemRef/ListID"].text <=> 
  y.elements["ItemRef/ListID"].text }

Preferred form:

line_adds.sort_by { |x| x.elements["ItemRef/ListID"].text }

For small collections both techniques have similar performance profiles. When the sort key is something simple like an integer there is no performance benefit from sort_by.

The performance difference is especially noticeable if the sort key is expensive to compute and/or you have a large collection to sort.

The algorithm that yields the performance benefit is known as the Schwartzian Transform.

Ruby print to replace contents on same line

In Ruby, the print command can be used with the '\r' (carriage return) character to bring the cursor back to the beginning of the printed line, so that the next print call will replace the contents already outputted to that line. This is a very useful tool for printing status updates in a CLI script. For example:

print "#{index} done. Progress: %.2f%" % (index.to_f / items * 100).round(2) + "\r" if (index % 10) == 0

This will print and replace a line in STDOUT to report the status of a list of items being processed by a function, like so:

200 done. Progress: 15%

Typewriters still hold a lasting impact on modern-day computing!

Reading Text File with Byte Order Mark Using Ruby

Ruby's File.read() can natively read a text file containing a byte order mark and strip it out:

text_without_bom = File.read("file.txt", encoding: "bom|utf-8")

A quick deep dive into 'rake gettext:find'

Problem

I am using Ruby Gettext to manage translations. But today, when I ran rake gettext:find to update my PO files, none of them got updated.

Why??

The Investigation

After some digging, I noticed that Ruby Gettext defines one FileTask (a specific type of Rake task) per PO file, which delegates the work to GNU gettext.

FileTask looks at the timestamps of dependent files, and only executes the supplied block if any of the dependent files have a timestamp later than the file to update.

For example:

dependent_files = ["translations_template_file.pot"]
file "file_to_update" => dependent_files do
  # update the file
end

Why gettext:find was not doing anything

It turned out that gettext uses two FileTasks.

One to update the template:

files_needing_translations = ["file1.js", "file2.rb"]
file "translations_template_file.pot" => files_needing_translations do
  # update the translations template file
end

and another to update the PO file:

file "en-US/translation_file.po" => ["translations_template_file.pot"] do
  # update "en-US/translations.po"
end

The reason gettext:find did not do anything was because none of the files needing translation were updated, thus no PO files were updated.

Solution

> touch one_of_the_files_that_gettext_looks_at.js
> rake gettext:find

Matching array subset in Ruby

Problem:

How do you evaluate whether one array is a subset of another? For example, are the elements [a,c] included in [a,b,c]?

First attempt:

I was hoping to find something like Array.include?([...]), but this only checks if the array includes the argument as one of its values.

Second attempt:

Another approach is to pass a block into Array.any?

!arr1.any? { |e| !arr2.include?(e) }

But the double negation is rather indirect and doesn't easily reveal the intent.

I considered extracting a method to name the functionality:

def subset?(arr1, arr2)
  !arr1.any? { |e| !arr2.include?(e) }
end

But it's still difficult to read, as it's not clear whether arr1 is a subset of arr2, or vice versa.

Final Solution:

The Enumerable module includes a to_set method to convert the array to set, and Set includes a subset? method.

arr1.to_set.subset?(arr2.to_set)

Technically, you need to require set.rb to get this method defined on Enumberable:

require "set"

arr1.to_set.subset?(arr2.to_set)

But you get this require for free in Rails.

A little thing about .to_str

Playing with Ruby's === today and found some knowledge that's share-worthy. I noticed in ruby docs for string === a reference to .to_str and decided to investigate.

Nothing too exciting here, but its an important point of reference.

> hello = "hello"
> goodbye = "goodbye"

> hello === hello     #=> true
> hello === goodbye   #=> false

This is also what would usually be expected. Hang in there...

> string = "string"
> object = Object.new
> string + object     #=> TypeError no implicit conversion

Here's where things get funky.

class SomeObjectWithToStr
  def to_str
    "is now a string"
  end
end

> string = "string"
> object = SomeObjectWithToStr.new

> string + object     #=> "string is now a string"
> "string is now a string" === "string" + object      #=> true

Hunh? Why did that work?

TIL that .to_str is the default method call when operators force a conversion to a string. You'll likely have to define it yourself. Also note that the object type on the left is what the object type on the right will try to convert into.

Do you know of any Objects that come with pre-defined .to_str methods?

What's a "twiddle-wakka"?

I'm proud to say that I now know what a twiddle-wakka is. It is the notation ~> that we use in our Ruby-flavoured semver notation in Gemfile. Specifically, it means that the accepted version must be at the same level of the specified version. All sub-levels below the next increment of the current level are accepted. For example, ~> 2.0 means 2.0 <= VERSION < 2.1, while ~> 2.0.1 means 2.0.1 <= VERSION < 2.0.2.

Reference: http://guides.rubygems.org/patterns/

But seriously, now I know what a twiddle-wakka is. :D

Enhancing rake tasks

Problem

I have a rake task, and I want to make it do something before and after it's done.

task :a_task do
  puts "task"
end

task :setup_task do
  puts "setup"
end

task :run_after do
  puts "after"
end

What are my options?

Solution

For pre-req tasks, this is what is often done:

task :a_task => [:setup_task] do
  puts "task"
end

task :setup_task do
  puts "setup"
end
> rake a_task
setup
task

However, this requires modifying the existing task. This might not even be an option for rake tasks from 3rd party gems. We can do better: Enhance the task!

task :a_task do
  puts "task"
end

task :setup_task do
  puts "setup"
end

Rake::Task[:a_task].enhance [:setup_task]
> rake a_task
setup
task

To run a task (or any code for that matter) after a rake task:

task :a_task do
  puts "task"
end

task :run_after do
  puts "after"
end

Rake::Task["a_task"].enhance do
  Rake::Task["run_after"].invoke
end
> rake a_task
task
after

Source: http://www.dan-manges.com/blog/modifying-rake-tasks

Expect to Receive and Call to Original

In integration specs, it is preferable to call the original method when setting up an expectation on an object to receive an invocation of that method. This way, the method isn't stubbed out but instead will still be invoked. Any downstream effects of calling that method won't be hidden.

class Calculator
  def self.add(x, y)
    x + y
  end
end

Should be tested like this:

require 'calculator'

RSpec.describe "and_call_original" do
  it "responds as it normally would" do
    expect(Calculator).to receive(:add).and_call_original
    expect(Calculator.add(2, 3)).to eq(5)  # any bugs inside of #add won't be hidden
  end
end

This code example is taken from: Relish - Calling the original implementation

Show definition of a method at runtime in Ruby

Many people know about method(:foo).source_location to find where a method is defined at runtime. I just found a better way by using pry.

From the pry console, run:

[8] pry(main)> show-source Bar.scoped.where
From: /Users/arturo/.rvm/gems/ruby-2.2.6/gems/activerecord-3.2.22.1/lib/active_record/relation/query_methods.rb @ line 132:
Owner: ActiveRecord::QueryMethods
Visibility: public
Number of lines: 7

def where(opts, *rest)
  return self if opts.blank?

  relation = clone
  relation.where_values += build_where(opts, rest)
  relation
end

Happy Hacking!

Inheritance does not affect method visibility

Contrary to visibility conventions in other languages such as Java, Ruby methods defined under a private block in a class definition are still accessible by that class' children:

class Foo
  private

  def private_method!
    p "Hello world!"
  end
end

class Bar < Foo
  def uses_private_method
    private_method!
  end
end

b = Bar.new
b.uses_private_method # => "Hello world!"

This is because the private keyword in Ruby has nothing to do with inheritance; declaring a method as private only adds the restriction that it may not be invoked with an explicit receiver, as illustrated below:

class Quux < Foo
  def explicit_receiver
    self.private_method!
  end

  def implicit_receiver
    private_method!
  end
end

q = Quux.new
q.explicit_receiver # => NoMethodError: private method `private_method!' called for #<Quux:0x007fee689e0ff8>
q.implicit_receiver # => "Hello world!"

Fetch: two approaches for setting a default value

When using #fetch to assign a default value, the default can be passed either as an argument or as a block.

What are the implications for choosing one approach over the other?

# Option 1: block
setting = settings.fetch('key') { default_setting }

#Option 2: argument
setting = settings.fetch('key', default_setting)

When the default value is passed as a block, it is only evaluated when needed (lazy evaluation).

The argument approach could lead to serious performance issues if the default is an expensive operation.

Consider using the #public_send method

Prefer the #public_send method to the #send method.

result_date = date.public_send(operation, delta)

In the example above the first parameter to the #public_send method is either + or -.

Loading Data into ELK

Scenario

I want to load data in to Elasticsearch

Solution

Modify this script.

require "elasticsearch"
require "typhoeus"
require "typhoeus/adapters/faraday"

client = Elasticsearch::Client.new(host: "localhost:9200")

scope = BackgroundTask
  .where("created_at > '2015-01-01'")
  .where("created_at < '2016-01-01'")

count_so_far = 0

puts "Processing #{scope.count} records"

scope
  .find_in_batches do |tasks|

  puts "#{count_so_far} of #{scope.count}"
  count_so_far += tasks.count

  task_array = tasks.map do |task|
    {
      create: {
        _index: "background_tasks",
        _id: task.id,
        _type: "task",
        data: {
          task_type: task.type,
          created_at: task.created_at,
          waiting_time: task.queued_at - task.created_at,
          queued_time: task.run_at - task.queued_at,
          processing_time: task.completed_at - task.run_at
        },
      }
    }
  end

  client.bulk(body: task_array)
end

Problems with Reusing Ruby Standard Class Names

You might want to think twice before making a class that reuses the same name of a Ruby Standard Library class. If undetected, you will get strange hard-to-debug behaviour in your app. Let's explore further with this Ruby file:

module Utils
  module String
    def self.some_useful_method
      # ...
    end
  end
end

module Utils
  module Foo
    def self.do_stuff(string)
      raise "Argument '#{string}' is not a string" unless string.is_a?(String)
      # ...
    end
  end
end

Utils::Foo.do_stuff("Hello World")

Okay, so Hello World is a String. And a String is a String, no questions asked. Right? Well...not quite.

RuntimeError: Argument 'Hello World' is not a string

Since Utils::String gets loaded by Ruby, all references to the String constant from code inside the Utils module will resolve to Utils::String. This example may seem simple and obvious, but imagine if these two classes were in separate files, even separate libraries. How would it feel like if you keep getting "string".is_a? String => false in your debugging sessions?

MORAL OF THE STORY: It probably isn't a good idea to reuse class names from the Ruby Standard Library. Naming the first module to Utils::StringUtils is likely a better idea.

UUID Generation in Ruby Standard Lib

No need for fancy gems in order to generate RFC4122 Version 4 compliant UUID strings.

>> require 'securerandom'
=> true
>> SecureRandom.uuid
=> "af04813c-6d80-4277-b4e7-7193f7413876"

Curly braces vs. do/end: Operation Precedence

Choosing whether you use { ... } or do ... end around your blocks is more than just a stylistic choice in Ruby. It also affects the way that an operation will be executed because your choice also specifies the Operation Precedence to use. In a nutshell, a block curly braces has higher precedence than a block with do/end.

Consider this example from a great Stackoverflow post:

f param { do_something() }

will execute differently than

f param do do_something() end

The former will bind the block to param, while the latter will bind the block to f. The more you know...

Reverse-search in IRB.

You can reverse-search through previously entered statements in IRB by pressing Ctrl-R:

~
 irb
2.1.6 :001  def something_complicated(x,y); x + y; end
 = :something_complicated
2.1.6 :002  quit

~ 10s
 irb
(reverse-i-search)`compli': def something_complicated(x,y); x + y; end

Happy hacking!