Enums and Queries in Rails 4.1, and Understanding Ruby
Sometimes when you get puzzled by what Rails is doing, you really just need to understand what Ruby is doing.
For example, given this simple code to get an attribute value:
# return value of some_attribute and foobar
def some_attribute_foobar
"#{some_attribute} and foobar"
end
Beginners are often stumped by why this code does not set an attribute value:
# change the value of some_attribute to foobar
def change_some_attribute
# why doesn't the next line set the some_attribute value to "foobar"?
some_attribute = "foobar"
save!
end
What's going on?
In the first method, some_attribute
is actually a method call which
gets the attribute value of the record. This works in Rails ActiveRecord
due to the Ruby feature of
method_missing
which allows some code to run when a method is called that does not
exist.
In the second method, a local variable called some_attribute is getting assigned. There is no call to method_missing, as this is a variable assignment!
The correct code should have been:
# change the value of some_attribute to foobar
def change_some_attribute
self.some_attribute = "foobar"
save!
end
In this case, we're calling the method some_attribute=
on the model
instance, and we get the expected result of assigning an attribute
value.
Enums
For those not familiar with enums:
An enum type is a special data type that enables for a variable to be a set of predefined constants. The variable must be equal to one of the values that have been predefined for it.
Enums, introduced in Rails 4.1, are a place a lot of Ruby magic happens! It's critical to understand Ruby well in order to understand how to use enums effectively. Let's suppose we have this simple example, copied over from the Rails docs:
class Conversation < ActiveRecord::Base
enum status: [ :active, :archived ]
end
# conversation.update! status: 0
conversation.active!
conversation.active? # => true
conversation.status # => "active"
# conversation.update! status: 1
conversation.archived!
conversation.archived? # => true
conversation.status # => "archived"
# conversation.update! status: 1
conversation.status = "archived"
# conversation.update! status: nil
conversation.status = nil
conversation.status.nil? # => true
conversation.status # => nil
So what's going on in terms of Ruby meta-programming?
For all the enum values declared for Conversation
, methods are created
in the following forms. Let's use the model Conversation, column
"status", and the enum "active" for this exampl:
method | description |
---|---|
self.status | Returns enum string value (not symbol, and not integer db value) |
self.status=<enum_string_value or integer_value> | Set the status to corresponding enum integer value using either a string, symbol, or integer. If you use an invalid value, you get an ArgumentError. String/symbol is converted to corresponding integer value. |
self.active! | Sets the status enum to "active". This syntax is a bit confusing in that you don't see the attribute you're assigning! ArgumentError if invalid enum. |
self.active? | equivalent to (self.status = "active"), and *not* equivalent to (=self.status = :active=) due to symbols not being equal to strings! |
Conversation.active | equivalent to Conversation.where(status: "active") . Again, it's a bit confusing not to see the column being queried. |
Conversation.statuses | Mapping of symbols to ordinal values { "active" => 0, "archived" => 1 } , of type HashWithIndifferentAccess , meaning you can use symbols or strings |
Default Values for Enums
As the docs say, it's a good idea to use the default value from the database declaration, like:
create_table :conversations do |t|
t.column :status, :integer, default: 0, null: false
end
More specifically, consider using the first declared status (enum db value zero) be the default and to not allow null values. I've found that when I've allowed null values in enums, it makes all my code more complicated. This is an example of the Null Object Pattern. Nulls in your data and checking for these in your code will make your life more difficult! Instead, have an enum value for "I don't know" if that really is a possibility, and make that first value, which is an index of zero, and you can set that as the database column default.
Queries on Enums
The docs say:
In rare circumstances you might need to access the mapping directly. The mappings are exposed through a class method with the pluralized attribute name
Conversation.statuses # => { "active" => 0, "archived" => 1 }
This is not rare! This is critical!
For example, suppose you want to query where the status is not "archived":
You might be tempted to think that Rails will be smart enough to figure out that
Conversation.where("status <> ?", "archived")
Rails is not smart enough to know that the ? is for status and that is an enum. So you have to use this syntax:
Conversation.where("status <> ?", Conversation.statuses[:archived])
You might be tempted to think that this would work:
Conversation.where.not(status: :archived)
That throws an ArgumentError
. Rails wants an integer and not a symbol,
and symbol does not define to_i
.
What's worse is this one:
Conversation.where.not(status: "archived")
The problem is that ActiveRecord sees that the enum column is of type
integer and calls #to_i
on the value, so archived.to_i
gets
converted to zero. In fact, all your enums will get converted to
zero! And if you use the value of the enum attribute on an
ActiveRecord instance (say a Conversation object), then you're using a
string value!
If you're curious what the Rails source is, then take a look here: ActiveRecord::Type::Integer.
Here's a guaranteed broken bit of code:
# my_conversation.status is a String!
Conversation.where.not(status: my_conversation.status)
You'd think that Rails would be clever enough to see that the key maps
to an enum and then check if the comparison value is a String, and then
it would not call to_i
on the String! Instead, we are effectively
running this code:
Conversation.where.not(status: 0)
An acceptable alternative to the last code example would be:
Conversation.where.not(Conersation.statuses[my_conversation.status])
If you left out the not
, you could also do:
Conversation.send(my_conversation.status)
However, I really would like to simply do these, all of which DO NOT work.:
Conversation.where(status: my_conversation.status)
Conversation.where(status: :archived)
Conversation.where(status: "archived")
Pluck vs Map with Enums
Here's another subtle issue with enums.
Should these two lines of code give the same result or a different result:
statuses_with_map = Conversation.select(:status).where.not(status: nil).distinct.map(&:status)
statuses_with_pluck = Conversation.distinct.where.not(status: nil).pluck(:status)
It's worth experimenting with this in the Pry console!
In the first case, with map
, you get back an Array with 2 strings:
["active",
"archived"]
. In the second case, with pluck
, you get back an Array
with 2 integers: [0, 1]
.
What's going on here?
In the code where map
calls the status
method on each Conversation
record, the status
method converts the database integer value into the
corresponding String value!
In the other code that uses :pluck
, you get back the raw database
value. It's arguable whether or not Rails should intelligently transform
this value into the string equivalent, since that is what is done in
other uses of ActiveRecord. Changing this would be problematic, as there
could be code that depends on getting back the numerical value.
find_or_initialize_by, oh my!!!
Let's suppose we have this persisted in the database:
Conversation {
:id => 18,
:user => 25
:status => "archived" (1 in database)
}
And then we do a find_or_initialize_by
:
[47] (pry) main: 0> conversation = Conversation.find_or_initialize_by(user: 25, status: "archived")
Conversation Load (4.6ms) SELECT "conversations".* FROM "conversations"
WHERE "conversations"."user_id" = 25
AND "conversations"."status" = 0 LIMIT 1
#<Conversation:> {
:id => nil,
:user_id => 25,
:status => "archived"
}
We got nil
for :id
, meaning that we're creating a new record.
Wouldn't you expect to find the existing record? Well, maybe not given
the way that ActiveRecord.where
works, per the above discussion.
Next, the status on the new record is created with "archived", which is value 1. Hmmm….If you look closely above, the query uses
AND "conversations"."status" = 0
Let's look at another example:
Conversation {
:id => 19,
:user => 26
:status => "active" (0 in database)
}
And then we do a find_or_initialize_by
:
[47] (pry) main: 0> conversation = Conversation.find_or_initialize_by(user: 26, status: "active")
Conversation Load (4.6ms) SELECT "conversations".* FROM "conversations"
WHERE "conversations"."user_id" = 26
AND "conversations"."status" = 0 LIMIT 1
#<Conversation:> {
:id => 19,
:user_id => 26,
:status => "active"
}
Wow! Is this a source of subtle bugs and some serious yak shaving?
Note, the above applies equally to ActiveRecord.find_or_create_by
.
It turns out that the Rails methods that allow creation of a record via a Hash of attributes will convert the enum strings to the proper integer values, but this is not case when querying!
Rails Default Accessors For Setting Attributes
You may find it useful to know which Rails methods call the "Default Accessor" versus just going to the database directly. That makes all the difference in terms of whether or not you can/should use the string values for enums.
The key thing is that that "Uses Default Accessor" means that string enums get converted to the correct database integer values.
Method | Uses Default Accessor (converts string enums to integers!) |
---|---|
attribute= | Yes |
write_attribute | No |
update_attribute | Yes |
attributes= | Yes |
update | Yes |
update_column | No |
update_columns | No |
Conversation::update | Yes |
Conversation::update_all | No |
For more information on this topic, see
- Different Ways to Set Attributes in ActiveRecord by @DavidVerhasselt.
- Official API of ActiveRecord::Base
- Official Readme of Active Record – Object-relational mapping put on rails.
While these don't mention Rails enums, it's critical to understand that enums create default accessors that do the mapping to and from Strings.
So when you call these methods, the default accessors are used:
conversation.status = "archived"
conversation.status = 1
puts conversation.status # prints "archived"
So keep in mind when those default accessors are used per the above table.
Deep Dive: Enum Source
If you look at the Rails source code for ActiveRecord::Enum, you can see this at line 91, for the setter of the enum (I added some comments):
_enum_methods_module.module_eval do
# def status=(value) self[:status] = statuses[value] end
define_method("#{name}=") { |value|
if enum_values.has_key?(value) || value.blank?
# set the db value to the integer value for the enum
self[name] = enum_values[value]
elsif enum_values.has_value?(value) # values contains the integer
self[name] = value
else
# enum_values did not have the key or value passed
raise ArgumentError, "'#{value}' is not a valid #{name}"
end
}
From this definition, you see that both of these work:
conversation.status = "active"
conversation.status = 0
Here's the definition for the getter, which I've edited a bit for illustrative purposes:
# def status() statuses.key self[:status] end
define_method(name) do
db_value = self[name] # such as 0 or 1
enum_values.key(db_value) # the key value, like "archived" for db_value 1
end
Recommendations to the Rails Core Team
In response to this issue, I submitted this github issue: Rails where query should see value is an enum and convert a string #17226
-
@Bounga and @rafaelfranca on Github suggest that we can't automatically convert enum string values in queries. I think that is true for converting cases of a
?
or a named param, but I suspect that a quick map lookup to see that the attribute is an enum, and a string is passed, and then converting the string value to an integer is the right thing to do for 2 reasons:- This is the sort of "magic" that I expect from Rails.
- Existing methods
find_or_initialize_by
andfind_or_create_by
will result in obscure bugs when string params are passed for enums. However, it's worth considering if all default accessor methods (setters) should be consistently be called for purposes of passing values in a map to such methods. I would venture that Rails enums are some Rails provided magic, and thus they should have a special case. If this shouldn't go into Rails, then possibly a gem extension could provide a method likeModel.where_with_enum
which would convert a String into the proper numerical value for the enum. I'm not a huge fan of the generated Model scopes for enums, as I like to see what database field is being queried against.
-
Aside from putting automatic conversion of the enum hash attributes, I recommend we change the automatic conversion of Strings to integers to use the stricter
Integer(some_string)
rather thansome_string.to_i
. The difference is considerable,String#to_i
is extremely permissive. Try it in a console. With theto_i
method, any number characters at the beginning of the String are converted to an Integer. If the first character is not a number, 0 is returned, which is almost certainly a default enum value. Thus, this simple change would make it extremely clear when an enum string is improperly used. I would guess that this would make some existing code crash, but in all circumstances for a valid reason. As to whether this change should be done for all integer attributes is a different discussion, as that could have backwards compatibility ramifications. This change would require changing the tests in ActiveRecord::ConnectionAdapters::TypesTest. For example, this test:assert_equal 0, type.type_cast_from_user('bad')
would change to throw an exception, unless the cases are restricted to using Integer.new() for enums. It is inconsistent that some type conversions throw exceptions, such as converting a symbol to an integer. Whether or not they should is much larger issue. In the case of enums, I definitely believe that proper enum string value should not silently convert to zero every time.
Conclusion
I hope this article has convinced you that it's worth understanding Ruby as much as it is to understand Rails. Additionally, the new Enum feature in 4.1 requires some careful attention!
Thanks to Hack Hands for supporting the development of this content. You can find a copy of this article in their blog.