Behind the Scenes with nil? and empty?
At their base level the ruby methods nil? and empty? are just contents checks with slightly different focuses. But I thought it would be interesting and maybe even useful to look at how each actually opperates.
Why are you writing this
Mostly just for shits and giggles, and because I was currious about the core implementations of these methods.
nil?
First, let’s take a look at nil?
. It is the most simplistic of the three methods and is easy to understand at a source code level. First we should know what the method does. Calling nil?
simply checks the object is is called on and returns whether it is of the nil object type or not.
If you accept that nil?
will only return true if it’s owner is nil, this everything in this result set should make sense. The only thing that might give you pause is NilClass.nil? # => false
, but this should be quickly cleared up by seeing that NilClass is the class of the singleton object nil in ruby. This simply means that NilClass can only instantiate to the instance nil, and is only instantiated through the keyword nil
.
To implement this functionality, ruby does four things in its source:
Most of this is pretty straight forward but there are a few things to make note of. VALUE
is a pointer to an object, so it’s use in this code is simply stating that both methods take in an object and return an object. More specifically for the rb_true
and rb_false
methods they take take in an instance of the base ruby Object class and return an either an instance of TrueClass or FalseClass.
Qfalse
and Qtrue
are instances of VALUE
and represent the singleton objects of the ruby classes TrueClass and FalseClass. This is significant only in that it is useful to note that ruby’s true and false are not the same thing as c’s true and false, although they can be converted.
Next, you’ll probably note the strange method rb_define_module
. This function creates a module that is used in this case by the Object class which is defined by rb_cObject = rb_define_class("Object", rb_cBasicObject);
. Methods that are core to the Object class and don’t extend to other classes are defined directly on the object class where as methods that do extend are defined on the kernal module.
Lastly rb_define_method
takes an class definition, a stringified name for the method being defined, a function pointer to deliver it’s returned value and the number of arguments expected to be passed to the method.
empty?
empty?
is a core ruby method, but it is only implemented on certain ruby object types. It is not implemented on the ruby ObjectClass and then extended to all other clases. Instead it is implemented only on classes that can have a lenght or count. First a few examples.
Again, going through how these results are generated will give a deeper understanding of what is actually happening when we make even these basic method calls.
First, let’s look at the source codes for the empty?
methods.
This looks relatively familiar to what we saw before with the nil?
definition. There’s a little more logic but it’s really not too hard to work through.
First we see that the function takes in a VALUE
pointer object and returns a VALUE
pointer object. In this case it takes in a pointer to an object that is expected to be a string and returns a the Qtrue
or Qfalse
VALUE
instances. When we look at the logic inside the function we can see that is does exactly what we’d expect it to do. It checks the string’s length using RSTRING_LEN
, a custom string length method implemented for ruby, and return Qtrue
if that length is 0.
Next, looking at the array empty method we see very similar code. Again, the function takes in and returns VALUE
and again it uses a custom string length function, this time RARRAY_LEN
to check the length of the object. Again, RARRAY_LEN
is simply a wrapper for getting the memory allocated for the array in question. It is worth noting that an empty array in ruby does not coorespond to an empty array in C.
Again, in the hash version of the function everything looks very similar. However, there is a small diference. This time we aren’t checking that the hash has no length, instead we are actually checking that it is empty. This is simply because it is more expensive to get the length of a hash than it is to get the length of an array. With an array you can simply check how much memory is allocated, where as with a hash you have to actually count the number of entities.
Finally, looking at the function for Symbol#empty?
we start to see why it works the way it does. The short answer is that it simply casts the symbol to a string before checking whether it is empty. But this has a few implecations to it. First casting a symbol to a string atomatically increases the memory allocation for the variable. It’s also worth noting that casting a symbol actually requires a chain of function calls. First the symbol’s value must be retrieved from the global symbol table. This is what SYM2ID
is doing. Next that value is cast to a string, and finally the strings memory allocations is counted.
In the grand scheme of things none of this is particularly costly, but it is enlightening to see what is actually going on behind the scenes.