This week we looked at S3 and S4 objects in R.
My code for the exercise is on GitHub here:
https://github.com/jered0/lis4930-rpackage/blob/master/module7/s3s4objects.r
I used the trees dataset bundled with R for the exercise.
The trees dataset works with several generic functions, but not all. For example, summary(trees)
returns:
Girth Height Volume Min. : 8.30 Min. :63 Min. :10.20 1st Qu.:11.05 1st Qu.:72 1st Qu.:19.40 Median :12.90 Median :76 Median :24.20 Mean :13.25 Mean :76 Mean :30.17 3rd Qu.:15.25 3rd Qu.:80 3rd Qu.:37.30 Max. :20.60 Max. :87 Max. :77.00
However, mean(trees)
doesn’t work because trees is a list, not a numeric vector.
The trees dataset can be used with both S3 and S4 functions. The data is straightforward (three numerical values per row), so it works well with both an S3-style list and as distinct values in an S4 class. In my code for the exercise I define both S3 and S4 classes that can contain records from the trees dataset, including a print()
function for the S3 class and a show()
function for the S4 class, and use apply()
to map the dataset to each objects from each class.
Exercise questions
1. How do you tell what OO system (S3 vs. S4) an object is associated with?
You can check which class system was used to create an object using the otype()
function from the “pryr” library. For example, the following command, using the S4 class from my exercise code, would yield a result of “S4″:
otype(new("trees_s4", Girth=1, Height=2, Volume=3))
2. How do you determine the base type (like integer or list) of an object?
You can determine the base type of an object with the mode()
function. Running mode(trees)
will show that the trees dataset is a list, for example, while mode(1)
will show that 1 is a numeric type.
3. What is a generic function?
A generic function is a function that can be implemented in different ways for each class but called with a generic dispatcher. Calling the generic print()
method, for example, will cause the interpreter to check the class of the object in question for its own implementation of print()
. Generic functions offer a uniform interface that can be used without regard for implementation.
4. What are the main differences between S3 and S4?
An S3 class is essentially a list with a class attribute. That makes the implementation very flexible, as it’s not picky about how objects are defined and used. Code can add more items to a particular object’s list without a problem.
An S4 class is less flexible, requiring that the fields on an object be limited to those established in its class definition. Fields in an S4 class are also strictly typed, throwing an error if a value of the wrong type is assigned to them.
An S4 class is not as flexible as an S3 class because it’s not a list – it’s a defined class. While that stricter implementation is more inhibiting than an S3 class, the well-defined type checking can prevent unexpected errors in code using an S4 class.