R Module 4 – R Programming Structure

This week’s assignment was to create a boxplot graph and a histogram from a set of patient data. The full program is here:

https://github.com/jered0/lis4930-rpackage/blob/master/module4/patientdata.r

I wanted to use the data as presented (changing a couple curly-quotes to straight quotes), then convert it within the program into numeric data that could be plotted. It was more work than expected, and it’s possible it could have been done more efficiently than I wound up doing it.

The first big obstacle was converting text to numeric values. I worked it out using ifelse() to convert the text values.

The other big obstacle involved the data types being used. When I created a data frame with the values, it turned the values into “factor” types, which couldn’t be plotted. Converting some of the numbers into a numeric type directly resulted in strange values, so to work around that I converted the values to characters first, then to the numeric type.

I also went overboard on the boxplot. Looking for ways to make it useful to a doctor reviewing it, I wanted to use the boxplot itself to show the average of the two doctors’ opinions on the patient condition. To make that more meaningful, I plotted two other values as points on that graph – one for the ER priority assigned to the patient (so doctor opinions could be contrasted with the patient’s ER condition), and one for blood pressure (so a doctor could see whether there was a correlation between BP and patient condition).

Then I had to learn how to use a legend, because it felt wrong putting so many colored dots on the graph without a legend to explain them.

The histogram, by comparison, was straightforward – I plotted only one value (frequency of patient visits) so doctors could see how frequently patients tended to come in for examinations. Then it was just a matter of setting axis labels and such.

This entry was posted in R Programming. Bookmark the permalink.

Comments are closed.