R 3.3 – Access or Create Columns in Data Frames, or Simplify a Data Frame using aggregate()
Articles Blog

R 3.3 – Access or Create Columns in Data Frames, or Simplify a Data Frame using aggregate()

October 10, 2019

In this video, I’ll take a
look at three functions– with, within, and aggregate. To explore these functions, I’m
going to load in the Mario Kart data set. The with function can be use to
access variables on a data frame with ease. This is especially important
when the name of the data frame is long and multiple
variables are being accessed for calculation. I can read this command as with
mariokart, compute the total price minus the
shipping price. The within function is similar
to with except that it’s used to create new columns and
merge them with the original data set. For instance, here I’ll add an
auction closing price to the data frame and store the
result in a new data frame called mk. Notice that the commands to
generate the new variables goes inside braces as
the second argument. If I had wanted, I could’ve
added multiple variables by creating more variables
inside of the braces. As you can probably guess, the
with function is useful for one off calculations, while
within is useful for expanding a data set to include
new variables. When I want to aggregate or
summarize results, I often use the aggregate function. Here, I’ll aggregate across the
number of wheels and the condition variables. The first argument
is a formula. The dot on the left side of the
tilde means keep all the variables for the output, and
the variables on the right side, separated by a plus sign,
indicate which variables to aggregate across. The second argument is a data
frame, and the third argument is how to aggregate
the variables. Had I only wanted to aggregate
across one variable– say, the total price– I could’ve replaced the dot with
the total price variable. Notice the similarities between
what the aggregate and tapply functions produce. Which you should use depends
on what you will use the output for. Congratulations. You’ve finished section three. The next section will introduce
how to write your own functions.

Only registered users can comment.

Leave a Reply

Your email address will not be published. Required fields are marked *