Please run these code chunks before you start.
library(vembedr)
library(ggplot2)
library(palmerpenguins)
load("../../Homeworks/Module2/Homework.RData")
Insert a code chunk to determine
what datatype the variables FFF
, GGG
,
HHH
, and III
are.
Inspect the value of
GGG
and HHH
in the Environment
tab of the Environments
pane. What sort of data structures
are they?
Why are GGG
and
HHH
different types even though both have the same
values?
Print out the 31st value of
FFF
.
Print out the 13th through 23rd
values of III
.
Print out the value of the
variable JJJ
. What kind of data structure is
JJJ
?
Write code to print out the
number of row and number of columns of JJJ
.
Use the summary()
function to determine datatype of the col3
column of
mpg
.
Print out the 17th entry in the
2nd column of JJJ
.
Print out the entire column
col1
.
Print out the 11th row of
JJJ
.
Note: In all plots, give the plot a title and create proper x-, y- and color-axis (if applicable) labels with units if available.
Read about the
chickwts
dataset. What kind of variable is
feed
? What would be an appropriate way to visualize its
distribution? Generate the plot.
What kind of variable is the weight? What would be an appropriate plot for visualizing its distribution? Visualize its distribution.
What would be an appropriate way to visualize the distribution of weight as it depends on feed? Visualize it. What may be concluded?
What would be an appropriate plot to visualize the proportions of transmission types used in each class of vehicle? Make the plot. In which class of vehicle are manual transmissions most common? In which class of vehicle are automatic transmissions most popular?
In the mpg
dataset,
visualize the relationship between engine displacement and the highway
mileage. Visualize how this relationship changes with the drive train
using facets. What may be concluded?
The trees
dataset
contains the diameter, height, and volume for black cherry trees. Use
?trees
in the console to read up on the variables. What
kind of variables are the three? What would be an appropriate type of
plot to visualize the relationship of all three variables? Make the
plot. What can you infer about the relationship between the three
variables?
Use a relational operator to
write an expression that checks whether the body mass of the 6th penguin
in the penguins
dataset is greater than or equal to that of
the 33rd penguin and assigns the result to a variable. Print out the
value of the variable. What is its datatype? Is the 6th penguin greater
than or equal to the 33rd in body mass?
The
sample(x, size, replace = TRUE)
function generates a random
sample of size size
from vector x
with
replacement (the same element of x
can be samples multiple
times). Assign a vector of the numbers 1 through 6, representing the
outcome of a die roll, to a variable. Use the sample
function and this vector to randomly sample 100 die rolls and assign the
result to another variable. What is the data structure produced by
sample
? What are its dimensions? How many of these do you
expect to be equal to 2? Write an expression that checks which rolls
resulted in 2’s and assign to a variable. What data structure and type
is this variable? Write an expression to determine how many rolls
resulted in 2’s. Did the result match your expectation?
Use indexing to assign the 3rd
through the 6th value of the 8th row (penguin) in the
penguins
dataset to a variable. Do the same for the 150th
row. Write a relational statement checking whether the 33rd penguin has
smaller measurements than the 8th and assign the result to a variable.
What is the data structure or the result? What is the
datatype?
Use relational and Boolean operators to determine how many Adelie penguins live on the Biscoe island.
Use the approach from problem #2 above to sample 1000 die rolls and assign to a variable. Sample a second time (representing another die) and assign to another. Then use relational and Boolean operators to determine rolls in which both dice produced 1 or both dice produced 2. How many do you expect? Did the result match expectation?
In the mpg
dataset,
determine how many subcompacts were made by Ford, Honda, and Subaru in
the year 2008.
Refer to problem 1 in the Boolean operator exercises above. Use logical indexing to subset the penguins dataset for Adelies living on Biscoe and plot only their flipper length vs body mass.
Refer to question 3 in the Boolean operator exercises above. Use logical indexing to determine which subcompact models were made by Ford, Honda, and Subaru in 2008?
Are there any flowers in the
iris
dataset with petal length more than 10 times petal
width? What species have such thin petals? Use relational and Boolean
operators and logical indexing to answer.
End of Module 2 HW