5 Data transformation
5.2.4 Exercises
Find all flights that
- Had an arrival delay of two or more hours
- Flew to Houston (
IAH
orHOU
) - Were operated by United, American, or Delta
- Departed in summer (July, August, and September)
- Arrived more than two hours late, but didn’t leave late
- Were delayed by at least an hour, but made up over 30 minutes in flight
- Departed between midnight and 6am (inclusive)
Another useful dplyr filtering helper is
between()
. What does it do? Can you use it to simplify the code needed to answer the previous challenges?How many flights have a missing
dep_time
? What other variables are missing? What might these rows represent?Why is
NA ^ 0
not missing? Why isNA | TRUE
not missing? Why isFALSE & NA
not missing? Can you figure out the general rule? (NA * 0
is a tricky counterexample!)