readxl
でExcelファイルを読み込むときに,日付列が文字列と日付の混在だった場合,とりあえず col_types
をlist
にして読み込む.
d <- read_excel("data.xlsx", skip = 2, col_names = c("Date", "Temp", "RH", "WindSpd", "GustSpd", "Rain", "WindDir"), col_types = c("list", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric")) head(d)
# A tibble: 6 x 7 Date Temp RH WindSpd GustSpd Rain WindDir <list> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 <dttm [1]> 28.4 82.5 0. 0. NA 1.40 2 <dttm [1]> 28.9 77.5 0.250 3.02 0. 78.6 3 <dttm [1]> 31.9 61.2 0.500 2.27 0. 135. 4 <dttm [1]> 25.6 82.6 0.250 4.53 0. 129. 5 <dttm [1]> 22.8 95.3 0. 1.76 8.60 170. 6 <dttm [1]> 22.8 96.1 0. 0. 0. 147.
次に,変換する関数を定義.
形式が mdy HMS p
で,日付で読まれた方もコケているので修正する(lubridate
を使用)
f_parse_date_column <- function(x){ if (is.character(x[[1]])) { parse_date_time(x[[1]], order = "mdy HMS p") } else { this_date = x[[1]] y = year(this_date) m = month(this_date) d = day(this_date) hr = hour(this_date) mn = minute(this_date) sc = second(this_date) make_datetime(y, d, m, hr, mn, sc) } }
最後に,dplyr::mutate
で変換.
dd <- d %>% rowwise() %>% mutate(Date = f_parse_date_column(Date)) head(dd)
# A tibble: 6 x 7 Date Temp RH WindSpd GustSpd Rain WindDir <dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 2015-03-04 14:14:54 28.4 82.5 0. 0. NA 1.40 2 2015-03-04 15:14:54 28.9 77.5 0.250 3.02 0. 78.6 3 2015-03-04 16:14:54 31.9 61.2 0.500 2.27 0. 135. 4 2015-03-04 17:14:54 25.6 82.6 0.250 4.53 0. 129. 5 2015-03-04 18:14:54 22.8 95.3 0. 1.76 8.60 170. 6 2015-03-04 19:14:54 22.8 96.1 0. 0. 0. 147.