readxl
でExcelファイルを読み込むときに,日付列が文字列と日付の混在だった場合,とりあえず col_types
をlist
にして読み込む.
d <- read_excel("data.xlsx",
skip = 2,
col_names = c("Date", "Temp", "RH", "WindSpd", "GustSpd", "Rain", "WindDir"),
col_types = c("list", "numeric", "numeric", "numeric", "numeric", "numeric", "numeric"))
head(d)
# A tibble: 6 x 7
Date Temp RH WindSpd GustSpd Rain WindDir
<list> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 <dttm [1]> 28.4 82.5 0. 0. NA 1.40
2 <dttm [1]> 28.9 77.5 0.250 3.02 0. 78.6
3 <dttm [1]> 31.9 61.2 0.500 2.27 0. 135.
4 <dttm [1]> 25.6 82.6 0.250 4.53 0. 129.
5 <dttm [1]> 22.8 95.3 0. 1.76 8.60 170.
6 <dttm [1]> 22.8 96.1 0. 0. 0. 147.
次に,変換する関数を定義.
形式が mdy HMS p
で,日付で読まれた方もコケているので修正する(lubridate
を使用)
f_parse_date_column <- function(x){
if (is.character(x[[1]])) {
parse_date_time(x[[1]], order = "mdy HMS p")
} else {
this_date = x[[1]]
y = year(this_date)
m = month(this_date)
d = day(this_date)
hr = hour(this_date)
mn = minute(this_date)
sc = second(this_date)
make_datetime(y, d, m, hr, mn, sc)
}
}
最後に,dplyr::mutate
で変換.
dd <- d %>%
rowwise() %>%
mutate(Date = f_parse_date_column(Date))
head(dd)
# A tibble: 6 x 7
Date Temp RH WindSpd GustSpd Rain WindDir
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2015-03-04 14:14:54 28.4 82.5 0. 0. NA 1.40
2 2015-03-04 15:14:54 28.9 77.5 0.250 3.02 0. 78.6
3 2015-03-04 16:14:54 31.9 61.2 0.500 2.27 0. 135.
4 2015-03-04 17:14:54 25.6 82.6 0.250 4.53 0. 129.
5 2015-03-04 18:14:54 22.8 95.3 0. 1.76 8.60 170.
6 2015-03-04 19:14:54 22.8 96.1 0. 0. 0. 147.