时间序列可视化在周期数据或者股市中最为常见,ggTimeSeries为此提供了诸多新颖的绘制时间序列数据的方法,包括日历图、热图、折线图等。

ggTimeSeries使用也比较简单,属于ggplot2生态中的一员。下面简单介绍其安装和使用:


# 安装
devtools::install_github('Ather-Energy/ggTimeSeries')

1)日历热图


# 创建演示数据
set.seed(1)
dtData = data.table(
      DateCol = seq(
         as.Date("1/01/2014", "%d/%m/%Y"),
         as.Date("31/12/2015", "%d/%m/%Y"),
         "days"
      ),
      ValueCol = runif(730)
   )
dtData[, ValueCol := ValueCol + (strftime(DateCol,"%u") %in% c(6,7) * runif(1) * 0.75), .I]
dtData[, ValueCol := ValueCol + (abs(as.numeric(strftime(DateCol,"%m")) - 6.5)) * runif(1) * 0.75, .I]

# 绘图
p1 = ggplot_calendar_heatmap(
   dtData,
   'DateCol',
   'ValueCol'
)

# 个性化风格
p1 +
   xlab(NULL) +
   ylab(NULL) +
   scale_fill_continuous(low = 'green', high = 'red') +
   facet_wrap(~Year, ncol = 1)

2)面积图


# 创建演示数据
set.seed(1)
dfData = data.frame(x = 1:1000, y = cumsum(rnorm(1000)))

# 绘图
p1 = ggplot_horizon(dfData, 'x', 'y')
# 个性化设置
p1 +
   xlab(NULL) +
   ylab(NULL) +
   scale_fill_continuous(low = 'green', high = 'red') +
   coord_fixed( 0.5 * diff(range(dfData$x)) / diff(range(dfData$y)))

3)折线图


# 创建演示数据
set.seed(10)
dfData = data.frame(
   Time = 1:1000,
   Signal = abs(
      c(
         cumsum(rnorm(1000, 0, 3)),
         cumsum(rnorm(1000, 0, 4)),
         cumsum(rnorm(1000, 0, 1)),
         cumsum(rnorm(1000, 0, 2))
      )
   ),
   VariableLabel = c(rep('Class A', 1000), rep('Class B', 1000), rep('Class C', 1000), rep('Class D', 1000))
)

# 绘图
p1 = ggplot(dfData, aes(x = Time, y = Signal, group = VariableLabel, fill = VariableLabel)) +
  stat_steamgraph()


# 个性化
p1 +
   xlab(NULL) +
   ylab(NULL) +
   coord_fixed( 0.2 * diff(range(dfData$Time)) / diff(range(dfData$Signal)))

4)瀑布图


# 创建数据
set.seed(1)
dfData = data.frame(x = 1:100, y = cumsum(rnorm(100)))

# 绘图
p1 = ggplot_waterfall(
   dtData = dfData,
   'x',
   'y'
)

# 坐标轴设置
p1 +
   xlab(NULL) +
   ylab(NULL)

5)点图


# 创建演示数据
set.seed(1)
dfData = data.table(x = 1:100, y = floor(4 * abs(rnorm(100, 0 , 0.4))))

# 绘图
p1 = ggplot(dfData, aes(x =x, y = y) )+
   stat_occurrence()

# 个性化设置
p1 +
   xlab(NULL) +
   ylab(NULL) +
   coord_fixed(ylim = c(0,1 + max(dfData$y)))

6)马赛克图(用来显示分类数据中一对变量之间的关系)


# 创建演示数据
set.seed(1)

dfData = data.frame(Signal = pmax(pmin(rnorm(10000), 3), -3))

dfData2 = data.frame(
   Signal = round(head(dfData$Signal, -1),0),
   NextSignal = round(tail(dfData$Signal, -1),0),
   Weight = 1
)

# 绘图
p1 = ggplot(dfData2, aes(xbucket = Signal, ybucket = NextSignal, fill = NextSignal, weight = Weight) )+
   stat_marimekko(color = 'black', xlabelyposition = -0.1)

# 个性化设置
p1 +
   xlab('Signal occurrence %') +
   ylab('Signal | Next signal occurrence %') +
   scale_x_continuous(breaks = 0:10/10) +
   scale_y_continuous(breaks = 0:10/10)

参考资料:

1.https://github.com/AtherEnergy/ggTimeSeries