Bayesian 가우시안 기저 모형을 활용한 Linear 예측(R)
2019. 3. 26. 10:49ㆍML(머신러닝)/Bayesian
728x90
이성령
Prior
Likelihood
p(t|w,β)=p(t|x,w,β)
Posterior
p(w|t,α,β)=N(w|mN,SN)
S−1N=αI+βϕTϕ
Posterior Predictive
p(t∗|x,t,α,β)=∫p(t∗|x,w,β)p(w|t,α,β)dw
=N(t|mTNϕ(x),σ2N(x))
σ2N(x)=1β+ϕ(x)TSNϕ(x)
p(w|t,α,β)=N(w|mN,SN)
예제 가우시안 기저 모형
- 실제 DGP : y=sum(2πx)+ϵ , x∼U(0,1) , ϵ∼N(0,1β)
- 가우시안 기저 모형 : sin(2πx)≈θ0+θ1ϕ1(x)+θ2ϕ2(x)+...++θ9ϕ9(x)
- 단 ϕk(x)∼N(k9,182),(k=1,2,...,9)
- Prior : \theta = (\theta_0, \theta_1 , ..., \theta_9)^T \sim N(0, \alpha^{-1} \boldsymbol{I})
- \alpha = 2 , \beta = 25
- N = 1, 2, 20
DGP <- function(X , noise_variance){
return( 0.5 + sin(2 * pi * X ) +
matrix( rnorm(dim(X)[1]*dim(X)[2],mean=0,sd=noise_variance), dim(X)[1] , dim(X)[2]) )
}
Gaussian_Basis <- function(x , mu , sigma = 1/8){
return( exp(-0.5*(x-mu)**2 / sigma**2))
}
expand <- function( x, bf , bf_args){
if(is.null(dim(x))){
a <- matrix(1, nrow = 1 , ncol = 1)
}else{
a = matrix(1, nrow = dim(x)[1] , ncol = dim(x)[2])
}
for( i in bf_args ){
a <- cbind(a , bf(x , i))
}
return(a)
}
시각화
N_list = c(1,2,20 , 50)
beta = 25.0
alpha = 2.0
X = matrix( runif(1*N_list[length(N_list)],min = 0 , max = 1), 1 , N_list[length(N_list)])
t = DGP(X , noise_variance = 1/ beta)
X_test = matrix(seq(from = 0.0, to = 1.0, length.out = 100), 100, 1)
Phi_test = expand(X_test, bf=Gaussian_Basis,
bf_args=seq(1/9,1,length.out = 9))
y_true = DGP(X_test, noise_variance=0)
Visualization Loop
i = 1
p <- list()
for( N in N_list ){
X_N = matrix(X[0:N] , ncol = 1)
dim(X_N)
t_N = matrix(t[0:N] , ncol = 1 )
dim(t_N)
Phi_N = expand( X_N, bf=Gaussian_Basis, bf_args=seq(1/9 , 1 ,length.out = 9))
dim(Phi_N)
post = posterior(Phi_N, t_N, alpha, beta)
m_N <- post$mu
dim(m_N)
m_N <- matrix(m_N , ncol = 1)
S_N <- post$sigma
m_N <- matrix(m_N , ncol = 1, nrow = length(m_N))
dim(S_N)
OUTPUT = posterior_predictive(Phi_test, m_N, S_N, beta)
pred <- OUTPUT$y
std <- sqrt(OUTPUT$y_var)
d = data.frame(X_test , pred , std , y_true)
out <- ggplot(d) +
geom_ribbon(aes(X_test ,ymin=pred-std,ymax=pred+std),
color="yellow",alpha=0.1) +
geom_line(aes(X_test, pred, color = "darkblue"), size= 2) +
geom_line(aes(X_test, y_true, color ="red") , size= 2) +
scale_color_discrete(name = " ", labels = c("Prediction", "TRUE")) +
theme_bw() +
ggtitle(TeX(paste0("$sin(2\\pi x)+$", "$\\epsilon$ ", "(N : ",N, ")" ) )) +
labs(x = "Data Range" , y = "") +
theme(plot.title = element_text(size = 25 , face = "bold" , hjust = 0.5) ,
legend.position="top",
axis.line = element_line(colour = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.border = element_blank(),
panel.background = element_blank()) +
geom_point(data = data.frame(X_N , t_N) ,aes(X_N , t_N))
p[[i]] <- out
i = i + 1
}
Visualization
do.call(grid.arrange,p)
파이썬 코드를 R 코드로 변환
728x90
'ML(머신러닝) > Bayesian' 카테고리의 다른 글
BLiTZ — A Bayesian Neural Network LSTM 으로 주가 예측 해보기 (3) | 2020.04.19 |
---|---|
BLiTZ — A Bayesian Neural Network 해보기 (0) | 2020.04.11 |
(개인 공부) Markov Chain 정의 도박사 파산의 예시 및 다른 예시 (0) | 2020.03.29 |
Credible Interval(신용구간) , Confidence Interval(신뢰구간) 차이 (1) | 2019.11.11 |
베이지안 추론 (0) | 2018.01.02 |
데이터분석뉴비님의
글이 좋았다면 응원을 보내주세요!