Python) Permutation Importance 다양하게 표현하는 방법
2022. 1. 31. 20:29ㆍ분석 Python/구현 및 자료
해당 파일에 함수를 정리하였습니다.
In [26]:
from IPython.core.display import display, HTML
display(HTML("<style>.container {width:90% !important;}</style>"))
/tmp/ipykernel_3209568/3510566465.py:1: DeprecationWarning: Importing display from IPython.core.display is deprecated since IPython 7.14, please import from IPython display
from IPython.core.display import display, HTML
In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from tqdm import tqdm
from sklearn.linear_model import *
from sklearn.datasets import fetch_openml, fetch_california_housing
In [2]:
from advanced_feature_importance import PermutationImportance
Regression¶
In [3]:
### READ DATA AND FIT A SIMPLE MODEL ###
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
m = Ridge()
m.fit(X,y)
Out[3]:
Ridge()
In [4]:
perf = PermutationImportance(model=m , output_type="regression")
In [5]:
imp_table = perf.make_imp_table(X,y,column_names=list(X))
100%|██████████| 8/8 [00:00<00:00, 28.34it/s]
In [6]:
imp_table
Out[6]:
mean | std | |
---|---|---|
MedInc | 1.379051 | 0.014738 |
HouseAge | 0.028542 | 0.001399 |
AveRooms | 0.139063 | 0.002949 |
AveBedrms | 0.186004 | 0.002765 |
Population | 0.000045 | 0.000056 |
AveOccup | 0.002938 | 0.000298 |
Latitude | 1.620343 | 0.012850 |
Longitude | 1.522178 | 0.013827 |
In [7]:
perf.plot_permutation_importance(imp_table)
In [8]:
sample_idx = np.random.choice(np.arange(1,len(X)),1000)
In [9]:
sample_X , sample_y = X.iloc[sample_idx , : ] , y[sample_idx]
In [10]:
sample_imp_table = perf.make_sample_imp_table(sample_X,sample_y,column_names=list(X))
100%|██████████| 8/8 [00:00<00:00, 40.32it/s]
In [11]:
perf.plot_sample_permutatiom_importance(sample_imp_table)
In [12]:
perf.plot_one_sample_importacne_heatmap(sample_imp_table , 0)
In [13]:
perf.plot_sample_importacne_heatmap_betwwen_columns(sample_X , sample_imp_table , 'MedInc' , "HouseAge" , "AveBedrms")
In [14]:
perf.plot_sample_importance_heatmap(sample_X ,sample_imp_table)
Classiftion¶
In [15]:
X, y = fetch_openml(name='wine-quality-red', return_X_y=True, as_frame=True)
m = LogisticRegression(C=10, max_iter=10_000)
m.fit(X,y)
Out[15]:
LogisticRegression(C=10, max_iter=10000)
In [16]:
perf = PermutationImportance(model=m , output_type="classification")
In [17]:
imp_table = perf.make_imp_table(X,y,column_names=list(X))
100%|██████████| 11/11 [00:00<00:00, 29.70it/s]
In [18]:
perf.plot_permutation_importance(imp_table)
In [19]:
sample_idx = np.random.choice(np.arange(1,len(X)),1000)
sample_X , sample_y = X.iloc[sample_idx , : ] , y[sample_idx]
In [20]:
sample_imp_table = perf.make_sample_imp_table(sample_X,sample_y,column_names=list(X))
100%|██████████| 11/11 [00:02<00:00, 4.26it/s]
In [21]:
perf.plot_sample_permutatiom_importance(sample_imp_table)
In [22]:
perf.plot_one_sample_importacne_heatmap(sample_imp_table , 1)
In [23]:
perf.plot_sample_importance_heatmap(sample_X ,sample_imp_table)
In [24]:
perf.plot_sample_importacne_heatmap_betwwen_columns(sample_X , sample_imp_table , 'fixed_acidity' , "volatile_acidity" , "chlorides")
In [25]:
perf.plot_sample_importance_target_heatmap(sample_X , sample_imp_table,"fixed_acidity",n_bins=20)
In [ ]:
728x90
'분석 Python > 구현 및 자료' 카테고리의 다른 글
Python) Google Calendar API 사용 방법 (0) | 2022.02.12 |
---|---|
Python) 기념일 관리하기 및 구글 캘린더에 등록할 템플릿 만드는 코드 (0) | 2022.02.12 |
Python) Sphinx를 사용하여 문서화하기 + Github Pages + Gitlab (2) | 2022.01.26 |
Python) featuretools를 사용한 자동 변수 생성 (0) | 2022.01.22 |
Python) most frequent speed test (0) | 2021.12.24 |