[ Python ] Scikit-Learn, Numeric 표준화 / Category Onehot 하는 Pipeline 및 모델링하는 코드
2019. 6. 15. 18:38ㆍ분석 Python/Scikit Learn (싸이킷런)
728x90
numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())])
## category 있는 경우
categorical_features = ['embarked', 'sex', 'pclass']
categorical_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
('onehot', OneHotEncoder(handle_unknown='ignore'))])
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
clf = Pipeline(steps=[('preprocessor', preprocessor),
('classifier', LogisticRegression(solver='lbfgs'))])
param_grid = {
'preprocessor__num__imputer__strategy': ['mean', 'median'],
'classifier__C': [0.1, 1.0, 10, 100],
}
#grid_search = GridSearchCV(clf, param_grid, cv=10, iid=False)
#grid_search.fit(X_train, y_train)
Column Transformer with Mixed Types — scikit-learn 0.21.2 documentation
Note Click here to download the full example code Column Transformer with Mixed Types This example illustrates how to apply different preprocessing and feature extraction pipelines to different subsets of features, using sklearn.compose.ColumnTransformer.
scikit-learn.org
728x90
'분석 Python > Scikit Learn (싸이킷런)' 카테고리의 다른 글
sklearn Pipeline을 이용해 다양한 Regression모델 모델링하기 (0) | 2019.06.15 |
---|---|
sklearn Pipeline을 이용해 다양한 Classification모델들 모델링하기 (0) | 2019.06.15 |
Sklearn SVM + OneVsRestClassifer Gridsearch (0) | 2019.06.15 |
Lasso coordinate Descent 방식으로 최적의 Coef 구하기 (0) | 2019.05.12 |
Ridge, Lasso, ElasticNet / train, test, coef 값 내뱉는 multiprocessing 함수 만들기 (0) | 2019.05.06 |