All I Need Is Data.

[ Python ] Scikit-Learn, Numeric 표준화 / Category Onehot 하는 Pipeline 및 모델링하는 코드

2019. 6. 15. 18:38ㆍ분석 Python/Scikit Learn (싸이킷런)

numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())])
## category 있는 경우 
categorical_features = ['embarked', 'sex', 'pclass']
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)])

clf = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', LogisticRegression(solver='lbfgs'))])

param_grid = {
    'preprocessor__num__imputer__strategy': ['mean', 'median'],
    'classifier__C': [0.1, 1.0, 10, 100],
}

#grid_search = GridSearchCV(clf, param_grid, cv=10, iid=False)
#grid_search.fit(X_train, y_train)

https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html

Column Transformer with Mixed Types — scikit-learn 0.21.2 documentation

Note Click here to download the full example code Column Transformer with Mixed Types This example illustrates how to apply different preprocessing and feature extraction pipelines to different subsets of features, using sklearn.compose.ColumnTransformer.

scikit-learn.org

'분석 Python > Scikit Learn (싸이킷런)' 카테고리의 다른 글

sklearn Pipeline을 이용해 다양한 Regression모델 모델링하기 (0)	2019.06.15
sklearn Pipeline을 이용해 다양한 Classification모델들 모델링하기 (0)	2019.06.15
Sklearn SVM + OneVsRestClassifer Gridsearch (0)	2019.06.15
Lasso coordinate Descent 방식으로 최적의 Coef 구하기 (0)	2019.05.12
Ridge, Lasso, ElasticNet / train, test, coef 값 내뱉는 multiprocessing 함수 만들기 (0)	2019.05.06

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

All I Need Is Data.

All I Need Is Data.

태그

최근글

댓글

공지사항

아카이브

'분석 Python > Scikit Learn (싸이킷런)' 카테고리의 다른 글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역