[Python] H2O로 Randomforest 해보기

알고리즘 설정하기

import h2o
h2o.init()
from h2o.estimators import H2ORandomForestEstimator
# Import the cars dataset into H2O:
cars = h2o.import_file("https://s3.amazonaws.com/h2o-public-test-data/smalldata/junit/cars_20mpg.csv")
## 타겟 변수 (classfication)
cars["economy_20mpg"] = cars["economy_20mpg"].asfactor()
## 타겟 변수 (regression)
cars["economy"] = cars["economy"].asnumeric()
predictors = ["displacement","power","weight","acceleration","year"]
factor_response = "economy"
numeric_response = "economy_20mpg"

train, valid = cars.split_frame(ratios=[.8], seed=1234)

Classfication

# Build and train the model:
cars_drf = H2ORandomForestEstimator(ntrees=10,
                                    max_depth=5,
                                    min_rows=10,
                                    calibrate_model=True,
                                    binomial_double_trees=True)

cars_drf.train(x=predictors,
               y=factor_response,
               training_frame=train,
               validation_frame=valid)

Regression

# Build and train the model:
cars_drf = H2ORandomForestEstimator(ntrees=10,
                                    max_depth=5,
                                    min_rows=10,
                                    calibrate_model=False,
                                    binomial_double_trees=False)

cars_drf.train(x=predictors,
               y=numeric_response,
               training_frame=train,
               validation_frame=valid)

Check Performance

# Eval performance:
perf = cars_drf.model_performance()

# Generate predictions on a validation set (if necessary):
pred = cars_drf.predict(valid)

모델 결과 확인

# model1 = h2o.get_model(model_id)  model_id == string
cars_drf.varimp(use_pandas=True)
cars_drf.varimp_plot()
cars_drf.partial_plot(train, cols =[x[0]])

굳

저작자표시 (새창열림)

'분석 Python > 구현 및 자료' 카테고리의 다른 글

[Python] ConfigSpace 여러 기능 사용해보기 (0)	2020.08.20
[Python] 딥러닝 학습하는 도중에 GPU 사용량 확인하기 (1)	2020.08.19
dict을 txt 저장했을 때 다시 dict으로 만들기 (0)	2020.07.09
pip list를 이용해서 requirement.txt 만들기 (0)	2020.07.08
python dict(사전) 초기화하기 (0)	2020.06.27

[Python] H2O로 Randomforest 해보기

알고리즘 설정하기

Classfication

Regression

Check Performance

모델 결과 확인

굳

'분석 Python > 구현 및 자료' 카테고리의 다른 글

AI 도구

AI 도구 사이드 패널

티스토리툴바