A demo for multi-output regression

The demo is adopted from scikit-learn:

https://scikit-learn.org/stable/auto_examples/ensemble/plot_random_forest_regression_multioutput.html#sphx-glr-auto-examples-ensemble-plot-random-forest-regression-multioutput-py

SeeMultiple Outputs for more information.

Note

The feature is experimental. For themulti_output_tree strategy, many features aremissing.

importargparsefromtypingimportDict,List,Optional,Tupleimportmatplotlibimportnumpyasnpfrommatplotlibimportpyplotaspltimportxgboostasxgbdefplot_predt(y:np.ndarray,y_predt:np.ndarray,name:str,ax:matplotlib.axes.Axes)->None:s=25ax.scatter(y[:,0],y[:,1],c="navy",s=s,edgecolor="black",label=name)ax.scatter(y_predt[:,0],y_predt[:,1],c="cornflowerblue",s=s,edgecolor="black")ax.legend()defgen_circle()->Tuple[np.ndarray,np.ndarray]:"Generate a sample dataset that y is a 2 dim circle."rng=np.random.RandomState(1994)X=np.sort(200*rng.rand(100,1)-100,axis=0)y=np.array([np.pi*np.sin(X).ravel(),np.pi*np.cos(X).ravel()]).Ty[::5,:]+=0.5-rng.rand(20,2)y=y-y.min()y=y/y.max()returnX,ydefrmse_model(strategy:str,ax:Optional[matplotlib.axes.Axes])->None:"""Draw a circle with 2-dim coordinate as target variables."""X,y=gen_circle()# Train a regressor on itreg=xgb.XGBRegressor(tree_method="hist",n_estimators=128,n_jobs=16,max_depth=8,multi_strategy=strategy,subsample=0.6,)reg.fit(X,y,eval_set=[(X,y)])y_predt=reg.predict(X)ifax:plot_predt(y,y_predt,f"RMSE-{strategy}",ax)defcustom_rmse_model(strategy:str,ax:Optional[matplotlib.axes.Axes])->None:"""Train using Python implementation of Squared Error."""defgradient(predt:np.ndarray,dtrain:xgb.DMatrix)->np.ndarray:"""Compute the gradient squared error."""y=dtrain.get_label().reshape(predt.shape)returnpredt-ydefhessian(predt:np.ndarray,dtrain:xgb.DMatrix)->np.ndarray:"""Compute the hessian for squared error."""returnnp.ones(predt.shape)defsquared_log(predt:np.ndarray,dtrain:xgb.DMatrix)->Tuple[np.ndarray,np.ndarray]:grad=gradient(predt,dtrain)hess=hessian(predt,dtrain)# both numpy.ndarray and cupy.ndarray works.returngrad,hessdefrmse(predt:np.ndarray,dtrain:xgb.DMatrix)->Tuple[str,float]:y=dtrain.get_label().reshape(predt.shape)v=np.sqrt(np.mean(np.power(y-predt,2)))return"PyRMSE",vX,y=gen_circle()Xy=xgb.DMatrix(X,y)results:Dict[str,Dict[str,List[float]]]={}# Make sure the `num_target` is passed to XGBoost when custom objective is used.# When builtin objective is used, XGBoost can figure out the number of targets# automatically.booster=xgb.train({"tree_method":"hist","num_target":y.shape[1],"multi_strategy":strategy,},dtrain=Xy,num_boost_round=128,obj=squared_log,evals=[(Xy,"Train")],evals_result=results,custom_metric=rmse,)y_predt=booster.inplace_predict(X)ifax:plot_predt(y,y_predt,f"PyRMSE-{strategy}",ax)np.testing.assert_allclose(results["Train"]["rmse"],results["Train"]["PyRMSE"],rtol=1e-2)if__name__=="__main__":parser=argparse.ArgumentParser()parser.add_argument("--plot",choices=[0,1],type=int,default=1)args=parser.parse_args()ifargs.plot==1:_,axs=plt.subplots(2,2)else:axs=np.full(shape=(2,2),fill_value=None)assertisinstance(axs,np.ndarray)# Train with builtin RMSE objective# - One model per output.rmse_model("one_output_per_tree",axs[0,0])# - One model for all outputs, this is still working in progress, many features are# missing.rmse_model("multi_output_tree",axs[0,1])# Train with custom objective.# - One model per output.custom_rmse_model("one_output_per_tree",axs[1,0])# - One model for all outputs, this is still working in progress, many features are# missing.custom_rmse_model("multi_output_tree",axs[1,1])ifargs.plot==1:plt.show()

DownloadJupyternotebook:multioutput_regression.ipynb

DownloadPythonsourcecode:multioutput_regression.py

Downloadzipped:multioutput_regression.zip

Gallery generated by Sphinx-Gallery

Movatterモバイル変換

A demo for multi-output regression