Note
Go to the endto download the full example code.
Using xgboost on GPU devices
Shows how to train a model on theforest cover type dataset using GPUacceleration. The forest cover type dataset has 581,012 rows and 54 features, making ittime consuming to process. We compare the run-time and accuracy of the GPU and CPUhistogram algorithms.
In addition, The demo showcases using GPU with other GPU-related libraries includingcupy and cuml. These libraries are not strictly required.
importtimeimportcupyascpfromcuml.model_selectionimporttrain_test_splitfromsklearn.datasetsimportfetch_covtypeimportxgboostasxgb# Fetch dataset using sklearnX,y=fetch_covtype(return_X_y=True)X=cp.array(X)y=cp.array(y)y-=y.min()# Create 0.75/0.25 train/test splitX_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.25,train_size=0.75,random_state=42)# Specify sufficient boosting iterations to reach a minimumnum_round=3000# Leave most parameters as defaultclf=xgb.XGBClassifier(device="cuda",n_estimators=num_round)# Train modelstart=time.time()clf.fit(X_train,y_train,eval_set=[(X_test,y_test)])gpu_res=clf.evals_result()print("GPU Training Time:%s seconds"%(str(time.time()-start)))# Repeat for CPU algorithmclf=xgb.XGBClassifier(device="cpu",n_estimators=num_round)start=time.time()clf.fit(X_train,y_train,eval_set=[(X_test,y_test)])cpu_res=clf.evals_result()print("CPU Training Time:%s seconds"%(str(time.time()-start)))