3.0.3 Patch Release (Jul 30 2025)

  • Fix NDCG metric with non-exp gain. (#11534)

  • Avoid using mean intercept forrmsle. (#11588)

  • [jvm-packages] addsetNumEarlyStoppingRounds API (#11571)

  • Avoid implicit synchronization in GPU evaluation. (#11542)

  • Remove CUDA check in the array interface handler (#11386)

  • Fix check in GPU histogram. (#11574)

  • Support Rapids 25.06 (#11504)

  • Addingenable_categorical to the sklearn.apply method (#11550)

  • Make xgboost.testing compatible with scikit-learn 1.7 (#11502)

  • Add support for building xgboost wheels on Win-ARM64 (#11572,#11597,#11559)

3.0.2 Patch Release (May 25 2025)

  • Dask 2025.4.0 scheduler info compatibility fix (#11462)

  • Fix CUDA virtual memory fallback logic on WSL2 (#11471)

3.0.1 Patch Release (May 13 2025)

  • Usenvidia-smi to detect the driver version and handle old drivers that don’t support virtual memory. (#11391)

  • Optimize deep trees for GPU external memory. (#11387)

  • Small fix for page concatenation with external memory (#11338)

  • Build xgboost-cpu formanylinux_2_28_x86_64 (#11406)

  • Workaround for different Dask versions (#11436)

  • Output models now use denormal floating-point instead ofnan. (#11428)

  • Fix aarch64 CI. (#11454)

3.0.0 (2025 Feb 27)

3.0.0 is a milestone for XGBoost. This note will summarize some general changes and thenlist package-specific updates. The bump in the major version is for a reworked R packagealong with a significant update to the JVM packages.

External Memory Support

This release features a major update to the external memory implementation with improvedperformance, a newExtMemQuantileDMatrix for more efficient datainitialization, new feature coverage including categorical data support and quantileregression support. Additionally, GPU-based external memory is reworked to support usingCPU memory as a data cache. Last but not least, we worked on distributed training usingexternal memory along with the spark package’s initial support.

  • A newExtMemQuantileDMatrix class for fast data initialization withthehist tree method. The new class supports both CPU and GPU training. (#10689,#10682,#10886,#10860,#10762,#10694,#10876)

  • External memory now supports distributed training (#10492,#10861). In addition, theSpark package can use external memory (the host memory) when the device is GPU. Thedefault package on maven doesn’t support RMM yet. For better performance, one needsto compile XGBoost from the source for now. (#11186,#11238,#11219)

  • Improved performance with new optimizations for both thehist-specific training andtheapprox (DMatrix) method. (#10529,#10980,#10342)

  • New demos and documents for external memory, including distributed training. (#11234,#10929,#10916,#10426,#11113)

  • Reduced binary cache size and memory allocation overhead by not writing the cut matrix. (#10444)

  • More feature coverage, including categorical data and all objective functions, includingquantile regression. In addition, various prediction types like SHAP values aresupported. (#10918,#10820,#10751,#10724)

Significant updates for the GPU-based external memory training implementation. (#10924,#10895,#10766,#10544,#10677,#10615,#10927,#10608,#10711)

  • GPU-based external memory supports both batch-based and sampling-based training. Beforethe 3.0 release, XGBoost concatenates the data during training and stores the cache ondisk. In 3.0, XGBoost can now stage the data on the host and fetch them bybatch. (#10602,#10595,#10606,#10549,#10488,#10766,#10765,#10764,#10760,#10753,#10734,#10691,#10713,#10826,#10811,#10810,#10736,#10538,#11333)

  • XGBoost can now utilizeNVLink-C2C for GPU-based external memory training and canhandle up to terabytes of data.

  • Support prediction cache (#10707).

  • Automatic page concatenation for improved GPU utilization (#10887).

  • Improved quantile sketching algorithm for batch-based inputs. See the section fornew features for more info.

  • Optimization for nearly-dense input, see the section foroptimization for more info.

See our latest document for detailsUsing XGBoost External Memory Version. The PyPI package(pipinstall) doesn’t haveRMM support, which is required by the GPU externalmemory implementation. To experiment, you can compile XGBoost from source or wait for theRAPIDS conda package to be available.

Networking

Continuing the work from the previous release, we updated the network module to improvereliability. (#10453,#10756,#11111,#10914,#10828,#10735,#10693,#10676,#10349,#10397,#10566,#10526,#10349)

The timeout option is now supported for NCCL using the NCCL asynchronous mode (#10850,#10934,#10945,#10930).

In addition, a newConfig class is added for users tospecify various options including timeout, tracker port, etc for distributedtraining. Both the Dask interface and the PySpark interface support the newconfiguration. (#11003,#10281,#10983,#10973)

SYCL

Continuing the work on the SYCL integration, there are significant improvements in thefeature coverage for this release from more training parameters and more objectives todistributed training, along with various optimization (#10884,#10883).

Starting with 3.0, the SYCL-plugin is close to feature-complete, users can start workingon SYCL devices for in-core training and inference. Newly introduced features include:

Other related PRs (#10842,#10543,#10806,#10943,#10987,#10548,#10922,#10898,#10576)

Features

This section describes new features in the XGBoost core. For language-specific features,please visit corresponding sections.

  • A new initialization method for objectives that are derived from GLM. The new method isbased on the mean value of the input labels. The new method changes the result of theestimatedbase_score. (#10298,#11331)

  • Thexgboost.QuantileDMatrix can be used with all prediction types for bothCPU and GPU.

  • In prior releases, XGBoost makes a copy for the booster to release memory held byinternal tree methods. We formalize the procedure into a new booster methodreset() /XGBoosterReset(). (#11042)

  • OpenMP thread setting is exposed to the XGBoost global configuration. Users can use itto workaround hardcoded OpenMP environment variables. (#11175)

  • We improved learning to rank tasks for better hyper-parameter configuration and fordistributed training.

    • In 3.0, all three distributed interfaces, including Dask, Spark, and PySpark, supportsorting the data based on query ID. The option for theDaskXGBRanker is true by default and can be optedout. (#11146,#11007,#11047,#11012,#10823,#11023)

    • Also for learning to rank, a new parameterlambdarank_score_normalization isintroduced to make one of the normalizations optional. (#11272)

    • Thelambdarank_normalization now uses the number of pairs when normalizing themean pair strategy. Previously, the gradient was used for bothtopk andmean.#11322

  • We have improved GPU quantile sketching to reduce memory usage. The improvement helpsthe construction of theQuantileDMatrix and the newExtMemQuantileDMatrix.

    • A new multi-level sketching algorithm is employed to reduce the overall memory usagewith batched inputs.

    • In addition to algorithmic changes, internal memory usage estimation and the quantilecontainer is also updated. (#10761,#10843)

    • The change introduces two more parameters for theQuantileDMatrixandDataIter, namely,max_quantile_batches andmin_cache_page_bytes.

  • More work is needed to improve the support of categorical features. This releasesupports plotting trees with stat for categorical nodes (#11053). In addition, somepreparation work is ongoing for auto re-coding categories. (#11094,#11114,#11089) These are feature enhancements instead of blocking issues.

  • Implement weight-based feature importance for vector-leaf. (#10700)

  • Reduced logging in the DMatrix construction. (#11080)

Optimization

In addition to the external memory and quantile sketching improvements, we have a numberof optimizations and performance fixes.

  • GPU tree methods now use significantly less memory for both dense inputs and near-denseinputs. (#10821,#10870)

  • For near-dense inputs, GPU training is much faster for bothhist (about 2x) andapprox.

  • Quantile regression on CPU now can handle imbalance trees much more efficiently. (#11275)

  • Small optimization for DMatrix construction to reduce latency. Also, C users can nowreuse theProxyDMatrix for multiple inferencecalls. (#11273)

  • CPU prediction performance forQuantileDMatrix has been improved(#11139) and now is on par with normalDMatrix.

  • Fixed a performance issue for running inference using CPU with extremely sparseQuantileDMatrix (#11250).

  • Optimize CPU training memory allocation for improved performance. (#11112)

  • Improved RMM (rapids memory manager) integration. Now, with the help ofconfig_context(), all memory allocated by XGBoost should be routed toRMM. As a bonus, allthrust algorithms now use async policy. (#10873,#11173,#10712,#10712,#10562)

  • When used without RMM, XGBoost is more careful with its use of caching allocator toavoid holding too much device memory. (#10582)

Breaking Changes

This section lists breaking changes that affect all packages.

  • Remove the deprecatedDeviceQuantileDMatrix. (#10974,#10491)

  • Support for saving the model in thedeprecated has been removed. Users can stillload old models in 3.0. (#10490)

  • Support for the legacy (blocking) CUDA stream is removed (#10607)

  • XGBoost now requires CUDA 12.0 or later.

Bug Fixes

  • Fix the quantile error metric (pinball loss) with multiple quantiles. (#11279)

  • Fix potential access error when running prediction in multi-thread environment. (#11167)

  • Check the correct dump format for thegblinear. (#10831)

Documentation

Python Package

  • Thefeature_weights parameter in the sklearn interface is now defined asa scikit-learn parameter. (#9506)

  • Initial support for polars, categorical feature is not yet supported. (#11126,#11172,#11116)

  • Reduce pandas dataframe overhead and overhead for various imports. (#11058,#11068)

  • Better xlabel inplot_importance() (#11009)

  • Validate reference dataset for training. Thetrain() function nowthrows an error if aQuantileDMatrix is used as a validationdataset without a reference. (#11105)

  • Fix misleading errors when feature names are missing during inference (#10814)

  • Add Stacklevel to Python warning callback. The change helps improve the error messagefor the Python package. (#10977)

  • Remove circular reference in DataIter. It helps reduce memory usage. (#11177)

  • Add checks for invalid inputs forcv. (#11255)

  • Update Python project classifiers. (#10381,#11028)

  • Support doc link for the sklearn module. Users can now find links to documents in ajupyter notebook. (#10287)

  • Dask

    • Prevent the training from hanging due to aborted workers. (#10985) This helpsDask XGBoost be robust against error. When a worker is killed, the training will failwith an exception instead of hang.

    • Optional support for client-side logging. (#10942)

    • Fix LTR with empty partition and NCCL error. (#11152)

    • Update to work with the latest Dask. (#11291)

    • See theFeatures section for changes to ranking models.

    • See theNetworking section for changes with the communication module.

  • PySpark

    • Expose Training and Validation Metrics. (#11133)

    • Add barrier before initializing the communicator. (#10938)

    • Extend support for columnar input to CPU (GPU-only previously). (#11299)

    • See theFeatures section for changes to ranking models.

    • See theNetworking section for changes with the communication module.

  • Document updates (#11265).

  • Maintenance. (#11071,#11211,#10837,#10754,#10347,#10678,#11002,#10692,#11006,#10972,#10907,#10659,#10358,#11149,#11178,#11248)

  • Breaking changes

    • Remove deprecatedfeval. (#11051)

    • Remove dask from the default import. (#10935) Users are now required to import theXGBoost Dask through:

      fromxgboostimportdaskasdxgb

      instead of:

      importxgboostasxgbxgb.dask

      The change helps avoid introducing dask into the default import set.

    • Bump Python requirement to 3.10. (#10434)

    • Drop support for datatable. (#11070)

R Package

We have been reworking the R package for a few releases now. In 3.0, we will startpublishing a new R package on R-universe, before moving toward a CRAN update. The newpackage features a much more ergonomic interface, which is also more idiomatic to Rspeakers. In addition, a range of new features are introduced to the package. To name afew, the new package includes categorical feature support,QuantileDMatrix, and aninitial implementation of the external memory training. To test the new package:

install.packages('xgboost',repos=c('https://dmlc.r-universe.dev','https://cloud.r-project.org'))

Also, we finally have an online documentation site for the R package featuring bothvignettes and API references (#11166,#11257). A good starting point for the new interfaceis the newxgboost() function. We won’t list all the feature gains here, as there aretoo many! Please visit theXGBoost R Package for more info. There’s a migrationguide (#11197) there if you use a previous XGBoost R package version.

JVM Packages

The XGBoost 3.0 release features a significant update to the JVM packages, and inparticular, the Spark package. There are breaking changes in packaging and someparameters. Please visit themigration guide forrelated changes. The work brings new features and a more unified feature set between CPUand GPU implementation. (#10639,#10833,#10845,#10847,#10635,#10630,#11179,#11184)

Maintenance

Code maintenance includes both refactoring (#10531,#10573,#11069), cleanups (#11129,#10878,#11244,#10401,#10502,#11107,#11097,#11130,#10758,#10923,#10541,#10990),and improvements for tests (#10611,#10658,#10583,#11245,#10708), along with fixingvarious warnings in compilers and test dependencies (#10757,#10641,#11062,#11226). Also, miscellaneous updates, including some dev scripts and profiling annotations(#10485,#10657,#10854,#10718,#11158,#10697,#11276).

Lastly, dependency updates (#10362,#10363,#10360,#10373,#10377,#10368,#10369,#10366,#11032,#11037,#11036,#11035,#11034,#10518,#10536,#10586,#10585,#10458,#10547,#10429,#10517,#10497,#10588,#10975,#10971,#10970,#10949,#10947,#10863,#10953,#10954,#10951,#10590,#10600,#10599,#10535,#10516,#10786,#10859,#10785,#10779,#10790,#10777,#10855,#10848,#10778,#10772,#10771,#10862,#10952,#10768,#10770,#10769,#10664,#10663,#10892,#10979,#10978).

CI