Switching the import order (torch before xgboost) fixes the problem. Also doing export OMP_NUM_THREADS=1 fixes the problem. From my quick debugging session I think it's that xgboost loads libomp from ...
Hi, I'm trying to use ExtMemQuantileDMatrix for training huge dateset on gpus. For example, training 1Tb raw fp32 dataset on 4/8xRTX 4090 (24G) + 2/4Tb memory (which is sufficent for the same dataset ...