Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork26k
How to write a CV splitter for ragged data?#31281
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I was writing a gridsearch experiment for interpolation/smoothing. I train on some regular interval of data points, and validate on some intermediate points, so here's my splitter: classInterpSplitter(BaseCrossValidator):def__init__(self,thin_step:int=2):ifthin_step<2:raiseValueError("thin_step must be >= 2")self.thin_step=thin_stepdefsplit(self,t_obs:np.ndarray,x:None=None,groups:None=None):n_obs=len(t_obs)train_set=tuple(range(n_obs))[::self.thin_step]test_set=tuple(sorted(set(range(n_obs))-set(train_set)))yieldnp.array(train_set),np.array(test_set)defget_n_splits(self,X=None,y=None,groups=None):return1 If I have a list of trajectories of the same length, it should work to stack the data in shape (n_obs, n_trajectories). However, if the trajectories are different lengths, I can't stack them. The only solution I can think of is to pad with nans and stack, but that's a bad idea: it requires changes to the library's Would love any advice on how to write a splitter for ragged (list of list) data. Or more generally, how to write a splitter for custom containers. |
BetaWas this translation helpful?Give feedback.