Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Nested cross-validation: why the proposed example does not have an hold-out set?#31510

Unanswered
SamGG asked this question inQ&A
Discussion options

I am learning Machine Learning and exploring nested cross-validation.

I don't understand the example given in scikit-learn. The model seems to learn from the whole dataset and the evaluation is not performed on a hold-out set.
scikit documentation
scikit implementation

# Loop for each trialfor i in range(NUM_TRIALS):    # Choose cross-validation techniques for the inner and outer loops,    # independently of the dataset.    inner_cv = KFold(n_splits=4, shuffle=True, random_state=i)    outer_cv = KFold(n_splits=4, shuffle=True, random_state=i)    # Nested CV with parameter optimization    clf = GridSearchCV(estimator=svm, param_grid=p_grid, cv=inner_cv)    nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv)    nested_scores[i] = nested_score.mean()

From what I read in Applied Predictive Modeling from Kuhn & Johnson, the model resulting from the inner loop should be evaluated on the hold-out set of the outer loop and the following post adheres to this pointmachinelearningmastery blog

As I am far from a Python expert, could you tell me the advantages, drawbacks and purposes of both of these implementations?

I read#21621 but I am not sure if it really answers my question. If it does, let me know and I will try to carefully understand it.

You must be logged in to vote

Replies: 0 comments

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Q&A
Labels
None yet
1 participant
@SamGG

[8]ページ先頭

©2009-2025 Movatter.jp