13

I have trained a RandomForestClassifier from Python Sckit Learn Module with very big dataset, but question is how can I possibly save this model and let other people apply it on their end.Thank you!

askedApr 10, 2014 at 23:09
user3038725's user avatar
1

2 Answers2

29

The recommended method is to usejoblib, this will result in a much smaller file than a pickle:

from sklearn.externals import joblibjoblib.dump(clf, 'filename.pkl') #then your colleagues can load itclf = joblib.load('filename.pkl')

See theonline docs

Tom Briggs's user avatar
Tom Briggs
831 silver badge6 bronze badges
answeredApr 11, 2014 at 13:09
EdChum's user avatar
Sign up to request clarification or add additional context in comments.

Comments

5

Have you tried pickling theRandomForestClassifier using the Pickle module and then saving it to the disk?

Here’s an example based on thepickle docs:

import pickleclassifier = RandomForestClassifier(etc)output = open('classifier.pkl', 'wb')pickle.dump(classifier, output)output.close()

The “other people” could then reload the pickled object as follows:

import picklef = open('classifier.pkl', 'rb')classifier = pickle.load(f)f.close()
NatNgs's user avatar
NatNgs
87414 silver badges27 bronze badges
answeredApr 11, 2014 at 0:24
Ryan D.W.'s user avatar

1 Comment

joblib is preferred and less verbose (i.e. smaller file):scikit-learn.org/stable/tutorial/basic/…

Your Answer

Sign up orlog in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

By clicking “Post Your Answer”, you agree to ourterms of service and acknowledge you have read ourprivacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.