I`m new with scikit-learn. I’m following the book “hands-on ml with scikit learn and tensorflow” All the models I try to implement perform quite poorly.
I don’t receive any error and the code is pretty much the same from the book.
I installed the same scikit version of the book, just in case, it was that.
I’m using the default hyperparameters and the models aren`t chosen really well for the task. But they perform far worse than in the book and just above random.
I think it might be because I`m using a laptop which is not powerful and the models stop training prematurely.
I’ve tried LinearRegression, RandomForestRegressor, SVR in a housing database of 20000 cases, with 12 parameters.
I`ve also tried an SGDClassifier on the MINST dataset as a binary classifier.
All of that following the book instructions.
#RANDOM FOREST MODEL from sklearn.ensemble import RandomForestRegressor forest_reg = RandomForestRegressor() forest_reg.fit(housing_pr, housing_labels) forest_reg_scores = cross_val_score(forest_reg, housing_pr, housing_labels, scoring="neg_mean_squared_error", cv=10) forest_reg_rmse_scores = np.sqrt(-forest_reg_scores) display_scores(forest_reg_rmse_scores)
The result is
Scores: [100358.84813795 59740.95594336 73069.35686091 58367.36656326 70119.66693956 61570.40051825 49889.14813703 80314.78172767 73177.26056318 102031.12922303] Mean: 72863.89146141837 Standard deviation: 16454.877060423143
While on the book the mean is 52634,191, and the standard deviation 1576
The price range is between 120000 and 265000.
The difference is a lot larger on the MINST dataset. Doing a binary classifier to predict classify 5 it explains precission and recall. His are 0.7687 and 0.79136 while mine are 0.092217 and 0.06972.