Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for M/L Learning Byte: Linear Regression
AWS Community Builders  profile imageShakir
Shakir forAWS Community Builders

Posted on • Edited on

     

M/L Learning Byte: Linear Regression

Hello 👋, in this post, we shall see the procedures involved in training a simple linear model with Keras API in TensorFlow. Note that we will not optimize the model by training it iteratively with different parameters, we will focus more on some of the standard steps involved. You may check thispost📄 for a refresher on some of the pandas methods we use here. Ready to go!!!

Sagemaker Studio Lab

I'll be doing this exercise on the AmazonSagemaker Studio Lab🆓, you can request for an account there and once it's approved you should receive a sign up link, note that the approval expires in 7 days, so you should better signup before that.

I am logging into the studio lab and start a runtime with CPU as the compute type.
Start runtime

Open the project once the runtime is started. Ensure popups are allowed for this site on your browser. The jupyter lab i.e. the Sagemaker studio lab should be opened.

Click the plus icon next to the Getting Started notebook, to see the launcher. From there, I am launching a notebook📔 with the sagemaker-distribution environment.
New notebook

We will be executing code covered in this post, in the notebook we just launched.

Dataset

Let's say we have a simple dataset like below(generated with ChatGPT):

Age (years)Income (thousands)Hours_WorkedSalary (thousands)
32455070
41504580
28306060
35385575
45604290
29324865
37403575
42554785
36483880
31355270

In easy terms, regression is all about predicting labels/targets(numbers) from one ore more inputs/features(numbers). We say it's linear regression when we could potentially use a linear function to show the relation between the features and labels.

Let's considerAge, Income and Hours worked are features andSalary is the label that we want to predict. And to start with(baseline) we are assuming this model is linear meaning it should approximately fit alinear equation(y = w1x1 + w2x2 + w3x3 + b) meaning you should be able to predict the value of y(Salary) with the values of x1(Age), x2(Income) and x3(Hours_Worked) using the linear equation. However you don't know what the weights(w1, w2, w3) and bias(b) are. That is your model's job to find the best weights and bias, that's when you model is trained or learned.

Usually datasets are quite huge and are loaded from URLs, we have chosen a small dataset here for the purpose of learning the concepts covered in this post in a simpler way.

File

Add a file in our studio lab, that represents the dataset in CSV format.

%%writefile dataset.csvAge (years),Income (thousands),Hours_Worked,Salary (thousands)32,45,50,7041,50,45,8028,30,60,6035,38,55,7545,60,42,9029,32,48,6537,40,35,7542,55,47,8536,48,38,8031,35,52,70
Enter fullscreen modeExit fullscreen mode
Writing dataset.csv
Enter fullscreen modeExit fullscreen mode

Data readiness

Let's load our dataset and shuffle it.

import pandas as pddf = pd.read_csv('dataset.csv')df = df.sample(frac=1)
Enter fullscreen modeExit fullscreen mode

Let's add extra columns to the dataframe by min-max scaling each of the features.

for feature in ['Age (years)', 'Income (thousands)', 'Hours_Worked']:    df[f'scaled_{feature}'] = (df[feature] - df[feature].min()) / (df[feature].max() - df[feature].min())print(df.head(1))
Enter fullscreen modeExit fullscreen mode
 Age (years)  Income (thousands)  Hours_Worked  Salary (thousands)  \9           31                  35            52                  70      scaled_Age (years)  scaled_Income (thousands)  scaled_Hours_Worked  9            0.176471                   0.166667                 0.68
Enter fullscreen modeExit fullscreen mode

We can now split the dataframe into training(80%) and test(20%) dataframes.

train_df = df.sample(frac=0.8)test_df = df.drop(train_df.index)
Enter fullscreen modeExit fullscreen mode

Model

We have the data ready. It's time to create the model.

We will be building asequential model for this purpose with just one layer. That layer will have 3 inputs(features) and 1 output(label).

import tensorflow as tfmodel = tf.keras.Sequential([    tf.keras.layers.Dense(units=1, input_shape=[3])])
Enter fullscreen modeExit fullscreen mode

Note that sequential models are used in Keras when there are a stack of layers with each layer having one input tensor and one output tensor.

A tensor is nothing but TensorFlow's version of a numpy array with more features, which inturn is similar to a list in Python, but with extra attributes/methods.

In our case, it's 3 features how ever it's only one tensor, think of it like a rectangular matrix with 3 columns. Likewise, though it's only one output/label, it's still one tensor(a single column matrix).

We have created the model, initially our model will have random weights and zero bias. Collectively the weight and bias are reffered to as just weights.

w,b = model.weightstf.print('Initial weights:', w)tf.print('Initial bias:', b)
Enter fullscreen modeExit fullscreen mode
Initial weights: [[0.787701] [-0.283494174] [0.238811135]]Initial bias: [0]
Enter fullscreen modeExit fullscreen mode

One things to note. TensorFlow is usually known forDeep Neural Networks(DNN). What we have done still follows the same approach we would rather use for neural networks but our model is not deep it just has 1 layer(depth = 1) and not wide either, just 1 unit in the layer(width = 1). And we do not have any activation functions, which are used when we need non linear functions(for ex.Rectifier function) to map output with input

Compile

Our model's performance could be calcualted based on a loss function. Mean average loss is one such loss functions used with regression. And there should be a way(algorithm) using which we can evaluate this loss, which is nothing but the optimizer. Adam is one populary used optimizer.

Let's compile our model with these settings.

model.compile(    optimizer=tf.keras.optimizers.Adam(),    loss='mean_absolute_error')
Enter fullscreen modeExit fullscreen mode

Train

We can finally train(fit) the data and assign it as a variable. We shall keep 20% of the training data as validation data, and determine the loss for each of these sub datasets. I have set verbose as 0, to suppress terminal output while the training happens.

features = ['scaled_Age (years)', 'scaled_Income (thousands)', 'scaled_Hours_Worked']label = 'Salary (thousands)'history = model.fit(    train_df[features],    train_df[label],    validation_split=0.2,    verbose=0)
Enter fullscreen modeExit fullscreen mode

We have done the training, let's see the what the loss is.

print(history.history)
Enter fullscreen modeExit fullscreen mode
{'loss': [78.85342407226562], 'val_loss': [70.26549530029297]}
Enter fullscreen modeExit fullscreen mode

So the training loss is 79 and the validation loss is 70 approximately. We have a parameter called epoch, that tells for how many full(one full training dataset) iterations did the training happen.

print(len(history.epoch))
Enter fullscreen modeExit fullscreen mode
1
Enter fullscreen modeExit fullscreen mode

So by default it's just 1 epoch.

Let's try with epoch as 10.

history = model.fit(    train_df[features],    train_df[label],    validation_split=0.2,    verbose=0,    epochs=10)print(history.history)
Enter fullscreen modeExit fullscreen mode
{'loss': [78.8173599243164, 78.81478881835938, 78.81220245361328, 78.80962371826172, 78.80704498291016, 78.8044662475586, 78.80188751220703, 78.79930877685547, 78.79672241210938, 78.79415130615234], 'val_loss': [70.24092102050781, 70.23916625976562, 70.23741149902344, 70.23565673828125, 70.23390197753906, 70.23213958740234, 70.23037719726562, 70.22862243652344, 70.22686767578125, 70.22511291503906]}
Enter fullscreen modeExit fullscreen mode

So this time we see the training and validation losses for 10 epochs. We can access just the final training and validation with the last index.

print('Final training loss:', history.history['loss'][-1])print('Final validation loss:', history.history['val_loss'][-1])
Enter fullscreen modeExit fullscreen mode
Final training loss: 78.79415130615234Final validation loss: 70.22511291503906
Enter fullscreen modeExit fullscreen mode

We can see there is no much improvement in the losses with increasing the epochs. Also, the loss was kinda similar in all the epochs. We will try with a higher value, say 1000 epochs.

history = model.fit(    train_df[features],    train_df[label],    validation_split=0.2,    verbose=0,    epochs=1000)print('done')
Enter fullscreen modeExit fullscreen mode
done
Enter fullscreen modeExit fullscreen mode

As there are 1000 losses each for training and validation, rather than printing, we can try plotting the losses in each epoch.

import matplotlib.pyplot as pltplt.plot(history.history['loss'], label='training loss')plt.plot(history.history['val_loss'], label='validation loss')plt.xlabel('Epoch')plt.ylabel('Loss')plt.legend()
Enter fullscreen modeExit fullscreen mode

Plot loss

We will see what our final weights and bias are.

w, b = model.weightstf.print(w, b)
Enter fullscreen modeExit fullscreen mode
[[1.7877351] [0.716494739] [1.23881245]] [0.999988]
Enter fullscreen modeExit fullscreen mode

Note that these graphs are not the best, and our example is not the best either, it was quite a small dataset. The aim of this exercise is not to really optimize the training or to get the best loss values, or the weights and bias at which we get the best loss. It was more on knowing the procedures involved in training a simple(one layer, one unit) linear network with TensorFlow.

Evaluate & Predict

We'll see a couple more steps, first, we can evaluate our model with the test dataset i.e. we see what's the test loss is.

model.evaluate(    test_df[features],    test_df[label],    verbose=0)
Enter fullscreen modeExit fullscreen mode
76.93582153320312
Enter fullscreen modeExit fullscreen mode

And predict the values for a new dataset that doesn't have labels. Let's add a new file for the prediction dataset.

%%writefile to_predict.csvAge (years),Income (thousands),Hours_Worked33,46,4938,52,4427,28,5944,58,4330,34,5150,70,3029,33,4734,39,5641,54,4148,65,36
Enter fullscreen modeExit fullscreen mode
Writing to_predict.csv
Enter fullscreen modeExit fullscreen mode

We can scale the features just like we have done for the training data.

to_predict = pd.read_csv('to_predict.csv')to_predict = (to_predict - to_predict.min()) / (to_predict.max() - to_predict.min())
Enter fullscreen modeExit fullscreen mode

We can predict now.

print(model.predict(to_predict))1/1 [==============================] - 0s 38ms/step[[2.5850587] [2.8624647] [2.2388005] [3.3884692] [2.2325983] [3.5042179] [1.9669406] [2.842394 ] [3.0016134] [3.5197716]]
Enter fullscreen modeExit fullscreen mode

I know the predictions are bad, it's predicting quite low salaries💵 compared to the training set.

Math

Let's see the math used in calculating the predictions. We know the final weights and bias are 1.7877351, 0.716494739, 1.23881245, 0.999988. Let's take the first row from to_predict.

print(to_predict.head(1))
Enter fullscreen modeExit fullscreen mode
Age (years)  Income (thousands)  Hours_Worked0      0.26087            0.428571      0.655172
Enter fullscreen modeExit fullscreen mode

Let's do the math with the linear equation. y = w1x1 + w2x2 + w3x3 + b.
This becomes y =1.7877351*0.26087 + 0.716494739*0.428571 + 1.23881245*0.655172 + 0.999988 =2.585058552816369 This kinda matches with the first entry of predictions(2.5850587).

Summary

So we saw some important ⭐ steps such as creating, training, evaluating and predicting with a model... We could build upon this knowlegde to try regression with a bigger dataset and optimize our model with low losses, fine tune parameters, yield better predictions, which are kinda iterative in nature and are usually implemented with automated workflows i.e. pipelines.

That's it for the post, thanks for reading!!!

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Build On!

Would you like to become an AWS Community Builder? Learn more about the program and apply to join when applications are open next.

More fromAWS Community Builders

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp