Machine Learning : Simple Linear Regression with Python, ASSIGNMENT

SOLUTIONS

In this task we have to find the students scores based on their study hours. This is a simple Regression problem type because it has only two variables.

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

data = pd.read_csv("/content/StudentHoursScores.csv")

print(data)

OUTPUT:

Hours Scores 0 7.7 79 1 5.9 60 2 4.5 45 3 3.3 33 4 1.1 12 5 8.9 87 6 2.5 21 7 1.9 19 8 2.7 29 9 8.3 81 10 5.5 58 11 9.2 88 12 1.5 14 13 3.5 34 14 8.5 85 15 3.2 32 16 6.5 66 17 2.5 21 18 9.6 96 19 4.3 42 20 4.1 40 21 3.0 30 22 2.6 25

plt.plot(data['Hours'],data['Scores'],color='g')
plt.xlabel('Hours')
plt.ylabel('Scores')
plt.title('StudentHoursScores')
plt.show()



print(data.head())
print()
print(data.tail())
print()
print(data.shape)
print()
print(data.dtypes)
print()
print(data.columns)
print()
print(data.corr())
print()
print(data.describe())
print()
print(data.min())
print()
print(data.info())
print()
x = data.iloc[:,0:-1]
y = data.iloc[:, 1]
print(x.head())
print(y.head())

OUTPUT:

   Hours  Scores
0    7.7      79
1    5.9      60
2    4.5      45
3    3.3      33
4    1.1      12

    Hours  Scores
18    9.6      96
19    4.3      42
20    4.1      40
21    3.0      30
22    2.6      25

(23, 2)

Hours     float64
Scores      int64
dtype: object

Index(['Hours', 'Scores'], dtype='object')

           Hours    Scores
Hours   1.000000  0.997656
Scores  0.997656  1.000000

           Hours     Scores
count  23.000000  23.000000
mean    4.817391  47.695652
std     2.709688  27.103228
min     1.100000  12.000000
25%     2.650000  27.000000
50%     4.100000  40.000000
75%     7.100000  72.500000
max     9.600000  96.000000

Hours      1.1
Scores    12.0
dtype: float64

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23 entries, 0 to 22
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Hours   23 non-null     float64
 1   Scores  23 non-null     int64  
dtypes: float64(1), int64(1)
memory usage: 496.0 bytes
None

   Hours
0    7.7
1    5.9
2    4.5
3    3.3
4    1.1
0    79
1    60
2    45
3    33
4    12
Name: Scores, dtype: int64


from sklearn.model_selection import train_test_split

xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size = 0.2, random_state = 0)

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(xtrain, ytrain)
ypred = model.predict(xtest)
print("Prediciton of testing data by model:\n", ypred)

OUTPUT:
Prediciton of testing data by model:
 [91.81882791 54.56931042 29.40071751 84.7716219  40.47489839]

Search This Blog

Pulipati Sravya

Machine Learning : Simple Linear Regression with Python, ASSIGNMENT - 7, GO_STP_8113

Comments

Post a Comment

Popular posts from this blog

Data Science Matplotlib Library Data Visualization, ASSIGNMENT - 6, GO_STP_8113