Machine Learning : Simple Linear Regression with Python, ASSIGNMENT - 7, GO_STP_8113

SOLUTIONS


In this task we have to find the students scores based on their study hours. This is a simple Regression problem type because it has only two variables. 

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv("/content/StudentHoursScores.csv")
print(data)

OUTPUT:

Hours Scores 0 7.7 79 1 5.9 60 2 4.5 45 3 3.3 33 4 1.1 12 5 8.9 87 6 2.5 21 7 1.9 19 8 2.7 29 9 8.3 81 10 5.5 58 11 9.2 88 12 1.5 14 13 3.5 34 14 8.5 85 15 3.2 32 16 6.5 66 17 2.5 21 18 9.6 96 19 4.3 42 20 4.1 40 21 3.0 30 22 2.6 25


plt.plot(data['Hours'],data['Scores'],color='g')
plt.xlabel('Hours')
plt.ylabel('Scores')
plt.title('StudentHoursScores')
plt.show()



print(data.head())
print()
print(data.tail())
print()
print(data.shape)
print()
print(data.dtypes)
print()
print(data.columns)
print()
print(data.corr())
print()
print(data.describe())
print()
print(data.min())
print()
print(data.info())
print()
x = data.iloc[:,0:-1]
y = data.iloc[:, 1]
print(x.head())
print(y.head())

OUTPUT:

Hours Scores 0 7.7 79 1 5.9 60 2 4.5 45 3 3.3 33 4 1.1 12 Hours Scores 18 9.6 96 19 4.3 42 20 4.1 40 21 3.0 30 22 2.6 25 (23, 2) Hours float64 Scores int64 dtype: object Index(['Hours', 'Scores'], dtype='object') Hours Scores Hours 1.000000 0.997656 Scores 0.997656 1.000000 Hours Scores count 23.000000 23.000000 mean 4.817391 47.695652 std 2.709688 27.103228 min 1.100000 12.000000 25% 2.650000 27.000000 50% 4.100000 40.000000 75% 7.100000 72.500000 max 9.600000 96.000000 Hours 1.1 Scores 12.0 dtype: float64 <class 'pandas.core.frame.DataFrame'> RangeIndex: 23 entries, 0 to 22 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Hours 23 non-null float64 1 Scores 23 non-null int64 dtypes: float64(1), int64(1) memory usage: 496.0 bytes None Hours 0 7.7 1 5.9 2 4.5 3 3.3 4 1.1 0 79 1 60 2 45 3 33 4 12 Name: Scores, dtype: int64


from sklearn.model_selection import train_test_split

xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size = 0.2, random_state = 0)

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(xtrain, ytrain)
ypred = model.predict(xtest)
print("Prediciton of testing data by model:\n", ypred)

OUTPUT:
Prediciton of testing data by model: [91.81882791 54.56931042 29.40071751 84.7716219 40.47489839]



Comments

Popular posts from this blog

Data Science Matplotlib Library Data Visualization, ASSIGNMENT - 6, GO_STP_8113