Posts

Predicting a Startups Profit/Success Rate using Multiple Linear Regression in Python, ASSIGNMENT - 8, GO_STP_8113

Here 50 startups dataset containing 5 columns  like “R&D Spend”, “Administration”, “Marketing Spend”, “State”, “Profit”. In this dataset first 3 columns provides you spending on Research , Administration and Marketing respectively. State indicates startup based on that state. Profit indicates how much profits earned by a startup. Clearly, we can understand that it is a multiple linear regression problem, as the independent variables are more than one. Prepare a prediction model for profit of 50_Startups data in Python import  numpy  as  np import  matplotlib.pyplot  as  plt import  pandas  as  pd dataset = pd.read_csv( '50_Startups.csv' ) X = dataset.iloc[:, : -1 ] y = dataset.iloc[:,  4 ] states=pd.get_dummies(X[ 'State' ],drop_first= True ) X=X.drop( 'State' ,axis= 1 ) X=pd.concat([X,states],axis= 1 ) from  sklearn.model_selection  import  train_test_split X_train,...

Machine Learning : Simple Linear Regression with Python, ASSIGNMENT - 7, GO_STP_8113

SOLUTIONS In this task we have to find the students scores based on their study hours. This is a simple Regression problem type because it has only two variables.  import  numpy  as  np import  pandas  as  pd import  matplotlib.pyplot  as  plt data = pd.read_csv( "/content/StudentHoursScores.csv" ) print (data) OUTPUT: Hours Scores 0 7.7 79 1 5.9 60 2 4.5 45 3 3.3 33 4 1.1 12 5 8.9 87 6 2.5 21 7 1.9 19 8 2.7 29 9 8.3 81 10 5.5 58 11 9.2 88 12 1.5 14 13 3.5 34 14 8.5 85 15 3.2 32 16 6.5 66 17 2.5 21 18 9.6 96 19 4.3 42 20 4.1 40 21 3.0 30 22 2.6 25 plt.plot(data[ 'Hours' ],data[ 'Scores' ],color= 'g' ) plt.xlabel( 'Hours' ) plt.ylabel( 'Scores' ) plt.title( 'StudentHoursScores' ) plt.show() print (data.head()) print () print (data.ta...

Data Science Matplotlib Library Data Visualization, ASSIGNMENT - 6, GO_STP_8113

Image
SOLUTIONS : Load the necessary package for plotting using pyplot from matplotlib. Example - Days(x-axis) represents 8 days and Speed represents a car’s speed. Plot a Basic line plot between days and car speed, put x axis label as days and y axis label as car speed and put title Car Speed Measurement.      Days=[1,2,3,4,5,6,7,8]       Speed=[60,62,61,58,56,57,46,63]​​​​​ import  matplotlib.pyplot  as  plt  Days=[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ] Speed=[ 60 ,  62 ,  61 ,  58 ,  56 ,  57 ,  46 ,  63 ] plt.plot(Days,Speed) plt.xlabel( 'days' ) plt.ylabel( 'car speed' ) plt.title( 'Car Speed Measurement' ) plt.show() OUTPUT : 2. Now to above car data apply some string formats  like line style example green dotted line, marker shape like +, change markersize, markerface color etc. #line style green dotted line Days=[ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 ] Speed=[ 60 ,  62 ,  61 ,  5...

ASSIGNMENT - 5, G0_STP_8113

Load the IMDb Dataset and read import numpy as np import pandas as pd df = pd.read_csv('IMDB-Movie-Data.csv') print(df) 2. View the dataset print(df.head(12)) 3. Understand some basic information about the dataset and Inspect the dataframe Inspect the dataframe's columns, shapes, variable types etc. print (df.columns) print (df.shape) print (df.dtypes) 4. Data Selection – Indexing and Slicing data print (df.iloc[ 0 ]) print (df[ 1 : 3 ]) 5. Data Selection – Based on Conditional filtering df[ 'Votes' ]> 8 6. Groupby operations print (df.groupby([ 'Rating' ,  'Votes' ]).groups) print (df.groupby( 'Votes' ).groups) grouped = df.groupby( 'Year' ) print (grouped.get_group( 2014 )) 7. Sorting operation x = df.sort_values(by= 'Revenue (Millions)' ) print (x) 8. Dealing with missing values print (df.isnull()) 9. Dropping columns and null values df = df.dropna(axis= 1 ) print (df) 10. Apply( ) functions y=df.apply( lambda  x: [ 1 , ...

ASSIGNMENT - 4, GO_STP_8113

  GO_STP_8113 Assignment - 4 Goeduhub Technologies Task link : https://www.goeduhub.com/11546/question-exercise-practice-solutions-science-machine-learning #OnlineSummerTraining   #machinelearning   #datascience   #PYTHONNUMPY   #pandas   #pythonprogramming   #goeduhub SOLUTIONS: Import the numpy package under the name np and Print the numpy version and the configuration  import numpy as np print(np.__version__) np.show_config() 2. Create a null vector of size 10 import numpy as np print(np.zeros(10)) 3. Create Simple 1-D array and check type and check data types in array import numpy as np a = np.arange(0, 10) print(type(a)) print(a.dtype) 4. How to find number of dimensions, bytes per element and bytes of memory used? import numpy as np a = np.arange(0, 10) print(a.ndim) print(a.itemsize) print(a.nbytes) 5. Create a null vector of size 10 but the fifth value which is 1 import numpy as np a = np.zeros(10) a[4]=1 print(a) 6. Create a vector w...