Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building Regression Model #8

Open
aditbala99 opened this issue Mar 21, 2023 · 0 comments
Open

Building Regression Model #8

aditbala99 opened this issue Mar 21, 2023 · 0 comments

Comments

@aditbala99
Copy link

#Building Regression model
library(caTools)
set.seed(123)
split = sample.split(datan$median_house_value, SplitRatio = 0.9)
training_set = subset(datan, split == TRUE)
test_set = subset(datan, split == FALSE)
m <- lm(formula = median_house_value ~ ., data = training_set)
summary(m)

Call:
lm(formula = median_house_value ~ ., data = training_set)

Residuals:
Min 1Q Median 3Q Max
-365334 -44799 -8876 33157 516625

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -17713.419 3007.209 -5.890 3.92e-09 ***
housing_median_age 1403.913 46.780 30.011 < 2e-16 ***
total_rooms -16.570 1.225 -13.529 < 2e-16 ***
total_bedrooms 125.034 8.350 14.973 < 2e-16 ***
population -64.708 1.588 -40.739 < 2e-16 ***
households 157.760 9.630 16.382 < 2e-16 ***
median_income 51145.454 470.785 108.639 < 2e-16 ***
ocean_proximity.INLAND -63853.921 1348.467 -47.353 < 2e-16 ***
ocean_proximity.ISLAND 171468.785 31044.200 5.523 3.37e-08 ***
ocean_proximity.NEAR BAY -1717.301 1767.588 -0.972 0.331
ocean_proximity.NEAR OCEAN 11860.808 1620.804 7.318 2.62e-13 ***

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 69340 on 18673 degrees of freedom
Multiple R-squared: 0.6453, Adjusted R-squared: 0.6451
F-statistic: 3397 on 10 and 18673 DF, p-value: < 2.2e-16

Predicting the Test set results

y_pred = predict(m, newdata = test_set)
MSE <- mean((y_pred - test_set$median_house_value)^2)
MSE
[1] 4247739708
totalss <- sum((test_set$median_house_value - mean(test_set$median_house_value))^2)
totalss
[1] 2.08637e+13

Regression and Residual Sum of the Squered.

regss <- sum((y_pred - mean(test_set$median_house_value))^2)
regss
[1] 1.476199e+13
resiss <- sum((test_set$median_house_value - y_pred)^2)
resiss
[1] 8.308579e+12

Calulate R squared.

R2 <- regss/totalss
R2
[1] 0.7075441

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant