{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Regression using sklearn\n", "\n", "Author: Carlos J. Costa, ISEG\n", "\n", "Purpose: Identify the weight of several features in the home prices in Boston.\n", "\n", "**1** import libraries needed: sklearn and pandas\n", "\n", "**2** use boston dataset from https://scikit-learn.org/stable/datasets/index.html and convert into two dataframes: X for the features and Y for the target.\n", "\n", "**3** Verify 5 lines of the features variables\n", "\n", "**4** Veify datatype\n", "\n", "**5** Create new features variables with only 2 variables\n", "\n", "**6** Create and fit the model: model = LinearRegression().fit(features, target)\n", "\n", "**7** obtain intercept\n", "\n", "**8** obtain coeficients \n", "\n", "**9** obtain R^2\n", "\n", "Comment the following code:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#\n", "from sklearn.linear_model import LinearRegression\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "#\n", "from sklearn.datasets import load_boston\n", "boston =load_boston()\n", "X = pd.DataFrame(boston.data, columns=boston.feature_names)\n", "Y = pd.DataFrame(boston.target)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | CRIM | \n", "ZN | \n", "INDUS | \n", "CHAS | \n", "NOX | \n", "RM | \n", "AGE | \n", "DIS | \n", "RAD | \n", "TAX | \n", "PTRATIO | \n", "B | \n", "LSTAT | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.00632 | \n", "18.0 | \n", "2.31 | \n", "0.0 | \n", "0.538 | \n", "6.575 | \n", "65.2 | \n", "4.0900 | \n", "1.0 | \n", "296.0 | \n", "15.3 | \n", "396.90 | \n", "4.98 | \n", "
1 | \n", "0.02731 | \n", "0.0 | \n", "7.07 | \n", "0.0 | \n", "0.469 | \n", "6.421 | \n", "78.9 | \n", "4.9671 | \n", "2.0 | \n", "242.0 | \n", "17.8 | \n", "396.90 | \n", "9.14 | \n", "
2 | \n", "0.02729 | \n", "0.0 | \n", "7.07 | \n", "0.0 | \n", "0.469 | \n", "7.185 | \n", "61.1 | \n", "4.9671 | \n", "2.0 | \n", "242.0 | \n", "17.8 | \n", "392.83 | \n", "4.03 | \n", "
3 | \n", "0.03237 | \n", "0.0 | \n", "2.18 | \n", "0.0 | \n", "0.458 | \n", "6.998 | \n", "45.8 | \n", "6.0622 | \n", "3.0 | \n", "222.0 | \n", "18.7 | \n", "394.63 | \n", "2.94 | \n", "
4 | \n", "0.06905 | \n", "0.0 | \n", "2.18 | \n", "0.0 | \n", "0.458 | \n", "7.147 | \n", "54.2 | \n", "6.0622 | \n", "3.0 | \n", "222.0 | \n", "18.7 | \n", "396.90 | \n", "5.33 | \n", "