Thursday April 27th 2017




Select Features and Target in Scikit Learn

To do Machine Learning in SKlearn, as a first step we need to import following
import pandas as pd
import numpy as np
Step 1. We read the file in Panadas Dataframe by
In jupyter Note book we defined dataframe as
Pic 1 Data impoort in SKLearn

In order to select features and target for machine learning we will use the following commands

X=df[list(df.columns)[:-1]] input
y = df[‘Type’]
X is (Features)
Y is (Target)
By using above command X=df[list(df.columns)[:-1]] we removed the Type Column from input features and then used the y = df[‘Type’] as (Target).
If check X by running we will have the following columns

Int64Index: 214 entries, 0 to 213
Data columns (total 9 columns):
RI 214 non-null float64
Na 214 non-null float64
Mg 214 non-null float64
Al 214 non-null float64
Si 214 non-null float64
K 214 non-null float64
Ca 214 non-null float64
Ba 214 non-null float64
Fe 214 non-null float64
dtypes: float64(9)
memory usage: 16.7 KB

you will notice ‘TYPE’ column which is target variable is not shown below as we used the [:-1] which removed the last column which was target column. This is very useful command if we enter -2 then last 2 columns will be removed and so on.;