#using pandas to read .csv file
import pandas as
pd
import the data from CSV filedata = pd.read_csv('bank_note_data.csv')
data.head()
When using Neural Network and Deep Learning based systems, it is usually a good idea to Standardize your data, this step isn't actually necessary for our particular data set, but let's run through it for practice!
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(data.drop('Class',axis=1))
scaled_features = scaler.fit_transform(data.drop('Class',axis=1))
See how our data looks like after feature scaling.
X= pd.DataFrame(scaled_features,columns=data.columns[:-1])
X.head()
y = data['class']
X = X.as_matrix()
y = y.as_matrix()
Use the .as_matrix() method on X and Y and reset them equal to this result. We need to do this in order for TensorFlow to accept the data in Numpy array form instead of a pandas series.
Train Test splitting the Data.
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
Contrib.learn
importing tensorflow.contrib.learn as learn
import tensorflow.contrib.learn.python.learn as learn
feature_columns = learn.infer_real_valued_columns_from_input(X)
feature_columns
[_RealValuedColumn(column_name='', dimension=4, default_value=None, dtype=tf.float64,
normalizer=None)]
Creating an object called classifier which is a DNNClassifier from learn. Setting it to have 2 classes and a [10,20,10] hidden unit layer structure.
classifier = learn.DNNClassifier(hidden_units=[10, 20, 10], n_classes=2)
Fitting data to classifier and make prediction for X_test
Fitting the data to the classifier. Use steps 200 with batch_size of 20. You can play around with these values depending upon your machine limits.
Note: Ignore any warnings you get, they won't affect your output
classifier.fit(X_train, y_train, steps=200, batch_size=20)
note_predictions = classifier.predict(X_test)
Model Evaluation
import metrics
from sklearn.metrics import classification_report,confusion_matrix
print(classifier.evaluate(X_test,y_test)["accuracy"])
1.0
from numpy import array
pre = array( list(note_predictions))
print(confusion_matrix(y_test,pre))
[[218 0]
[ 0 194]]
print(classification_report(y_test,pre))
precision recall f1-score support
0 1.00 1.00 1.00 218
1 1.00 1.00 1.00 194
avg / total 1.00 1.00 1.00 412