Computer Science‎ > ‎

### Logistic Regression with Numpy and Pandas

#### Import Libraries

Import a few libraries you think you'll need (Or just import them as you go along!)

`import ``numpy`` as np`
`import matplotlib.pyplot as ``plt`
`import pandas as ``pd`

#### Getting the data, if data in CSV file we can import as below, there are many other functions included in PANDAS  library to work with data importing and preprocessing.

`dataset = pd.read_csv('Social_Network_Ads.csv')`
`X = dataset.iloc[:, [2, 3]].values`
`y = dataset.iloc[:, 4].values`
Let's see how our data looks in the data frame after importing, iloc(), return the data in NUMPY array.

`dataset.head()`
`X`
```array([[    19,  19000],
[    35,  20000],
[    26,  43000],
[    27,  57000],
[    19,  76000],
[    27,  58000],
[    27,  84000],...])```
`y`
`array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, ....])`

#### Splitting the dataset into the Training set and Test set

Always recommended to split the data and test the accuracy of the data.You can play around with test_size, and random_state, they define the size and random selection of data point from the data respectively.

`from sklearn.model_selection import train_test_split`
`X_train,X_test,y_train,y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)`

#### Feature Scaling

in Data, all column may not have the similar size of scale, like number of rooms to the size of room it is always good practice to have a feature scaling on the data before feeding to the training model.

`from sklearn.preprocessing import StandardScaler`
`sc = StandardScaler()`
`X_train = sc.fit_transform(X_train)`
`X_test = sc.transform(X_test)`

#### Fitting Logistic Regression to the Training set

We can import the library as we need, its not like c++ :), while definning object classifier for logistic regression, we  can play with the value of random_state.
`from sklearn.linear_model import LogisticRegression`
`classifier = LogisticRegression(random_state = 0)`
`classifier.fit(X_train, y_train)`

#### Predicting the Test set results

`y_pred = classifier.predict(X_test)`
`classifier.score(X_test,y_test)`
0.89000000000000001
`from sklearn.metrics import confusion_matrix,classification_report`
`print(confusion_matrix(y_test, y_pred))`
[[65 3]
[ 8 24]]
`print(classification_report(y_test,y_pred))`
```             precision    recall  f1-score   support

0       0.89      0.96      0.92        68
1       0.89      0.75      0.81        32

avg / total       0.89      0.89      0.89       100
```
• Visualization of Logistic Regression Working