Sigmoid Neuron model, Gradient Descent with sample code

4 min readJun 5, 2019

In the perceptron model, the limitation is that there is a very harsh change in output function(binary output) which require linearly separable data. However, in most real-life cases, we need a continuous output. So we propose a Sigmoid Neurons model

Above function shown is a sigmoid function where it takes linear input resulting in smooth continuous output(shown in red line).

Here red line is the output of the sigmoid model and the blue line output of the perceptrons model. The output value lies between [0,1] irrespective of the number of inputs. As the sum keeps changing, we observe different values along the red line.

The sigmoid model can be used both for regression and classification problems. In the case of regression, the predicted of the sigmoid function is y value whether in a classification problem, first predict using sigmoid function then decide threshold value for classes which classify the different class of predicted y. The threshold can be 0.5 or mean of predicted y or anything depending on the problem.

Loss function

Cross entropy or square loss use in case of sigmoid neurons model.

here cap y is predicted value and y is the actual value

Learning algorithm(Mathematics behind Gradient Descent)

By changing the value of coefficient(w) and bias(b), you will get family of the sigmoid function.

Here start with random ‘w’ and ‘b’ value, compute the loss than again update w and b and so on. initially, you might start with worst sigmoid function but as the update of w and b happen, it will reach optimal sigmoid function.

Colored curved lines are a different sigmoid function for different w and b value.

Nowadays, it is inbuilt in Pytorch/Tensorflow which automatically calculate optimal w and b with loss minimization.

Hereafter to calculate the loss with each pair of (w,b), we need to find grade-w and grade-b.

Here we use Tayler series to calculate the loss after (w,b) update. the calculated loss should be less than the previous one. Now our objective is to find (w,b) such that the second term of third approximate Taylor series result a negative value.

how to calculate grade

Overall fitness function is given below

Iterate until satisfied means

At the predecide value of the number of iteration
At the predecided value of loss value
if the change in (w,b) is not much with iteration

Example in the case of multi-variable

#importing library
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
import matplotlib.colors
import pandas as pd
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_squared_error
from tqdm import tqdm_notebook#sigmoid class for two variable
def sigmoid(x1, x2, w1, w2, b):
  return 1/(1 + np.exp(-(w1*x1 + w2*x2 + b)))
#calculate loss
def calculate_loss(X, Y, w_est, b_est):
  loss = 0
  for x, y in zip(X, Y):
    loss += (y - sigmoid(x, w_est, b_est))**2
  return loss#=======================================
#class SigmoidNeuron:
  
  def __init__(self):
    self.w = None
    self.b = None
    
  def perceptron(self, x):
    return np.dot(x, self.w.T) + self.b
  
  def sigmoid(self, x):
    return 1.0/(1.0 + np.exp(-x))
  
  def grad_w(self, x, y):
    y_pred = self.sigmoid(self.perceptron(x))
    return (y_pred - y) * y_pred * (1 - y_pred) * x
  
  def grad_b(self, x, y):
    y_pred = self.sigmoid(self.perceptron(x))
    return (y_pred - y) * y_pred * (1 - y_pred)
  
  def fit(self, X, Y, epochs=1, learning_rate=1, initialise=True, display_loss=False):
    
    # initialise w, b
    if initialise:
      self.w = np.random.randn(1, X.shape[1])
      self.b = 0
      
    if display_loss:
      loss = {}
    
    for i in tqdm_notebook(range(epochs), total=epochs, unit="epoch"):
      dw = 0
      db = 0
      for x, y in zip(X, Y):
        dw += self.grad_w(x, y)
        db += self.grad_b(x, y)       
      self.w -= learning_rate * dw
      self.b -= learning_rate * db
      
      if display_loss:
        Y_pred = self.sigmoid(self.perceptron(X))
        loss[i] = mean_squared_error(Y_pred, Y)
    
    if display_loss:
      plt.plot(loss.values())
      plt.xlabel('Epochs')
      plt.ylabel('Mean Squared Error')
      plt.show()
      
  def predict(self, X):
    Y_pred = []
    for x in X:
      y_pred = self.sigmoid(self.perceptron(x))
      Y_pred.append(y_pred)
    return np.array(Y_pred)

Evaluation

Accuracy rate in case of the classification problem

RMSE in case of the regression problem

===================================

======Detail code is given below in Github link======

ranasingh-gkp/padhaiAI

Contribute to ranasingh-gkp/padhaiAI development by creating an account on GitHub.

github.com

Reference

Wikipedia
Deep Learning by One Fourth Labs(special thanks)