• Intro
  • Hadoop and Spark (HDFS / RDD / MapReduce/ Dataframe)
  • Examples: Dataframe/ ML-Lib models on docker
  • ML-Lib pipeline

Data that can fit on a local computer, in a scale of 0–32 GB depending on RAM. …


  • Intro
  • Traditional method(Moving Average/ARIMA/Regression/KNN/Prophet)
  • Advance method(Univariate Single-step/Multi-variate Single-step/Multi-variate Multi-step time-series forecasting)

There are so many factors involved in the prediction — physical factors vs. psychological, rational and irrational behavior, etc. All these aspects combine to make share prices volatile and very difficult to predict with a high degree of accuracy. Using…

Ninjacart is, six-year-old company, India’s largest B2B Fresh Produce Supply Chain platform, solving one of the toughest supply chain problems of our country through technology. They source fresh produce from farmers and deliver them to businesses(Just in time supply chain) around 200 tones of fresh produce from farm to retailers in less than 12 hours -every day. Ninjacart connects vegetables and fruit farmers directly with businesses. On one end, they help farmers get better prices and ensure consistent demand, and on the other end, help retailers source fresh vegetables at competitive prices directly from farmers. They optimize effectively at a…


  1. Intro
  2. Types(Filter methods, Wrapper methods, Embedded methods, Hybrid methods)-: Information gain/chi-squ/corr/MAD/stepwise/logistic/RF
  3. Genetic algorithm for feature selection

The goal of feature selection in machine learning is to find the best set of features that…

FE for ML

Feature engineering, also known as feature creation, is the process of constructing new features from existing data to train a machine learning model. Typically, feature engineering is a drawn-out manual process, relying on domain knowledge, intuition, and data manipulation. This process can be extremely tedious and the final features will be limited both by human subjectivity and time. Automated feature engineering aims to help the data scientist by automatically creating many candidates features out of a dataset from which the best can be selected and used for training.

Fortunately, featuretools is exactly the solution we are looking for. This open-source…


  1. Introduction
  2. Train our own word embedding (code)
  3. Phrases(bigrams)
  4. t-SNE visualizations in 2D
  5. Retrain Glove Vectors on top of my own data

Word embedding is one of the most popular representations of document vocabulary. It is capable of capturing the context of a word in a document, semantic and syntactic similarity, relation with other words, etc.

Word2Vec is one of the most popular techniques to learn word embeddings using a shallow neural network. It was developed by Tomas Mikolov in 2013 at Google.

After playing around with GLOVE, you will quickly find that certain words in your training data…

The biggest problem with deep learning is Overfitting. Deep NN means it had many hidden layers, which means many ways to train. given the potential of many training weights, the biggest encounter is overfitting. you have to always regularize to avoid overfitting. In ML learning we extensively use L1, L2 regularization to avoid overfitting. Let's discuss some of the parameters used in deep learning to control overfitting.


  1. Basic MLP terminology explained
  2. Application on MNIST data using Keras
  3. Hyperparameter tuning(sklearn/hyperopt)

1. Dropout layers & Regularization:

Dropout is a general concept used for regularization.

dropout rate: it's basically the probability of inactive neurons(dropped out) in given…


Ottonova, Consumer-Centric, is a “digital-first” insurance company, provides health insurance for patients and employees based in Munich (Germany). The company develops an AI-based chat App platform where customers can register and access the clinic insurance coverage plans for medical services. They offer automated AI-based concierge services like doctor consultations, treatment plans, medical data storage, prescription management, and more with help of big data processing. Apart from concierge services, they also facilitate customers with health and fitness policy management application where customers can leverage the maximum benefit of coverage plans. They deal with digital products that set standards in terms of…


  1. problem definition and solution requirements
  2. datasets+code
  3. keyword and sentence vector and data structure
  4. high level Design Architecture
  5. Docker containerization and elastic search installation [setup]
  6. index data[code]
  7. search[code]
  8. deployment[code]
  9. extension to solution

1. Problem definition and solution requirements

We want to build simple search engine given repository of questions with most relevant answer in decreasing order of most relevant.

What are APIs?

In very simple language, it is a function call. In another way, calling a function which is most likely from a different box which is nothing but a server. The output dataset is in either json or XML format.

Rana singh

Leadership belief /Analyst(AI)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store