Scala is a general-purpose programming language. Scala source code is intended to be compiled to Java bytecode to run on a Java Virtual Machine (JVM) so Java libraries may be used directly in Scala. A large reason Scala demand has dramatically risen in recent years is because of Apache Spark. Let’s discuss what Spark is in the context of Big Data.

Data that can fit on a local computer, in a scale of 0–32 GB depending on RAM. …


Time series forecasting is an important area of machine learning that is often neglected. It is important because there are so many prediction problems that involve a time component. These problems are neglected because it is this time component that makes time series problems more difficult to handle.

There are so many factors involved in the prediction — physical factors vs. psychological, rational and irrational behavior, etc. All these aspects combine to make share prices volatile and very difficult to predict with a high degree of accuracy. Using…

Ninjacart is, six-year-old company, India’s largest B2B Fresh Produce Supply Chain platform, solving one of the toughest supply chain problems of our country through technology. They source fresh produce from farmers and deliver them to businesses(Just in time supply chain) around 200 tones of fresh produce from farm to retailers in less than 12 hours -every day. Ninjacart connects vegetables and fruit farmers directly with businesses. On one end, they help farmers get better prices and ensure consistent demand, and on the other end, help retailers source fresh vegetables at competitive prices directly from farmers. They optimize effectively at a…


Feature selection is the process of reducing the number of input variables when developing a predictive model. Adding redundant variables reduces the generalization capability of the model and may also reduce the overall accuracy of a classifier. It is desirable to reduce the number of input variables to both reduce the computational cost of modeling and, in some cases, to improve the performance of the model.

The goal of feature selection in machine learning is to find the best set of features that…

FE for ML

Feature engineering, also known as feature creation, is the process of constructing new features from existing data to train a machine learning model. Typically, feature engineering is a drawn-out manual process, relying on domain knowledge, intuition, and data manipulation. This process can be extremely tedious and the final features will be limited both by human subjectivity and time. Automated feature engineering aims to help the data scientist by automatically creating many candidates features out of a dataset from which the best can be selected and used for training.

Fortunately, featuretools is exactly the solution we are looking for. This open-source…



Word embedding is one of the most popular representations of document vocabulary. It is capable of capturing the context of a word in a document, semantic and syntactic similarity, relation with other words, etc.

Word2Vec is one of the most popular techniques to learn word embeddings using a shallow neural network. It was developed by Tomas Mikolov in 2013 at Google.

After playing around with GLOVE, you will quickly find that certain words in your training data…

The biggest problem with deep learning is Overfitting. Deep NN means it had many hidden layers, which means many ways to train. given the potential of many training weights, the biggest encounter is overfitting. you have to always regularize to avoid overfitting. In ML learning we extensively use L1, L2 regularization to avoid overfitting. Let's discuss some of the parameters used in deep learning to control overfitting.


1. Dropout layers & Regularization:

Dropout is a general concept used for regularization.

dropout rate: it's basically the probability of inactive neurons(dropped out) in given…


Ottonova, Consumer-Centric, is a “digital-first” insurance company, provides health insurance for patients and employees based in Munich (Germany). The company develops an AI-based chat App platform where customers can register and access the clinic insurance coverage plans for medical services. They offer automated AI-based concierge services like doctor consultations, treatment plans, medical data storage, prescription management, and more with help of big data processing. Apart from concierge services, they also facilitate customers with health and fitness policy management application where customers can leverage the maximum benefit of coverage plans. They deal with digital products that set standards in terms of…


1. Problem definition and solution requirements

We want to build simple search engine given repository of questions with most relevant answer in decreasing order of most relevant.

What are APIs?

In very simple language, it is a function call. In another way, calling a function which is most likely from a different box which is nothing but a server. The output dataset is in either json or XML format.

Rana singh

Leadership belief /Analyst(AI)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store