Feature selection is the process of reducing the number of input variables when developing a predictive model. Adding redundant variables reduces the generalization capability of the model and may also reduce the overall accuracy of a classifier. It is desirable to reduce the number of input variables to both reduce the computational cost of modeling and, in some cases, to improve the performance of the model.
The goal of feature selection in machine learning is to find the best set of features that…
Feature engineering, also known as feature creation, is the process of constructing new features from existing data to train a machine learning model. Typically, feature engineering is a drawn-out manual process, relying on domain knowledge, intuition, and data manipulation. This process can be extremely tedious and the final features will be limited both by human subjectivity and time. Automated feature engineering aims to help the data scientist by automatically creating many candidates features out of a dataset from which the best can be selected and used for training.
Word embedding is one of the most popular representations of document vocabulary. It is capable of capturing the context of a word in a document, semantic and syntactic similarity, relation with other words, etc.
Word2Vec is one of the most popular techniques to learn word embeddings using a shallow neural network. It was developed by Tomas Mikolov in 2013 at Google.
After playing around with GLOVE, you will quickly find that certain words in your training data…
The biggest problem with deep learning is Overfitting. Deep NN means it had many hidden layers, which means many ways to train. given the potential of many training weights, the biggest encounter is overfitting. you have to always regularize to avoid overfitting. In ML learning we extensively use L1, L2 regularization to avoid overfitting. Let's discuss some of the parameters used in deep learning to control overfitting.
Dropout is a general concept used for regularization.
dropout rate: it's basically the probability of inactive neurons(dropped out) in given…
Ottonova, Consumer-Centric, is a “digital-first” insurance company, provides health insurance for patients and employees based in Munich (Germany). The company develops an AI-based chat App platform where customers can register and access the clinic insurance coverage plans for medical services. They offer automated AI-based concierge services like doctor consultations, treatment plans, medical data storage, prescription management, and more with help of big data processing. Apart from concierge services, they also facilitate customers with health and fitness policy management application where customers can leverage the maximum benefit of coverage plans. They deal with digital products that set standards in terms of…
We want to build simple search engine given repository of questions with most relevant answer in decreasing order of most relevant.
What are APIs?
In very simple language, it is a function call. In another way, calling a function which is most likely from a different box which is nothing but a server. The output dataset is in either json or XML format.
“If I had my life to live over again, I would have made a rule to read some poetry and listen to some music at least once every week.”― Charles Darwin
Life exists on the sharp edged wire of the Guitar. Once you jump, it’s echos can be heard with immense intangible pleasure. Let's explore this intangible pleasure…
Music is nothing but a sequence of nodes(events). Here input to the model is a sequence of nodes.
Some of the music generated example using RNNs shown below
Discrete random variable: X is a discrete random variable, if its range is countable.
Continuous random variable: A continuous random variable is a random variable where the data can take infinitely many values. For example, a random variable measuring the time taken for something to be done is continuous since there is an infinite number of possible timestamps that can be taken.