Defining a problem is the foundation of our research procedure. Here we define the area of concern, a condition to be improved, a difficulty to be eliminated, or define troubling questions.
In this process, we gather, combine, structure, and organize the data. The components of data preparation include pre-processing, profiling, cleansing, validation and transformation.
The data analysis process helps in reducing a large chunk of data into smaller fragments, which makes sense. In this stage, we define variables and perform logical, conditional, and statistical operations.
features in the data will directly influence the result. In this process, we use domain knowledge to extract features from the data. These features can be used to improve the performance of machine learning algorithms.
Assessing the performance of a machine learning model is an essential step in a predictive modeling pipeline. Once a model is ready, it has to be evaluated to establish its correctness
We find the most optimal hyper parameters in the machine learning algorithm. Hyper parameters are important because they directly control the behavior of the training algorithm and the model.
It helps us to understand and explain, the steps and decisions a machine learning model takes while making predictions. It gives us the ability to question the model's decision and evaluate the conclusion.
This is considered as the final stage of the research. We document and present the Overview of the problem, data modeling approach, data analysis reports, findings, and our substantive conclusions.