It comes from the intersection of the fields of data extraction, also called data mining, and statistical analysis.
The standard workflow for a data modeling project is generally as follows:
- Planning: defining the project and expected results;
- Preparation: preparing the working environment for data scientists, their working tools, their access to relevant data and other resources;
- The interference: loading the appropriate data into the work environment;
- Exploration: analysis, exploration and visualization of data;
- Modeling: the design, training and validation of the models thus defined;
- Deployment: the deployment of models in the production phase.