Summary

We would like to build a continuous learning algorithm that will be able to predict execution times of Ansible builds (Playbooks) based on historical Ansible build data. As a precursor to developing the algorithm we are seeking a technologist to develop the continuous learning environment using free versions of Ansible, Splunk and Elastic Search and to generate training data from which the algorithm can learn.

Proposal

As part of your proposal please answer the following questions:

What cloud environment will you use to develop the continuous learning environment? Please describe or diagram the system and provide an estimated cost (e.g. EC2 instance costs or Heroku dyno costs) for maintaining the environment for data generation.

Please provide an estimate of hours required to build and configure the environment. Please provide an estimate of hours required to generate the training data.

What will be your strategy/approach for configuring Playbooks based on the specified Galaxy Roles? How will you ensure the generated data provides a variety of Playbook structures (singletons, clusters, single and multi-target builds) for optimal machine learning?

How do you plan to structure the resulting training data?

Please describe your knowledge of and past experience with the technologies required for this project.

Scope of Work

The selected consultant will be responsible for:

Setting up a cloud environment for data generation (included in this project posting) and continuous learning (for the future project posting).

Setting up and configuring at least one Ansible instance.

Setting up and configuring Splunk.

Installing and configuring the Ansible App for Splunk (used to import Ansible data in to Splunk).

Installing and configuring Elastic Search to access the Ansible data within Splunk.

Setting up multiple hosts upon which Ansiblebuilds can be executed.

Configuring a selected set of publicly available Ansible Roles (from galaxy.ansible.com) in to both singleton (single-Role) and cluster (multiple-Roles) Ansible Playbooks for the purpose of data generation.

Developing a script to execute the resulting Ansible playbooks against single and multiple hosts in order to generate approximately 2,000 rows of test data.

Provide a method for extracting the training data for machine learning (extracted data must be in a flat-file format).

The primary outputs of this project are both the test data and the environment for generating additional test data, which can be accessed by the continuous learning environment.

The attached presentation provides additional details around the environment and data requirements and gives additional context to the broader project scope (beyond the environment and data generation scope of this first project). Details relevant to the scope of this Experfy project posting have been highlighted in yellow in the presentation for clarification.

When submitting your proposal please include an executive summary which describes key elements and numbers for your approach. In your proposal, when estimating cost for the environment, please declare some assumptions regarding number and size of hosts and indicate how much of the proposed cost is due to environmental costs.

After the executive summary, the rest of the proposal should address all questions listed in the proposal section above. The proposal does not need to explicitly be in question and answer format, however it can be. The important thing is that all questions are clearly answered or, if they cannot be answered, then an explanation is given as to why.

Finally, please make sure all proposals are client-ready.