Non Destructive OpenStack Lifecycle Management with Containerized OpenStack (draft)
Problem Description
Problem Definition
This section is optional. Please use it to provide additional details (if available) about your user story (if warranted) for further expansion for clarity. A detailed description of the problem. This should include the types of functions that you expect to run on OpenStack and their interactions both with OpenStack and with external systems.
When operating OpenStack based Cloud Services, especially based on a stable version, it is often important to apply patches to running OpenStack Environment. Since OpenStack Community is doing great job back-porting important enhancements and fixes into stable branch, it is critical for an operator to apply those patches into running OpenStack production environment so that its OpenStack can continuously run a stable version.
In addition, most of patches does not require the entire OpenStack services to be updated. An operator wants to apply a patch only to relevant part of OpenStack not touch other parts of OpenStack at all. Having said that, finding a way to only update/change necessary process(es) on running OpenStack is critical for an operator to keep its OpenStack production environment healthy and safe.
In many cases, automation can provide update/upgrade capability for OpenStack. However, it is difficult to find a capability from the existing automation tool to provide "rollback" capability, especially per service. Although Operators go through many sets of tests on stage environment before applying any changes in production environment, there has always been unexpected failures/errors when applying it to production environment due to various environmental dependencies. In this sense, having rollback capability per service can be a very helpful tool for operators to have.
Opportunity/Justification
This section is mandatory. Use this section to give opportunity details that support why pursuing these user stories would help address key barriers to adoption or operation. Some examples of information that might be included here are applicable market segments, workloads, user bases, etc. and any associated data.
Operating a production Openstack cloud in an efficient and scalable way is critical for an organization to achieve the cost benefits and promise that a private cloud strategy entails. For non-production and new operators a bare metal - few servers type installation is sufficient and tearing down and rebuilding the configuration is a normal course of business. However for larger installations with high availability expectations that approach results in too much work and at times an insurmountable barrier to keeping the cloud current and usable by the cloud users.
Requirements Specification
Use Cases
This section is mandatory. You may submit multiple use cases in a single submission as long as they are inter-related and can be associated with a single epic and/or function. If the use cases are explaining goals that fall under different epics/themes then please complete a separate submission. For Personas see: http://docs.openstack.org/contributor-guide/ux-ui-guidelines/ux-personas.html
• As Rey the Cloud Operator I want to…
• As…
As Rey the Cloud Operator, I would like to..
- easily deploy openstack without any deep knowledge about openstack itself and without any specific environmental dependencies on various deployment site
- so that I can have a single and unified way to quickly deploy openstack on various deployment site
As Rey the Cloud Operator after successful OpenStack deployment, I would like to..
- easily manage various types of OpenStack processes by recognizing each of them as part of discrete, versioned service components
- so that unrelated dependencies do not restrict my ability to upgrade or downgrade components to meet operational needs
- have a single unified method for managing the deployment lifecycle of OpenStack component services
- so that I do not have to manage multiple automation and availability supervisor tools in the course of deploying components to meet operational needs
As Rey the Cloud Operator after successful OpenStack deployment, I would like to..
- easily & safely upgrade running openstack. Apply an update/upgrade to only necessary part of openstack without service downtime,
- so that I can run my openstack environment as stable as possible with all the recent patches.
- so that I can upgrade my openstack to new version to get necessary patches and newly added features.
- easily rollback when newly applied patches/updates cause problem on the running openstack service
- so that I can minimize possible service interruption when problem occurs with applied changes.
Reference: OpenStack UX Personas.
Usage Scenarios Examples
This section is mandatory. In order to explain your use cases, if possible, provide an example in the form of a scenario to show how the specified user type might interact with the use case and what they might expect. An example of a usage scenario can be found at http://agilemodeling.com/artifacts/usageScenario.htm of a currently implemented or documented planned solution.
Scenario 1:
- A critical security patch is made available for Keystone with imminent damage / leak of information possible
- Operations team identifies numerous clouds (instances of openstack) which need to be patched
- Operations team updates the instance of record with the patched Keystone
- Operations team implements patch deployment routine which systemically brings down and replace all instances of keystone with the new
Scenario 2:
- Operations team is notified that a third-party actor was dependent on pre-upgrade behavior of a recently upgraded Keystone.
- A decision is made to immediately revert the Keystone version until a solution is found.
- Operations team update the required Keystone version in the deployment record for the affected cloud to the prior version
- Deployment platform replaces running Keystone instances with prior version.
- Operations team verifies that services is restored to third party actor.
Acceptance Criteria
This section is mandatory. In order to define completed implementation of a user story, provide detailed definitions of acceptance criteria for these stories. This should include where applicable the specific project set appropriate, the user focused experience and in some cases references to types of specific artifacts.
- Successful completion of this story would provide a coordinated set of projects to implement the functionality to run and upgrade OpenStack in a native containers
- A stable OpenStack deployment can be described in terms of version or build artifact references of all of the required service components, such that OpenStack can be deployed by supplying these component details to a deployment orchestration platform.
- A running OpenStack deployment can have individual components replaced with different versions by supplying component/artifact version details to a deployment orchestration platform
Related User Stories
This section is mandatory. If there are related user stories that have some overlap in the problem domain or that you perceive may partially share requirements or a solution, reference them here.
- Rolling Upgrades: https://github.com/openstack/openstack-user-stories/blob/master/user-stories/proposed/rolling-upgrades.rst
- Issue: This User Story covers almost "everything" you need to have for "update/upgrade without downtime". That means, our story might be regarded as one of subsets of this "rolling upgrade" user story.
Requirements
This section is optional. It might be useful to specify additional requirements that should be considered but may not be apparent through the use cases and usage examples. This information will help the development be aware of any additional known constraints that need to be met for adoption of the newly implemented features/functionality. Use this section to define the functions that must be available or any specific technical requirements that exist in order to successfully support your use case. If there are requirements that are external to OpenStack, note them as such. Please always add a comprehensible description to ensure that people understand your need.
External References
This section is optional. Please use this section to add references for standards or well-defined mechanisms. You can also use this section to reference existing functionality that fits your user story outside of OpenStack. If any of your requirements specifically call for the implementation of a standard or protocol or other well-defined mechanism, use this section to list them.
https://wiki.openstack.org/wiki/Kolla
https://launchpad.net/openstack-helm