Find out why Natura migrated its Kubernetes clusters to EKS
Today, Natura is part of the Natura &Co group, formed by Avon, The Body Shop and Aesop, being the 4th largest cosmetics company in the world with operations in more than 110 countries.
Natura uses container technology to isolate and package its applications and, since 2016, has adopted kubernetes to orchestrate them. Although Kubernetes provides several benefits for the operation and scalability of applications, its large-scale administration is not a trivial task and can become expensive. For a better understanding of the drivers of this complexity, it is first necessary to understand how the technology is composed.
A Kubernetes cluster contains Masters and Worknodes instances, while the Worknodes are responsible for running our workloads, these use elastic infrastructure reacting according to demand, the Master nodes in turn are responsible for managing and controlling the entire cluster, maintaining an infrastructure fixed during use.
For this reason, managing master nodes tends to be financially costly, given their inelastic nature and the need to maintain at least 3 instances for this purpose. Furthermore, its internal components make its administration complex, requiring the dedication of highly specialized professionals for version upgrades, high availability, troubleshooting, fault mitigation and infrastructure layer management such as storage, cpu consumption, memory, I/O, etc.
Between 2018 and 2019, Natura suffered several problems in the K8S cluster, totaling 106 hours of unavailability for applications. In addition to the hours consumed and the cost of specialists to solve the problem, other factors led to a review of the Kubernetes strategy at Natura. The growth in the number of clusters was one of them, in that period Natura already had 9 clusters and with its expansion plan to multiple countries in different regions of the globe, this number tends to have an expressive growth, requiring a technology that facilitates the cluster maintenance tasks, version upgrades and other administration tasks.
After the Well Architected work carried out in partnership with the AWS team, there was an opportunity to reduce costs and increase the stability of Kubernetes clusters using the Amazon Elastic Kubernetes Service (EKS).
EKS is a Kubernetes service managed by AWS, in which the customer no longer has to worry about managing master nodes, AWS runs the control plane in multiple Availability Zones, detecting failures and automatically replacing nodes, enabling updates and patching on-demand and immediate, and offering an SLA of 99.95% availability.
In addition, EKS has native integrations with the AWS infrastructure, enabling the use of resources already widely used such as load balancers, security groups and spot instances, facilitating management and interoperability between the other components of the architecture.
According to Marcio Cabreira, Cloud Specialist and SRE at Natura, “With the expansion of our business, consequently our workload of applications running on Kubernetes increased, we realized that managing clusters was starting to demand a lot of effort, we migrated our clusters from Kops to EKS and gained in agility in updates, we reduced security risks and kept our efforts focused on the business and experience of the consultants.”
After proofs of concept using EKS, the strategy was defined to use Velero (https://velero.io/) to create backups in the source clusters, based on KOPS, followed by restores in the new EKS cluster. This strategy allowed replicating configurations to the newly created EKS clusters.
For deployment-related artifacts, ArgoCD(https://argoproj.github.io/argo-cd/), already used at Natura, maintains the configurations of deployments, services and ingress in a git repository, synchronizing them and maintaining infrastructure-level change control, thereby easing the task of replicating resources in the new cluster.
Having the cluster and its resources created and synchronized, waves of migrations were carried out, starting with development environments, followed by homologation and later production. These waves, aligned with the business, basically had one activity: updating the applications DNS (Route53) for the LoadBalancers (Ingress) of the new clusters.
In addition, along with the migration, other improvements were implemented, such as 100% automation of the creation of new environments using devops mats, bringing agility to teams and security in the replicability of infrastructure tasks.
“Working in partnership with AWS was of great importance and synergy. We brought a level of maturity to our operation that directly reflects on the experience and quality of our services for consultants and end customers”, quotes Cloud & DevOps Specialist Natura &Co Latam, Luciano Beja.
In all, 10 waves were carried out, covering 930 applications and more than 5,000 pods. All were successfully and transparently migrated to Natura’s consultants, customers and development team.
Currently, Natura runs high-impact applications for the business using EKS, some examples such as the consultant’s sales management platform, which has 1.8 million users, the financial services platform &Co Pay, initially launched for Natura consultants in Brazil , in addition to innovation processes and the e-commerce platform.
After the start of the migration in 2019 until the publication of this article, the new EKS clusters were not unavailable, this translates into more stability and availability for applications and a better experience for customers.
In addition to these benefits, Natura reduced by 1/3 the cost with master nodes and by 48% the costs with EKS environments in general, we can also mention the indirect costs with its specialists who can now dedicate themselves to activities that add more value to the business.
According to Renzo, head of Cloud Platform Engineering & DevOps at Natura &Co Latam, “The expansion to other markets, regions, new brands and the strong growth in demand for digital assets combined with the advancement of digitalization and our platform strategy generated an increase of approximately 80% in the amount of microservices environments. This strong growth generated more operation/upgrade complexity and increased our costs. We chose to migrate to EKS to support this growth and with the objective of gaining agility, simplification and cost reduction with the operation of microservices environments.”
Written by Thiago Couto, Solutions Architect at AWS, Luciano Beja, Cloud & DevOps Specialist at Natura, Bruno Emer, Solutions Architect Specialist in Containers at AWS, Marcio Cabrera, Cloud and SRE Specialist at Natura and Renzo Petri, Cloud Platform Engineering & DevOps at Natura.
This is a production of Amazon Web Service — AWS, partner of 100 Open Startups , which will be monthly here on the blog, with specific articles for startups! AWS will also be present throughout the year with special content for startups in all editions of Oiweek and exclusive opportunities on the 100 Open Startups platform !