• Experience in monitoring production environments (. AWS CloudWatch, Prometheus, Grafana);
• Prior experience in developing metrics and alarms to monitor health of infrastructure and applications;
• Prior experience with ELK stack (ElasticSearch, LogStash and Kibana);
• Very good background using public cloud infrastructure (AWS);
• Very good Linux troubleshooting skills;
• Understanding of networking fundamentals (TPC/IP, DHCP, DNS, IP routing, switching, SDWAN, Security and Cloud networking services);
• Programming experience in automation using python;
• Hands-on experience with Docker, Kubernetes, Terraform, Salt;
• Knowledge of RDBMS and Cassandra databases (including query construction);
• Very good communication with customer skills (English);
• Experience in managing SaaS applications infrastructure with REST based test automation;
• Experience with MariaDB, ArangoDB, Zookeeper, RabbitMQ, ETCD;
• Experience and thorough understanding of Micro service development architecture, Agile development model;
• Knowledge of building pipeline/infrastructure like Jenkins, GitHub, CICD would be added advantage.
We have developed Cloud Services platform, based on micro services architecture, hosting SaaS applications for large enterprises and internet service providers. As we scale our offerings, we are expanding our support capabilities in Europe. This position involves providing L1/L2 support needed by our customers.
The candidate will be part of L1 SRE engineering team whose mission is to monitor and manage cloud operations, monitor the overall health, take proactive measures to exceed customer SLAs, plan capacity, understand the overall cloud services platform architecture, and the end-to-end customer solution needs.
• Operate, maintain and support production systems/applications; ensure that the systems are accessible and available;
• Work in a 24x7 support environment, which includes night shifts as well;
• Deploy, maintain and improve availability and performance of production environment to ensure high quality through early detection of issues;
• Develop metrics and alarms to monitor health and security of applications and micro services running on cloud in AWS infrastructure;
• Ensure systems availability to adhere to customers SLAs and plan capacity;
• Participate in change management process, as appropriate;
• Escalate issues to engineering and L3 SRE team, as appropriate and participate in communication with customers until the issue is resolved;
• Work with customers, deliver high quality monitoring, constantly improve and optimize it;
• Correlate monitoring information from application and infrastructure helping to resolve problems;
• The candidate must possess outstanding problem-solving skills in the diagnosis and resolution of platform issues;
• Participate in documentation creation process.
Reasons to join us:
Attractive salary and benefits package
We invest into your professional training including business domain knowledge, and allow you to grow your professional career.
We encourage creative-thinking into an open-minded work environment. Frequently the relaxation rooms are the place where the most ambitions ideas are
We are not just professional teams, we are also friends that have fun working together
If you are an active person and you feel motivated by the creation/development of the software solutions, then this is the place to be, you will not get bored.
Luxoft, a DXC Technology Company, (NYSE: DXC), is a digital strategy and software engineering firm providing bespoke technology solutions that drive business change for customers the world over. Luxoft uses technology to enable business transformation, enhance customer experiences, and boost operational efficiency through its strategy, consulting, and engineering services. Luxoft combines a unique blend of engineering excellence and deep industry expertise, specializing in automotive, financial services, travel and hospitality, healthcare, life sciences, media and telecommunications. Luxoft is well known for its consistent high level of delivery and complex project management, its premier digital engineering talent, exceptional client focus, and agility, creativity, and remarkable problem-solving capabilities.
For more information, please visit our website.