We are looking for an experienced Site Reliability Engineer for a company specialized in integrated Contact Center solutions.
As an SRE, having the real-world knowledge of how things should run in production, you bring a developer approach and background to an operations role. You are responsible for proactively improving and supporting the deployment, resilience, and performance of software and infrastructure to drive the reliability of Broadvoice’s complex Platforms. You are also focused on automating manual operational work, designing performance and reliability tests, and monitoring frameworks to accomplish noiseless alerting across our entire stack.
– Advise, assess, define, implement and support world-class SRE architectures and ways of working;
– Design and guide the implementation and adoption of SRE practices for complex business domains;
– Identify, assess and solve complex business problems for area of responsibility where analysis of situations requires an in-depth knowledge of organizational objectives;
– Own high responsibilities like leadership decision-making and determining objectives and approaches to critical assignments;
– Interact with senior management levels, negotiating or influencing on significant matters;
– Collaborate in setting strategic direction to establish near term goals for area of responsibility;
– Work on projects roadmaps for your team and balance operational support and regular releases with ongoing project work.
– 2+ years of relevant experience as a SRE/DevOps Engineer;
– 3+ years of relevant experience as a Backend Developer or Sysadmin Engineer;
– Solid Background in Infrastructure-As-Code using Terraform;
– Working knowledge on Python, Shell, Ansible and YAML packages;
– Experience with Docker/Kubernetes deployment, configuration, scaling, and management of containerized applications;
– Strong hands-on experience on AWS ecosystem and architecture;
– Strong technical analytical and troubleshooting skills
Proficiency to expert scripting and automation skills converting manual and maintenance functions into fully orchestrated automation;
– Strong Knowledge & experience in system monitoring techniques and tools supporting unattended operations;
– Ability to operate in complex, highly secure, and highly available, operations environments and interact with the technology domain experts required to maintain those environments;
– Latitude in decision-making and determining objectives and approaches to critical assignments;
– Working knowledge of monitoring and automation tools;
– Strong system security awareness and knowledge in OS hardening;
– Excellent communication & interpersonal skills.