This job board retrieves part of its jobs from: Toronto Jobs | Emplois Montréal | IT Jobs Canada

Find jobs in Atlanta, Georgia today!

To post a job, login or create an account |  Post a Job

  Jobs in Atlanta  

Bringing the best, highest paying job offers near you

previous arrow
next arrow
Slider

Careerbuilder-US: Site Reliability Engineer

Careerbuilder-US

This is a Full-time position in Atlanta, GA posted November 2, 2021.

Title: SITE RELIABILITY ENGINEER As a Senior Site Reliability Engineer, you will be working as part of a team that owns the end to end availability, reliability, and performance of Exabeam’s cloud offering.

This is a key strategic position as Exabeam moves aggressively to the cloud with a brand-new true multi-tenant cloud native using the latest technologies.

You will lead and define processes, technologies, and tools to ensure business SLAs are met.

What you will be responsible for in this role!

· Availability Focus on system reliability and reduce operational impairment, lessen and mitigate failures and minimize downtime.

· Resiliency self healing/self monitoring/ automation, scripting, ansible playbook, architecture review for higher availability to reduce MTTR and MTTD.

· Observability monitoring and alerting.

Ensure underlying infrastructure is properly functioning to ensure critical business remains up and running
· Performance benchmark ensure the product sustainability under extreme load, chaos testing for the critical platform components
· Problem Solving Facilitate post incident RCA, document necessary findings and educate whole team
· Ability to work in a lean, highly effective organization, making strategic trade-offs that are in priority when needed
· Foster a healthy and collaborative culture, in line with Exabeam’s core values
· Be a champion for SRE practices across the wider engineering organization and work with other engineering managers to grow our culture of automation and reliability
· Collaborate with Engineering teams to understand deployment practices and processes and work towards iteratively improving releases, scalability, availability, and cost management.

The Background/Experience we’re seeking!

· You should have a strong passion for SRE/DevOps and running highly resilient/automated systems
· You have 7+ years of experience working in a Global SRE teams and provide support Cloud systems for large production (customer facing) environments (500+ computing nodes)
· Deep working experience on at least one public cloud (GCP and/or AWS) or private cloud (VMWare, OpenStack) and open source software like Kubernetes, Prometheus, Grafana, Kafka etc.,
· on-call rotations across continents, using a follow-the-sun model and handle incidence response to ensure high-availability
· Regularly report on availability and incidents to senior management
· Build a team culture to aim for innovation, high service availability, scalability, and observability goalsJob Requirements:Coordinate failure analysis of reliability devicesUtilized for reliability data analysisAttend routine reliability team meetings, maintain reliability dataImproving equipment and plant reliabilityCompare theoretical system reliability metrics to actual reliability metricsView the reliability requirements and role in improving reliabilityImprove the reliability tools and methods utilized inincreasing plant reliabilityEnsure maintainability and reliability iiProcess equipment for reliability capabilityMaintaining equipment to increase reliabilityReport the reliability test result and failureEnhance product reliability and qualityUse understanding of various reliability tests to assess reliability capabilityIdentify product and reliability requirementsEnsuring historical product reliability programsAssure reliability performance of productsEnsure new product reliability and maintainabilityImprove the reliability tools and methods utilized in increasing plant reliabilityMeasure the system’s reliabilityImprove functional performance and reliability


Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 1 in /home/peacqfkq/public_html/wp-content/themes/jobsboardus/single.php on line 704
Please add your adsense or publicity code here (inc/structure/adsfooter.php)