Site Reliability Engineering Manager (Bellevue, WA) (Remote Eligible)

US Remote Tech Ops-610

Okta is seeking a Site Reliability Manager (SRE) to lead our Core SRE team.

At Okta our motto is "Always On", and nowhere do we embrace that more than in Technical Operations. We strive to build the most reliable and performant systems on the planet through the skillful use of automation. We've created an integrated system that securely connects any person via any device to the technologies they need to do their most significant work.  

The Core SRE team is in the center of our growing production services at Okta. Your team works directly with TPM/QA and Engineering to automate AWS services across the world. The team also leads our edge networking services and plays a key role in a number of new projects 

The ideal candidate:

  • Has a track record of leading or managing high performing teams whilst still being hands-on.
  • Has production experience with AWS cloud-based infrastructure.
  • Has operated complex custom applications on UNIX/Linux and/or Enterprise Java platforms
  • Is passionate about automation and leveraging agile software development methodologies to deliver automation

Job Duties and Responsibilities:

  • Mentor and manage a team of experienced engineers using agile development
  • Partner with recruiting to hire staff in our HQ and remote sites
  • Manage and own delivery of new infrastructure components:
    • Collaborate with TPM, architects and executive management
    • Design and code reviews
    • Partner with Okta security teams.
  • Continuously refine monitoring processes, thresholds, and configuration
  • Respond to issues and escalations and participate in a management on-call rotation
  • Work closely with product developers to ensure new features have the proper operational support and maintainability 

Minimum REQUIRED Knowledge, Skills, and Abilities: 

  • Demonstrate a track record of leading or managing a team
  • Experience with Amazon Web Services and knowledge of AWS networking technologies (VPC/ELB/WAF)
  • Experience with managing Linux Systems in production.
  • Proficient in at least one scripting language (bash, Perl, Ruby, Python)
  • Experience supporting a complex, multi-tier service running in the cloud
  • Prior experience in software development, DevOps role, or SRE role



Okta, Inc. is a publicly traded identity and access management company based in San Francisco. It provides cloud software that helps companies manage and secure user authentication into modern applica...

View all jobs
Apply now