Mattermost provides an open source enterprise-grade messaging platform to many leading organizations to enable their teams to collaborate securely and privately anywhere. With over 10,000 server downloads/month, our customers include Intel, Samsung, Affirm, The US Department of Defense, and more.
Our private cloud messaging platform offers secure, configurable, highly scalable messaging using web, mobile, and desktop applications and provides deep integrations with hundreds of SaaS and on-premises tools and applications.
We value high impact work, ownership, self-awareness and being focused on customer success. If these values match who you are, we hope you'll learn more about working at Mattermost
We are looking for an engineer with demonstrated experience in software development and infrastructure using Kubernetes. You will be ensuring high reliability and scaling of Mattermost’s new SaaS offering through building tools, deploying infrastructure and automation in Kubernetes.
Responsibilities:Build services and tools to ensure the stability of Mattermost’s SaaS offeringDefine infrastructure in code with Terraform and other toolsWrite thoughtful and high-quality code in GoFollow our engineering best practices, and ensure alignment with our Leadership PrinciplesProvide technical mentorship for fellow engineersDevelop services to handle automatic recovery from incidents and disastersAutomate incident or disaster simulations to identify blindspotsSet technical vision and innovate to be on the forefront of self-healing SaaS servicesImplement, maintain and tune monitoring and alerting systemsDeploy applications to and manage Kubernetes clustersParticipate in our on-call rotation to respond to incidents and resolve problems.
Requirements:Bachelor's degree in Computer Science or related fields, or significant professional DevOps or SRE experience5+ years of previous experience as a developer or SRE with operational responsibilitiesUnderstand Kubernetes inside and outProven experience responding on-call to incidents with superior knowledge of incident response processesStrong skills and experience working with infrastructure as code tools, such as TerraformFamiliarity with container systems such as Kubernetes & DockerSolid programming skills and experience with or an ability to quickly become proficient in GoAbility and willingness to be on-call
Preferences:Experience with distributed application systems using HTTP, WebSockets, RPC, pub/sub, etc. at scaleOpen source contributions to related projectsKnowledge of Grafana and PrometheusComfortable with GitHub, Jira, Jenkins, CircleCIExperience working in open source communities