Job Description
<div class="content-intro"><p>Looking for an innovative, high-growth, multi-award-winning company in one of the hottest segments of the security market? Look no further than Veracode! </p>
<p>Veracode is a global leader in Application Risk Management for the AI era. Powered by trillions of lines of code scans and a proprietary AI-generated remediation engine, the Veracode platform is trusted by organizations worldwide to build and maintain secure software from code creation to cloud deployment.</p>
<p><em>Learn more at </em><a href="https://www.veracode.com/"><em>www.veracode.com</em></a><em>, on the </em><a href="https://www.veracode.com/blog"><em>Veracode blog</em></a><em>, and on </em><a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fveracode%2F&data=05%7C02%7Cjbelmonte%40Veracode.com%7C989fdcd544fe425a07b808dc69096df9%7C3b627b68f21c4ed79fe3698efdedbe21%7C0%7C0%7C638500736454429530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=WIg8EmVTgkrfgliLodV3%2Fgl%2F1IdGyT05d7Y%2FEfXS070%3D&reserved=0"><em>LinkedIn</em></a><em> and </em><a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2FVeracode%3Fref_src%3Dtwsrc%255Egoogle%257Ctwcamp%255Eserp%257Ctwgr%255Eauthor&data=05%7C02%7Cjbelmonte%40Veracode.com%7C989fdcd544fe425a07b808dc69096df9%7C3b627b68f21c4ed79fe3698efdedbe21%7C0%7C0%7C638500736454434178%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=YUcOAPoWdcug7OmcC1N7CEoeEKUFDAaTvgE4r1rAAe0%3D&reserved=0"><em>Twitter</em></a><em>. </em></p>
<p> </p></div><p>We are seeking a skilled Manager, Site Reliability Engineering<strong><em> </em></strong>to lead the reliability, availability, and operational excellence of Veracode’s production systems.This role focuses on defining and enforcing reliability standards, managing risk in production, and ensuring services meet agreed-upon service levels under real-world load and failure conditions. </p>
<p>The ideal candidate has experience operating large-scale distributed systems in production, driving and implementing SLO-based reliability practices, and partnering with engineering, security, devops and product teams to improve the reliability of the system and developer velocity at the same time. </p>
<p><strong><em>Key Aspects of Role</em></strong> </p>
<ul>
<li>Lead 9 member global Site Reliability Engineering Team</li>
<li>Set objectives and key results, KPIs and manage team performance</li>
<li>Act as the primary point of accountability for reliability concerns that span multiple teams, including DevOps, Security, Database, and Product Engineering, driving alignment and resolution.</li>
<li>Manage team on-call schedule and act as point of escalation for alerts and production incidents</li>
<li>Create tickets, groom backlog and prioritize work in sprints</li>
<li>Utilize AWS services to design scalable cloud solutions that support critical systems.</li>
<li>Partner with software engineering teams to ensure monitoring and alerting is in place, enabling consistent, scalable, and automated service delivery.</li>
<li>Own the design and enforcement of the organization’s observability strategy, ensuring continuous improvements in reliability, standardization, and observability across the board.</li>
<li>Drive alert hygiene, standardization, and reduction of alert fatigue across the organization. </li>
<li>Lead efforts to automate infrastructure deployment and management using Terraform, Kubernetes, and other cloud-native tools.</li>
<li>Create automated incident response workflows to handle common infrastructure and application issues.</li>
<li>Collaborate with security teams to ensure systems adhere to industry-standard security practices and policies.</li>
<li>Document and train engineering teams on best practices in reliability, scalability, and operational excellence.</li>
<li>Design, operate, and continuously improve on-call and incident response processes to ensure sustainability, appropriate escalation, and reduction of operational toil.</li>
<li>Contribute to incident and process post-mortems.</li>
<li>Ensure uptime, SLAs, and availability of critical platform components through process improvements and automation.</li>
<li>Monitor existing application and infrastructure while working to improve existing monitoring.</li>
<li>Communicate effectively with project stakeholders and management.</li>
<li>Develop and support processes to maintain uptime, SLAs and availability of critical platform components.</li>
<li>Troubleshoot and resolve production issues related to systems, network, and application.</li>
<li>Ensure that our systems and processes adhere to industry-standard security practices and policies. </li>
</ul>
<p><strong>Required Skills/Experience:</strong> </p>
<ul>
<li>Bachelor's Degree in Computer Science, Information Science, Engineering, or related/relevant field or equivalent experience.</li>
<li>2+ years working as a manager or team lead with direct reports</li>
<li>5+ years working in a SRE, DevOps, Cloud Engineering or similar role.</li>
<li>Experience with AWS and automation tools like Terraform, CloudFormation, or Ansible.</li>
<li>Hands-on experience deploying, managing, and troubleshooting Kubernetes clusters.</li>
<li>Hands -on proficiency with observability, monitoring, and alerting tools (Datadog, Sumologic, Prometheus, Grafana, etc.).</li>
<li>Familiarity with CI/CD pipelines and repository management tools (e.g., GitLab, Jenkins, GitHub).</li>
<li>Strong programming skills for automation (Python, Go, or similar languages).</li>
<li>Solid understanding of infrastructure as code (IaC) and GitOps methodologies.</li>
<li>Strong communication skills with the ability to collaborate effectively across different teams.</li>
<li>Ability to work in an Agile environment.</li>
<li>Proven experience in troubleshooting production environments and improving system reliability.</li>
<li>Experience with on-call/incident management systems such as PagerDuty, VictorOps or OpsGenie. </li>
</ul>
<p><strong><em>Desired Experience:</em></strong> </p>
<ul>
<li>Experience with service meshes (e.g., Istio) to enhance application observability and security.</li>
<li>Familiarity with advanced Kubernetes features (e.g., StatefulSets, Helm, Operators).</li>
<li>Knowledge of database management and migration processes, including RDS and DMS. </li>
</ul>
<p><strong><em>Compensation Transparency</em></strong> </p>
<p>In accordance with U.S. pay transparency laws, Veracode provides compensation transparency for roles based in the United States. Click <a href="https://www.veracode.com/sites/default/files/pdf/veracode-compensation-tranparency.pdf">here</a> to view our compensation ranges by grade. Please note, specific compensation may be influenced by various factors including candidates experience, education, and work location. </p>
<p>Job Grade: Manager </p>
<p><em>Employment opportunities are available to all applicants without regard to race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.</em> </p><div class="content-conclusion"><h3>Fraudulent Recruitment Alert - Be Aware and Stay Informed</h3>
<p class="sa-static-feature-box-section__text mb-2">At Veracode, we prioritize a secure recruitment process. Unfortunately, fake recruitment and job offer scams are on the rise. They aim to deceive candidates through emails and calls to obtain sensitive information.</p>
<p class="sa-static-feature-box-section__text mb-2">Here’s our recruitment promise to you:</p>
<ul>
<li class="sa-static-feature-box-section__text mb-2">Comprehensive Interview Process: We never extend job offers without a comprehensive interview process involving our recruitment team and hiring managers.</li>
<li class="sa-static-feature-box-section__text mb-2">Offer Communications: Our job offers are not sent solely through email, and we will never ask you to pay for your own hardware.</li>
<li class="sa-static-feature-box-section__text mb-2">Email Verification: Recruiting emails from Veracode will always originate from an “@veracode.com" email address.</li>
</ul>
<p class="sa-static-feature-box-section__text mb-5">If you have any doubts about the authenticity of an email, letter, or telephone communication claiming to be from Veracode, please reach out to us at <a href="mailto:careers@veracode.com">careers@veracode.com</a> before taking any further action.</p>
<p> </p></div>