Site Reliability Engineer / Platform Operations Engineer
Job Description
Job Description
Job Description
We are looking for an experienced Site Reliability Engineer or Platform Operations Engineer for our client. This is a permanent position that is remote to start with later relocation to Calgary or Winnipeg. Our client is a global enterprise company with a product that you've likely used.
You Will:
- Own development projects, providing technical guidance and delivering against the Platform & Service Operations Engineering roadmap.
- Designing and Implementing Wargames to test our operational response and identify areas of weakness in our platforms.
- Technical and Management Escalation point for Service Operations Centre (SOC) engineers and during major incidents.
- Troubleshooting, reproducing and mitigating issues in our production environments
- Mentoring other team members.
- Operate global AWS Platforms at scale
You Have:
- Evidence of Strong Troubleshooting, problem-solving and investigative skills
- Experience of AWS or Other cloud providers
- Experience developing in Java
- Major incident management on experience operating production platforms at scale
- Experience working with distributed web applications
- Experience Automating operational tasks / Processes using other languages
- Understanding of relational and/or NoSQL data structures
- Experience mentoring/influencing peers
- Identifying improvements, highlighting risks vs benefits, and translating them into technical requirements
Bonus:
- Worked with Ansible, Terraform, Python
- Experience working with Serverless / Containers
- Experience of ELK &/Or Graphite/Prometheus / Grafana
- Used Tracing Tools in production before
- Experience in Chaos Engineering / Failure Injection Testing
- Experience of working in an Agile Environment
- Experience working in a similar site reliability role
This role offers great perks and a competitive salary, please apply to the job posting if it matches your career path!
How to Apply
Ready to start your career as a Site Reliability Engineer / Platform Operations Engineer at Targeted Talent?
- Click the "Apply Now" button below.
- Review the safety warning in the modal.
- You will be redirected to the employer's official portal to complete your application.
- Ensure your resume and cover letter are tailored to the job description using our AI tools.
Frequently Asked Questions
Who is hiring?▼
This role is with Targeted Talent in Halifax.
Is this a remote position?▼
This appears to be an on-site role in Halifax.
What is the hiring process?▼
After you click "Apply Now", you will be redirected to the employer's official site to submit your resume. You can typically expect to hear back within 1-2 weeks if shortlisted.
How can I improve my application?▼
Tailor your resume to the specific job description. You can use our free Resume Analyzer to see how well you match the requirements.
What skills are needed?▼
Refer to the "Job Description" section above for a detailed list of required and preferred qualifications.