Geplaatst op

30+ dagen geleden

Beschrijving

At TomTom…

You’ll move the world forward. Every day, we create the most innovative mapping and location technologies to shape tomorrow’s mobility for the better.

We are proud to be one team of more than 5,000 unique, curious, passionate problem-solvers spread across the world. We bring out the best in each other. And together, we help the automotive industry, businesses, developers, drivers, citizens and cities move towards a safe, autonomous world that is free of congestion and emissions.

The SRE team at TomTom brings software and system engineering skills together. We code our way out of operational problems, working with internal and external teams to build resilient, scalable and reliable systems in order to deliver services of the highest quality to our customers.

This is a unique chance to solve challenging problems, dig deep and troubleshoot complex systems, learn from incidents, work across the stack to drive the reliability of our services, have a real impact on global mobility every day.


What you’ll do

  • Work with partners to shape the architecture, design, and implementation of new and existing systems and ensure their reliability.
  • Join the Incident Commander rotation in high priority incidents, getting hands-on when required to improve the TTR.
  • Apply resiliency engineering and drive incident response, analysis and remediation to prevent future occurrence.
  • Ensure that critical services have an adequate monitoring and alerting setup and that operational hygiene is applied to guarantee their continuity.
  • Deliver software to improve reliability, performance and scalability across the stack.
  • Collaborate with the team to define the SRE strategy and roadmap.

What you’ll need

  • 10+ years of working experience in a production environment, covering software and system engineering.
  • 5+ years of production experience operating Linux systems on cloud or bare metal, covering infrastructure as code, configuration management and monitoring.
  • Extensive experience designing, developing, operating and troubleshooting mission-critical distributed systems at scale.
  • Knowledge of algorithms and data structures, proficiency in one or more modern programming language, such as: Java, Go, C++, Scala or Python.
  • Knowledge of Linux systems internals.
  • Knowledge of networking.
  • Excellent written and oral communication skills, ability to collaborate successfully with technical and non-technical stakeholders.
  • Track record of establishing successful mentorship relationships with colleagues, expressing technical leadership without "pulling rank" and role modeling the SRE principles.
  • Business acumen, ability to prioritize high ROI work, strong sense of ownership.

Nice to have

  • Experience working with Kubernetes and Prometheus in production.
  • Experience working with AWS, Azure or a similar cloud environment at scale.

Meet your team

Our team is in the core TomTom live services. We connect with all DevOps teams and make sure that there is a good as possible customer experience when there is an incident and minimize the MTTR (Mean time to resolve). We also focus on reducing the number of incidents as we participate in improvement actions with a focus on automation and reliability setup.

Our Site Reliability Engineers (SRE) are a hybrid of software and systems engineers. We code our way out of operational problems. We are responsible for reliability, scalability, and automation while keeping an eye on latency, performance and capacity as well as other KPI’s.

Achieve more

We are self-starters who play well with others. Every day, we solve new problems with creativity, meet new people and learn rapidly at our offices around the world. We will invest in your growth and are committed to supporting you. In everything we do, we’re guided by six values: We care, putting our heart into what we do; we build trust (you can count on us); we create – driven to make a difference; we are confident, but don’t boast; we keep it simple, since life is complex enough; and we have fun because life’s too short to be boring. 

After you apply

Our recruitment team will work hard to give you a meaningful experience throughout the process, no matter the outcome. Your application will be screened closely and you can rest assured that all follow-up actions will be thorough, from assessments and interviews through your onboarding.

TomTom is an equal opportunity employer

We celebrate diversity, thrive on each other’s differences and are committed to creating an inclusive environment at our offices around the world. Naturally, we do not discriminate against any employee or job applicant because of race, religion, color, sexual orientation, gender, gender identity or expression, marital status, disability, national origin, genetics, or age.

Ready to move the world forward?

At TomTom…

You’ll move the world forward. Every day, we create the most innovative mapping and location technologies to shape tomorrow’s mobility for the better.

We are proud to be one team of more than 5,000 unique, curious, passionate problem-solvers spread across the world. We bring out the best in each other. And together, we help the automotive industry, businesses, developers, drivers, citizens and cities move towards a safe, autonomous world that is free of congestion and emissions.

The SRE team at TomTom brings software and system engineering skills together. We code our way out of operational problems, working with internal and external teams to build resilient, scalable and reliable systems in order to deliver services of the highest quality to our customers.

This is a unique chance to solve challenging problems, dig deep and troubleshoot complex systems, learn from incidents, work across the stack to drive the reliability of our services, have a real impact on global mobility every day.


What you’ll do

  • Work with partners to shape the architecture, design, and implementation of new and existing systems and ensure their reliability.
  • Join the Incident Commander rotation in high priority incidents, getting hands-on when required to improve the TTR.
  • Apply resiliency engineering and drive incident response, analysis and remediation to prevent future occurrence.
  • Ensure that critical services have an adequate monitoring and alerting setup and that operational hygiene is applied to guarantee their continuity.
  • Deliver software to improve reliability, performance and scalability across the stack.
  • Collaborate with the team to define the SRE strategy and roadmap.

What you’ll need

  • 10+ years of working experience in a production environment, covering software and system engineering.
  • 5+ years of production experience operating Linux systems on cloud or bare metal, covering infrastructure as code, configuration management and monitoring.
  • Extensive experience designing, developing, operating and troubleshooting mission-critical distributed systems at scale.
  • Knowledge of algorithms and data structures, proficiency in one or more modern programming language, such as: Java, Go, C++, Scala or Python.
  • Knowledge of Linux systems internals.
  • Knowledge of networking.
  • Excellent written and oral communication skills, ability to collaborate successfully with technical and non-technical stakeholders.
  • Track record of establishing successful mentorship relationships with colleagues, expressing technical leadership without "pulling rank" and role modeling the SRE principles.
  • Business acumen, ability to prioritize high ROI work, strong sense of ownership.

Nice to have

  • Experience working with Kubernetes and Prometheus in production.
  • Experience working with AWS, Azure or a similar cloud environment at scale.

Meet your team

Our team is in the core TomTom live services. We connect with all DevOps teams and make sure that there is a good as possible customer experience when there is an incident and minimize the MTTR (Mean time to resolve). We also focus on reducing the number of incidents as we participate in improvement actions with a focus on automation and reliability setup.

Our Site Reliability Engineers (SRE) are a hybrid of software and systems engineers. We code our way out of operational problems. We are responsible for reliability, scalability, and automation while keeping an eye on latency, performance and capacity as well as other KPI’s.

Achieve more

We are self-starters who play well with others. Every day, we solve new problems with creativity, meet new people and learn rapidly at our offices around the world. We will invest in your growth and are committed to supporting you. In everything we do, we’re guided by six values: We care, putting our heart into what we do; we build trust (you can count on us); we create – driven to make a difference; we are confident, but don’t boast; we keep it simple, since life is complex enough; and we have fun because life’s too short to be boring. 

After you apply

Our recruitment team will work hard to give you a meaningful experience throughout the process, no matter the outcome. Your application will be screened closely and you can rest assured that all follow-up actions will be thorough, from assessments and interviews through your onboarding.

TomTom is an equal opportunity employer

We celebrate diversity, thrive on each other’s differences and are committed to creating an inclusive environment at our offices around the world. Naturally, we do not discriminate against any employee or job applicant because of race, religion, color, sexual orientation, gender, gender identity or expression, marital status, disability, national origin, genetics, or age.

Ready to move the world forward?

Source: TomTom