Site Reliability Engineer II CTJ

Site Reliability Engineer II CTJ

Job Overview

Location
London, England
Job Type
Full Time Job
Job ID
53640
Date Posted
1 year ago
Recruiter
John Apl
Job Views
102

Job Description

Application Platform and Serverless team is part of broader Azure organization with a mission to empower developers to quickly build and manage highly scalable distributed applications on Azure. The team is responsible for some of the most popular, highly paid and fastest growing Azure services like Azure App Service (WebApps), Azure Functions, Azure API Management, Azure Logic Apps and the recently announced Azure Static Web Apps. These services cater to thousands of developers across the globe to develop and manage Web, Mobile, API, Event-Driven and IoT applications. Besides serving a multitude of industry sectors, our platform hosts mission critical applications for many Fortune 500 companies.

 

Bringing simplicity to our customers creates challenging engineering opportunities in areas like scalability, high-density, multi-tenancy, high availability, user experience and developer tools.

 

The emergence of Serverless as next generation cloud compute and the recent disruptions in container and container orchestration technologies like Docker, Kubernetes; Cloud Native, Hybrid and Serverless application development and hosting platforms have an exciting future ahead. Come join us and help shape the next generation Serverless and Application Platform for Azure.

 

As the services grow (millions of machines powering the back-end) and the pace is always increasing (adding new scale units every week), we are looking for engineers that are passionate about managing services which span across many continents and time zones, live site excellence, monitoring (our goal is to deliver a world class availability and stability to our customers while keeping the service secure and safe) and engineering improvements. Our goal is always to automate solutions to existing problems through code and move on to the next challenges. Experience working in a services environment and/or building developer tools and libraries are highly desired.

 

Azure App Service

 

App Service is a platform-as-a-service (PaaS) offering and one of the top services in Azure all-up (with most paying customers of any Azure compute service). The service powers Web Apps, Mobile Apps, API Apps, Logic Apps, Azure Static Web Apps, and Functions, a hot offering for the serverless world.

 

Azure Functions

 

Azure Functions is an open source project at the heart of Azure’s serverless platform and is the next generation of cloud compute.  It runs in more than 50 regions and handles billions of daily invocations of customer code written in a variety of languages including JavaScript, Python, C#/F#, Java and PowerShell.

 

Azure API Management

 

APIs are the foundation of our connected, digital world. Azure API Management enables companies building APIs to publish, describe, secure, manage and monitor their APIs so that they can focus on providing the functionality their developers want, while we provide them the infrastructure they need.

 

Azure Logic Apps

 

Our charter is to define what it means to build an integration app with modern connectivity and services and to be the market leader in this space. Our primary technical focus is integrating various SaaS (Software as a Service) and Enterprise systems and making business-to-business and enterprise application connectivity possible. The Logic Apps product is available today in Azure and offers a great end-to-end experience which includes a world class web authoring, management, and monitoring experience for workflows and orchestrations between applications, data, and services. The Logic Apps service runs at a massive scale processing billions of actions daily.

Responsibilities

We are looking for strong Site Reliability Engineers who are passionate about deploying, automating, maintaining, and monitoring a globally distributed service at large scale. An ideal candidate is a persistent problem solver able to own, triage, investigate and resolve service issues with an emphasis on communications, documentation, and improving service reliability. A candidate who can work in a collaborative and diverse team environment both locally and across geo. A candidate who can wear many hats, learn quickly and change direction when needed.

 

In this role you will help to deliver and support the next set of services, features and developer tools that will define how applications are developed and hosted on Azure. You'll be working with both open source and platform components written in a variety of programming languages, as well as technologies from up and down the entire Azure stack.

This is a unique opportunity to work on large scale distributed systems supporting initiatives that are key to Microsoft's cloud strategy.

In Azure Application Platform and Serverless team, Site Reliability Engineer duties include actively participating in on-call rotation (DRI), which can include creating/updating SOPs, Troubleshooting Guides, monitoring systems, mitigating/restoring incidents, and deep-dive analysis of root causes of outages. This position is collaborative – requires working with many partner teams, learning and understanding how to use their technology.

 

Key responsibilities:

·         Support services before they go live through activities such as buildout, stabilization, capacity planning, and internal customer support.

·         Maintain services once they are live by supporting external customers, and measuring and monitoring service availability system health.

·         Convert manual operations into repeatable automated processes.

·         Improve services by pushing for changes that improve reliability and velocity.

·         Practice sustainable incident response and blameless postmortems.

Qualifications

Hands on experience with distributed systems and services in the cloud.

Proven ability tracking complex technical issues for running online services.

Excellent written and oral communication skills.

Experience with .NET based systems is a plus.

Experience with scripting languages and automation tools is a plus.

Experience using Linux, Docker, Kubernetes is a plus.

Experience using public cloud services including Azure, AWS or Google Cloud.

 

Security Clearance Requirements: Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:

·         Citizenship Verification: This position requires verification of US Citizenship to meet federal government security requirements.

·         Candidates must have an active TS and be willing to upgrade to TS/SCI (with polygraph) or have an active TS/SCI and be willing to upgrade to TS/SCI (with polygraph). This role will require candidates to maintain the TS/SCI (with polygraph) clearance.

·         Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

 

Candidates selected for this position must comply with Federal Executive Order 14042 mandating that federal contractors and subcontractors receive the COVID-19 vaccine by being fully vaccinated before their date of hire, or work with Microsoft to receive an approved religious or medical accommodation.

 

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.  We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Job ID: 53640

Similar Jobs

Cargill

Full Time Job

Site reliability engineer ii ctj Site reliability engineer ii ctj

A Typical Work Day May Include: • Completing preventative, predictive, ...

Full Time Job

Deloitte

Full Time Job

Site reliability engineer ii ctj Site reliability engineer ii ctj

Are you looking to elevate your cyber career? Your technical skills? Your opport...

Full Time Job

Cargill

Full Time Job

Site reliability engineer ii ctj Site reliability engineer ii ctj

Cargill Animal Nutrition is a global business that serves large-scale feed mill ...

Full Time Job

Veolia

Full Time Job

Site reliability engineer ii ctj Site reliability engineer ii ctj

Primary Duties / Responsibilities:● Assist in daily operational troublesho...

Full Time Job

Cookies

This website uses cookies to ensure you get the best experience on our website.

Accept