JerseyCityRecruiter Since 2001
the smart solution for Jersey City jobs

Site Reliability Engineer Python -Java

Company: JPMorgan Chase & Co.
Location: Jersey City
Posted on: June 7, 2021

Job Description:

As a Site Reliability Engineer (SRE), you'll help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation. You'll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment, you'll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow. As an SRE, you'll be focused on running better production applications and systems.

Responsibilities:

  • Design, code, test and deliver software to automate manual operational work
  • Troubleshoot priority incidents, facilitate blameless post-incident evaluations and ensure permanent closure of incidents

Automate manual operational work by improving products or software

  • Engage with development teams throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes

Perform analytics on past data, such as incidents and usage patterns for predicting issues and take proactive steps in support of better service level objectives

  • Design self-healing and resiliency patterns
  • Design performance tests, identify bottlenecks and opportunities for optimization and capacity demands, and present solutions for continuous improvements
  • Design best in class monitoring frameworks to accomplish end-to-end flow monitoring and noiseless alerting
  • Design best in class monitoring frameworks to accomplish end-to-end flow monitoring and noiseless alerting
  • Design automated software and product upgrades, change management and release management solutions

Split time between operational work and engineering work

Coach or manage teams as applicable

Building the Tech/Business dashboards using visualization tools such as Grafana/Tableau.

  • Participate in the 24x7 support coverage including weekend shifts as needed

Qualifications

BS/BA degree or equivalent experience in a software engineering discipline

Proficient in at least two or more software languages such as Python, Java(Preferred), Go with respect to designing, coding, testing and software delivery- 5 to 8+ Years

Proficient in the development of automated tools, systems and services in multiple technology domains

Proficient knowledge of one or more infrastructure components such as networking, cloud services, orchestration tools, containerization, compute and storage systems

Proficient in service-level changes to a system and troubleshooting components

Experience in a production support environment

Experience with Splunk, Dynatrace or other monitoring tools

Working knowledge of the Unix/Linux environment

Design and contribute to performance monitoring and capacity management tools

Expert practitioner in one or more technology domains, may be a cross-domain expert, able to solve complex and mission critical problems within a business or across the firm

Excellent debugging and trouble shooting skills

Expertise in Continuous Integration and Continuous Delivery

Proven experience in development/support of REST API interfaces, streaming applications (Spark streaming and Kafka), SQL and No-SQL DBs (specifically with Cassandra or HBase), Spring Boot, distributed caching solutions such as Hazelcast and Gemfire.

Experience implementing API gateway products like Apigee, CA-Layer 7, Mashery, evaluating open source and vendor products, Experience conducting hands on POCs to prove concepts/products, building distributed systems at Internet scale, migrating applications to internal and external clouds.

Experience with high volume, mission critical applications, and building upon messaging and or event-driven architectures.

Experience in engineering solutions for metrics gathering/publishing and event collection/correlation across distributed architectures, automation, monitoring, intelligent alerting, random fault injection (Chaos Engineering), and self-healing.

Keywords: JPMorgan Chase & Co., Jersey City , Site Reliability Engineer Python -Java, Other , Jersey City, New Jersey

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest New Jersey jobs by following @recnetNJ on Twitter!

Jersey City RSS job feeds