NVIDIA is looking for a world class engineer to join its multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior DevOps and SRE Engineer. The position will be part of a fast-paced crew that develops and maintains sophisticated build & test environments for a multitude of hardware platforms both NVIDIA GPUs and Tegra Processors along with various operating systems (Windows/Linux/Android). The team works with various other business units within NVIDIA Software such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence, Robotics and Driverless Cars to cater to their infrastructure & system’s needs.
What you’ll be doing:
Kubernetes System Administration in a DevOps CI/CD. Designing and implementing clusters, cluster segmentation, internal/external networking for 4+ CI/CD deployment environments; dev, test, staging, production.
End-to-end Implementation of the Kubernetes architecture – installation, configuration, hardening, networking, sizing, scaling etc. to support a CI/CD pipeline for GitLab CI/CD and Jenkins CI/CD. Configuring Kubernetes auto provisioning, and auto scaling of CI/CD job/build agents/runners/nodes.
Implementing high availability clusters and disaster recovery solutions
Strong System Admin experience using Configuration as Code, infrastructure-as-code with tools such as ansible, puppet, chef & terraform.
Design and implement monitoring solution to gain more insight into applications and system health. Implement critical metric using various analytics methods and dashboards.
Craft and develop tools needed for automating workflows. Reuse AI techniques to extract useful signals about machines and jobs from the data generated.
Take part in prototyping, crafting and developing cloud infrastructure for Nvidia.
Participating in on-call support and critical issue coverage as a SRE engineer.
What we need to see:
Solid programming background in python/Go and/or similar scripting languages.
Experience of maintaining cloud infrastructure and highly available production environment.
Excellent debugging, problem solving and analytical skills.
Strong understanding of architectural requirements and development processes involved in building reliable, robust, scalable data products and pipelines.
Experience in Databases both SQL (MySQL) and NoSQL (Elastic Search /MongoDB/Cassandra).
Proficient with configuration management tools like Ansible, Puppet, Chef and source code management & binary repository systems like GitLab, GitHub, Artifactory etc.
Strong background with Gitlab, Jenkins and/or other CI/CD systems.
Proficient with Kubernetes administration, dockers & virtualization. Knowledge of standard methodologies related to security.
Proficient with data analytics/visualization & monitoring tools like Kibana, Grafana, Splunk, Zabbix, Prometheus and/or similar systems.
5+ years of proven experience.
Bachelor’s or master’s degree in computer science, Software Engineering, or equivalent experience.
Ways to stand out from the crowd:
Thrives in a multi-tasking environment with constantly evolving priorities.
Prior experience with large scale operations team. Experience with using and improving data centers. Expertise with windows server infrastructure.
Outstanding interpersonal skills and communication with all levels of management.
Background with computer algorithms and ability to choose the best possible algorithms to meet the scaling challenge.
Ability to analyze complex problems into simple sub problems and then reuse available solutions to implement most of those. Ability to design simple systems that can work efficiently without needing much support.
With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our exclusive engineering teams are rapidly growing. If you’re a creative and autonomous engineer with a real passion for technology, we want to hear from you.
NVIDIA is a Learning Machine
NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and the metaverse is transforming the world’s largest industries and profoundly impacting society.
Learn more about NVIDIA .
We are seeking a skilled and detailoriented Data Analyst to join our team. The ideal candidate will have a passion...
Apply For This JobSelling and promoting products to both existing and potential clients. Performing needs analysis and cost-benefit analysis for clients. Establishing and...
Apply For This JobMERN Stack Developer For the job we require candidates that can make the development process more fluid and can flawlessly...
Apply For This Job* Demonstrates a good understanding of procedures and concepts through extensive experience and advices the execution of IT Development plans...
Apply For This JobJob Details Job Description Responsibilities of candidate includes: • Determining and conveying sales targets that facilitate the actualization of our...
Apply For This JobJob Details Job Description Responsibilities: Inbound/Outbound Calls II Data Entry • Inform customers about the company’s products/services and offers •...
Apply For This Job