Summary

As a professional in this domain, I have demonstrated ability of working with Distributed Systems, Cloud Computing, and Backend-middleware development. Going forward, I intend to continue working on exciting projects in this domain.


Experience

Production Engineering Intern

Meta (Facebook) Bay Area, California (May 2022 - August 2022)

  • Ads Machine Learning PE: Internship with the Feature Platform Engineering team.

Software Engineer

DE Shaw and Co. Hyderabad, India (July 2019 - May 2021)

  • Kafka adoption for incident-response systems: As a monitoring SME, redesigned event processing engine to facilitate event-sourcing with Kafka; bumped up the maximum event coverage capacity (SLO) upto ~2 million events/day .

  • Linux grid job submission infra: Improved fault-tolerance of 6 different Apache Mesos masters by 50% with streaming replicas using Postgres Patroni framework in a newly created hypercluster; saved 19% monthly error-budget of the in-house Mesos PaaS.

  • Release Engineering: Conducted enterprise and DMZ Linux server patching and releases -- servers that run ElasticSearch cluster, Vault secret store, and core infra services such as DNS, Puppet and Kerberos.

  • Container registry and universal artifact repository deployment: Deployed and maintained a multi-site, multi-clustered, highly-available on-prem Universal Artifact Repository with Anycast routing for low-latency package uploads and downloads. Directly Responsible Individual for this system serving more than 600K artifacts with three 9s level of availability SLO.

Systems & Operations Engineering intern

DE Shaw and Co. Hyderabad, India (May 2018 - June 2018)

  • Systems Opsconsole: Analyzed tickets and system-wide issues reported for better operational efficiency. Built visualizations to provide insights.

  • Designed and developed a Web-app with React front-end, Python backend and DE Shaw proprietary JavaScript libraries.

  • On top of reducing MTTD (Mean Time To Detect), the project also reduced MTTR (Mean Time To Repair) taken by SREs to fix users' NFS home directories, grid job submissions, active sessions and group memberships by 19%.


Education

Georgia Institute of Technology | MS Computer Science (Aug 2021 - Dec 2022)

GPA: 4.0/4.0

Computer Networks, Database Systems Concepts and Design, System Implementation, Computer Vision

College of Engineering Pune (India) | BTech Computer Engineering (2015-2019)

CGPA: 9.09/10 (Honors)

Distributed Systems, Computer Networks, Operating Systems, Data Science, System Administration


Skillset

Languages and frameworks

Python, Go (Golang), Java, C++, SQL, bash, JavaScript, React

Container technologies

Kubernetes, Docker/Podman, Helm, MLOps and orchestration with Kubeflow

Infrastructure technologies

Microservices, Linux administration, Cloud Computing, HTTP/TCP load-balancing, Databases, Kafka, Puppet


Certifications

Kafka

Implementing an Event Log with Kafka | certificate🔗

Getting Started with Apache Kafka | certificate🔗

Cloud Computing and Site Reliability

Designing Infrastructure Deployment on AWS | certificate🔗

Site Reliability Engineering: Measuring and Managing Reliability by Google Cloud | certificate🔗

MLOps (Machine Learning Operations) Fundamentals by Google Cloud | certificate🔗

AWS core services | certificate🔗

Practical Networking | certificate🔗

Kubernetes

Kubernetes for Developers: Integrating Volumes and Using Multi-container Pods | certificate🔗

Getting Started with Kubernetes | certificate🔗

Go

Concurrent Programming with Go | certificate🔗

Go: Getting Started | certificate🔗

Java

Java Fundamentals: Object-oriented Design | certificate🔗

UX

UX for Developers | certificate🔗

Getting started in UX design | certificate🔗

C++

Reading Legacy C++ | certificate🔗

Learn to program with C++ | certificate🔗


results matching ""

    No results matching ""