Summary
As a professional in this domain, I have demonstrated ability of working with Distributed Systems, Cloud Computing, and Backend-middleware development. Going forward, I intend to continue working on exciting projects in this domain.
Experience
Production Engineering Intern
Meta (Facebook) Bay Area, California (May 2022 - August 2022)
- Ads Machine Learning PE: Internship with the Feature Platform Engineering team.
Software Engineer
DE Shaw and Co. Hyderabad, India (July 2019 - May 2021)
Kafka adoption for incident-response systems: As a monitoring SME, redesigned event processing engine to facilitate event-sourcing with Kafka; bumped up the maximum event coverage capacity (SLO) upto ~2 million events/day .
Linux grid job submission infra: Improved fault-tolerance of 6 different Apache Mesos masters by 50% with streaming replicas using Postgres Patroni framework in a newly created hypercluster; saved 19% monthly error-budget of the in-house Mesos PaaS.
Release Engineering: Conducted enterprise and DMZ Linux server patching and releases -- servers that run ElasticSearch cluster, Vault secret store, and core infra services such as DNS, Puppet and Kerberos.
Container registry and universal artifact repository deployment: Deployed and maintained a multi-site, multi-clustered, highly-available on-prem Universal Artifact Repository with Anycast routing for low-latency package uploads and downloads. Directly Responsible Individual for this system serving more than 600K artifacts with three 9s level of availability SLO.
Systems & Operations Engineering intern
DE Shaw and Co. Hyderabad, India (May 2018 - June 2018)
Systems Opsconsole: Analyzed tickets and system-wide issues reported for better operational efficiency. Built visualizations to provide insights.
Designed and developed a Web-app with React front-end, Python backend and DE Shaw proprietary JavaScript libraries.
On top of reducing MTTD (Mean Time To Detect), the project also reduced MTTR (Mean Time To Repair) taken by SREs to fix users' NFS home directories, grid job submissions, active sessions and group memberships by 19%.
Education
Georgia Institute of Technology | MS Computer Science (Aug 2021 - Dec 2022)
GPA: 4.0/4.0
Computer Networks, Database Systems Concepts and Design, System Implementation, Computer Vision
College of Engineering Pune (India) | BTech Computer Engineering (2015-2019)
CGPA: 9.09/10 (Honors)
Distributed Systems, Computer Networks, Operating Systems, Data Science, System Administration
Skillset
Languages and frameworks
Python, Go (Golang), Java, C++, SQL, bash, JavaScript, React
Container technologies
Kubernetes, Docker/Podman, Helm, MLOps and orchestration with Kubeflow
Infrastructure technologies
Microservices, Linux administration, Cloud Computing, HTTP/TCP load-balancing, Databases, Kafka, Puppet
Certifications
Kafka
Implementing an Event Log with Kafka | certificate🔗
Getting Started with Apache Kafka | certificate🔗
Cloud Computing and Site Reliability
Designing Infrastructure Deployment on AWS | certificate🔗
Site Reliability Engineering: Measuring and Managing Reliability by Google Cloud | certificate🔗
MLOps (Machine Learning Operations) Fundamentals by Google Cloud | certificate🔗
AWS core services | certificate🔗
Practical Networking | certificate🔗
Kubernetes
Kubernetes for Developers: Integrating Volumes and Using Multi-container Pods | certificate🔗
Getting Started with Kubernetes | certificate🔗
Go
Concurrent Programming with Go | certificate🔗
Go: Getting Started | certificate🔗
Java
Java Fundamentals: Object-oriented Design | certificate🔗
UX
UX for Developers | certificate🔗
Getting started in UX design | certificate🔗
C++
Reading Legacy C++ | certificate🔗
Learn to program with C++ | certificate🔗