Senior Software Engineer Scalable Backends · Distributed Systems · ML Integration · AWS

Senior Software Engineer with 4+ years of experience specializing in scalable backend systems and ML-driven applications. Proven expertise in designing high-performance APIs, implementing event-driven architectures, and integrating machine learning solutions.

Portrait of Ankush Rai

Experience

  1. ZS — Senior Software Engineer

    Jul 2024 – Jan 2025 · Bengaluru, IN

    • Led migration from a legacy Tableau dashboard to a modern web application using React, FastAPI, and Spring Boot with Redis, reducing latency by 45% and eliminating $100K/year in licensing costs.
    • Built ML microservice for anomaly detection in clinical system logs and operational metrics using isolation forests and time-series features, reducing triage time by 30%.
    • Developed a real-time notification system with WebSockets and AWS MSK, enabling sub-580ms delivery for alerts used by research and operations teams with 99.9% uptime.
    • Implemented automated CI/CD pipelines with GitLab and Docker, replacing ad-hoc scripts and sequential builds — cutting deployment time by 35% and ensuring smoother, more reliable releases.
  2. ZS — Software Engineer

    Mar 2022 – Jun 2024 · Bengaluru, IN

    • Developed a GenAI-powered pipeline to extract KPIs from 10K+ unstructured healthcare reports using prompt chaining and preprocessing - automated 20+ analyst hours/week.
    • Optimised RESTful APIs to handle 3 times higher concurrent load by tuning DB connection pooling, adding Redis caching, and validating stability through JMeter load tests.
    • Integrated Prometheus and Grafana to monitor service health and improve incident response time.
    • Protected HIPAA-compliant pharmaceutical APIs by integrating Okta SSO with JWT tokens and rate-limiting, securing access to 500K+ patient records for research teams.
  3. EY — Software Engineer I

    Oct 2020 – Feb 2022 · Bengaluru, IN

    • Improved backend performance by 35% for a large automotive platform by implementing caching, pagination, and streamlined payload handling across services built with NodeJS and Spring Boot.
    • Fixed race conditions in multithreaded services using locks and task queues — enhanced system stability and reduced transactional errors by 15%.
    • Tuned PostgreSQL performance using indexing and execution plan analysis, cutting query latency in key workflows.
    • Built robust unit and integration test suites with Pytest and Postman, increasing coverage and reducing production issues.
  4. Pantech Solutions — Machine Learning Intern

    Aug 2018 – Nov 2018 · Chennai, IN

    • Developed data validation and preprocessing scripts for real-time stock data pipelines, improving training input quality.
    • Trained and validated ML models (Linear Regression, SVM) using scikit-learn to predict short-term stock trends, applying feature engineering for improved accuracy.
    • Documented edge cases and model drift patterns to improve debugging and retraining workflows.

Education

  1. Santa Clara University

    Santa Clara, CA · March 2025 – March 2027

    Master of Science in Computer Science and Engineering

  2. SRM Institute of Science and Technology

    India · August 2016 – June 2020

    Bachelor of Technology in Computer Science and Engineering

Projects

Personal & Open Source

CFO Copilot: Mini FP&A Agent

Lightweight FP&A assistant built with Streamlit that reads CSV fixtures and answers common finance questions with text summaries, charts, and 2-page PDF exports. Features rule-based intent classifier with TF-IDF/embedding fallback for Revenue vs Budget, Gross Margin %, Opex, EBITDA, and Cash runway metrics.

  • Python
  • Streamlit
  • RAG
  • sentence-transformers
  • TF‑IDF
  • Pandas
  • PDF Export

cf_ai_chat_memory: Persistent AI Chat on Cloudflare

A serverless AI chat assistant built entirely on Cloudflare that remembers conversations. Uses Llama 3.1 8B via Workers AI for intelligent responses, Cloudflare Workers for edge compute, and Cloudflare KV for persistent conversation memory with 24-hour sessions.

  • JavaScript
  • Cloudflare Workers
  • Workers AI
  • Llama 3.1 8B
  • Cloudflare KV
  • Serverless
  • Edge Computing
  • Wrangler

Virtual Memory Management Simulator

Simulates OS virtual memory paging using five page replacement algorithms (FIFO, LRU, LFU, MFU, Random). Models locality of reference with 150 random processes, tracking hit ratios and page swaps.

  • C
  • Operating Systems
  • Algorithms
  • Memory Management
  • Data Structures

PaperTrail: NVIDIA AI Document Intelligence for Nonprofits

Real-time document understanding platform for nonprofits. Parses PDFs with pypdf/pypdfium2 + NVIDIA OCR NIM, embeds chunks with NVIDIA embed-qa-4, retrieves via in-memory vector store, and generates structured JSON with NVIDIA Nemotron showing critical/important insights, deadlines, and AI-generated action steps.

  • Python
  • FastAPI
  • NVIDIA NIM
  • RAG
  • LangChain
  • Vector DB
  • OCR
  • Nemotron

Acoustic Shield: AI Vehicle Emergency Sound Detection

AI-powered audio classification system deployed on AWS that detects vehicle emergency sounds in real-time. Uses wav2vec2 model to classify tire skids, emergency braking, and collision-imminent scenarios with sub-500ms latency.

  • Python
  • AWS SageMaker
  • PyTorch
  • HuggingFace
  • wav2vec2
  • Audio ML
  • S3
  • librosa

GreenSense: Automated Plant Health Monitoring

JavaFX plant monitoring system with event-driven architecture and custom EventBus. Implements environmental simulation engine with automated hydration/thermal regulation, pest detection, and species-parameterized health model using SOLID design patterns (Singleton, Observer, Factory).

  • Java
  • JavaFX
  • Maven
  • OOP
  • Design Patterns
  • Event-Driven

Enterprise & Industry Projects

Digital Control Tower

Amgen · ZS Associates

Led development of a centralized monitoring platform that tracks API performance and system health metrics in real-time; ML-driven anomaly detection reduces incident response time and prevents ~15% of potential downtime events.

  • Java
  • Spring Boot
  • Python
  • Flask
  • scikit‑learn
  • JavaScript
  • ML

Quality Control Schedule Optimization

Pfizer · ZS Associates

Real-time notification system delivering live alerts via Kafka and WebSockets on AWS MSK; achieved sub-580ms latency with 99.9% uptime for critical quality control operations.

  • FastAPI
  • Kafka (MSK)
  • AWS
  • GenAI

Adaptive Onboarding Experience Engine

General Motors · EY

Collaborative filtering recommendation system that personalizes content on user onboarding dashboards, significantly boosting feature adoption and user engagement across GM's digital platforms.

  • Python
  • scikit‑learn
  • Pandas
  • Flask

Skills

Languages & Frameworks

Languages: Python, Java, JavaScript/TypeScript, C

Backend: FastAPI, Spring Boot, Flask, Express.js

Frontend: React, Streamlit

AI & Machine Learning

Frameworks: TensorFlow, PyTorch, scikit-learn

GenAI/LLM: RAG, LangChain, Hugging Face, OpenAI API, NVIDIA NIM

NLP: Transformers, Embeddings, TF-IDF, sentence-transformers

Data & Infrastructure

Databases: PostgreSQL, MySQL, Redis, Vector DBs

Streaming: Kafka, AWS MSK, WebSockets

Data Tools: Pandas, NumPy, Matplotlib, Jupyter

Cloud & DevOps

Cloud: AWS (EC2, S3, MSK, Lambda), Azure

DevOps: Docker, GitLab CI/CD, GitHub Actions

Monitoring: Prometheus, Grafana, CloudWatch

Publications

Awards

Contact