Mohit Rajput
Machine Learning Software Engineer | Data Scientist | AI/ML/DL | IIT Roorkee
- Data product management, solution architecture, and end-to-end data/ML solution design (from collection and experimentation to deployment, monitoring, and maintenance).
- MLOps platforms with centralized and decentralized architectures, including microservices-based systems, industrial Edge processing, and hybrid cloud–edge workflows.
- Generative AI, augmented LLMs, agentic AI systems, and MCP Host/Server architectures, with a focus on tool-augmented workflows and prompt engineering.
- Statistical, machine, and deep learning across time series, computer vision, NLP, chatbots, and signal/sensor data for real-world decision-making.
- Reinforcement and unsupervised learning, recommendation systems, and data-centric architecture (data collection, data engineering, and R&D-driven solution design).
Core Responsibilities:
- Lead end-to-end ML solution design for centrifuge operational optimization.
- Own architecture, data pipelines, model deployment on Edge, and integration with industrial systems.
- Collaborate with cross-functional teams including Dianomic Systems and ADM stakeholders.
Experience in this company:
- MCP Host & MCP Server for FogLAMP
- Designed and implemented an LLM-powered MCP Host and FogLAMP MCP server supporting APIs, DB connections, agent persona orchestration, and dynamic tool routing between Excel, FogLAMP, and cloud services.
- Enables enterprise and agentic workflows for edge and cloud AI systems.
- Tags: MCP, MCP Host, API Design, FogLAMP, Edge AI, Agentic Workflows, Registry
- Excel FogLAMP Datalink
- Developed an Excel add-in to fetch FogLAMP data, enabling analysts and operators to explore and visualize edge data directly inside Excel templates.
- Tags: Excel Add-in, Office.js, FogLAMP, Enterprise Integration, Data Visualization
- Centrifuge Breakover Forecasting (EDGE Deployment)
- Designed, trained, and deployed a predictive ML solution on EDGE devices to optimize centrifuge operation and reduce downtime.
- Integrated with PLC and HMI for real-time operator feedback and decision support.
- Tags: Stacked-LSTM, PLC, HMI, FogLAMP, Edge Compute, Postgres, Grafana, GCP
- Solids Buildup Detection (EDGE Deployment)
- Developed a time–frequency domain signal processing and autoencoder-based solution for predictive maintenance of centrifuges.
- Deployed on FogLAMP to monitor solids accumulation using high-frequency vibration signals from multiple sensors (8 kHz streams).
- Tags: Autoencoder, Fourier Analysis Networks, Signal Processing, High-Velocity Data, FogLAMP, Postgres, Grafana, Edge AI
- Multi-Reference Anomaly Detector
- Researched and planned a multi-reference anomaly detection system leveraging contrastive learning and multi-instance learning for centrifuge operations.
- Supports operator-action and machine-response anomaly detection on vibration and process data.
- Tags: Siamese Networks, Multi-Instance Learning, Contrastive Learning, Fourier Analysis, Vibration Data, Edge AI
- MLOps Infrastructure (Cloud Agnostic)
- Architected and developed a cloud-agnostic, edge-ready MLOps framework on GCP (partially validated on Azure and on-prem) covering ingestion, feature engineering, training, hyperparameter tuning, evaluation, monitoring, and data validation/drift checks.
- Evaluated Postgres vs BigQuery vs InfluxDB; set up Grafana dashboards for operational and model monitoring, with support for multi-environment dev/stage/prod, backfill, health monitoring, and logging.
- Tags: MLOps, GCP, Azure, Grafana, Postgres, BigQuery, InfluxDB, Feature Engineering, Model Training, Drift Detection
- FogLAMP Filter and Rule Development
- Implemented multiple filters and rules for improved data handling, anomaly detection, and sensor health monitoring:
foglamp-filter-moving-measuresfoglamp-filter-data-driftfoglamp-rule-sensordrift(vibration sensor disconnect detection)
- Tags: FogLAMP, Edge Compute, Signal Processing, Python, Plugin Development
- Implemented multiple filters and rules for improved data handling, anomaly detection, and sensor health monitoring:
- Data Infrastructure and Monitoring
- Developed system resource and syslog monitoring modules; built improved automated data transfer scripts for reliable movement of operational data.
- Tags: System Monitoring, Syslogs, Automation, Edge Infrastructure
- Azure Data Platform & Medallion Architecture
- Established a Delta Lake on Azure Data Lake with a Medallion Architecture using Azure Data Factory for customer analytics and ML use cases.
- Evaluated DVC, MLFlow, and Databricks for experiment tracking and data/version management.
- Tags: Azure Data Factory, Delta Lake, DVC, MLFlow, Databricks, Data Medallion Architecture
- Plugin Development for Testing
- Created
foglamp-south-datafakerplugin to generate configurable signal patterns for system, pipeline, and MCP orchestration testing. - Tags: FogLAMP, Edge Compute, Plugin Development, Test Infrastructure
- Created
Notable Contributions:
- Defined and delivered an Edge AI and MLOps blueprint for centrifuge optimisation that is reusable across assets and sites.
- Shipped multiple production-grade EDGE deployments (forecasting, anomaly detection, solids buildup) integrated with PLC/HMI and Grafana-based monitoring.
- Introduced MCP Host + FogLAMP MCP server and Excel FogLAMP Datalink, enabling operator-friendly, tool-augmented workflows for industrial data analysis.
- Established Azure Data Platform with Medallion architecture and evaluation frameworks (DVC/MLFlow/Databricks) to standardise experimentation and governance.
Core Responsibilities:
- Build data enrichment and provider matching pipeline.
- Optimize identity resolution at scale.
Experience in this company:
- National Provider Identifier (NPI) Filler services for the missing NPI
- Filled in the missing information for multiple cases of Healthcare providers by utilising the data available from data source.
- Tags: Identity Resolution, Similarity Mapping, Search, Parallel Processing, Big Data, APIs, Redis, Spark
Core Responsibilities:
- Scale Day Ahead Capacity Planning from pilot to multi-country production.
- Develop monitoring, automation, and utilities for faster deployment cycles.
- Work with operations, supply chain, and engineering teams across geographies.
Experience in this company:
- Day Ahead Capacity Planning
- Providing Day Ahead forecast for the Last Mile, to the Delivery Stations. This enables them to plan for the logistics.
- Cost Saving comes from minimizing the over- and the underbooking of the logistics.
- Tags: Time Series, Curve Matching, ML, Sagemaker, EC2, Lambda, Event Bridge, S3, Dashboard
- Common Utilities Python Lib (v1)
- A simple library that contains the common utilities to be used as part of multiple projects.geography
- Credential Keeping and Retrieving: This is based on AWS Secret Manager
- SQL Query Executor: Run SQL jobs while capturing and providing the job stats and logging them too.
- Logger: A file and table-based logger wrapper
- S3 Interaction: A wrapper containing functions for S3-based utility.
- Operations: A set of certain classes & functions for Path, Time, & Math-related Operations
- Data Evaluation & Clustering: Functions to evaluate & show performance of Regression & class-based data in much greater detail.
- Data Analysis: Function to identify Missing data with respect to groups, plotting the data, etc was present.
- Document Interaction: ability to connect & retrieve data from Quip
- Tags: AWS Services, DB, SQL, Evaluation, Logging, Analysis, data pipeline
- Capacity Planning Performance Evaluation, Analysis & Dashboard
- Developed and maintained capacity planning, evaluation, and analysis. It is robust enough to get multiple insights and even generate plans for current state, pilots, and areas for improvement.
- Helps better evaluate multiple components of delivery at multiple countries, stations, ship methods and cycles
- Tags: Time Series, EC2, QuickSight, S3, Excel
- Multiple Reports Creation and Data Pipelines Testing
- Created Reports for Middle Mile Performance Evaluation and tested/adjusted a few data pipelines for reporting.
Notable Contributions:
- Took over and stabilized the project with zero active maintainers.
- Led deployment expansion to multiple countries within 6 months.
- Optimized pipeline runtime and reporting latency for faster planning.
Core Responsibilities:
- Build ML POCs and MVPs for digital transformation in agriculture and supply chain.
- Collaborate with product managers and global business stakeholders.
- Transition pilot projects to other teams after successful MVP delivery.
Experience in this company:
- ML labs v2 setup
- Defining ways to work with ML projects and other applications having ML-related features
- Bringing sustainable practices for accelerated development while maintaining the agility
- Planning phase of data governance, 3rd Party Data Annotators, Portal development for data tagging, selection and versioning, data collection experiment designs and pipelines.
- Estimating Growth Indicator of Chicken (POC)
- Estimating the weight and age of the chicken based on a video recording done using the mobile camera.
- Tags: Agile Experiment Setup & Design, Object Detection, Image Standardizer, Estimator, Ensembling
- Defective Nugget Detection and Accounting (POC then MVP)
- Defective Nugget Detection and Tracking on conveyor belt with output projected on display for manual removal.
- Tags: Jetson Nano, Raspberry Pi, Ardino Uno, Edge device, Object Detection, Tracking, on-premise, real-time
- HR Chatbot (Myco) (MVP)
- Initial product version was launched for Singapore and Australia with the capability for form filling, ticket creation, ticket checking, easy data integration from excel sheet
- Tags: Rasa, Redis, NLU, NDM, NLG, NLP, Chatbot, excel
- Financial Commentary Generation (POC)
- A utility for table-to-text based on template method was developed for the initial stage, b/c of the lack and too much fragmentation of data source this approach was deemed fit.
- YOLOv5-v4 training and deployment integration on AWS, suitable to multiple tailored needs.
- Tags: Computer Vision, Sagemaker, ECR, S3, Lambda Function, API Gateway, IAM, CloudWatch
- Computer Vision Utility
- Streamlit-based web application for parsing, labelling, and visualizing annotations, with support for Yolo prediction.
- Ability to connect work with local, S3 and pcloud storage.
- MiApp feature involving weather data visualization using the GIS data.
- Information Extraction from Ship Invoices (POC)
- Extracting multiple information from the information that was shared in old and current emails.
- Tags: Text-to-Table, Analytics; Regex, Python Email to DB
Notable Contributions:
- Delivered MVPs that transitioned into internal products.
- Developed first successful internal vision pilot used for later projects.
- Groomed soft and business skills through direct stakeholder interaction.
Core Responsibilities:
- Develop ML-based bot detection frameworks.
- Build and optimize detection pipelines with focus on low FP and adaptive learning.
- Research and implement reinforcement, unsupervised, and semi-supervised solutions.
Experience in this company:
- ICLSSTA
- Solely developed a framework, which develop and evaluate unsupervised learning solution in much depth and this from a single config file.
- Tags: Framework, Clustering, Anomaly, Outlier, Conceptual Drift Detection, Ensembling, Support all Dimension reduction, Anomaly, and Clustering Algorithm from Sklearn, Add. Custom Algorithm
- Adaptive Action Taking (AAT)
- Developed a generalized module named which makes use of reinforcement learning to automatically take action on incoming traffic on the web property.
- Tags: Reinforcement Learning, Multi-Armed Bandit, Smart handling of Business Limits, Self-Adjusting
- Deep Behaviour Analysis (IDBA)
- Partially developed a module named Intent based, which utilizes LSTM to yield encoded features and scores that are used with supervised, anomaly and clustering.
- Tags: RNN, Auto-Encoder, Semi-Supervised Learning, Sequence Based Detection, Deep Learning
- Threshold Online Reinforcement (ThOR)
- Helped developed a module named, which makes utilizes online learning to develop a probability score distribution over of isotonic regression for action taking.
- Tags: Isotonic Regression, Online Learning, Self-Adjusting, Smart handling of Business Limits
- Second Level Module Integration
- Developed and implemented end to end machine and reinforce learning based solutions using ICLSSTA-AAT, IDBA-ICLSSTA-AAT, & IDBA-ThOR
- Tags: Semi-Supervised, Supervised, unsupervised, reinforcement learning, behaviour analysis
- Titan Batch Analyser
- Developed and Implemented, which utilizes multiple browser signatures, behaviour based rules, regular expressions and databases to generate a suspiciousness score to take action on traffic.
- Tags: SQL & Python Implementation, ETL, Advance SQL operations, Handling Multiple Datasets
- Rule Scripts
- Developed and Implemented multiple rules which were based on network, device and browser fingerprint, and behaviour of the visitors. These were adequately balanced between causal and correlated, yielding negligible FP.
- Tags: Domain knowledge, Behaviour Analysis, Rules development
- Rule Mining
- Created a Recommendation System for the highlighting possible bad signature coming in the traffic.
- Tags: Associate Rule Mining, Text Data Pre-processing
- Developing Dynamic Moving Collective Intelligence which can evolve while working across multiple sid.
- Other Works
- Developed Dashboard, Visualization and Analysis, spreadsheet to understand the whole behaviour of the traffic. Partially interactive visualization sheets were also developed in python.
- Tags: SQL and Spreadsheet/Python Implementation, Visualization, Result Sharing
Notable Contributions:
- First author and co-inventor on patented bot detection frameworks:
- “System and method for detecting bots based on iterative clustering and feedback-driven adaptive learning techniques” — US20200099713A1
- “System and method for detecting bots using semi-supervised deep learning techniques” — US20200099714A1
- Contributed IP and research that supported ShieldSquare’s acquisition by Radware.
Here’re some projects I’ve developed
articledoc – Firefox Extension for Medium-to-PDF [demo link]
Developed a Firefox extension that converts Medium articles into a clean, defined PDF template for easier reading, printing, and offline archiving.
Personal Site for Articles and Notes [demo link]
Static site for publishing long-form posts, reading notes, and technical deep-dives across ML, Edge AI, and software engineering topics.
LLM-Powered Gmail Email Agent
Emails fetcher, filtering, and summarization workflow for Gmail using LangChain and Groq—supports query-based search, smart filtering, and compact summaries of single or multiple emails.
WhatsApp Mart Chatbot (LLM-Powered)
Mart chatbot available over WhatsApp (via Twilio) for product search, filtering, comparison, cart building and management, and checkout/invoice support. Backend pipeline deployed on GCP with Kubernetes Event-Driven Autoscaling (KEDA), with a separate MindsDB pipeline powering advanced recommendation and query scenarios.
mcp-plots – MCP Server for Data Visualization [demo link]
A Model Context Protocol (MCP) server for data visualization that exposes tools to render charts (line, bar, pie, scatter, heatmap, etc.) from structured data. Returns plots as PNG images, base64-encoded images, or Mermaid diagrams; available via Smithery, Glama, PyPI, and the MCP registry.
IP camera (RTSP protocol) to consistent HTML Live Stream with offline Data Sync
Edge Device based CV application for counting of sheets processed in a workshop [demo link]
Sticker Detection and OCR [demo link]
Image to Image Search - Flicker8k [demo link]
Following Searchers were created
1. RCH - Regional Color Histogram
2. PTG - Pre Trained Grouping
3. TAE - Trained Auto Encoder
4. ICG - Image Caption Based Grouping (incomplete)
Image to Image Search - Jewellery [demo link]
Accounting for Pipe Bundles being Exported [demo link]
Face Detection, Face Landmark Detection and Face Recognition Wrappers [demo link]
Human Pose Estimation and Activity Recognition [demo link]
GIS Temperature and Precipitation Time based changes visualization [demo link]
WhatsApp based Chatbot (Hack around way)
Selenium based web scrapping and Web Application Control of Tinder & WhatsApp
Streamlit deployed app for basic Computer Vision applications. [demo link]
Streamlit based threshold adjustment for object detector and annotation visualizer
Supervised and Unsupervised Learning Framework
Easy selection of basic data pre-processing, transformation, dimension reduction/clustering/anomaly algorithm selection or pipeline creation, hyperparameters setting, ensemble, and EDA.
Other projects
- Flying Taxi Business Case, Udacity
- Build a Scalable Data Strategy, Udacity
-
Create an Iterative Design Path, Udacity
- Facial Keypoint Detection, Udacity
- Image Captioning, Udacity
-
Landmark Detection & Tracking (SLAM), Udacity
- Part of Speech Tagging, Udacity
- Build an Adversarial Game Playing Agent - Isolation game playing agent, Udacity
- Build a Forward-Planning Agent - Cargo route planning, Udacity
-
Sudoku solver, Udacity
- Twitter Sentiment Analysis
- Identifying customer segment
- Black Friday: Understanding the customer purchase behaviour, Analytics Vidhya
- Estimation of the audience score of movies, Coursera
- Predicting the way in which exercise were done, Coursera
- Forecasting bike rental demand in the Capital Bike Sharing Program, Kaggle
- Predicting Survival on the Titanic, Kaggle
Here’re the libraries that I’ve created
1. Algorithmic Machine Learning Exploration and Exploitation Tool (AMLEET)
A personal library compromised of general code base for faster develoment. It supports the following.
- General: Code Visual in Terminal; operations reated to list, dict, datetime, pandas and numpy; Runtime python cmdline support; Git status info; system performance; and many more.
- Logger: Initializing, Method.
- Configuration: Generator; Accepts json, yaml, cmdline, configparser and dict; Merge multiple configs; etc
- Notifications: Support sending Emails and SMS
- Storage: Ability to work with datalakes such as Pcloud, S3.
- DB Support: Provides functionalities to work with tables and DB. Additionally support GoogleBQ.
- Computer Vision:
- General
- Work With Streams
- Color Threshold
- Draw on Frame
- Manipulate Frame
- Image Transformation for Transfer
- Video to Images
- Images to Videos or GIF
- Annotation Conversions and Plotting
- Image Pixel Standardizations
- Multi Face Detection
- Multi Face Landmark
- Multi Recognition - best selection
- Multi Pose Estimation
- Training Data Creation
- NLP:
- Text Cleaning
- EDA
- Embedding Creation
- Tables
- Feature Analysis
- Custom Scaling and Transformation
- EDA
- Transformation
- Supervised & Unsupervised Learning
- Wrapper on scikit-learn
- Anomaly and outlier Detection
- Custom Algorithm support
- Framework to support config to select suitable algorithm and params
- Ensembling
- Support for multiple metadata export
- Evaluation
- Regression
- Classification
- Object Detection
- Sections Still in Development
- Redis Integration
- GIS Data Support
- Time Series
- Network information extraction
2. COCO Transformation Utility (CTU) [demo link]
Developed as a part of ML-Labs@Cargill. Awaiting for approval to be released as open source.
- Enable modifying your COCO annotations similair to the transformation applied to the images.
- Provides the capabilty to have augmented images and annoations.
- Ability to plot multiple annoations on images.
3. mcp-plots – MCP Server for Data Visualization [demo link]
A Model Context Protocol (MCP) server for data visualization that exposes tools to render charts (line, bar, pie, scatter, heatmap, etc.) from tabular data. Supports multiple output formats including PNG images, base64-encoded images, and Mermaid diagram syntax. Published on PyPI and listed on Smithery, Glama, and the MCP registry.
4. SyInfo – Simple System Information Library [demo link]
A simple, well-structured Python library for gathering system information including hardware specifications, network configuration, and real-time system monitoring. Provides analysis utilities for log search, package inventory, and basic diagnostics to support observability and troubleshooting workflows.
- First Author. “System and method for detecting bots based on iterative clustering and feedback-driven adaptive learning techniques”; us US20200099713A1; link.
- Second Author. “System and method for detecting bots using semi-supervised deep learning techniques”; us US20200099714A1; link.
- First Author. “Electric field and current assisted alignment of CNT inside polymer matrix and its effect on electrical and mechanical properties”, in International Journal for the Science and Technology of Polymers 2016; link.
- ML Languages/Framework
- Python, scikit-learn, TensorFlow, PyTorch, NumPy, pandas, LangChain, OpenCV, Streamlit, Flask, Dask
- Other Language
- R, SAS, JAVA, C++
- Shell & OS
- Linux/Unix Shell, PowerShell
- Other
- APIs, Google Big Query, Docker, Selenium, Redis, Grafana, HTML, CSS, Jenkins, Apache Spark, Apache Kafka
- Databases
- SQL, No-SQL, Postgres
- Software
- Tableau, Excel, XLMiner, Spyder, Jupyter Notebook, RStudio, Gimp, SolidWorks
- Cloud Computing Services
- Google Cloud Platform, Amazon Web Services, Microsoft Azure, Databricks, AVEVA PI System, Microservices Architecture
- Cloud Data lakes
- S3, Google Cloud Storage (GCS), Azure Data Lake Storage (ADLS), Google Drive, Pcloud, Delta Lake
- Hardware
- Jetson Nano, Raspberry Pi, Arduino Uno, IP Cameras, Sensors, Programmable Logic Controller (PLC), basic electrical devices and circuits
- New
- Postgres, Azure Data Factory, Databricks, Cursor, VS Code, Jupyter Lab, Excel-Addon, MCP, Cloud Functions, Pub/Sub, Compute Engine, FogLAMP/Fledge
- Credit Courses:
- Economics
- Corporate Social Responsibility
- Object Oriented Programming
- Management Concepts and Practices
- Marketing
- Human Resource Management
- Operations
- Financial Management
- Behaviour Psychology
- Hindi [Native]
- English [Professional]
- Reading
- Cooking
- PC Gaming
- Bike Trips
- Travelling