Ashutosh Agrahari

Data Scientist

Siri Information Intelligence & Product Engineering

Apple

India

About Me

I am an experienced Data Science professional with 4+ years in machine learning, statistical modeling, and data-driven product development. At Apple, I developed over 200 on-device Siri capabilities, including Hindi language support, and created Tableau dashboards to analyze Siri India usage, influencing major product launches. I also developed data pipelines for audio data collection and transcription, optimizing processes and improving stakeholder coordination. Previously at NPCI, I implemented fraud detection models and a time-series forecasting model to ensure banking system stability. Skilled in machine learning, anomaly detection, and production-level model deployment, I am passionate about leveraging data science to drive business impact.

Professional Memberships

ACM Student Member : 009585

IEEE Signal Processing Society Affiliate: 101003093

IET, UK Member : 1100747856

Research Interests

Machine Learning

Deep Learning

Natural Language Processing

Computer Vision

Statistical Model & Analytics

Data Visualization

Skills

Languages: Python, SQL, Trino, HTML/CSS, JavaScript, Spark (Scala), R

Tools/Technologies: Git, Firebase, SuperSet, Tableau, Redis, TensorFlow, Visual Studio, XCode, Anaconda, Jupyter

Soft Skills: Self-reliance, technical curiosity, self-confidence, team-spirit, optimistic, responsible, keen observer

Operating Systems: macOS, Windows, Linux

Hobbies: Reading, Yoga, Fitness, Bike-riding, Cooking

Education

Amity University Lucknow Campus

Bachelor of Technology in Computer Science and Engineering (2016 - 2020)

CGPA: 9.39 on a scale of 10

City Montessori School, Gomti Nagar-I

Intermediate (2015) | High School (2013)

Intermediate: 92.50 % | High School: 92.60 %

Work Experience

Apple (Machine Learning Engineer)

Hyderabad, India, Jun'22 - Present

  • Apple Intelligence: Designed and curated edge case dataset to capture linguistic variations across English dialects. Evaluated internal LLM model performance post-updates and investigated production bugs arising from internal live-on testing.
  • On-device Siri: Developed 200+ on-device capabilities thereby improving Siri’s model responsiveness by an average of 10% via iterative data analysis, domain-level data modeling, and retraining cycles.
  • Product Release Quality workflow: Collaborated with global teams to evaluate Siri’s performance across locales; automated asset promotion workflow, reducing engineering effort by 80%. Applied LLM-based clustering to segment large bug sets, cutting manual triage time.
  • Hindi support for Siri Translation: Implemented client-side logic in SiriKit enabling Hindi support in Siri’s machine translation feature thereby expanding Siri’s capabilities to serve new markets.
  • AdHoc Analytics: Analyzed and surfaced insights from Siri usage patterns in India through Jupyter and Tableau dashboards driving rollout of the English & Hindi UI option within Siri.
  • Audio Data Collection: Led end-to-end internal user study to enhance Siri’s speech recognition; managed textual data creation, participant coordination, and cross-functional collaboration. Delivered high-quality audio data at a significantly lower cost than industry benchmarks.
  • Speech Audio Transcription: Led the transcription workflow for large-scale speech data; defined views and built scalable data management solutions to ensure transcription quality and efficiency.
  • National Payments Corporation of India (Data Science Associate)

    Hyderabad, India, Nov'20 - Jun'22

  • AML Detection (Collaboration with IIT-H ): Built graph-based anomaly detection models using Neo4j to analyze million-scale financial data, identifying fraudulent patterns and enhancing risk detection capabilities.
  • HNDP Forecasting Model: Developed a time-series forecasting model to predict HNDP (monetary reserves) for the Reserve Bank of India and NPCI, helping mitigate systemic risk in the banking sector.
  • IMPS Fraud Detection Model (Collaboration with IISc): Designed and implemented machine learning pipelines on large-scale imbalanced financial transaction data; engineered key features and managed Redis data pipelines for production deployment, improving fraud detection efficiency and saving around 50 lakhs per month.
  • Vestella (Research Intern)

    Seoul, South Korea, May'19 - Jul'19

  • Worked on Intelligent Traffic Systems, recommendation systems and Blockchain related projects.
  • Developed a hybrid self-adapting recommendation system, and an AI system to automatically track down the riders not wearing helmet.
  • CSIR-CDRI (Research Intern)

    Lucknow, India, Dec'18 - May'19

  • Worked on data visualization related projects.
  • Visualized important insights about USA Education system by mining and analyzing relevant USA education statistics.
  • BSNL (Summer Trainee)

    Lucknow, India, Jun'18

  • Learned about the technologies working behind the scenes that enable us to talk via cellphones in today’s world.
  • Publications

    Marvelous Hand: An IoT-Enabled Artificial Intelligence-Based Human-Centric Biosensor Design for Consumer Personal Security Application

    Ashutosh Agrahari; Ruchi Agarwal; Pawan Singh; Abhishek Singh Kilak; Deepak Gupta; Ankit Vidyarthi

    IEEE Transactions on Consumer Electronics, vol. 70, no. 1, pp. 1063-1070

    Prognosticating the effect on Unemployment rate in the post-pandemic India via Time-Series Forecasting and Least Squares Approximation

    Ashutosh Agrahari; Pawan Singh; Ankur Veer; Anshuman Singh; Ankit Vidyarthi; Baseem Khan

    Pattern Recognition Letters (pp. 172-179), Elsevier.

    Smart City Transportation Technologies: Automatic No-Helmet Penalizing System

    Ashutosh Agrahari and Dhananjay Singh

    Blockchain Technology for Smart Cities. Blockchain Technologies. Springer, Singapore, 2020

    Independent Projects

    Automatic Alert Generation in response to Gatherings during COVID-19 Pandemic

    Leveraged the Crowd-density Estimation algorithm training it on the ShanghaiTech dataset using CSRNet.

    Developed a system facilitating automatic alert generation on seeing a gathering of people more than a set threshold.

    Skin Lesion Segmentation

    Used FCNet to train Deep Learning model, and also used various normal image filtering techniques to generate segments.

    Image Toonifier

    Used various image processing and machine learning approaches to generate a somewhat toonified version of given image.

    Sudoku Vision

    A Flask-based web application for solving sudoku puzzle from its image.

    ATPC Tracker

    A Windows executable to bring all the announcements made on the Amity Placement Portal on one platform, whereby a student keep an eye on the announcements.

    Unlabeled News Topic Classifier

    System to classify unlabeled news articles using unsupervised means.

    Heart Disease Recognition Web App

    A web application for intelligently identifying if a patient is prone to a heart disease based on the pathological information.

    Pneumonia Detection from Chest X-Ray Images

    AI System to detect if a patient is prone to Pneumonia by analysing its Chest X-Ray Image.