About

Bio

I am a Ph.D. student at NYU Data Science working with the Center for Social Media and Politics with interests in simulation-based inference and causality. I use probabilistic programming to conduct social network analysis, with a focus on understanding misinformation spread and control.

I hold a Bachelor’s degree in Computer Engineering and worked at Adobe Research on trending hashtag recommendation (accepted at ISM 2021) and the European Organisation for Nuclear Research (CERN) on particle physics. I’ve worked on graph neural networks, machine learning for high-energy physics, recommender systems, machine learning in cybersecurity, and have a working knowledge of language models and deep convolutional neural networks.

Social Network Simulation

  • Talk at Facebook Research: Probability Org., Nov 2021.

I am building SimPPL - a social network simulator to demonstrate how to combine heterogeneous datasets in a principled manner so as to create an expressive model of online social networks that is conditioned on real-world data. It is part of my ongoing research on misinformation control to highlight the applications of such a tool towards understanding the diffusion of information and the evolution of beliefs on platforms like Facebook and Twitter. There are a rich set of downstream applications of such a simulator, including interventions to curb misinformation spread, and the causal modeling of online user behavior. I am collaborating with the Torr Vision Group at Oxford on the applications.

Probabilistic Programming and COVID-19 Models

This project offers an exposition of COVID-19 modeling techniques based on the ideas and problem setup highlighted in Wood et al., (2020). We define a generative model corresponding to our intuition about epidemiological modeling using the probabilistic programming framework Pyro and apply probabilistic inference to draw insights into controlling the COVID-19 pandemic through interventions. In particular, we estimate the confidence intervals for the outbreak parameters to ensure that a predetermined goal is achieved. We are not epidemiologists; the sole aim of this study is to serve as a guide to generative modeling, not to draw inference about real-world impact of policy-making for COVID-19.

Recommender Systems, Graph Neural Nets

  • Paper: Open-domain Trending Hashtag Recommendation for Videos, S. Mehta et al., IEEE International Symposium on Multimedia (2021)
  • Patent: Under review, Adobe Research

Recommendations determine the type, ranking, and placement of most content appearing on our screen ranging from social networks to e-commerce sites and advertisements. A laser focus on personalization has led to a plethora of issues from bias and lack of interpretability to filter bubbles and echo chambers. My work at Adobe dealt with a zero-shot prediction problem, building a production-ready graph attention-network based system and a novel hashtag matching algorithm that, in combination, effectively matched trending hashtags with relevant videos for improving content discovery via all Adobe products. Part of our motivation was to develop a tool to address the content discovery problem exacerbated by poorly designed recommendations. Furthermore, I am using open-source recommendation systems in SimPPL: A Social Network Simulator with Probabilistic Programs in order to simulate content spread and shilling attacks (bad actors using fake reviews to boost virality) and stimulate research on detection and control.

ML x Particle Physics, Graph Neural Nets

I’ve also developed strategies using the cornerstone of artificial intelligence to advance the natural sciences. I used to work on graph-based approaches to particle track reconstruction (similar to the TrackML Challenge on Kaggle) - specifically using the representation of 3D point cloud data as a (lower-dimension) graph followed by training a graph neural network on it, possibly conditioned on additional physical information (meta data). Problems in high-energy physics and science in general prove to be a rich testbed for statistical machine learning and Bayesian inference. It is exciting to see a growing focus on making this area more practical especially as optimization toolkits and features are released within popular frameworks.

Natural Language Processing, 3D Modeling, Cybersecurity

As part of my Bachelor’s thesis, my teammates and I designed a framework to prototype chatbots with context-based question-answering models based on Jack the Reader (ACL, 2018).

I have also built an automated assessment tool for Blender (3D Modeling) Assignments, and a dynamic, automated workflow for an award-winning cybersecurity tool, IllusionBlack.

Knowledge Transfer: DJ Unicode

I am passionate about knowledge transfer actively working with a student-run organisation that I co-founded. Unicode was born of the need for skill development at the grassroots level in addition to the need for a rapport between college freshmen, sophomores, and juniors at universities that don’t offer such opportunities by means of the coourse structure. Our aim is to extend the ‘summer-of-code’ workflow to the rest of the year helping our students to build a strong foundational understanding of software development. I’m leading the expansion of our mentorship into teaching math and statistics for machine learning through comprehensive reading groups on standard texts in the subject.

Unicode started in 2017 with 15-20 students separated into 5 teams based on their projects. Today, we are a thriving community of 200+ members, with teams winning hackathons, students receiving international internship offers, multiple selections for Google Summer of Code each year, and alumni at Ivy League universities and FAANG companies in the USA!

Unicode Research

I founded a research arm within Unicode, focused on doing collaborative research in statistics and machine learning, with particular emphasis on AI for social good. This includes extensions of projects by students in our ML Summer Course, and ideas by Unicode students and collaborators. I was joined by Dr. Akash Srivastava from the MIT-IBM AI Lab to help teach the students about deep generative models in a palatable fashion, introducing them to probabilistic machine learning.

Ongoing research projects at Unicode Research include estimating the causal effect of mentorship on student career outcomes, social network analysis using probabilistic machine learning, and other topics in deep generative modeling.

Personal

  • Look up my tech articles published in the Open Source for You (OSFY) Magazine

  • I (like to think that I) am an artist.

  • Apart from cooking, and biking, I spend time:
  • I enjoy participating in hackathons where you are likely to find me scrounging food around midnight. I’m partial to a steaming cup of sweet, milky tea (also termed ‘cutting chai’ by the Indian streetside tea stalls).

  • I like to run the occasional marathon.

Sites

Feel free to browse through some of my older posts on Blogger and [defunct] The CCDev Blog.