Shreyas Rajesh

PhD student, UCLA

Shreyas Rajesh

About

I’m a PhD student at UCLA advised by Prof. Vwani Roychowdhury. I’m broadly interested in language modeling, memory systems for LLMs, and continual learning.

My current focus is on building memory systems for LLMs. We introduced Generative Semantic Workspaces (GSW) [5] — semantic representations that mirror how humans observe and track actors, roles, and states across evolving situations. Building on this, we showed that GSW can serve as an episodic memory for LLMs [2], outperforming standard RAG on long-document reasoning while using far fewer tokens. Most recently, Panini [1] is a non-parametric continual learning framework built on top of GSW as its memory store, achieving state-of-the-art performance across multihop question answering benchmarks while being significantly more token-efficient than competing approaches. I’m also interested in applying advances in language modeling to problems in the sciences [3] and in health [4].

I’ve also spent consecutive summers (and a bit more!) at NVIDIA building on-device agents for gamers. At NVIDIA, I worked across the entire stack from curating proprietary tool-use/reasoning data, parameter-efficient fine tuning of Small Language Models, Retrieval mechanisms for system info QA, and a host of optimizations to keep everything working in a tiny form factor!

Research interests

  • Language modeling and continual learning
  • Memory systems for LLMs
  • Information retrieval
  • Small language models and on-device agents
  • Applications of NLP in scientific and health domains

Papers

[1] Panini: Continual Learning in Token Space via Structured Memory

Under review at ICML 2026

Shreyas Rajesh, Pavan Holur, Mehmet Yigit Turali, Chenda Duan, Vwani Roychowdhury

We propose a continual learning framework where a fixed base model learns through an external semantic memory, representing documents as Generative Semantic Workspaces—networks of question-answer pairs that enable reasoning-grounded inference with superior performance while reducing token usage.

[2] Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces

AAAI 2026 Oral
NeurIPS 2025 — Language, Agents and World Models Spotlight

Shreyas Rajesh, Pavan Holur, Chenda Duan, David Chong, Vwani Roychowdhury

We introduce GSW, which equips LLMs with human-like episodic memory through structured semantic representations that track actors, roles, and states across space and time, outperforming standard RAG frameworks on episodic memory tasks.

[3] Embed-Search-Align: DNA sequence alignment using Transformer models

Bioinformatics 2025

Pavan Holur*, Kenneth C Enevoldsen*, Shreyas Rajesh, Lajoyce Mboning, Thalia Georgiou, Louis-S Bouchard, Matteo Pellegrini, Vwani Roychowdhury

We introduce DNARDE, a DNA language model trained with a contrastive objective for sequence alignment, pairing it with vector-store retrieval over nucleotide sequences. DNARDE outperforms prior transformer baselines on alignment and transfers across chromosomes and species.

[4] Customizing Open Source LLMs for Quantitative Medication Attribute Extraction across Heterogeneous EHR Systems

NeurIPS 2025 — GenAI for Health Workshop

Zhe Fei*, Mehmet Yigit Turali*, Shreyas Rajesh*, Xinyang Dai, Huyen Pham, Pavan Holur, Yuhui Zhu, Larissa Mooney, Yih-Ing Hser, Vwani Roychowdhury

We develop a framework using open-source LLMs to standardize opioid use disorder prescriptions from diverse EHRs, enabling consistent analysis across different EHR sources and formats.

[5] Creating an AI Observer: Generative Semantic Workspaces

arXiv

Pavan Holur, Shreyas Rajesh, David Chong, Vwani Roychowdhury

Initial explorations on what an evolving memory system could look like from the perspective of how an ideal AI would observe and build a semantic map of an ongoing situation.

* denotes equal contribution.

Reviewing

NeurIPS LAW '25, AAAI '26, ICML '26