Viren Jain

Viren Jain

Research Areas

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
Sexual dimorphism in the complete connectome of the Drosophila male central nervous system
Stuart Berg
Isabella R Beckett
Marta Costa
Philipp Schlegel
Elizabeth C Marin
Aljoscha Nern
Stephan Preibisch
Wei Qiu
Shin-ya Takemura
Andrew Champion
Reed A. George
Gary Huang
William Katz
Christopher Ordish
Ken Hayworth
Eric Trautman
Vivek Jayaraman
Wyatt Korff
Geoffrey W Meissner
Sandro Romani
Jan Funke
Christopher Knecht
Stephan Saalfeld
Louis Scheffer
Scott Waddell
Gwyneth Card
Carlos Ribeiro
Michael B. Reiser
Harald Hess
Gerry Rubin
Gregory S.X.E. Jefferis
bioRxiv (2026)
Preview abstract Sex differences in behaviour exist across all animals, typically under strong genetic regulation. In Drosophila, fruitless/doublesex transcription factors can identify dimorphic neurons but their organisation into functional circuits remains unclear. We present the connectome of the entire Drosophila male central nervous system. This contains 166,691 neurons spanning the brain and nerve cord, fully proofread and annotated including fruitless/doublesex expression and 11,691 types. We provide the first comprehensive comparison between male and female brain connectomes to synaptic resolution, finding 7,205 isomorphic, 114 dimorphic, 262 male-specific and 69 female-specific types. This resource enables analysis of full sensory-to-motor circuits underlying complex behaviours and the impact of dimorphic elements. Sex-specific/dimorphic neurons are concentrated in higher brain centres while the sensory and motor periphery are largely isomorphic. Within higher centres, male-specific connections are organised into hotspots defined by male-specific neurons or arbours. Numerous circuit switches reroute sensory information to form antagonistic circuits controlling opposing behaviours. (Full author list included with the paper.) View details
Preview abstract Biological neurons come in many shapes. High-fidelity generative modeling of their varied morphologies is challenging yet underexplored in neuroscience, and crucial for the subfield of connectomics. We introduce MoGen (Neuronal Morphology Generation), a flow matching model to generate high-resolution 3D point clouds of mouse cortex axon and dendrite fragments. This is enabled by an adaptation that injects local geometric context into a scalable latent transformer backbone, allowing for the generation of high-fidelity, realistic samples. To assess MoGen's generation quality, we propose a dedicated evaluation suite with interpretable geometric and topological features tailored to neuronal structures that we validate in a user study. MoGen's practical utility is showcased through controllable generation for visualization via smooth interpolation and a direct downstream application: we augment the training set of a shape plausibility classifier from a production connectomics neuron reconstruction pipeline with millions of generated samples, thereby improving classifier accuracy and reducing the number of remaining split and merge errors by 4.4%. We estimate this can reduce manual proofreading labor by over 157 person-years for reconstruction of a full mouse brain. View details
ZAPBench: a benchmark for whole-brain activity prediction in zebrafish
Alex Immer
Alex Bo-Yuan Chen
Mariela Petkova
Nirmala Iyer
Luuk Hesselink
Aparna Dev
Gudrun Ihrke
Woohyun Park
Alyson Petruncio
Aubrey Weigel
Wyatt Korff
Florian Engert
Jeff W. Lichtman
Misha Ahrens
International Conference on Learning Representations (ICLR) (2025)
Preview abstract Data-driven benchmarks have led to significant progress in key scientific modeling domains including weather and structural biology. Here, we present the Zebrafish Activity Prediction Benchmark (ZAPBench), which quantitatively measures progress on the problem of predicting cellular-resolution neural activity throughout an entire vertebrate brain. The benchmark is based on a novel dataset containing 4d light-sheet microscopy recordings of more than 70,000 neurons in a larval zebrafish brain, along with motion stabilized and voxel-level cell segmentations of these data that facilitate development of a variety of forecasting methods. Initial results from a selection of time series and volumetric video modeling approaches achieve better performance than naive baseline methods, but also show room for further improvement. The specific brain used in the activity recording is also undergoing synaptic-level anatomical mapping, which will enable future integration of detailed structural information into ZAP forecasting methods. View details
CURIE: Evaluating LLMs on multitask long context scientific understanding and reasoning
Hao Cui
Zahra Shamsi
Gowoon Cheon
Xuejian Ma
Shutong Li
Maria Tikhanovskaya
Nayantara Mudur
Paul Raccuglia
Victor V. Albert
Pranesh Srinivasan
Haining Pan
Philippe Faist
Brian Rohr
Ekin Dogus Cubuk
Muratahan Aykol
Amil Merchant
Michael Statt
Drew Purves
Elise Kleeman
Ruth Alcantara
Matthew Abraham
Muqthar Mohammad
Ean Phing VanLee
Chenfei Jiang
Lizzie Dorfman
Eun-Ah Kim
International Conference on Learning Representations (ICLR) (2025)
Preview abstract The core of the scientific problem-solving process involves synthesizing information while applying expert knowledge. Large Language Models (LLMs) have the potential to accelerate this process due to their extensive knowledge across a variety of domains. Recent advancements have also made it possible for LLMs to handle very long "in-context" content. However, existing evaluations of long-context LLMs have focused on assessing their ability to summarize or retrieve information within the given context, primarily in generalist tasks that do not require deep scientific expertise. To facilitate analogous assessments of domain-specific tasks, we introduce the scientific long-Context Understanding and Reasoning Inference Evaluations (CURIE) benchmark. This benchmark provides a set of 8 challenging tasks, derived from around 250 scientific research papers, requiring domain expertise, comprehension of long in-context information, and multi-step reasoning that tests the ability of LLMs to assist scientists in realistic workflows. Tasks in CURIE have been collected from experts in six disciplines - materials science, theoretical condensed matter physics, quantum computing, geospatial analysis, biodiversity, and protein sequencing - covering both experimental and theoretical workflows in science. We evaluate a range of closed and open LLMs on these tasks. Additionally, we propose strategies for task decomposition, which allow for a more nuanced evaluation of the models and facilitate staged multi-step assessments. We hope that insights gained from CURIE can guide the future development of LLMs. View details
ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish
Alexander Immer
Alex Bo-Yuan Chen
Mariela D. Petkova
Nirmala A. Iyer
Luuk Willem Hesselink
Aparna Dev
Gudrun Ihrke
Woohyun Park
Alyson Petruncio
Aubrey Weigel
Wyatt Korff
Florian Engert
Jeff W. Lichtman
Misha B. Ahrens
International Conference on Learning Representations (ICLR) (2025)
Preview abstract Data-driven benchmarks have led to significant progress in key scientific modeling domains including weather and structural biology. Here, we present the Zebrafish Activity Prediction Benchmark (ZAPBench), which quantitatively measures progress on the problem of predicting cellular-resolution neural activity throughout an entire vertebrate brain. The benchmark is based on a novel dataset containing 4d light-sheet microscopy recordings of more than 70,000 neurons in a larval zebrafish brain, along with motion stabilized and voxel-level cell segmentations of these data that facilitate development of a variety of forecasting methods. Initial results from a selection of time series and volumetric video modeling approaches achieve better performance than naive baseline methods, but also show room for further improvement. The specific brain used in the activity recording is also undergoing synaptic-level anatomical mapping, which will enable future integration of detailed structural information into ZAP forecasting methods. View details
Light-microscopy-based dense connectomic reconstruction of mammalian brain tissue
Mojtaba R. Tavakoli
Julia Lyudchik
Vitali Vistunou
Nathalie Agudelo Duenas
Jakob Vorlaufer
Christoph Sommer
Caroline Kreuzinger
Barbara de Souza Oliveira
Alban Cenameri
Gaia Novarino
Johann Danzl
Nature (2025)
Preview abstract The information-processing capability of the brain’s cellular network depends on the physical wiring pattern between neurons and their molecular and functional characteristics. Charting neurons and resolving the individual synaptic connections requires volumetric imaging at nanoscale resolution and comprehensive cellular contrast. Light microscopy is uniquely positioned to visualize specific molecules but dense, synapse-level circuit reconstruction by light microscopy has been out of reach due to limitations in resolution, contrast, and volumetric imaging capability. Here we developed light-microscopy based connectomics (LICONN). We integrated hydrogel embedding and expansion with comprehensive deep-learning based segmentation and analysis of connectivity, thus directly incorporating molecular information in synapse-level brain tissue reconstructions. LICONN will allow synapse-level brain tissue phenotyping in biological experiments in a readily adoptable manner. View details
A petavoxel fragment of human cerebral cortex reconstructed at nanoscale resolution
Alex Shapson-Coe
Daniel R. Berger
Yuelong Wu
Richard L. Schalek
Shuohong Wang
Neha Karlupia
Sven Dorkenwald
Evelina Sjostedt
Dongil Lee
Luke Bailey
Angerica Fitzmaurice
Rohin Kar
Benjamin Field
Hank Wu
Julian Wagner-Carena
David Aley
Joanna Lau
Zudi Lin
Donglai Wei
Hanspeter Pfister
Adi Peleg
Jeff W. Lichtman
Science (2024)
Preview abstract To fully understand how the human brain works, knowledge of its structure at high resolution is needed. Presented here is a computationally intensive reconstruction of the ultrastructure of a cubic millimeter of human temporal cortex that was surgically removed to gain access to an underlying epileptic focus. It contains about 57,000 cells, about 230 millimeters of blood vessels, and about 150 million synapses and comprises 1.4 petabytes. Our analysis showed that glia outnumber neurons 2:1, oligodendrocytes were the most common cell, deep layer excitatory neurons could be classified on the basis of dendritic orientation, and among thousands of weak connections to each neuron, there exist rare powerful axonal inputs of up to 50 synapses. Further studies using this resource may bring valuable insights into the mysteries of the human brain. View details
Multi-Layered Maps of Neuropil with Segmentation Guided Contrastive Learning
Sven Dorkenwald
Daniel R. Berger
Agnes L. Bodor
Forrest Collman
Casey M. Schneider-Mizell
Nuno Maçarico da Costa
Jeff W. Lichtman
Nature Methods (2023)
Preview abstract Maps of the nervous system that identify individual cells along with their type, subcellular components and connectivity have the potential to elucidate fundamental organizational principles of neural circuits. Nanometer-resolution imaging of brain tissue provides the necessary raw data, but inferring cellular and subcellular annotation layers is challenging. We present segmentation-guided contrastive learning of representations (SegCLR), a self-supervised machine learning technique that produces representations of cells directly from 3D imagery and segmentations. When applied to volumes of human and mouse cortex, SegCLR enables accurate classification of cellular subcompartments and achieves performance equivalent to a supervised approach while requiring 400-fold fewer labeled examples. SegCLR also enables inference of cell types from fragments as small as 10 μm, which enhances the utility of volumes in which many neurites are truncated at boundaries. Finally, SegCLR enables exploration of layer 5 pyramidal cell subtypes and automated large-scale analysis of synaptic partners in mouse visual cortex. View details
Preview abstract Early machine-learning systems were inspired by neural networks — now AI might allow neuroscientists to get to grips with the brain’s unique complexities. View details
SyConn2: dense synaptic connectivity inference for volume electron microscopy
Philipp J. Schubert
Sven Dorkenwald
Jonathan Klimesch
Fabian Svara
Andrei Mancu
Hashir Ahmad
Michale S. Fee
Joergen Kornfeld
Nature Methods, 19 (2022), 1367–1370
Preview abstract The ability to acquire ever larger datasets of brain tissue using volume electron microscopy leads to an increasing demand for the automated extraction of connectomic information. We introduce SyConn2, an open-source connectome analysis toolkit, which works with both on-site high-performance compute environments and rentable cloud computing clusters. SyConn2 was tested on connectomic datasets with more than 10 million synapses, provides a web-based visualization interface and makes these data amenable to complex anatomical and neuronal connectivity queries. View details
×