Research
Preprints
George Henderson*, Adam Gudys*, Tavor Z. Baharav, Punit Sundaramurthy, Marek Kokot, Peter L. Wang, Sebastian Deorowicz, Allison Carey, Julia Salzman. “Ultra-efficient, unified discovery from microbial sequencing with SPLASH and precise statistical assembly”, 2024.
Roozbeh Dehghannasiri*, George Henderson*, Rob Bierman, Kaitlin Chaung, Tavor Z. Baharav, Peter Wang, Julia Salzman. “Unsupervised reference-free inference reveals unrecognized regulated transcriptomic complexity in human single cells”, 2022.
Journal Papers
Marek Kokot*, Roozbeh Dehghannasiri*, Tavor Z. Baharav, Julia Salzman, Sebastian Deorowicz. “Scalable and unsupervised discovery from raw sequencing reads using SPLASH2”. Nature Biotechnology, 2024. [code]
Tavor Z. Baharav, David Tse, Julia Salzman. “OASIS: An interpretable, finite sample valid alternative to Pearson's X2 for scientific discovery”. Proceedings of the National Academy of Sciences (PNAS) direct submission, 2024. [pypi package]
Kaitlin Chaung*, Tavor Z. Baharav*, George Henderson, Ivan Zheludev, Peter L. Wang, Julia Salzman. “SPLASH: A statistical, reference-free genomic algorithm unifies biological discovery”. Cell, 2023. [initial code], [updated SPLASH2 code]
Tavor Z. Baharav, Tze Leung Lai, “Adaptive Data Depth via Multi-Armed Bandits”. Journal of Machine Learning Research (JMLR), 2023. [code]
Vivek Bagaria*, Tavor Z. Baharav*, Govinda M. Kamath*, David N. Tse, “Bandit-Based Monte Carlo Optimization for Nearest Neighbors”. IEEE Journal on Selected Areas in Information Theory, 2021.
Ilai Bistritz, Tavor Z. Baharav, Amir Leshem, Nicholas Bambos, “One for All and All for One: Distributed Learning of Fair Allocations with Multi-player Bandits”. IEEE Journal on Selected Areas in Information Theory, 2021.
Tavor Z. Baharav*, Govinda M. Kamath*, David N. Tse, Ilan Shomorony, “Spectral Jaccard Similarity: A new approach to estimating pairwise sequence alignments”. Cell Patterns, 2020. [code]
Conference Papers
Tavor Z. Baharav*, Ryan Kang*, Colin Sullivan*, Mo Tiwari, Eric S. Luxenberg, David Tse, Mert Pilanci. “Adaptive Sampling for Efficient Softmax Approximation”. Advances in Neural Information Processing Systems (NeurIPS), 2024. [code]
Yifei Wang, Tavor Z. Baharav, Yanjun Han, Jiantao Jiao, David Tse. “Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits”. Advances in Neural Information Processing Systems (NeurIPS), 2022.
Tavor Z. Baharav, Gary Cheng, Mert Pilanci, David Tse.
“Approximate Function Evaluation via Multi-Armed Bandits”. International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.
Tavor Z. Baharav, Daniel L. Jiang, Kedarnath Kolluri, Sujay Sanghavi, and Inderjit S. Dhillon. “Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification”. ACM International Conference on Information and Knowledge Management (CIKM), 2021; oral presentation.
Govinda M. Kamath*, Tavor Z. Baharav*, Ilan Shomorony, “Adaptive Learning of Rank-One Models for Efficient Pairwise Sequence Alignment”. Advances in Neural Information Processing Systems (NeurIPS), 2020. [code]
Ilai Bistritz, Tavor Z. Baharav, Amir Leshem, Nicholas Bambos, “My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits”. International Conference on Machine Learning (ICML), 2020.
Tavor Z. Baharav*, Govinda M. Kamath*, David N. Tse, Ilan Shomorony, “Spectral Jaccard Similarity: A new approach to estimating pairwise sequence alignments”. International Conference on Research in Computational Molecular Biology (RECOMB), 2020. [code]
Tavor Z. Baharav, David N. Tse, “Ultra Fast Medoid Identification via Correlated Sequential Halving”. Advances in Neural Information Processing Systems (NeurIPS), 2019. [Poster] [code].
Tavor Baharav, Kangwook Lee, Orhan Ocal, Kannan Ramchandran, “Straggler-proofing massive-scale distributed matrix multiplication with d-dimensional product codes”. International Symposium on Information Theory (ISIT), 2018.
Tavor Baharav, Mohini Bariya, Avideh Zakhor, “In Situ Height and Width Estimation of Sorghum Plants from 2.5d Infrared Images”. Electronic Imaging (EI) 2017.
Miscellaneous
Software
OASIS_stat: python package installable via pip. Created 2024 based on OASIS: pypi, documentation, Github.
SPLASH: optimized package for reference-free genomic analysis. Created 2023, Github.
Presentations and talks
“A statistical reference-free genomic algorithm subsumes common workflows and enables novel discovery.” Cold Spring Harbor Laboratory: Biological Data Science meeting 2022. Platform presentation.
“Bandit-based Monte Carlo Optimization.” Cornell ORIE Young Researcher's workshop 2021. Poster.
“Bandit-based Monte Carlo Optimization for Nearest Neighbors.” Baylearn 2020 (Symposium). Poster.
“Adaptive Monte Carlo Optimization: Ultra Fast Medoid Identification via Correlated Sequential Halving.” Baylearn 2019 (Symposium). Poster.
“DAMN fast: DNA Alignment using Multi-armed baNdits: Spectral Jaccard Similarity for long-read alignment.” Intelligent Systems for Molecular Biology (ISMB/ECCB 2019). Poster.
“Ultra Fast Medoid Identification via Correlated Sequential Halving.” 2019 North American School of Information Theory (School). Poster.
Teaching
Information Theory (Stanford, EE276/Stats376a): Winter 2021-2022
Probability & Random Processes (Berkeley EECS 126): TA in Spring 2017, Head TA in Spring 2018
Discrete Math and Probability, Efficient Algorithms and Intractable Problems (Berkeley CS70 and CS170): reader in Fall 2015, Spring 2016 respectively
|