I’m a researcher working at the intersection of machine learning and
software engineering. I currently work as a staff software engineer in
Google’s DevAI team, where we build machine learning systems to make
Google developers more productive.
I received a PhD in Computer Science (CS) from MIT under the
supervision of Martin Rinard, an MS in CS from NYU working with Dennis
Shasha, and a BA in Economics from University of Pennsylvania. I’m
originally from Costa Rica.
[1] Shraddha Barke, Christian Poelitz, Carina Suzana Negreanu, Benjamin
Zorn, José Cambronero, Andrew D Gordon, Vu Le, Elnaz Nouri, Nadia
Polikarpova, Advait Sarkar, others 2024. Solving data-centric tasks
using large language models. NAACL 2024. (2024).
[2] Yuzhang Tian, Jianbo Zhao, Haoyu Dong, Junyu Xiong, Shiyu Xia,
Mengyu Zhou, Yun Lin, José Cambronero, Yeye He, Shi Han, others 2024.
SpreadsheetLLM: Encoding spreadsheets for large language models.
arXiv preprint arXiv:2407.09025 (to appear EMNLP 2024. (2024).
[3] Mukul Singh, José Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu,
Gust Verbruggen 2023.
CodeFusion:
A pre-trained diffusion model for code generation.
Proceedings
of the 2023 conference on empirical methods in natural language
processing (Singapore, Dec. 2023), 11697–11708.
[4] Mukul Singh, José Cambronero Sánchez, Sumit Gulwani, Vu Le, Carina
Negreanu, Mohammad Raza, Gust Verbruggen 2023. Cornet: Learning table
formatting rules by example.
Proc. VLDB Endow. 16, 10 (Jun.
2023), 2632–2644. DOI:https://doi.org/
10.14778/3603581.3603600.
[5] Andrew D Gordon, Carina Negreanu, José Cambronero, Rasika
Chakravarthy, Ian Drosos, Hao Fang, Bhaskar Mitra, Hannah Richardson,
Advait Sarkar, Stephanie Simmons, others 2023. Co-audit: Tools to help
humans double-check AI-generated content. arXiv preprint
arXiv:2310.01297 (PLATEAU 2024). (2023).
[6] Mukul Singh, José Cambronero, Sumit Gulwani, Vu Le, Gust Verbruggen
2023.
EmFore: Online
learning of email folder classification rules.
Proceedings of
the 32nd ACM international conference on information and knowledge
management (New York, NY, USA, 2023), 2280–2290.
[7] Harshit Joshi, Abishai Ebenezer, José Cambronero, Sumit Gulwani,
Aditya Kanade, Vu Le, Ivan Radiček, Gust Verbruggen 2023. FLAME: A small
language model for spreadsheet formulas. arXiv preprint
arXiv:2301.13779 (AAAI 2024). (2023).
[8] José Cambronero, Sumit Gulwani, Vu Le, Daniel Perelman, Arjun
Radhakrishna, Clint Simon, Ashish Tiwari 2023.
FlashFill++: Scaling programming
by example by cutting to the chase.
Proceedings of the ACM on
Programming Languages. 7, POPL (2023), 952–981.
[9] Mukul Singh, José Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu,
Elnaz Nouri, Mohammad Raza, Gust Verbruggen 2023. FormaT5: Abstention
and examples for conditional table formatting with natural language.
VLDB 2024. (2023).
[10] Tung Phung, José Cambronero, Sumit Gulwani, Tobias Kohn, Rupak
Majumdar, Adish Singla, Gustavo Soares 2023.
Generating high-precision
feedback for programming syntax errors using large language models.
EDM 2023. (2023).
[11] Tung Phung, Victor-Alexandru Pădurean, José Cambronero, Sumit
Gulwani, Tobias Kohn, Rupak Majumdar, Adish Singla, Gustavo Soares 2023.
Generative AI for
programming education: Benchmarking ChatGPT, GPT-4, and human
tutors.
Proceedings of the 2023 ACM conference on international
computing education research - volume 2 (New York, NY, USA, 2023),
41–42.
[12] Harshit Joshi, José Cambronero Sanchez, Sumit Gulwani, Vu Le, Ivan
Radiček, Gust Verbruggen 2023.
Repair is nearly
generation: Multilingual program repair with LLMs.
Proceedings
of the thirty-seventh AAAI conference on artificial intelligence and
thirty-fifth conference on innovative applications of artificial
intelligence and thirteenth symposium on educational advances in
artificial intelligence (2023).
[13] Ananya Singha, José Cambronero, Sumit Gulwani, Vu Le, Chris Parnin
2023. Tabular representation, noisy operators, and impacts on table
structure understanding tasks in LLMs. arXiv preprint
arXiv:2310.10358 (Table Representation Learning at NeurIPS 2023).
(2023).
[14] Rohan Bavishi, Harshit Joshi, José Cambronero, Anna Fariha, Sumit
Gulwani, Vu Le, Ivan Radiček, Ashish Tiwari 2022. Neurosymbolic repair
for low-code formula languages.
Proc. ACM Program. Lang. 6,
OOPSLA2 (Oct. 2022). DOI:https://doi.org/
10.1145/3563327.
[15] Bram Wasti, José Pablo Cambronero, Benoit Steiner, Hugh Leather,
Aleksandar Zlateski 2022. LoopStack: A lightweight tensor algebra
compiler stack. arXiv preprint arXiv:2205.00618. (2022).
[16] Jialu Zhang, José Cambronero, Sumit Gulwani, Vu Le, Ruzica Piskac,
Gustavo Soares, Gust Verbruggen 2022. Repairing bugs in python
assignments using large language models. OOPSLA 2024. (2022).
[17] Limor Appelbaum, Alexandra Berg, Jose Pablo Cambronero, Thurston
Hou Yeen Dang, Charles Chuan Jin, Lori Zhang, Steven Kundrot, Matvey
Palchuk, Laura A Evans, Irving D Kaplan, others 2021. Development of a
pancreatic cancer prediction model using a multinational medical records
database. American Society of Clinical Oncology.
[18] Fatjon Zogaj, José Pablo Cambronero, Martin C Rinard, Jürgen Cito
2021.
Doing more
with less: Characterizing dataset downsampling for AutoML.
Proceedings of the VLDB Endowment. 14, 11 (2021), 2059–2072.
[19] Thurston HY Dang, Jose P Cambronero, Martin C Rinard 2021.
Inferring drop-in binary parsers from program executions. arXiv
preprint arXiv:2104.09669. (2021).
[20] Malavika Samak, Jose Pablo Cambronero, Martin C Rinard 2021.
Searching for replacement classes. arXiv preprint
arXiv:2110.05638. (2021).
[21] José P Cambronero, Jürgen Cito, Martin C Rinard 2020.
AMS: Generating
AutoML search spaces from weak specifications.
Proceedings of
the 28th ACM joint meeting on european software engineering conference
and symposium on the foundations of software engineering (2020),
763–774.
[22] Limor Appelbaum, Jose Pablo Cambronero, Karla Pollick, George
Silva, Jennifer P Stevens, Harvey J Mamon, Irving D Kaplan, Martin
Rinard 2020. Development and validation of a pancreatic cancer
prediction model from electronic health records using machine learning.
American Society of Clinical Oncology.
[23] Limor Appelbaum, José P Cambronero, Jennifer P Stevens, Steven
Horng, Karla Pollick, George Silva, Sebastien Haneuse, Gail Piatkowski,
Nordine Benhaga, Stacey Duey, others 2020.
Development
and validation of a pancreatic cancer risk model for the general
population using electronic health records: An observational study.
European Journal of Cancer. 143, (2020), 19–30.
[24] José P Cambronero, Thurston HY Dang, Nikos Vasilakis, Jiasi Shen,
Jerry Wu, Martin C Rinard 2019.
Active learning for
software engineering.
Proceedings of the 2019 ACM SIGPLAN
international symposium on new ideas, new paradigms, and reflections on
programming and software (2019), 62–78.
[25] José P Cambronero, Martin C Rinard 2019.
AL: Autogenerating supervised
learning programs.
Proceedings of the ACM on Programming
Languages. 3, OOPSLA (2019), 1–28.
[26] José Pablo Cambronero, Jiasi Shen, Jürgen Cito, Elena Glassman,
Martin Rinard 2019.
Characterizing
developer use of automatically generated patches.
2019 IEEE
symposium on visual languages and human-centric computing (VL/HCC)
(2019), 181–185.
[27] Jose Cambronero, Hongyu Li, Seohyun Kim, Koushik Sen, Satish
Chandra 2019.
When
deep learning met code search.
Proceedings of the 2019 27th ACM
joint meeting on european software engineering conference and symposium
on the foundations of software engineering (2019), 964–974.
[28] Jose Cambronero, Phillip Stanley-Marbell, Martin Rinard 2018.
Incremental color quantization for color-vision-deficient observers
using mobile gaming data. arXiv preprint arXiv:1803.08420.
(2018).
[29] José Cambronero, John K Feser, Micah J Smith, Samuel Madden 2017.
Query
optimization for dynamic imputation.
Proceedings of the VLDB
Endowment. 10, 11 (2017), 1310–1321.