Two New Requests for Proposals: Understanding the Real-World Capabilities and Impacts of Large Language Models

Published: November 10, 2023 | by Ajeya Cotra

In the wake of surprisingly rapid progress in large language models (LLMs) like GPT-4, some experts have predicted that AI systems will be able to outperform human professionals at virtually all tasks within decades. Other experts are skeptical — they argue that LLMs’ capabilities have been overstated, and expect the technology to make a modest impact before running up against fundamental limitations.

To help build scientific understanding in this area, Open Philanthropy is looking to fund projects that will help us understand the capabilities and impacts of systems built from large language models (LLMs).

We are doing this through two separate requests for proposals (RFPs) — one on benchmarking LLM agents, and the other on studying and forecasting the impacts of LLM systems.

Anyone is eligible to apply, including those working in academia, nonprofits, or independently; we are also open to making restricted grants to projects housed within for-profit companies. We will evaluate applications on a rolling basis. See below for more details.

Benchmarking LLM agents

Through this RFP, we aim to fund benchmarks that measure how close LLM agents can get to performing consequential real-world tasks.

LLM agents are very new, and their impact has been limited so far, but well-functioning agents could have much more wide-ranging applications than LLM chatbots like GPT-4 or Claude. By the same token, they could pose more extensive risk than chatbots — executing plans, rather than merely creating them.

We hope to understand these potential outcomes by funding benchmarks that will reliably indicate whether and when LLM agents will be able to impact the world on a very large scale — for example, by replacing or outperforming humans in professions which account for a large share of the labor market.

See this page for the application link and more details on the RFP.

We also hosted a webinar to answer questions about this RFP on November 29 2023; the recording is here and the slides are here.

Studying and forecasting the real-world impacts of LLM systems

Through this RFP, we aim to fund a broad array of research projects (aside from benchmarks for LLM agents) that might shed light on what real-world impacts LLM systems could have over the next few years.

Examples of ideas that could make for a strong proposal:

Conducting randomized controlled trials to measure the extent to which access to LLM products can increase human productivity on real-world tasks.
Polling members of the public about whether and how much they use LLM products, what tasks they use them for, and how useful they find them to be.
Eliciting expert forecasts about what LLM systems are likely to be able to do in the near future and what risks they might pose.

See this page for the application link and more details on the RFP, including many additional examples of proposals that might interest us.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.