Reading Between the Lines: Contract Terms, Text Features, and R&D Outcomes in Biopharmaceuticals
Carolina Biliotti, Filippo Chiarello, Giacomo Marzi, and Massimo Riccaboni. Draft available soon.
Abstract: We examine the relationship between licensing contracts and the success of biopharmaceutical trials. Using a unique dataset of licensing agreements, we combine structured deal characteristics – including company experience, products under development, technologies, and targeted indications – with textual representations of contract language, such as large language model embeddings, topic models, and TF–IDF features. Predictive models indicate that structured deal characteristics are the primary predictors of success in clinical development, while textual features offer complementary information by capturing latent project and partnership attributes. To examine associations between contractual payment structures and outcomes, we apply time-aware Double Machine Learning procedure with high-dimensional controls and textual covariates. Payments not conditional on development, such as upfront payments, are associated with higher success, whereas conditional payments show more variable relationships, reflecting their contingent nature. In contrast, product-specific deal value is negatively associated with success after adjustment (approximately -0.02 for 1 standard deviation increase), suggesting that larger financial commitments often correspond to inherently riskier projects rather than improved outcomes, consistent with selection effects in licensing markets. Our results suggest that contract terms primarily reflect underlying project quality and risk, rather than directly driving development success. By combining predictive modeling, textual analysis, and de-biased inference, this study advances the understanding of licensing outcomes.