The escalating demand for recombinant proteins has spurred the exploration of data science along with engineering strategies for selecting and optimising host cell lines. This encompasses comprehensive verification and sequence analysis of the target gene or protein, along with processes such as codon optimisation, vector construction, and clone/host selection each requiring meticulous consideration of numerous variables. Cambridge Healthtech Institute’s Using Data Science to Maximise Protein Production explores high-throughput expression systems, elucidates data organisation methodologies, outlines data-driven design strategies, and with streamlining the number of experiments, saving time and costs. Learn from seasoned, savvy protein and data scientists who are fostering wider adoption of deep learning models for cell line engineering, protein expression, and production.