
Jacob West-Roberts
Articles
-
Nov 21, 2024 |
biorxiv.org | Nishant Jha |Joshua Kravitz |Jacob West-Roberts |Antonio Pedro Camargo
AbstractProtein sequence similarity search is fundamental to genomics research, but current methods are typically not able to consider crucial genomic context information that can be indicative of protein function, especially in microbial systems. Here we present Gaia (Genomic AI Annotator), a sequence annotation platform that enables rapid, context-aware protein sequence search across genomic datasets.
-
Oct 1, 2024 |
biorxiv.org | Andre L Cornman |Jacob West-Roberts |Antonio Pedro Camargo |Simon Roux
AbstractBiological language model performance depends heavily on pretraining data quality, diversity, and size. While metagenomic datasets feature enormous biological diversity, their utilization as pretraining data has been limited due to challenges in data accessibility, quality filtering and deduplication.
-
Aug 17, 2024 |
biorxiv.org | Andre L Cornman |Jacob West-Roberts |Antonio Pedro Camargo |Simon Roux
AbstractBiological language model performance depends heavily on pretraining data quality, diversity, and size. While metagenomic datasets feature enormous biological diversity, their utilization as pretraining data has been limited due to challenges in data accessibility, quality filtering and deduplication.
-
Jul 16, 2024 |
biorxiv.org | Jacob West-Roberts |Nishant Jha |Andre L Cornman |Joshua Kravitz
AbstractBiological foundation models hold significant promise for deciphering complex biological functions. However, evaluating their performance on functional tasks remains challenging due to the lack of standardized benchmarks encompassing diverse sequences and functions. Existing functional annotations are often scarce, biased, and susceptible to train-test leakage, hindering robust evaluation.
Try JournoFinder For Free
Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.
Start Your 7-Day Free Trial →