Research

Advancing Agricultural Intelligence through Scientific Rigor. Explore our peer-reviewed publications and technical breakthroughs redefining the frontiers of crop genomics and seed science.

SeedLLM·Rice: A large language model integrated with rice biological knowledge graph

Molecular Plant

Pub Date: 2025-07-07

DOI: 10.1016/j.molp.2025.05.013

Fan Yang, Huanjun Kong, Jie Ying, Zihong Chen, Tao Luo, Wanli Jiang, Zhonghang Yuan, Zhefan Wang, Zhaona Ma, Shikuan Wang, Wanfeng Ma, Xiaoyi Wang, Xiaoying Li, Zhengyin Hu, Xiaodong Ma, Minguo Liu, Xiqing Wang, Fan Chen, Nanqing Dong

Abstract

To address the growing complexity of rice biology research, driven by an explosion of literature and multi-omics data, we introduce SeedLLM·Rice, a 7-billion-parameter LLM trained on 1.4 million rice publications—covering nearly 98.24% of global research output. Unlike general-purpose models, SeedLLM achieves 57–88% win rates on rice-specific tasks, outperforming GPT-4o and DeepSeek-R1. It is integrated with the Rice Biological Knowledge Graph (RBKG), which consolidates genome annotations and multi-omics data from over 1,800 studies. We also introduce a novel human-centric evaluation framework to assess domain-specific LLM performance. Free access is provided via a web portal. SeedLLM represents a transformative tool for crop improvement and climate adaptation research.

Read Full Article

SeedBench: A Multi-task Benchmark for Evaluating Large Language Models in Seed Science

arXiv - CS - Computation and Language

Pub Date: 2025-05-19

DOI: arxiv-2505.13290

Jie Ying , Zihong Chen , Zhefan Wang , Wanli Jiang , Chenyang Wang , Zhonghang Yuan , Haoyang Su , Huanjun Kong , Fan Yang , Nanqing Dong

Abstract

Seed science is the cornerstone of global food security, yet its progress is often hindered by interdisciplinary complexity and a lack of specialized AI support. To bridge the gap between artificial intelligence and agricultural innovation, we introduce SeedBench, the first multi-task benchmark specifically engineered for seed science. Developed in close collaboration with industry experts, SeedBench simulates critical real-world breeding processes to rigorously test the capabilities of large language models (LLMs). Our comprehensive evaluation of 26 leading models—including proprietary, open-source, and domain-specific versions—reveals significant performance disparities in addressing complex gene-trait relationships and specialized breeding tasks. By highlighting these substantial gaps, this research provides a foundation for developing the next generation of LLMs tailored for seed design, establishing a standardized evaluation metric that paves the way for high-precision, AI-driven crop improvement and smarter agricultural solutions.

Read Full Article