BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

Table of Contents

Title: BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining

Authors: Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu

Published: Oct 19 2022

Summary (Generated by Microsoft Copilot):

Introduction: BioGPT is a generative pre-trained Transformer model designed for biomedical text generation and mining, addressing the limitations of BERT-like models in generation tasks.

Challenges: Existing models like BioBERT and PubMedBERT excel in understanding tasks but lack generation capabilities, and GPT models perform poorly in the biomedical domain due to domain shift.

Methods: BioGPT is pre-trained on 15M PubMed abstracts and fine-tuned on six biomedical NLP tasks, using a Transformer decoder architecture and continuous embeddings for prompts.

Novelties: BioGPT introduces domain-specific pre-training from scratch, natural language target sequences, and soft prompts for better task adaptation.

Results: BioGPT outperforms previous models on most tasks, achieving state-of-the-art results in relation extraction, question answering, and document classification.

Performances: Significant improvements in F1 scores and accuracy across various biomedical NLP benchmarks.

Limitations: The model size is limited to GPT-2 medium due to computational constraints.

Discussion: Future work includes scaling BioGPT to larger models and datasets for broader applications in biomedical NLP.