BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Table of Contents
Title: BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Authors: Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon, Tie-Yan Liu
Published: Oct 19 2022
Link: https://arxiv.org/abs/2210.10341
Summary (Generated by Microsoft Copilot):
Introduction: BioGPT is a generative pre-trained Transformer model designed for biomedical text generation and mining, addressing the limitations of BERT-like models in generation tasks.
Challenges: Existing models like BioBERT and PubMedBERT excel in understanding tasks but lack generation capabilities, and GPT models perform poorly in the biomedical domain due to domain shift.
Methods: BioGPT is pre-trained on 15M PubMed abstracts and fine-tuned on six biomedical NLP tasks, using a Transformer decoder architecture and continuous embeddings for prompts.
Novelties: BioGPT introduces domain-specific pre-training from scratch, natural language target sequences, and soft prompts for better task adaptation.
Results: BioGPT outperforms previous models on most tasks, achieving state-of-the-art results in relation extraction, question answering, and document classification.
Performances: Significant improvements in F1 scores and accuracy across various biomedical NLP benchmarks.
Limitations: The model size is limited to GPT-2 medium due to computational constraints.
Discussion: Future work includes scaling BioGPT to larger models and datasets for broader applications in biomedical NLP.