Knowledge-augmented Methods for Natural Language Processing
English
By (author): Bill Yuchen Lin Chenguang Zhu Meng Jiang Shuohang Wang Wenhao Yu Yichong Xu
Over the last few years, natural language processing has seen remarkable progress due to the emergence of larger-scale models, better training techniques, and greater availability of data. Examples of these advancements include GPT-4, ChatGPT, and other pre-trained language models. These models are capable of characterizing linguistic patterns and generating context-aware representations, resulting in high-quality output. However, these models rely solely on input-output pairs during training and, therefore, struggle to incorporate external world knowledge, such as named entities, their relations, common sense, and domain-specific content. Incorporating knowledge into the training and inference of language models is critical to their ability to represent language accurately. Additionally, knowledge is essential in achieving higher levels of intelligence that cannot be attained through statistical learning of input text patterns alone. In this book, we will review recent developmentsin the field of natural language processing, specifically focusing on the role of knowledge in language representation. We will examine how pre-trained language models like GPT-4 and ChatGPT are limited in their ability to capture external world knowledge and explore various approaches to incorporate knowledge into language models. Additionally, we will discuss the significance of knowledge in enabling higher levels of intelligence that go beyond statistical learning on input text patterns. Overall, this survey aims to provide insights into the importance of knowledge in natural language processing and highlight recent advances in this field.
See more