Join our book community on Discord
https://packt.link/EarlyAccessCommunity
When studying transformer models, we tend to focus on their architecture and the datasets provided to train them. This book covers the original Transformer, BERT, RoBERTa, ChatGPT, GPT4, PaLM, LaMBDA, DALL-E, and more. In addition, the book reviews several benchmark tasks and datasets. We have fine-tuned a BERT-like model and trained a RoBERTa tokenizer using tokenizers to encode data. In the previous Chapter 9, Shattering the Black Box with Interpretable Tools, we also shattered the black box and analyzed the inner workings of a transformer model.However, we did not explore the critical role tokenizers play and evaluate how they shape the models we build. AI is data-driven. Raffel et al. (2019), like all the authors cited in this book, spent time preparing datasets for transformer models. In this chapter, we will go through some of the issues of tokenizers that hinder or boost the performance of transformer...