You're reading from Building Data-Driven Applications with LlamaIndex A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications

Product type Paperback

Published in May 2024

Publisher Packt

ISBN-13 9781835089507

Length 368 pages

Edition 1st Edition

Languages

Python

Tools

LlamaIndex

Concepts

GPT/LLMs

Author (1):

Andrei Gheorghiu

View More author details

Table of Contents (18) Chapters

Preface

1. Part 1:Introduction to Generative AI and LlamaIndex

2. Chapter 1: Understanding Large Language Models FREE CHAPTER

3. Chapter 2: LlamaIndex: The Hidden Jewel - An Introduction to the LlamaIndex Ecosystem

4. Part 2: Starting Your First LlamaIndex Project

5. Chapter 3: Kickstarting Your Journey with LlamaIndex

6. Chapter 4: Ingesting Data into Our RAG Workflow

7. Chapter 5: Indexing with LlamaIndex

8. Part 3: Retrieving and Working with Indexed Data

9. Chapter 6: Querying Our Data, Part 1 – Context Retrieval

10. Chapter 7: Querying Our Data, Part 2 – Postprocessing and Response Synthesis

11. Chapter 8: Building Chatbots and Agents with LlamaIndex

12. Part 4: Customization, Prompt Engineering, and Final Words

13. Chapter 9: Customizing and Deploying Our LlamaIndex Project

14. Chapter 10: Prompt Engineering Guidelines and Best Practices

15. Chapter 11: Conclusion and Additional Resources

16. Index

Why subscribe?

17. Other Books You May Enjoy

Handling documents that contain a mix of text and tabular data

Data is not always simple. Many real-world documents, such as research papers, financial reports, and others, contain a mix of unstructured text, as well as structured tabular data in tables. Ingesting such heterogeneous documents presents an additional challenge - we need to not only extract text but also identify, parse, and process tables embedded within the text. Because, sometimes you get tables, sometimes you get text and sometimes you have to deal with a mix of both.

LlamaIndex provides UnstructuredElementNodeParser to tackle such documents containing both free-form text as well as tables and other structured elements. It leverages the Unstructured library to analyze the document layout and delineate text sections from tables.

This parser works exclusively on HTML files and can extract two types of nodes:

Text nodes: Containing the text chunks
Table nodes: Containing the table data and metadata...

The rest of the chapter is locked

You're reading from Building Data-Driven Applications with LlamaIndex A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications

Table of Contents (18) Chapters

Handling documents that contain a mix of text and tabular data

Authors (1)

Personalised recommendations for you

You're reading from Building Data-Driven Applications with LlamaIndex A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications

Table of Contents (18) Chapters

Handling documents that contain a mix of text and tabular data

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you