Ingesting data with Vertex AI Agent Builder
Vertex AI Agent Builder offers a managed way to ingest data using a service called DataStore. It automatically ingests and chunks a number of documents with few configuration parameters. In order to create a DataStore, you need to navigate to the Data Stores section inside Agent Builder in the Google Cloud console.
Figure 5.5: Create a new version of a custom model
Agent builder supports several source types, such as the following.
Website data
For website data, it’s important to check whether the web pages are blocked by robots.txt
and verify domain ownership for advanced indexing. Adding structured data such as meta tags and PageMaps can further enhance the indexing process.
Unstructured Data
Unstructured data, such as HTML, PDF, TXT, PPTX, and DOCX files, can be imported from Cloud Storage using the console, API, or streaming ingestion. File size limits apply, with a limit of 2.5 MB for...