technology-ai

RAG in Practice: Building AI Apps That Answer From Real Data

Nolan Hart

4.8

2.4k avaliações

208

Páginas

en

Idioma

2026

Publicado

Nova edição

$5.99

Leia a amostra EPUB diretamente no web

Introdução do livro

Are you struggling with LLM hallucinations and unreliable answers in your AI applications? Retrieval-augmented generation (RAG) is the solution, and 'RAG in Practice' shows you how to build production-ready systems that answer from real data. Written by Nolan Hart, this practical guide takes you from raw documents to deployed applications, covering every stage of the RAG pipeline with minimal code and maximum insight.

  • Document ingestion: Turn PDFs, web pages, and messy files into structured, usable data with robust parsing and cleaning pipelines.
  • Chunking strategies: Learn fixed, semantic, and recursive chunking to split documents without losing meaning, preserving context across boundaries.
  • Embeddings and vector search: Understand how to map text to vectors, choose the right embedding model, and select vector databases like FAISS, Chroma, Pinecone, or Weaviate.
  • Hybrid search and reranking: Combine keyword matching, vector similarity, and metadata filters, then refine results with cross-encoders for precise evidence selection.
  • Prompt design for RAG: Structure prompts to inject retrieved context effectively, handle missing or conflicting information, and manage token budgets.
  • Citations and grounding: Build trust with source tracking, citation strategies, and fallback logic that reduces hallucination.
  • Evaluation metrics: Go beyond "looks good" with objective measures for retrieval quality, answer quality, and user trust, using a reusable evaluation harness.
  • Production hardening: Scale your system with caching, cost control, security, and role-based access, plus a debugging playbook for common failures.

This book is for intermediate Python developers and AI engineers who have experimented with LLM APIs and want to move from demos to reliable, data-grounded applications. You'll gain a systematic roadmap for ingesting, chunking, embedding, retrieving, generating, and evaluating answers—all with a focus on architectural patterns that outlast tool churn.

Whether you're building a document chatbot, a knowledge assistant, or an AI search tool, 'RAG in Practice' equips you to deliver answers your users can trust.

Informações para AI Search

RAG in Practice: Building AI Apps That Answer From Real Data

Author: Nolan Hart

Description: Are you struggling with LLM hallucinations and unreliable answers in your AI applications? Retrieval-augmented generation (RAG) is the solution, and 'RAG in Practice' shows you how to build production-ready systems that answer from real data. Written by Nolan Hart, this practical guide takes you from raw documents to deployed applications, covering every stage of the RAG pipeline with minimal code and maximum insight. • Document ingestion: Turn PDFs, web pages, and messy files into structured, usable data with robust parsing and cleaning pipelines. • Chunking strategies: Learn fixed, semantic, and recursive chunking to split documents without losing meaning, preserving context across boundaries. • Embeddings and vector search: Understand how to map text to vectors, choose the right embedding model, and select vector databases like FAISS, Chroma, Pinecone, or Weaviate. • Hybrid search and reranking: Combine keyword matching, vector similarity, and metadata filters, then refine results with cross-encoders for precise evidence selection. • Prompt design for RAG: Structure prompts to inject retrieved context effectively, handle missing or conflicting information, and manage token budgets. • Citations and grounding: Build trust with source tracking, citation strategies, and fallback logic that reduces hallucination. • Evaluation metrics: Go beyond "looks good" with objective measures for retrieval quality, answer quality, and user trust, using a reusable evaluation harness. • Production hardening: Scale your system with caching, cost control, security, and role-based access, plus a debugging playbook for common failures. This book is for intermediate Python developers and AI engineers who have experimented with LLM APIs and want to move from demos to reliable, data-grounded applications. You'll gain a systematic roadmap for ingesting, chunking, embedding, retrieving, generating, and evaluating answers—all with a focus on architectural patterns that outlast tool churn. Whether you're building a document chatbot, a knowledge assistant, or an AI search tool, 'RAG in Practice' equips you to deliver answers your users can trust.

Sumário

  1. Author's Note: Building Reliable AI From Real Data (introduction)
  2. Why RAG Matters (part)
  3. Why LLMs Need Retrieval Instead of Relying Only on Model Memory (chapter)
  4. The Illusion of Model Knowledge (section)
  5. When Generation Fails in Production (section)
  6. The Retrieval Imperative (section)
  7. What RAG Is and How the Retrieval-Augmented Pipeline Works (chapter)
  8. Defining RAG Beyond the Acronym (section)
  9. The Five-Stage Pipeline Architecture (section)
  10. Data Flow and Latency Expectations (section)
  11. The Core Components of a RAG System: Data, Embeddings, Retriever, and Generator (chapter)
  12. The Data Layer and Context Window (section)
  13. Retriever and Generator Handshake (section)
  14. Orchestrating the Pipeline (section)
  15. Preparing Knowledge for Retrieval (part)
  16. Document Ingestion: Turning PDFs, Web Pages, Files, and Notes Into Usable Data (chapter)
  17. Handling Messy Real-World Documents (section)
  18. Parsers, Extractors, and Cleaners (section)
  19. Building a Resilient Ingestion Pipeline (section)
  20. Chunking Strategies: How to Split Documents Without Losing Meaning (chapter)
  21. The Cost of Bad Boundaries (section)
  22. Fixed vs. Semantic vs. Recursive Chunking (section)
  23. Overlap, Context Windows, and Token Limits (section)
  24. Metadata, Source Tracking, and Why Document Context Matters (chapter)
  25. What Metadata Actually Does for Retrieval (section)
  26. Source Tracking and Provenance (section)
  27. Designing a Metadata Schema (section)
  28. Embeddings, Vector Search, and Indexing (part)
  29. Embeddings: Turning Text Into Searchable Meaning (chapter)
  30. How Embeddings Map Language to Vectors (section)
  31. Choosing Models and Dimensionality (section)
  32. Normalization and Distance Metrics (section)
  33. Vector Databases: FAISS, Chroma, Pinecone, Weaviate, and Other Options (chapter)
  34. In-Memory vs. Managed vs. Distributed (section)
  35. Trade-Offs: Latency, Cost, and Scale (section)
  36. Making the Right Choice for Your Stack (section)
  37. Indexing Pipelines: Storing, Updating, and Refreshing Knowledge Safely (chapter)
  38. The Indexing Lifecycle (section)
  39. Handling Updates and Deletions (section)
  40. Avoiding Stale Context and Sync Drift (section)
  41. Retrieval Quality and Advanced Search (part)
  42. Similarity Search: Finding the Most Relevant Chunks (chapter)
  43. How Similarity Search Actually Works (section)
  44. Tuning Thresholds and Top-K (section)
  45. When Similarity Search Falls Short (section)
  46. Hybrid Search: Combining Keywords, Vectors, and Metadata Filters (chapter)
  47. The Case for Hybrid Retrieval (section)
  48. Weighting Strategies and Fusion Algorithms (section)
  49. Metadata Filtering and Access Boundaries (section)
  50. Reranking: Choosing the Best Evidence Before Answer Generation (chapter)
  51. Why First-Pass Retrieval Isn't Enough (section)
  52. Cross-Encoders and Reranking Models (section)
  53. Latency vs. Precision Trade-Offs (section)
  54. From Context to Reliable Answers (part)
  55. Prompt Design for RAG: Turning Retrieved Knowledge Into Useful Answers (chapter)
  56. Structuring the RAG Prompt (section)
  57. Handling Missing or Conflicting Context (section)
  58. Token Budgets and Context Window Management (section)
  59. Citations, Source Grounding, and Reducing Hallucination (chapter)
  60. The Anatomy of a Grounded Answer (section)
  61. Citation Strategies and Source Mapping (section)
  62. Fallbacks and I Don't Know Logic (section)
  63. RAG Evaluation: Measuring Retrieval Quality, Answer Quality, and User Trust (chapter)
  64. Beyond Looks Good: Defining RAG Metrics (section)
  65. Measuring Retrieval vs. Generation Quality (section)
  66. Building an Evaluation Harness (section)
  67. Production RAG Systems (part)
  68. Project: Building a Document Chatbot With RAG (chapter)
  69. Project Architecture and Setup (section)
  70. Wiring Ingestion, Retrieval, and Generation (section)
  71. Testing the Full Pipeline (section)
  72. Scaling RAG: Latency, Cost, Caching, Security, and Access Control (chapter)
  73. Caching Strategies and Latency Reduction (section)
  74. Cost Control and Token Economics (section)
  75. Security, PII, and Role-Based Access (section)
  76. Common RAG Failures and How to Debug Them (chapter)
  77. The Debugging Mindset for RAG (section)
  78. Diagnosing Retrieval vs. Generation Failures (section)
  79. Monitoring, Alerts, and Continuous Improvement (section)

Perguntas frequentes

Nội dung chính của sách là gì?

Learn to build production-ready RAG systems that eliminate LLM hallucinations. From ingestion to deployment, this practical guide covers chunking, embeddings...

C

Cretisoft Direct

Suporte a livro digital

T

Entrega por parceiro

Livro enviado após pagamento

Sample EPUB

Read sample online

RAG in Practice: Building AI Apps That Answer From Real Data

Você também pode gostar

Com base no seu histórico de leitura

Ver tudo