Status: Explanation delivered.Below is a concise, engineering-style summary of how Perplexity handles any file type in Google Drive when used as a search source.

Perplexity + Google Drive: How It Searches Any File

1. File Types It Can Search

Perplexity can process any file that Google Drive’s API can deliver as text or a readable binary:

PDF (text-based & OCR’d)
EPUB
Google Docs / Sheets / Slides
TXT / Markdown
Word / Excel / PowerPoint
CSV
HTML
Code files
Images with OCRable text (Drive provides OCR for many images) Binary-only files (e.g., ZIP, executables, audio, video) are not text-searchable, but filenames and metadata still are.

2. How the Search Works (Pipeline)

Step 1 — Live File Fetch

Perplexity performs a real-time API call to Google Drive:

Gets the file metadata and latest version ID.
Fetches the actual file content if the file type supports text extraction. No static snapshot is stored.

Step 2 — Text Extraction

Depending on file type:

Docs/Sheets/Slides → Google exports them as raw text via export endpoints.
PDF/EPUB/Word/PowerPoint → Perplexity extracts text or uses OCR if available.
Images → Google’s built-in OCR text is used (if the user enabled OCR).
Code/Text files → Direct raw text ingestion.

Step 3 — Chunking & Embedding

Perplexity temporarily chunks the text into segments and embeds them to perform:

Semantic search
Q&A style retrieval
Citation mapping Embeddings are cached transiently for performance but do not replace the live source.

Step 4 — Query Execution

Your query runs against the freshly-extracted content.
If you update the file, delete it, or add a new one, the next query reflects the change immediately.

3. Practical Implications

Yes, it will “scan” your PDF or EPUB in full.
Yes, it can read long books, manuals, technical files, etc.
Yes, updates to files are reflected instantly.
No, it doesn’t rely on outdated snapshots.
No, you don’t need to “sync” anything manually.

If you want, I can also list which Drive file types have the best extraction quality or any quirks (e.g., scanned PDFs, DRM-protected EPUBs, complex slides, or sheets with formulas).