Skip to content
Document metadata

Extract the hidden metadata of a PDF file

View the embedded metadata of a PDF: title, author, subject and keywords, the software that created and produced it, the creation and modification dates, the format version, the page count and whether the file is encrypted. These often reveal more than you would expect about who generated the document and with what. The PDF is read in the browser: it is never uploaded to any server.

Drop a PDF here or click to choose one
PDF files only. The document stays in the browser.

How to read a PDF's metadata

  1. 1

    Load the PDF

    Drop the file into the drop zone or click to select it. Only PDF files are accepted; the document is read as binary in the browser.

  2. 2

    Read the document properties

    See title, author, subject and keywords: the fields the author (or the software) filled in. The title or author often reveal the real file or person name.

  3. 3

    Check software and dates

    Creator and Producer show which programs wrote and generated the PDF; the creation and modification dates tell its history. Useful clues for analysis and verification.

  4. 4

    Assess the technical side

    Format version, estimated page count, encryption and web optimisation complete the picture. Copy the summary if you need it for documentation.

What a PDF's metadata contains

A PDF file is not just the page you see: it holds a structure of objects and, usually, an information dictionary (the so-called /Info) with the document metadata. This tool reads the file as binary and extracts the standard fields: Title, Author, Subject, Keywords, Creator (the application that created the content) and Producer (the library or software that generated the final PDF), plus the creation and modification dates in the PDF format D:YYYYMMDDhhmmss.

This data is often more revealing than the author imagines. The Producer tells which library produced the file (and therefore which workflow, which version, sometimes which operating system); the Author can carry a real name even when the content is anonymous; the dates can contradict the document's official story. That is why PDF metadata is a classic starting point in technical analysis and verification, but also a privacy risk when publishing a document without cleaning it.

The tool also estimates technical data: the format version from the %PDF-1.x header, an estimated page count by counting the page objects, the presence of encryption (an /Encrypt dictionary) and whether the PDF is linearised for progressive web loading. Reading is based on parsing the format markers, so it is robust for most PDFs but stays an estimate for very complex files or those with compressed streams. Everything happens in the browser: the document is never sent to a server.

Glossary

Technical terms used on this page, briefly explained.

/Info dictionary #
The PDF section that holds the document metadata: title, author, subject, keywords, software, dates. It is the main source read by this tool.
Creator #
The application that created the original content (e.g. a word processor or graphics program) before conversion to PDF.
Producer #
The library or software that generated the final PDF file. It often reveals the version and environment of the document's production workflow.
PDF date (D:...) #
The date format in PDFs: D: followed by year, month, day, hour, minutes, seconds and an optional timezone. The tool converts it to readable form.
Linearisation #
A PDF reorganisation (Fast Web View) that lets the first page show before the whole file is downloaded. It signals web optimisation.
/Encrypt #
A dictionary indicating the PDF is encrypted, with an open password or restrictions (print, copy). Its presence marks a protected document.

PDF metadata FAQ

Is my PDF uploaded to a server?
No. The file is read as binary in the browser and the metadata is extracted locally. The document never leaves the device and is not sent anywhere. You can verify it in the Network tab of the developer tools, or use the tool offline.
Why do some PDFs show no metadata?
Because not all include it: raw scans, files generated by minimal tools or deliberately cleaned documents may carry no information dictionary. In that case only the technical data derivable from the structure is shown (version, pages, encryption).
Is the page count always exact?
It is a robust but not guaranteed estimate. It is derived by counting the page objects in the file. For PDFs with unusual structures or objects compressed in streams, the value may differ from the real one. For a certain count, open the document in a PDF reader.
What does the Producer field reveal about my privacy?
It tells which software and version generated the PDF, and sometimes hints at the system or workflow used. Together with Author and the dates, it is one of the things worth checking and, if needed, removing before publishing a document.
Can the tool remove the metadata?
No, it only reads it. Removing it requires tools that rewrite the PDF (many PDF editors have a metadata-cleaning feature, or you can reprint the document as a new PDF). This tool is for seeing what the file is revealing.
Does it work with encrypted or password-protected PDFs?
It detects the presence of encryption and tells you, but if the PDF is protected by an open password, the textual metadata may be encrypted and unreadable without the password. The technical structure usually stays inspectable.
Does it also read XMP metadata?
The tool focuses on the standard information dictionary, which covers the most common case. Some PDFs also include XMP metadata (an XML block); full XMP parsing is not the goal of this tool, designed for a quick, readable look.

Who builds these tools?

Maurizio Fonte, senior IT consultant with 20+ years in PHP, Laravel, unmanaged Linux infrastructure, applied cybersecurity and AI/LLM integration. Production backends, legacy code modernization, security audits, custom AI agents and MCP servers: the work behind every tool published here.

About Maurizio Fonte