Read Pdf Metadata

Pdf File Metadata
Bash Read Pdf Metadata
Sharepoint Read Pdf Metadata
Extracting Metadata From Pdf File

The metadata is stored as a file in XMP format. (To use the saved metadata in another PDF, open the document and use these instructions to replace or append metadata in the document.) To save the metadata as a template, choose Save Metadata Template from the dialog box menu in the upper right corner, and name the file. Welcome to the free online metadata reader. With this free online tool you can extract metadata from files of arbitrary type. Metadata might contain the name and login of the author, the creation date or other interesting details.

-->

Some image files contain metadata that you can read to determine features of the image. For example, a digital photograph might contain metadata that you can read to determine the make and model of the camera used to capture the image. With GDI+, you can read existing metadata, and you can also write new metadata to image files.

GDI+ stores an individual piece of metadata in a PropertyItem object. You can read the PropertyItems property of an Image object to retrieve all the metadata from a file. The PropertyItems property returns an array of PropertyItem objects.

A PropertyItem object has the following four properties: Id, Value, Len, and Type.

Id

A tag that identifies the metadata item. Some values that can be assigned to Id are shown in the following table.

Hexadecimal value	Description
0x0320 0x010F 0x0110 0x9003 0x829A 0x5090 0x5091	Image title Equipment manufacturer Equipment model ExifDTOriginal Exif exposure time Luminance table Chrominance table

Value

An array of values. The format of the values is determined by the Type property.

Len

The length (in bytes) of the array of values pointed to by the Value property.

Type

The data type of the values in the array pointed to by the Value property. The formats indicated by the Type property values are shown in the following table

Numeric value	Description
1	A `Byte`
2	An array of `Byte` objects encoded as ASCII
3	A 16-bit integer
4	A 32-bit integer
5	An array of two `Byte` objects that represent a rational number
6	Not used
7	Undefined
8	Not used
9	`SLong`
10	`SRational`

Example

Description

The following code example reads and displays the seven pieces of metadata in the file FakePhoto.jpg. The second (index 1) property item in the list has Id 0x010F (equipment manufacturer) and Type 2 (ASCII-encoded byte array). The code example displays the value of that property item.

The code produces output similar to the following:

Code

Compiling the Code

The preceding example is designed for use with Windows Forms, and it requires PaintEventArgse, which is a parameter of the Paint event handler. Handle the form's Paint event and paste this code into the paint event handler. You must replace FakePhoto.jpg with an image name and path valid on your system and import the System.Drawing.Imaging namespace.

6 Answers

The Zend framework includes Zend_Pdf, which makes this really easy:

Limitations: Works only on files without encryption smaller then 16MB.

Community♦

user113292

Don't know about libraries, but a simple way to achieve the same result might be fopening the file and parsing everything that comes after the last 'endstream'.

Try to open a pdf on a text editor, a parser shouldn't take more than five lines.

user113292

Pdf File Metadata

cbrandolinocbrandolino

5,0322 gold badges15 silver badges27 bronze badges

PDF Parser does exactly what you want and it's pretty straightforward to use:

You can try it in the demo page.

Alessandro CosentinoAlessandro Cosentino

I was looking for the same thing today. And I came across a small PHP class over at http://de77.com/ that offers a quick and dirty solution. You can download the class directly. Output is UTF-8 encoded.

The creator says:

Here’s a PHP class I wrote which can be used to get title & author and a number of pages of any PDF file. It does not use any external application - just pure PHP.

For me, it work's! All thanks goes solely to the creator of the class ... well, maybe just a little bit thanks to me too for finding the class ;)

maxpower9000maxpower9000

joan16v

3,7822 gold badges41 silver badges43 bronze badges

ved uniyalasved uniyalas

Bash Read Pdf Metadata

You may use PDFtk to extract the page count:

Sharepoint Read Pdf Metadata

If ImageMagick is available you may also use:

Execute in PHP via shell_exec():

Extracting Metadata From Pdf File