- Pdf File Metadata
- Bash Read Pdf Metadata
- Sharepoint Read Pdf Metadata
- Extracting Metadata From Pdf File
The metadata is stored as a file in XMP format. (To use the saved metadata in another PDF, open the document and use these instructions to replace or append metadata in the document.) To save the metadata as a template, choose Save Metadata Template from the dialog box menu in the upper right corner, and name the file. Welcome to the free online metadata reader. With this free online tool you can extract metadata from files of arbitrary type. Metadata might contain the name and login of the author, the creation date or other interesting details.
-->
Some image files contain metadata that you can read to determine features of the image. For example, a digital photograph might contain metadata that you can read to determine the make and model of the camera used to capture the image. With GDI+, you can read existing metadata, and you can also write new metadata to image files.
GDI+ stores an individual piece of metadata in a PropertyItem object. You can read the PropertyItems property of an Image object to retrieve all the metadata from a file. The PropertyItems property returns an array of PropertyItem objects.
A PropertyItem object has the following four properties:
Id
, Value
, Len
, and Type
.Id
A tag that identifies the metadata item. Some values that can be assigned to Id are shown in the following table.
Hexadecimal value | Description |
---|---|
0x0320 0x010F 0x0110 0x9003 0x829A 0x5090 0x5091 | Image title Equipment manufacturer Equipment model ExifDTOriginal Exif exposure time Luminance table Chrominance table |
Value
An array of values. The format of the values is determined by the Type property.
Len
The length (in bytes) of the array of values pointed to by the Value property.
Type
The data type of the values in the array pointed to by the
Value
property. The formats indicated by the Type
property values are shown in the following tableNumeric value | Description |
---|---|
1 | A Byte |
2 | An array of Byte objects encoded as ASCII |
3 | A 16-bit integer |
4 | A 32-bit integer |
5 | An array of two Byte objects that represent a rational number |
6 | Not used |
7 | Undefined |
8 | Not used |
9 | SLong |
10 | SRational |
Example
Description
The following code example reads and displays the seven pieces of metadata in the file
FakePhoto.jpg
. The second (index 1) property item in the list has Id 0x010F (equipment manufacturer) and Type 2 (ASCII-encoded byte array). The code example displays the value of that property item.The code produces output similar to the following:
Code
Compiling the Code
The preceding example is designed for use with Windows Forms, and it requires PaintEventArgs
e
, which is a parameter of the Paint event handler. Handle the form's Paint event and paste this code into the paint event handler. You must replace FakePhoto.jpg
with an image name and path valid on your system and import the System.Drawing.Imaging
namespace.See also
Active2 years, 2 months ago
I'm trying to read metadata attached to arbitrary PDFs: title, author, subject, and keywords.
Is there a PHP library, preferably open-source, that can read PDF metadata? If so, or if there isn't, how would one use the library (or lack thereof) to extract the metadata?
To be clear, I'm not interested in creating or modifying PDFs or their metadata, and I don't care about the PDF bodies. I've looked at a number of libraries, including FPDF (which everyone seems to recommend), but it appears only to be for PDF creation, not metadata extraction.
user113292
6 Answers
The Zend framework includes Zend_Pdf, which makes this really easy:
Limitations: Works only on files without encryption smaller then 16MB.
Community♦
user113292
Don't know about libraries, but a simple way to achieve the same result might be fopening the file and parsing everything that comes after the last 'endstream'.
Try to open a pdf on a text editor, a parser shouldn't take more than five lines.
user113292
Pdf File Metadata
cbrandolinocbrandolino5,0322 gold badges15 silver badges27 bronze badges
PDF Parser does exactly what you want and it's pretty straightforward to use:
You can try it in the demo page.
Alessandro CosentinoAlessandro Cosentino
I was looking for the same thing today. And I came across a small PHP class over at http://de77.com/ that offers a quick and dirty solution. You can download the class directly. Output is UTF-8 encoded.
The creator says:
Here’s a PHP class I wrote which can be used to get title & author and a number of pages of any PDF file. It does not use any external application - just pure PHP.
For me, it work's! All thanks goes solely to the creator of the class ... well, maybe just a little bit thanks to me too for finding the class ;)
maxpower9000maxpower9000
joan16v3,7822 gold badges41 silver badges43 bronze badges
ved uniyalasved uniyalas
Bash Read Pdf Metadata
You may use PDFtk to extract the page count:
Sharepoint Read Pdf Metadata
If ImageMagick is available you may also use:
Execute in PHP via shell_exec():
Extracting Metadata From Pdf File
maxpower9000maxpower9000