Title: PDF Metadata Mapping
Last modified: August 21, 2016

---

# PDF Metadata Mapping

 *  Resolved [kevinfraser](https://wordpress.org/support/users/kevinfraser/)
 * (@kevinfraser)
 * [13 years ago](https://wordpress.org/support/topic/pdf-metadata-mapping/)
 * David I love the plugin and it looks very promising, but I seem to have a problem.
 * I either am deficient in understanding how to use the feature, or there is some
   other reason it doesn’t work as the documentation describes. I have carefully
   read and re-read all the documentation I could find that seems relevant. I have
   tried it with a new wordpress installation with only the MLA plugin enabled to
   eliminate any kind of plugin interaction possibility.
 * I have a collection of only PDF files, all “purely” authored in latest version
   of Adobe Acrobat Pro CS6, so the PDF files themselves are as canonical I think
   as possible. Each has had metadata input using Acrobat. I have confirmed that
   Acrobat shows this metadata both in the first Acrobat ‘Document Info’ screen,
   and echoed to its ‘Additional Metadata’ button/tabs that are supposed to somehow
   use XML to handle both EXIF and IPTC fields. I have also verified the presence
   of the metadata using the independent PDFtk command line tool (q.v.:)
    ( [http://www.pdflabs.com/docs/pdftk-man-page](http://www.pdflabs.com/docs/pdftk-man-page))
 * My point is the metadata is filled in everywhere I can find that it should be
   and I can demonstrate it’s in these PDF files somewhere!
 * I cannot FOR THE LIFE of me figure out any mode or method to get the MLA to see
   any of that metadata in any of those PDF attachments, either during upload or
   using the MLA IPTC/EXIF tab to any standard WordPress field or Category or Tag
   fields.
 * I am intrigued by the ALL_EXIF and ALL_ITPC fields, but I can’t figure out from
   the documentation how I would use them to see this metadata in the PDF files:
   an example expressed as an [MLA-GALLERY] shortcode would be most appreciated.
 * SUMMARY: Demonstrably populated EXIF (and/or) ITPC metadata fields in a latest
   version Adobe-created PDF file, but seemingly unable to see that metadata using
   MLA plugin with any method I’ve discovered (so far).
 * Any ideas? I’m looking forward to slapping myself in the forehead over something
   cravenly simple.
 * [http://wordpress.org/extend/plugins/media-library-assistant/](http://wordpress.org/extend/plugins/media-library-assistant/)

Viewing 8 replies - 1 through 8 (of 8 total)

 *  Thread Starter [kevinfraser](https://wordpress.org/support/users/kevinfraser/)
 * (@kevinfraser)
 * [13 years ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3837807)
 * An idea occurred to me that WordPress may not be at the bleeding edge of PDF 
   authoring, so I tried converting one of the files to the Archivable (“/A”) Acrobat
   5 PDF 1.4 format. I verified the presence of the metadata, via Acrobat and also
   the PDFtk command line tool. Same result: no apparent visibility of any metadata
   in the PDF files via the MLA plugin. Is there a way you know of to test the WordPress
   image meta function? That is probably the first place to look. Please let me 
   know if you would like a test PDF file from me.
 *  Plugin Author [David Lingren](https://wordpress.org/support/users/dglingren/)
 * (@dglingren)
 * [13 years ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3837961)
 * Thank you for your interest in the plugin and for posting this question. As you
   suspected, WordPress isn’t up to speed on PDF authoring, and MLA hasn’t addressed
   this issue either.
 * Neither WordPress nor MLA looks for metadata in anything other than image files.
   WordPress 3.6 will extend this to audio and video files, but I haven’t seen anything
   about applying this functionality to PDF files.
 * This is a great idea for an MLA enhancement, and I am finishing up work on a 
   new version. I will have a serious look at how to extract metadata from PDF files
   and make it available in MLA.
 * I will keep you posted on my progress, adding comments to this topic. If you’d
   like to test a pre-release version, send me your e-mail address, using the “Contact
   Us” page at our web site:
 * [Fair Trade Judaica/Contact Us](http://fairtradejudaica.org/our-story/contact-us/)
 * Thanks for your interest, a great suggestion and your patience.
 *  Thread Starter [kevinfraser](https://wordpress.org/support/users/kevinfraser/)
 * (@kevinfraser)
 * [13 years ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3837963)
 * David thank you for your quick reply!
 * I have also been trying another plugin called MMWW
 * ([http://wordpress.org/plugins/mmww](http://wordpress.org/plugins/mmww))
 * Which is apparently being hampered by WordPress’s (in)ability to find all relevant
   XMP metadata in all PDF files as well. I got a near-instant reply from MMWW’s
   developer this morning, sent him some test files, and he’s having a look at it.
   He has licensed the Zend_PDF module to redistribute with his plugin. I don’t 
   know what your roadmap is, but if you haven’t seen it, maybe this will Offer 
   some clues?
    ( [http://framework.zend.com/manual/1.12/en/zend.pdf.info.html](http://framework.zend.com/manual/1.12/en/zend.pdf.info.html))
   Perhaps brings a solution closer and sooner?
 * Ollie is running into exactly the same problem – incomplete metadata retrieval
   from PDFs. Perhaps the two of you can share a solution to this problem?
 * Thanks for your pre-release offer — I’ll drop you my email address via your contact
   link and I’ll be happy to send you real world test files too if you need them.
 *  [OllieJones](https://wordpress.org/support/users/olliejones/)
 * (@olliejones)
 * [13 years ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3837984)
 * Kevin contacted me with a similar problem retrieving XMP metadata from his pdf
   files.
 * It turns out that some PDF files contain _multiple _ XMP metadata stanzas. (Each
   metadata stanza is a well-formed chunk of XML.)
 * The first one is pretty much a stub; but later ones contain the useful metadata.
   That accounts for [MMWW ](http://wordpress.org/plugins/mmww/)(my plugin)’s failure
   to read his metadata correctly. My version 1.0.2 corrects it.
 * Don’t hesitate to steal my opensource code if you want it. It doesn’t use the
   Zend framework; but it does use PHP’s xml handling.
 *  Thread Starter [kevinfraser](https://wordpress.org/support/users/kevinfraser/)
 * (@kevinfraser)
 * [13 years ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3837992)
 * This is just cool. Thanks, Ollie! I can’t wait to see what David does with your
   solution, too!
 *  Plugin Author [David Lingren](https://wordpress.org/support/users/dglingren/)
 * (@dglingren)
 * [12 years, 12 months ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3838014)
 * Ollie,
 * I’ve begun looking into adding PDF metadata support, and this will be a great
   help. Thank you for your offer and your generosity!
 * I will post my progress to this topic.
 *  Thread Starter [kevinfraser](https://wordpress.org/support/users/kevinfraser/)
 * (@kevinfraser)
 * [12 years, 12 months ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3838015)
 * David, Ollie’s code to pull metadata strings from XMP in a PDF definitely works!
   Haven’t tested it on anything except PDFs yet, but from looking at it I suspect
   it will also pull metadata from anything containing XMP, including humungous 
   media files — without causing server memory issues. Good approach to that, Ollie!
   Theoretically that will cover every supported media type that Adobe CS6 applications
   can save.
 * One little hack I already stuck in was to locate where Ollie formed the list 
   of returned XMP keywords/tags and change the delimiter from a semicolon to a 
   comma, which at least makes drag and drop input of those tags as a WP Media Library
   item easier.
 * It would really help me if I could just have those tags shoot right into the 
   WP db automagically on import, but this at least gets me closer!
 * Thank you both!
 *  Plugin Author [David Lingren](https://wordpress.org/support/users/dglingren/)
 * (@dglingren)
 * [12 years, 8 months ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3838178)
 * I have released version 1.50, which adds the ability to extract metadata from
   PDF documents.
 * Please let me know if you have any problems with or further questions about this
   new feature. Thanks again for your interest and for your suggestion.

Viewing 8 replies - 1 through 8 (of 8 total)

The topic ‘PDF Metadata Mapping’ is closed to new replies.

 * ![](https://ps.w.org/media-library-assistant/assets/icon-256x256.png?rev=973502)
 * [Media Library Assistant](https://wordpress.org/plugins/media-library-assistant/)
 * [Frequently Asked Questions](https://wordpress.org/plugins/media-library-assistant/#faq)
 * [Support Threads](https://wordpress.org/support/plugin/media-library-assistant/)
 * [Active Topics](https://wordpress.org/support/plugin/media-library-assistant/active/)
 * [Unresolved Topics](https://wordpress.org/support/plugin/media-library-assistant/unresolved/)
 * [Reviews](https://wordpress.org/support/plugin/media-library-assistant/reviews/)

## Tags

 * [metadata](https://wordpress.org/support/topic-tag/metadata/)
 * [mla](https://wordpress.org/support/topic-tag/mla/)
 * [pdf](https://wordpress.org/support/topic-tag/pdf/)

 * 8 replies
 * 3 participants
 * Last reply from: [David Lingren](https://wordpress.org/support/users/dglingren/)
 * Last activity: [12 years, 8 months ago](https://wordpress.org/support/topic/pdf-metadata-mapping/#post-3838178)
 * Status: resolved