Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Vincent A. Cicirello
Vincent A. Cicirello

Posted on

     

Add Metadata to a PDF Using pdfLaTeX

Inlast week's post, I explained how LaTeX is my tool of choice for all forms of writing. And I also provided a tip on how you can use it to combine multiple pdf files into one, regardless of whether or not any of them were originally produced with LaTeX. In this post, I provide another tip related to manipulating pdf files that again uses LaTeX, regardless of whether or not the pdf files were originally produced with it. This tip concerns adding or changing metadata embedded within the pdf, which may also be relevant to any web developers who find it useful to include on their sites some content in the form of pdf files (e.g., search engines do crawl and index the content of pdfs, including any embedded metadata).

So you have a pdf file that you want to add metadata to (e.g., author, title, subject, keywords) and for some reason you don't have an easy way to do so from the original source of the pdf. Here's an easy way to do this with pdfLaTeX. If you don't use LaTeX, don't worry about it. You don't really need to know LaTeX to use this trick, and I have arepository on GitHub with a LaTeX file you can edit with the details of the pdf and the metadata that you want to add to it. It doesn't matter how the original pdf was produced.

Table of Contents:

How to Add Metadata to a PDF with pdfLaTeX

Here are the steps to adding metadata to an existing pdf using pdfLaTeX.

Step 0: Install a LaTeX Distribution

If you don't have LaTeX installed on your system already, then you'll need to begin by installing a LaTeX distribution. For example,TeX Live is a good choice.

Step 1: Create a LaTex Source File

Create a LaTeX source file with atex extension, but name it differently than the pdf you are adding metadata to. I'll assume the namemetadata.tex in the example.

In thatmetadata.tex file (or whatever you named it), add the following with your favorite text editor.

\documentclass[11pt,letterpaper]{article}\usepackage[final]{pdfpages}\usepackage[pdftex,             pdfauthor={Your name possibly with coauthors goes here},            pdftitle={Your title goes here},            pdfsubject={Anything you want in the subject field goes here},            pdfkeywords={Your keywords go here},            pdfproducer={pdflatex or whatever you want for producer},            pdfcreator={pdflatex or whatever you want for creator}]{hyperref}\pagestyle{empty}\begin{document}\includepdf[pages=-]{originalFile.pdf}\end{document}
Enter fullscreen modeExit fullscreen mode

In the above example, we're using LaTeX's package hyperref, which has options that enable specifying metadata for the pdf. We first need thepdftex option of the hyperref package, which is required if we're using hyperref with pdfLaTeX, and then we can set any or all of the metadata fields inside the pdf as shown above. In the statement\includepdf[pages=-]{originalFile.pdf}, make sure you changeoriginalFile.pdf to however your original pdf is named.

Step 2: Run pdfLaTeX.

You can now use pdfLaTeX to create a pdf with the contents of your original pdf but with your additional metadata. At the command line, in the directory containing the LaTeX source file you created above and your existing pdf, run the following (change themetadata.tex file to whatever filename you used above):

pdflatex metadata.tex
Enter fullscreen modeExit fullscreen mode

This will produce a pdf namedmetadata.pdf, which you can easily rename as required. You can also start with thetex file named based on your desired target file.

Adding Metadata to a Combination of Multiple PDFs

If you want to add metadata while combining multiple pdf files into one, you can combine the above trick for the metadata with the trick from my previous post on using pdfLaTeX to combine multiple pdfs:

For example, yourtex file might look something like the following:

\documentclass[11pt,letterpaper]{article}\usepackage[final]{pdfpages}\usepackage[pdftex,             pdfauthor={Your name possibly with coauthors goes here},            pdftitle={Your title goes here},            pdfsubject={Anything you want in the subject field goes here},            pdfkeywords={Your keywords go here},            pdfproducer={pdflatex or whatever you want for producer},            pdfcreator={pdflatex or whatever you want for creator}]{hyperref}\pagestyle{empty}\begin{document}\includepdf[pages=-]{file1.pdf}\includepdf[pages=-]{file2.pdf}\includepdf[pages=-]{file3.pdf}\end{document}
Enter fullscreen modeExit fullscreen mode

The above assumes you are combining the pdfs in their entirety. You can of course also specify page ranges as needed. Finally, run the following command at the command line to generate your combined pdf with your desired metadata (just change the name of the tex file to whatever you named the file):

pdflatex metadata.tex
Enter fullscreen modeExit fullscreen mode

GitHub Repository

To get you started, I have a GitHub repository with a LaTeX file that you can download and edit with the details of your pdf and the metadata that you want to add to it.

GitHub logo cicirello / add-pdf-metadata

Add metadata to a pdf using pdflatex

add-pdf-metadata

Add metadata to a pdf using pdflatex regardless of how the original pdf was produced. Here are the steps:

  1. Make sure you have an up to date LaTeX system installed such asTeX Live.
  2. Read the comments in the fileAddMetadataToPdf.tex.
  3. Edit the line in that file where indicated with the name of the source pdf that you want to add metadata to.
  4. Runpdflatex AddMetadataToPdf.tex at the command line, which will produce a file namedAddMetadataToPdf.pdfwith the contents of the original pdf file, but with the addition of your specified metadata.
  5. Change the name of the original pdf if you want to keep it as a backup, or delete the original if you don't.
  6. RenameAddMetadataToPdf.pdf to the name of the original pdf file.
  7. Alternatively, you could rename the original pdf before the above procedure, and then renameAddMetadataToPdf.texbased on how you want the…




Where You Can Find Me

Follow me here on DEV:

Follow me on GitHub:

GitHub logo cicirello / cicirello

My GitHub Profile

Vincent A Cicirello

Vincent A. Cicirello

Sites where you can find me or my work
Web and social mediaPersonal WebsiteLinkedInDEV ProfileStack Overflow profileStackExchange profile
Software developmentGithubMaven CentralPyPIDocker Hub
PublicationsGoogle ScholarORCIDDBLPACM Digital LibraryIEEE XploreResearchGatearXiv
View Bibliometrics for My Research PublicationsMy bibliometrics
View My Detailed GitHub ActivityMy GitHub Activity

If you want to generate the equivalent to the above for your own GitHub profile,check out thecicirello/user-statisticianGitHub Action.




Or visit my website:

Vincent A. Cicirello - Professor of Computer Science

Vincent A. Cicirello - Professor of Computer Science at Stockton University - is aresearcher in artificial intelligence, evolutionary computation, swarm intelligence,and computational intelligence, with a Ph.D. in Robotics from Carnegie MellonUniversity. He is an ACM Senior Member, IEEE Senior Member, AAAI Life Member,EAI Distinguished Member, and SIAM Member.

favicon cicirello.org

Top comments(2)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss
CollapseExpand
 
davidhjr profile image
RJ
  • Joined

I'm new to LaTeX running into difficulty getting started in your steps using hyperref with pdfLaTeX, any advise on what I am missing?

CollapseExpand
 
cicirello profile image
Vincent A. Cicirello
Researcher and educator in A.I., algorithms, evolutionary computation, machine learning, and swarm intelligence
  • Location
    NJ
  • Education
    Ph.D. in Robotics, Carnegie Mellon University
  • Work
    Professor of Computer Science at Stockton University
  • Joined

What kind of difficulty? Are you getting an error message? If so, what's the error?

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Researcher and educator in A.I., algorithms, evolutionary computation, machine learning, and swarm intelligence
  • Location
    NJ
  • Education
    Ph.D. in Robotics, Carnegie Mellon University
  • Work
    Professor of Computer Science at Stockton University
  • Joined

More fromVincent A. Cicirello

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp