graphs on the web

I recently asked for advice about the best format for posting a lot of graphs on a web site so they could be either read on line or downloaded and printed.  In case this information is more generally useful, I’m summarizing what I learned here.  A well-done graph can convey information much more clearly to the public than words or numbers.  Laying out a graph so it tells a story is hard enough.  But an added complication is that graphs that are most appealing on a computer screen or slide show can be illegible when printed and photocopied.  Lines or bars of different colors and even monochrome patterns or gray shades can all look alike when photocopied.  Paper really needs white backgrounds: even a light gray can hide graphical elements after something is photocopied a few times.  It is a lot of work to produce legible graphs, and having to do everything twice is really frustrating.  So the goal was to produce things once in a format that could either be viewed on line or be downloaded and printed and would look decent either way.

I decided Adobe Acrobat PDFs would be best as they are readable on the web but also downloadable and printable. I produced the graphs in Stata.  They were of two types: (1) Simple line graphs.  The trick here is to use markers and line patterns rather than color to distinguish the lines.   (2) Complex line graphs plotting trends for 30 states on the same axes.  The trick here is to use state initials as markers on the lines.  They are most distinguishable in color, but are legible when printed in monochrome.

Getting the graphs from Stata into PDF took some tricks.  The suggestion to have Stata generate png (Portable Network Graphics) graphs did not work well.  Stata-produced png files were too fuzzy for the complex graphs to be legible. Also, all png graphs produced by Stata included a blur of grey around all graphical elements that just looked a little fuzzy when ported into Adobe but printed as black when the PDF graph was printed in monochrome — making it completely illegible. I ended up producing the PDF files in two ways (for the two types of graphs): (1) For the simple graphs, I generated them as wmf (Windows Metafile) files in Stata then used my photo manager software (I use ACDSee which I originally got as a demo with some other software) & Adobe distiller to batch print the wmf files to PDF.  I printed the graphs two to a page as “contact sheets.” (2) For the complex color graphs, it worked best to have Stata produce ps (Postscript) files that Adobe could import directly as pages. As the ps files cannot be viewed or checked on screen, I saved all these graphs in two formats, so I could both preview and check them and then assemble into PDF.

Lessons to remember: (1) Public sociology stuff, if it is used at all, tends to get passed around and photocopied. Always test how something looks when printed in low-resolution monochrome. (2) Producing something that will work on both the web and paper is a set of trade-offs.  If you think about how to make graphs work for both formats from the beginning, it goes better. (3) The different graphics formats work differently in different software, and different types of graphs seemed to do better with different formats.  If you are going to do a big job, test single graphs in the full production cycle before generating all of them. (4) The more recent versions of Adobe Acrobat Pro have capabilities that make these jobs easier, including the ability to assemble a document from multiple files (graphics files plus PDFs with documentation produced by a word processor) and to automatically add page numbers, headers & footers or cover sheets to documents in “batch” mode.  What I did not do, but tested later and discovered would work, is assemble a document of graphs in Adobe Acrobat Pro and then print to Adobe (using the distiller) from inside Adobe printing multiple pages to a page.  (5) Importing something directly into Adobe gives a different result from printing to it with the distiller.  (6) I’m still producing manuscripts and dull static web pages, not pretty magazine-like layouts.  I’ll learn those skills some other year.

I’m not sure anyone will ever want to use the information, but it is up there now.  It is a bunch of graphs on racial trends in imprisonment.  I’d appreciate feedback on content or format, by email or comment.  If you want to see it and have not been able to figure out who I am, drop a comment and I’ll tell you by email.  (FYI I plan to drop the pretense of anonymity after a certain report is released in a few months.)


Author: olderwoman

I'm a sociology professor but not only a sociology professor. It isn't hard to figure out my real name if you want to, but I keep it out of this blog because I don't want my name associated with it in a Google search. Although I never write anything in a public forum like a blog that I'd be ashamed to have associated with my name (and you shouldn't either!), it is illegal for me to use my position as a public employee to advance my religious or political views, and the pseudonym helps to preserve the distinction between my public and private identities. The pseudonym also helps to protect the people I may write about in describing public or semi-public events I've been involved with.

7 thoughts on “graphs on the web”

  1. I appreciate learning from your experience.

    Your story points to the paucity of affordable software that generates high-quality academic graphs. I have heard of high-end packages; reviews don’t sound promising – and very small teaching colleges such as the one where I work are unlikely to purchase it anyway.

    Good graphing software would have a much smaller audience than word processing — but I wish someone in the OpenSource world would develop a passion for high quality graphics software.

  2. I use the emf (Enhanced MetaFile) rather than wmf format. Both have the advantage of being easily imported into editors like Illustrator for further tweaking. I agree than png in Stata is fuzzy and I don’t know enough about graphics to know why that doesn’t work better.

  3. Your story points to the paucity of affordable software that generates high-quality academic graphs.

    R is free and can be used to produce very high quality graphs of data. There’s also GNUplot.

  4. E: Is Adobe out of your school’s price range?
    J and K: But you still have to get the graphs into a document or web site. This can be a pain unless you can produce PDFs.
    Because he mostly writes math, my son uses an open source program (probably an open source version of LaTEX but I forget which) that can generate PDFs. I talked to him about it when I was trying to figure out what would work for my problem, but he told me that it is awkward to import graphics into it so it did not seem worth the bother of trying to learn it.

    Because I like Stata and its large user community, have invested in learning to use it, and can afford it (it is quite affordable under university site license agreements) I’m not switching, but if I were young and broke I’d invest the time to move into the open source community. However, those of you in that community should remember the learning-curve issues for newbies. If you are just trying to get one job done on a deadline, point and click is the way to go.

  5. There are several separate things here.

    – If you want a simple point and click application for good for occasionaly producing graphs of data, there are number of free or cheap options available, like Plot for the Mac.

    – R produces PDFs directly, without any need for conversion programs or what have you. (It can output most other graphics formats as well.) Is it really true that Stata can’t procuce PDF files?

    – Learning curves: Stata’s learning curve does not seem to me a whole lot flatter than R’s.

  6. Stata’s documentation says it produces PDFs, but apparently not under Windows, per the error message. Re learning curves, I agree Stata has one and that it is particularly steep for Stata graphics, although what I liked when I was learning Stata 10 years ago was that the on-line help and the web user community were (and still are) really helpful. I also learned a lot from a couple of grad students who worked for me. The point is that I’ve already learned Stata and I don’t want to start over. I’m not anti-R, I said if I were just starting out, I’d probably go to it. But I’m not just starting out.

    Sounds like someone (not me) could/should pull together a set of recommendations for folks like ebogue.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s