Writing Complex Tables in Stata

Over at scatterplot, I asked for help with Stata code for automating descriptive tables. I did not fully solve the problem, but I am posting here a sample of the code that worked because when I searched for help, all I could find was partial syntax outlines that did not give enough information to help me avoid pitfalls. The particular feature of this problem is that it is non-standard: each row/column is a different subset of the data, not different variables or different statistics on the same variable, and each table is a different subset of the data. This code puts each table on a separate page (page breaks between them). Word or WordPerfect can parse the output tables and can turn them into tables with the “convert text to table” option available in both programs. (For some reason, however, Excel did not properly parse the page breaks, viewing them as unrecognized characters.  In any event, a spreadsheet is less desireable as a target for this application because spreadsheets treat all columns the same, and in this application there are different numbers of solumns and the columns have different widths in each table.) I have not successfully generated a macro in the word processor to mark and convert the tables — my attempts to do this with “record macro” in both word processors were unsuccessful because the recorded macro did not work the way the program worked during recording.  If you know how to write a Word or WordPerfect macro that will select a block of comma-deliimited text with varying numbers of rows and columns and convert it to a table with the option to resize columns to fit contents and then right-justify the columns (or, better, right-justify the right-most n of the columns while left-justifying the leftmost m of the columns), please drop a comment here.  (For you techies out there, I’ll tell you that LaTeX does not appear to be a particularly good solution for this kind of problem because LaTeX does not automatically wrap text in a table unless you pre-specify how wide you want the columns to be. Of course, there may be an add-on out there somewhere that would do it.)  Below  is a condensed version of the code that worked to generate the tables:

Here’s a condensed version of the code that worked
** code to create string variables so the tables will be self-labeling (some are already string)
decode offtype, gen(offtype_t)  // creates a string variable from the labels
levelsof county, local(ctylist)  // this syntax creates a list of all counties (county is a string variable)

** _n is new line _page is new page this first section basically just labels the output
file open outtables using agencytables.doc, write replace
file write outtables “Break down of arrests by agencies within counties.” _n
file write outtables “Insert explanatory material here. This will be the first page of the report.” _n
file write outtables _n
file write outtables _page

foreach c in `ctylist’ {  // list generated above
keep if county==”`c’ ”  // one county at a time

levelsof offtype_t, local(offlist)  // offense is inside county (does not have to be as it is the same for all)
levelsof agency, local(aglist)   // agency is inside county: different aglist for each county, string variable
foreach age in A J {
local A Adult  // this makes the syntax ` `age ’ ’ refer to these when the letter is used:  NOTE DOUBLE GRAVE TO OPEN AND DOUBLE APOSTROPHE TO CLOSE
local J Juvenile // ditto
file write outtables “`c’ County: ` `age ’  ’ arrests by agency and racial group compared to population” _n  //table caption  DOUBLED MACRO MARKS AGAIN
file write outtables “Race , Variable , Offense ”  // column headers for the IVs that will vary; note no comma at the end
foreach ag in `aglist’ {
file write outtables  “,” “`ag’ ”  // the rest of the column headers are agency names; note comma before
file write outtables  _n  // close agency headers with a new line
*now do variable loops
foreach r in White_Hisp Black Native Asian {  // spelling it out controls the order

*// the first five lines are omitted as the next five are more complex and show the strategy better
* line 6 % of pop
file write outtables “`r’ ”  “,” “% of Pop” “,” ” ”  // these are the string contents of the first three columns, third is blank for population
qui summ pop00 if race==”`r’ ” & adultorjuv==”`age’ ”   // county population
local totpop = r(mean)  // county pop is merged into all city records & is the same for all
foreach ag in `aglist’ {
qui summ popcit if race==”`r’ ” & adultorjuv==”`age’ ” & agency==”`ag’ ”  // city population differs by agency
local citpop=r(mean)
local val=`citpop’/`totpop’  // proportion city pop is of county pop
** // note that variables have to be in parentheses and can be preceded by format;
** // also note comma before entry is needed (if commas follow entries you end up with an empty column)
file write outtables “,” %4.2f  (`val’)
} // agency
file write outtables  _n  // close agency with new line

* lines 7-10 % of arrests
foreach off in `offlist’ {
file write outtables “`r’ ” “,” “% of Arrests” “,” “`off’ ”
qui summ arrests if race==”`r'” & adultorjuv==”`age’ ” & offtype_t==”`off ‘ ”
local ctot=r(sum) // total arrests across all agencies in the county
foreach ag in `aglist’ {
qui summ arrests if race==”`r’ ” & adultorjuv==”`age’ ” & offtype_t==”`off ‘ ” & agency==”`ag’ ”
local atot=r(mean)
local apct=`atot’/`ctot’
file write outtables  “,” %4.2f (`apct’)
file write outtables  _n  // close agency
} // offense
} // race
file write outtables  _page  // so each table will start on a new page
} // age
} // county

file close outtables

NOTES: (1) This code assumes you know how to use macros, which are delimited by ` and  ‘.  (Those should be a grave and an apostrophe. I have had a devil of a time because my default style sheet keeps changing the characters.) I had to add some  spaces in the code to keep the HTML editor from turning them into double quotes. (2) To learn how to use the file open, file write, file close etc commands in Stata, you have to search on “file write” or “file open” etc. If you just search on “write” or “write tables” you will not find the necessary help files. (3) It would be easy to include special characters at the beginning and end of each table to be used to select the tables, if you can figure out how to write macros in the target program to use those characters to select the beginning and end of the tables. (4) EDIT: I had a LOT of trouble getting the opening grave and closing apostrophe around macros to print properly in my default stylesheet; After repeated edits I think I have them so they msotly show right. But it quote marks are facing the wrong way, that’s the problem.


Author: olderwoman

I'm a sociology professor but not only a sociology professor. It isn't hard to figure out my real name if you want to, but I keep it out of this blog because I don't want my name associated with it in a Google search. Although I never write anything in a public forum like a blog that I'd be ashamed to have associated with my name (and you shouldn't either!), it is illegal for me to use my position as a public employee to advance my religious or political views, and the pseudonym helps to preserve the distinction between my public and private identities. The pseudonym also helps to protect the people I may write about in describing public or semi-public events I've been involved with.

4 thoughts on “Writing Complex Tables in Stata”

  1. When I initially commented I clicked the “Notify me when new comments are added”
    checkbox and now each time a comment is added I get four emails with the
    same comment. Is there any way you can remove people from
    that service? Appreciate it!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s