Skip to Content View Previous Reports



The data for this study were collected in two parts. Much of the study is based on research conducted originally by other people or organizations. Other research, particularly in the content analysis, is original work conducted specifically for this report.

For the data that was aggregated from other researchers, the Project took several steps. First, we tried to determine what data had been collected and by whom for the eight media sectors studied. We organized the data into the six primary areas of interest we wanted to examine: content, audience, economics, ownership, newsroom investment and public attitudes. For all data ultimately used, the Project sought and gained permission for its use. Next, the Project studied the data closely to determine where data reinforced each other and where there were apparent contradictions or gaps. In doing so, the Project endeavored to determine the value and validity of each data set. This in many cases involved going back to the sources who collected the research in the first place. Where data conflicted, we have included all relevant sources and tried to explain their differences, either in footnotes or in the narratives. For instance, the differences in online news usage are likely explained by how survey questions were phrased. For example, there are substantial gaps depending on whether a survey asks whether one ever goes online for news, has done so in the last month or in the last week.

In analyzing the data for each media sector, we sought insight from experts by having at least three outside readers for each sector chapter. These readers raised questions, offered arguments and questioned data where they saw fit. In a few cases, we sought the advice of our statistical research team at the University of Missouri School of Journalism.

All sources are cited in footnotes or within the narrative, and listed alphabetically in a source bibliography. The data used in the report are also available in more complete tabular form to any users online, where they can view the raw data, sort it on their own and make their own charts and graphs. Our goal was not only to organize the available material into a clear narrative, but to also collect all the public data on journalism in one usable place. In many cases, this involved the Project having to purchase the use of the data.

For the original content analysis research conducted by the Project, the methodology follows.

Content Analysis Methodology

Sampling and Inclusion

There are two distinct categories of media that were studied as part of the Project’s 2004 Annual Report on the State of the News Media.

The first, Text-Based Media, included Newspapers, News Magazines, and Internet News Sites. Princeton Survey Research Associates International conducted coding for these media.

The second, Electronic Media, included both Broadcast Network and Cable Network News. ADT Research, publisher of the Tyndall Report, conducted coding for these media.

Each subcategory of media was subject to a specific methodological approach re: sampling and selection.

Text-Based Media


Newspaper Selection

Individual newspapers were selected to present a meaningful assessment of content widely available to the public. Selections were made on both a geographic and a demographic basis, as well as diversity of ownership.

First, Newspapers were divided into four quartiles based on daily circulation size: More than 750,000; 300,001 to 750,000; 100,001 to 300,000, and 100,000 and under.

For newspapers over 750,000, we selected four papers: USA Today, The Los Angeles Times, The New York Times and The Washington Post. (The Wall Street Journal, which also falls in this circulation category, was excluded as a specialty publication.)

Four newspapers were chosen in each of the remaining three categories. To ensure geographical diversity, each of the four newspapers within a circulation category was selected from a different geographic region of the United States. Regions were defined according to the parameters established by the U.S. Census Bureau.1

The newspapers in circulation groups 2, 3 and 4 were selected through the following process: First, using the Editor and Publisher Yearbook, a list of every daily newspaper in the United States was created. Each newspaper was assigned a random number. After resorting the list by the random number, newspapers were chosen by going down the list until all slots were filled. To be eligible for selection, a newspaper (a) must have a Sunday section, (b) must have its stories indexed in a news database in order to be available to coders and (c) must not be a tabloid. Newspapers not meeting these criteria were skipped over. In addition, an effort was made to ensure a range of owners was included. Those selected were:

Circulation Group 1
The Los Angeles Times
The New York Times
USA Today
The Washington Post

Circulation Group 2
The Arizona Republic (Phoenix)
The Boston Globe
The Detroit Free Press2
The St. Petersburg Times

Circulation Group 3
The Hartford Courant
The Kansas City Star
The Knoxville (Tenn.) News Sentinel
The Las Vegas Review-Journal

Circulation Group 4
The Albany (N.Y.) Times Union
The Corpus Christi (Texas) Caller-Times
The Modesto (Calif.) Bee
The Rockford (Ill.) Register

Newspaper Study – Operative Dates 2003
Random sampling was used to select a sample of individual days for the study. By choosing individual days rather than actual weeks, we hoped to provide a broader look at news coverage that more accurately represents the entire year. In order to account for variations related to the different days of the week, the 28 days that were sampled included 4 of each day of the week. Dates were chosen from January 1 to October 7, a span of 280 days. October 7 was the last eligible day in order to allow time for coding.

Using these procedures, the following dates were generated and make up the 2003 sample:

January-8, 16
February-10, 16, 19, 21, 25
March-12, 30
April-15, 19 25
May-5, 6, 17, 29, 30
June-7, 16, 18
July-8, 20, 24
August-9, 31
September-4, 26

Story Procurement, Selection, and Inclusion
Articles were obtained via a combination electronic databases (Dialog, Factiva, Nexis), supplemented by hard copies of daily publications.3

All articles with distinct bylines that appeared on the selected newspaper’s front page (Page A1), on the first page of the Local/Metro Section, or on the first page of the “soft news” section, (i.e., Living, Food, Style, Entertainment, Weekend, etc.) were selected for analysis.4

News Magazines
This year the study sought to examine content of general interest news magazines. The news magazine sample comprised the three largest weekly general interest news magazines: Time, Newsweek, and U.S. News and World Report. As of June 30, 2003, all three magazines had circulations above two million.5

News Magazine Study – Operative Dates 20036
In order to ensure a sample that was spread throughout the year, the first 40 weeks of 2003 (January 1 to October 7) were divided into 4 groups of 10 weeks. One week was chosen at random from each group, which resulted in a sample of 4 weeks.

Using these procedures, the following dates were generated and make up the 2003 sample:

February 17
April 7
June 16
October 6

In addition, the study took a census of all 2003 covers of the three magazines in the sample. Covers of Time were obtained its Web site, Covers for Newsweek were obtained through, Covers for U.S News & World Report were obtained through its Web site,

Story Procurement, Selection, and Inclusion
For each news magazine, all stories with distinct bylines appearing in issues delivered to general mail subscribers in the Washington, D.C., area were included. Stories were procured via a combination of the Nexis database and hard copies of each publication.

Internet News Sites

In order to select the Internet news sites to be coded, the Nielsen//NetRatings top 20 news sites list was consulted to determine the most prominent news sites. The list contained four basic types of sites: news aggregators,7 newspaper sites, network news sites and cable news sites. Two sites were chosen for each of these categories. For aggregators, AOL and Yahoo were selected as they are the only two aggregators in top 20 list. For network news sites, two sites were randomly chosen from among ABC, CBS, and MSNBC. For cable sites, two were randomly chosen from among CNN, Fox News, and MSNBC. MSNBC appeared in both lists because it is the news site for both NBC News and the MSNBC cable channel. For newspapers, the first site was chosen randomly from the four newspapers in circulation group 1 and the second site was chosen randomly from the 12 newspapers in groups 2 through 4.

Using Via this sample selection, the following sites were included in the 2003 study:

AOL (news section front page)
CBS News (
Fox News (
The Las Vegas Review-Journal Online (
The New York Times Online (
Yahoo! (

Internet News Sites – Operative Dates 2003
The eligible dates for Web site coding ranged from July 14 to October 7, a period of 86 days. We were unable to acquire past Web pages, and therefore selected July 14 as the earliest available date to capture Web pages. October 7 was the latest date to collect pages to allow time for coding. Labor Day was excluded from the sample. Five days were selected at random, one of each weekday in order to account for variations due to the day of the week.

With these procedures, the following dates were generated and make up the 2003 Internet News Site sample.

July 14
August 5
August 15
September 10
September 25

Story Procurement, Selection and Inclusion

Each site was visited four times on each day – 9 a.m., 1 p.m., 5 p.m. and 9 p.m., all Eastern time, to download stories. The order in which the sites were visited was rotated for each capture time. Each download took about 20 minutes to complete.

Each time, the following method was used to determine which stories to capture. From the news home page of each of the sites, we captured two levels of stories. First, all stories at the top of the page explicitly relating to a graphic picture – event or person – were captured as featured stories. Multiple stories explicitly relating to the same graphic were also captured as features. Pages with more than one graphic have more than one featured story.

After the featured stories, we included the next three most prominent stories without graphics starting from the top and moving down. These stories were recorded as nonfeatured.

The following rules were put into place in selecting stories:

Text-Based Media Coding Procedures
General practice called for a coder to work through no more than seven days/issues from any news outlet during a coding session. After completing up to seven days/issues from one publication, coders switched to another Text-Based Media outlet, and continued to code up to seven days/issues.

All coding personnel rotated through all circulation groups, publications/sites, with the exception of the designated control publications. A control publication was chosen in each category of Text Media, including one newspaper in each circulation group. The designated control publication was initially handled by only one coder. That work was then oversampled during intercoder reliability testing.

Working with a standardized codebook and coding rules, coders generally worked through each story in its entirety, beginning with the inventory variables – publication date, story length, placement and origination. Next, they recorded the codes for that same story’s content variables – topics, protagonists, sourcing levels and recurring leads/big stories. (Note: in approximately 10 percent of all cases, the inventory variables were precoded by research assistants; the content variables were then completed by coding staff.)

Intercoder Reliability Testing For Text Media
Intercoder reliability measures the extent to which two coders, operating individually, reach the same coding decisions. The principal coding team for Text Media comprised four individuals, who were trained as a group, augmented by two precoders. One coder was designated as a general control coder and worked offsite for the duration of the project. In addition, one publication in each circulation group was designated as a control source.

At the completion of the general coding process, each coder, working alone and without access to the initial coding decisions, recoded publications originally completed by another coder. Intercoding tests were performed on 5 percent of all cases for inventory variables, and agreement rates exceeded 98 percent for those variables. For the more difficult content variables, 20 percent of all publications/sites were recoded, and intercoder agreement rates were as follows:

Big Story/Recurring Lead Agreement Rate = 90%
Protagonist Code = 86 % for designation of Individual vs. Institutional Protagonist
Topic = 87%
Level I Sourcing (Named and Identified Sources) = 93%;
Level II Sourcing (Name/Title Without Relationship Explained) = 91%
Level III Sourcing (Unnamed/Untitled With Relationship Explained) = 91%
Level IV Sourcing (Unnamed, No Explanation of Relationship) = 89%

No significant differences were found to exist on a recurring basis.

Broadcast Network Sample

The ability to make direct comparisons between Newspaper and Broadcast Network findings was a project design goal; thus, the weekday sample dates for those two news categories are identical. Because of pre-emptions and schedule changes, weekend network news broadcasts do not always appear in all markets; thus, Saturday and Sunday broadcast network news programs were excluded from the study. The following dates made up the Broadcast Network sample:

January-8, 16
February-10, 16, 19, 21, 25
March-12, 30
April-15, 19 25
May-5, 6, 17, 29, 30
June-7, 16, 18
July-8, 20, 24
August-9, 31
September-4, 26

Broadcast Network Morning News Programs
(7 to 7:59 a.m. Eastern time)

ABC — “Good Morning America”
CBS — “The Early Show”
NBC — “Today”

Broadcast Network Evening News Programs
(Full program as broadcast in New York market)

ABC – “World News Tonight”
CBS – “Evening News”
NBC – “Nightly News”
PBS – “NewsHour”

Program Procurement and Story Selection/Inclusion
The morning and evening broadcasts of the three commercial networks were videotaped live in the New York City market. For the evening newscasts, this represents each day’s 6:30 p.m. East Coast feed. PBS supplied the Project with tapes of the “NewsHour.” All programming was available on videotape. No coding relied on secondary sources such as transcripts.

In the mornings, the following content was analyzed: stories read by the newscaster in the half-hourly newsblocks; feature and interview segments outside of the newsblocks; banter between members of the anchor team whose import was other than to tease upcoming segments in that day’s program or to promote the network’s programming at some later time. One-fifth, 20 percent, of the sample was coded for teasers and promos and was analyzed separately. Excluded from the analysis were the content of the weather, local news inserts, commercials, and other content-free editorial matter such as logos, studio shots, openings and closings.

In the evenings the same rules applied, but because the content of the newscasts is less variegated, concerns about newsblocks, banter, weather and local news inserts were not applicable.

Cable News Sample

Cable News Programming – Operative Dates 2003
Cable coding dates were generated by randomly selecting five days, one of each weekday, from the period of June 1 to October 7, a 129-day span. June 1 was chosen as the start date due to the availability of tapes. October 7 was the last eligible date in order to allow time for coding. Weekend days were excluded because of the variability of the weekend cable schedule.

Following these criteria, the following dates were generated and represent the Cable News sample for 2003:

September-19, 24

In order to assess the nature of the 24-hour news cycle as presented on cable news programming, CNN, Fox News and MSNBC were selected because they were the three most viewed cable news channels in 2003. To get a sense of the nature of cable news, programming for the five days was coded from 7 a.m. (the beginning of the morning shows) until 11 p.m. (the end of prime time), a 16-hour stretch of programming. This resulted in 240 hours of programming.

Following these criteria, the following cable networks and broadcasts made up the 2003 Cable Network sample.

“American Morning”
“CNN Live Today”
“Live From … ”
“Inside Politics”
“Wolf Blizter Reports”
“Lou Dobbs Tonight”
“Live From the Headlines With Anderson Cooper”
“Live From the Headlines With Paula Zahn”
“Larry King Live”
News Break-In/Unscheduled Breaking News
Other Show, Non-News Break-In

“Fox and Friends”
“Fox News Live”
“Day Side With Linda Vester”
“Studio B With Shepard Smith”
“Your World With Neil Cavuto”
“The Big Story With John Gibson”
“Special Report with Brit Hume”
“Fox Report with Shepard Smith”
“The O’Reilly Factor”
“Hannity and Colmes”
Program Procurement and Story Selection/Inclusion

Cable news videotape was collected by VMS, a commercial third-party monitoring service. Dubbed copies were sold to PEJ for use for “internal review, analysis or research only.” In a few instances, the Federal Document Clearing House Inc., a nongovernment source, supplied videotape unavailable through VMS. On a few occasions the videotapes ended five minutes short of an hour or started a few minutes late. Here, transcripts were consulted when needed. In total, fewer than three hours were missing out of the total 240-hour sample.

For each 16-hour day, editorial content was broken into individual story items (the television equivalent of newspaper articles). An item was defined by its format, produced as a single program element standing alone, and by its content, covering a discrete story. Each item was measured for its duration and coded for its journalistic content according to an array of variables including format, topic focus, levels of sourcing, any depiction of a central protagonist and coverage of the year’s overarching major news developments, in particular Iraq-related coverage. One-fifth, 20 percent, of the sample was coded for teasers and promos and analyzed separately.

Electronic Media Coding Procedures

A team of three coders analyzed television news. No one coder analyzed less than 20 percent of each the three types of programming (morning broadcast, evening broadcast and cable news). In order to keep track of repetition and freshness of stories during the course of the day, coders were assigned to an entire bloc of a cable network’s 16 hours of programming.

Since many of the findings are weighted by time spent on each item, it was essential to ensure accurate measurement of the duration of items. They were documented as rundowns with start times and end times, allowing for the duration to be computed as the difference between the two. All of the commercial network broadcasts were viewed twice to derive an accurate inventory and item length statistics. To double check the accuracy of all long-format items, coders not only designated packages, interviews (external or in-house) and stand-ups, but also documented a story summary in the form of headlines/slugs and the identity of the reporter or interview subject involved. These headlines/slugs were also used to double check topic and big story coding.

Intercoder Reliability Testing For Electronic Media
An intercoder reliability check was performed to test the stability of the protagonist code and the source codes, both named and unnamed. A total of 35 hours of programming (777 different items) was recoded on these variables by a second coder, following the rundowns of start time and end times produced by the first coder’s inventory. The code for a single protagonist being portrayed as the central character of a story had an 87 percent reliability level. The code for named and identified Sources (at least three per story versus two or fewer) had a 92 percent reliability level. The code for unnamed yet identified sources (at least three per story versus two or fewer) had a 94 percent reliability level.

Intercoder Reliability Testing Across Media
The project was designed so that for two specific media subcategories – Newspapers and Broadcast Network News – there could be direct comparisons made as to findings within each subcategory. Because there were two distinct coding operations that worked on Text-Based and Electronic Media, intercoder reliability testing was conducted between the two groups. PSRA coders, working independently and without access to original coding decisions, recoded 10 percent of the Broadcast Network News stories completed by each Tyndall Research staff member on topic and big story/recurring lead variables. For topic, agreement was reached in almost four of every five cases (79 percent); for big story/recurring lead, agreement was found in more than nine of every ten cases (94 percent). These are the only media and the only two variables where cross-media comparison can be and are made.

Data Analysis

For each media subcategory – Newspapers, News Magazines, Internet News Sites, Broadcast Network News and Cable Network News – separate datasets were created, and separate tabulations were constructed. There was no aggregation of the data, and in reporting results, all cites of “totals” refer only to the specific media subcategory at hand.

For much of this report, the individual news story is the unit of analysis. There are, however, selected variables where it was more informative to present analysis via a measurement of the time/words devoted to particular topics or recurring leads.

Within each universe (cable, newspapers, etc.), each case in the applicable SPSS dataset represents one story. Length is one of the measurements recorded for each case. (For network and cable television, this number represents seconds; for newspaper and news magazines, this number represents word count; for Internet, no volume analysis was applied.)

To create the volumetric tables, each case was selected and the number recorded in the length variable was designated as a weight. Then, that individual weight was applied to each individual case. The resulting weighted dataset was used in the production of volumetric tables for selected variables.

Statistical Analyses

For most comparisons of how content and structure of the news varies as a function of which medium is being examined, chi square analyses were used. Chi square is a nonparametric statistic that examines the relationship between nominal variables, that is, variables that are identified by “name” and are not on a numeric scale (e.g., CNN, MSNBC and Fox News are nominal variables.) As explained by scholars Daniel Riffe, Stephen Lacy and Frederick Fico:

“The chi-square test of statistical significance is based on the assumption that the randomly sampled data was appropriately described, within sampling error, the population’s proportions of cases falling into the categorical values of the variables being tested.

“Chi-square starts with the assumption that there is in the population only random association between the two variables, and that any sample finding to the contrary is merely a sampling artifact. “8

This is called the “null hypothesis” and refers to the situation where there
is no relationship between the two variables examined.

Riffe and Lacy continue, “For each cell in the table linking the two variables, chi-square calculates the theoretical expected proportions based in this null relationship. The empirically obtained data are then compared cell by cell with the expected null-relationship proportions. Specifically, the absolute value of the differences between the observed and expected values in each cell goes into the computation of the chi-square statistic. Therefore, the chi-square statistic is large when the differences between the empirical and theoretical cell frequencies is large, and small when the empirically obtained data more closely resemble the pattern of the null relationship.

“This chi-square static has known values that permit a researcher to reject the null hypothesis (no relationship between the variables) at the standard 95 percent and 99 percent levels of probability.”9


1. U.S. Census Bureau, see for a map of the regions

2. The Detroit Free Press is bound to The Detroit News by a Joint Operating Agreement (JOA). Weekend editions reflect combined resources of both newspapers.

3. For the Nashville Tennessean, it was not possible to capture wire copy for the June – October dates.

4. All Page One articles from The Rockford (Ill.) Register were removed from the newspaper database before analysis. This was dictated by the front-page format of the Register, which differed from all other newspapers in this study. No complete articles are found on Page A1 of The Register; rather, abridged stories are presented, referring the reader to the full account, found on other pages throughout that day’s paper. Articles published on the lead pages of the Metro/Local and the Style/Living sections of The Register are included in this analysis.
For the New York Times, the following daily selections were made re: applicable “soft news” section: Monday, Tuesday – Arts & Culture; Wednesday – Dining In; Thursday – House and Home; Friday – Escapes; Saturday – Arts & Ideas; Sunday – Styles.

5. For circulation figures, see Advertising Age,

6. These represent the publication date as reported on the cover of each news magazine; the newsstand appearance of these issues occurred about one week before the publication dates.

7. Aggregators do not create news articles. Instead, they post material from wires and other news outlets.

8. Riffe, D., Lacy, S., & Fico, F. (1998). Analyzing media messages using quantitative content analysis in research. Mahwah, New Jersey: Laurence Erlbaum, 1988, pp 167-168.

9. Ibid.