|
| Next > | Home
Methodology | Authors
and Collaborators | Source
Bibliography
Methodology
The data for this study were collected in two parts. Much
of the study is based on research conducted originally by
other people or organizations. Other research, particularly
in the content analysis, is original work conducted specifically
for this report.
For the data that was aggregated from other researchers,
the Project took several steps. First, we tried to determine
what data had been collected and by whom for the eight media
sectors studied. We organized the data into the six primary
areas of interest we wanted to examine: content, audience,
economics, ownership, newsroom investment and public attitudes.
For all data ultimately used, the Project sought and gained
permission for its use. Next, the Project studied the data
closely to determine where data reinforced each other and
where there were apparent contradictions or gaps. In doing
so, the Project endeavored to determine the value and validity
of each data set. This in many cases involved going back to
the sources who collected the research in the first place.
Where data conflicted, we have included all relevant sources
and tried to explain their differences, either in footnotes
or in the narratives. For instance, the differences in online
news usage are likely explained by how survey questions were
phrased. For example, there are substantial gaps depending
on whether a survey asks whether one ever goes online for
news, has done so in the last month or in the last week.
In analyzing the data for each media sector, we sought insight
from experts by having at least
three outside readers for each sector chapter. These readers
raised questions, offered arguments and questioned data where
they saw fit. In a few cases, we sought the advice of our
statistical research team at the University of Missouri School
of Journalism.
All sources are cited in footnotes or within the narrative,
and listed alphabetically in a source
bibliography. The data used in the report are also available
in more complete tabular form to any users online, where they
can view the raw data, sort it on their own and make
their own charts and graphs. Our goal was not only to
organize the available material into a clear narrative, but
to also collect all the public data on journalism in one usable
place. In many cases, this involved the Project having to
purchase the use of the data.
For the original content analysis research conducted by
the Project, the methodology follows.
Content Analysis Methodology
Sampling and Inclusion
There are two distinct categories of media that were studied
as part of the Project's 2004 Annual Report on the State of
the News Media.
The first, Text-Based Media, included Newspapers, News Magazines,
and Internet News Sites. Princeton Survey Research Associates
International conducted coding for these media.
The second, Electronic Media, included both Broadcast Network
and Cable Network News. ADT Research, publisher of the Tyndall
Report, conducted coding for these media.
Each subcategory of media was subject to a specific methodological
approach re: sampling and selection.
Text-Based Media
Newspapers
Newspaper Selection
Individual newspapers were selected to present a meaningful
assessment of content widely available to the public. Selections
were made on both a geographic and a demographic basis, as
well as diversity of ownership.
First, Newspapers were divided into four quartiles based
on daily circulation size: More than 750,000; 300,001 to 750,000;
100,001 to 300,000, and 100,000 and under.
For newspapers over 750,000, we selected four papers: USA
Today, The Los Angeles Times, The New York Times and The Washington
Post. (The Wall Street Journal, which also falls in this circulation
category, was excluded as a specialty publication.)
Four newspapers were chosen in each of the remaining three
categories. To ensure geographical diversity, each of the
four newspapers within a circulation category was selected
from a different geographic region of the United States. Regions
were defined according to the parameters established by the
U.S. Census Bureau.
The newspapers in circulation groups 2, 3 and 4 were selected
through the following process: First, using the Editor and
Publisher Yearbook, a list of every daily newspaper in the
United States was created. Each newspaper was assigned a random
number. After resorting the list by the random number, newspapers
were chosen by going down the list until all slots were filled.
To be eligible for selection, a newspaper (a) must have a
Sunday section, (b) must have its stories indexed in a news
database in order to be available to coders and (c) must not
be a tabloid. Newspapers not meeting these criteria were skipped
over. In addition, an effort was made to ensure a range of
owners was included. Those selected were:
Circulation Group 1
The Los Angeles Times
The New York Times
USA Today
The Washington Post
Circulation Group 2
The Arizona Republic (Phoenix)
The Boston Globe
The Detroit Free Press
The St. Petersburg Times
Circulation Group 3
The Hartford Courant
The Kansas City Star
The Knoxville (Tenn.) News Sentinel
The Las Vegas Review-Journal
Circulation Group 4
The Albany (N.Y.) Times Union
The Corpus Christi (Texas) Caller-Times
The Modesto (Calif.) Bee
The Rockford (Ill.) Register
Newspaper Study - Operative Dates 2003
Random sampling was used to select a sample of individual
days for the study. By choosing individual days rather than
actual weeks, we hoped to provide a broader look at news coverage
that more accurately represents the entire year. In order
to account for variations related to the different days of
the week, the 28 days that were sampled included 4 of each
day of the week. Dates were chosen from January 1 to October
7, a span of 280 days. October 7 was the last eligible day
in order to allow time for coding.
Using these procedures, the following dates were generated
and make up the 2003 sample:
January-8, 16
February-10, 16, 19, 21, 25
March-12, 30
April-15, 19 25
May-5, 6, 17, 29, 30
June-7, 16, 18
July-8, 20, 24
August-9, 31
September-4, 26
October-6
Story Procurement, Selection, and Inclusion
Articles were obtained via a combination electronic databases
(Dialog, Factiva, Nexis), supplemented by hard copies of daily
publications.
All articles with distinct bylines that appeared on the selected
newspaper's front page (Page A1), on the first page of the
Local/Metro Section, or on the first page of the "soft
news" section, (i.e., Living, Food, Style, Entertainment,
Weekend, etc.) were selected for analysis.
News Magazines
This year the study sought to examine content
of general interest news magazines. The news magazine sample
comprised the three largest weekly general interest news magazines:
Time, Newsweek, and U.S. News and World Report. As of June
30, 2003, all three magazines had circulations above two million.
News Magazine Study - Operative Dates 2003
In order to ensure a sample that was spread throughout
the year, the first 40 weeks of 2003 (January 1 to October
7) were divided into 4 groups of 10 weeks. One week was chosen
at random from each group, which resulted in a sample of 4
weeks.
Using these procedures, the following dates were generated
and make up the 2003 sample:
February 17
April 7
June 16
October 6
In addition, the study took a census of all 2003 covers of
the three magazines in the sample. Covers of Time were obtained
its Web site, http://www.time.com/time/magazine/coversearch/.
Covers for Newsweek were obtained through Newscom.com,
http://prn.newscom.com. Covers for U.S News & World
Report were obtained through its Web site, http://www.usnews.com/usnews/issue/archive.htm.
Story Procurement, Selection, and Inclusion
For each news magazine, all stories with distinct bylines
appearing in issues delivered to general mail subscribers
in the Washington, D.C., area were included. Stories were
procured via a combination of the Nexis database and hard
copies of each publication.
Internet News Sites
In order to select the Internet news sites to be coded, the
Nielsen//NetRatings top 20 news sites list was consulted to
determine the most prominent news sites. The list contained
four basic types of sites: news aggregators,
newspaper sites, network news sites and cable news sites.
Two sites were chosen for each of these categories. For aggregators,
AOL and Yahoo were selected as they are the only two aggregators
in top 20 list. For network news sites, two sites were randomly
chosen from among ABC, CBS, and MSNBC. For cable sites, two
were randomly chosen from among CNN, Fox News, and MSNBC.
MSNBC appeared in both lists because it is the news site for
both NBC News and the MSNBC cable channel. For newspapers,
the first site was chosen randomly from the four newspapers
in circulation group 1 and the second site was chosen randomly
from the 12 newspapers in groups 2 through 4.
Using Via this sample selection, the following sites were
included in the 2003 study:
AOL (news section front page)
CBS News (www.cbsnews.com)
CNN (www.cnn.com)
Fox News (www.foxnews.com)
The Las Vegas Review-Journal Online (www.reviewjournal.com)
MSNBC (www.msnbc.com)
The New York Times Online (www.nytimes.com)
Yahoo! (news.yahoo.com)
Internet News Sites - Operative Dates 2003
The eligible dates for Web site coding ranged from July
14 to October 7, a period of 86 days. We were unable to acquire
past Web pages, and therefore selected July 14 as the earliest
available date to capture Web pages. October 7 was the latest
date to collect pages to allow time for coding. Labor Day
was excluded from the sample. Five days were selected at random,
one of each weekday in order to account for variations due
to the day of the week.
With these procedures, the following dates were generated
and make up the 2003 Internet News Site sample.
July 14
August 5
August 15
September 10
September 25
Story Procurement, Selection and Inclusion
Each site was visited four times on each day - 9 a.m., 1
p.m., 5 p.m. and 9 p.m., all Eastern time, to download stories.
The order in which the sites were visited was rotated for
each capture time. Each download took about 20 minutes to
complete.
Each time, the following method was used to determine which
stories to capture. From the news home page of each of the
sites, we captured two levels of stories. First, all stories
at the top of the page explicitly relating to a graphic picture
- event or person - were captured as featured stories. Multiple
stories explicitly relating to the same graphic were also
captured as features. Pages with more than one graphic have
more than one featured story.
After the featured stories, we included the next three most
prominent stories without graphics starting from the top and
moving down. These stories were recorded as nonfeatured.
The following rules were put into place in selecting stories:
-
For the sample, omitted from study were video, audio,
charts, maps, background/archival information, news tickers,
chat and polls.
-
Any headline link that linked to an outside Web site
was also omitted. (However, stories attributed to other
outlets but present on the site being studied were counted.)
-
Links to secondary stories about the same topic were
counted as unique, non-feature stories.
-
A graphic attached to a nonstory item (i.e., video, audio,
charts, maps, background/archival information, "complete
coverage," chat and polls) was not counted as a story.
-
If there were no stories associated with a graphic, then
only the top three stories were coded and none were considered
featured.
-
If there was no graphic present, then no story was considered
as featured, and the top three stories were counted as
nonfeatured.
-
When news headlines with the same font and type size
appeared in side-by-side columns, stories were prioritized
in a left-to-right, line-by-line zigzag pattern.
Text-Based Media Coding Procedures
General practice called for a coder to work through no
more than seven days/issues from any news outlet during a
coding session. After completing up to seven days/issues from
one publication, coders switched to another Text-Based Media
outlet, and continued to code up to seven days/issues.
All coding personnel rotated through all circulation groups,
publications/sites, with the exception of the designated control
publications. A control publication was chosen in each category
of Text Media, including one newspaper in each circulation
group. The designated control publication was initially handled
by only one coder. That work was then oversampled during intercoder
reliability testing.
Working with a standardized codebook and coding rules, coders
generally worked through each story in its entirety, beginning
with the inventory variables - publication date, story length,
placement and origination. Next, they recorded the codes for
that same story's content variables - topics, protagonists,
sourcing levels and recurring leads/big stories. (Note: in
approximately 10 percent of all cases, the inventory variables
were precoded by research assistants; the content variables
were then completed by coding staff.)
Intercoder Reliability Testing For Text Media
Intercoder reliability measures the extent to which two coders,
operating individually, reach the same coding decisions. The
principal coding team for Text Media comprised four individuals,
who were trained as a group, augmented by two precoders. One
coder was designated as a general control coder and worked
offsite for the duration of the project. In addition, one
publication in each circulation group was designated as a
control source.
At the completion of the general coding process, each coder,
working alone and without access to the initial coding decisions,
recoded publications originally completed by another coder.
Intercoding tests were performed on 5 percent of all cases
for inventory variables, and agreement rates exceeded 98 percent
for those variables. For the more difficult content variables,
20 percent of all publications/sites were recoded, and intercoder
agreement rates were as follows:
Big Story/Recurring Lead Agreement Rate = 90%
Protagonist Code = 86 % for designation of Individual vs.
Institutional Protagonist
Topic = 87%
Level I Sourcing (Named and Identified Sources) = 93%;
Level II Sourcing (Name/Title Without Relationship Explained)
= 91%
Level III Sourcing (Unnamed/Untitled With Relationship Explained)
= 91%
Level IV Sourcing (Unnamed, No Explanation of Relationship)
= 89%
No significant differences were found to exist on a recurring
basis.
Broadcast Network Sample
The ability to make direct comparisons between Newspaper
and Broadcast Network findings was a project design goal;
thus, the weekday sample dates for those two news categories
are identical. Because of pre-emptions and schedule changes,
weekend network news broadcasts do not always appear in all
markets; thus, Saturday and Sunday broadcast network news
programs were excluded from the study. The following dates
made up the Broadcast Network sample:
January-8, 16
February-10, 16, 19, 21, 25
March-12, 30
April-15, 19 25
May-5, 6, 17, 29, 30
June-7, 16, 18
July-8, 20, 24
August-9, 31
September-4, 26
October-6
Broadcast Network Morning News Programs
(7 to 7:59 a.m. Eastern time)
ABC -- "Good Morning America"
CBS -- "The Early Show"
NBC -- "Today"
Broadcast Network Evening News Programs
(Full program as broadcast in New York market)
ABC - "World News Tonight"
CBS - "Evening News"
NBC - "Nightly News"
PBS - "NewsHour"
Program Procurement and Story Selection/Inclusion
The morning and evening broadcasts of the three commercial
networks were videotaped live in the New York City market.
For the evening newscasts, this represents each day's 6:30
p.m. East Coast feed. PBS supplied the Project with tapes
of the "NewsHour." All programming was available
on videotape. No coding relied on secondary sources such as
transcripts.
In the mornings, the following content was analyzed: stories
read by the newscaster in the half-hourly newsblocks; feature
and interview segments outside of the newsblocks; banter between
members of the anchor team whose import was other than to
tease upcoming segments in that day's program or to promote
the network's programming at some later time. One-fifth, 20
percent, of the sample was coded for teasers and promos and
was analyzed separately. Excluded from the analysis were the
content of the weather, local news inserts, commercials, and
other content-free editorial matter such as logos, studio
shots, openings and closings.
In the evenings the same rules applied, but because the content
of the newscasts is less variegated, concerns about newsblocks,
banter, weather and local news inserts were not applicable.
Cable News Sample
Cable News Programming - Operative Dates 2003
Cable coding dates were generated by randomly selecting
five days, one of each weekday, from the period of June 1
to October 7, a 129-day span. June 1 was chosen as the start
date due to the availability of tapes. October 7 was the last
eligible date in order to allow time for coding. Weekend days
were excluded because of the variability of the weekend cable
schedule.
Following these criteria, the following dates were generated
and represent the Cable News sample for 2003:
June-16
July-15
August-21
September-19, 24
In order to assess the nature of the 24-hour news cycle as
presented on cable news programming, CNN, Fox News and MSNBC
were selected because they were the three most viewed cable
news channels in 2003. To get a sense of the nature of cable
news, programming for the five days was coded from 7 a.m.
(the beginning of the morning shows) until 11 p.m. (the end
of prime time), a 16-hour stretch of programming. This resulted
in 240 hours of programming.
Following these criteria, the following cable networks and
broadcasts made up the 2003 Cable Network sample.
CNN
"American Morning"
"CNN Live Today"
"Live From
"
"Inside Politics"
"Crossfire"
"Wolf Blizter Reports"
"Lou Dobbs Tonight"
"Live From the Headlines With Anderson Cooper"
"Live From the Headlines With Paula Zahn"
"Larry King Live"
"NewsNight"
News Break-In/Unscheduled Breaking News
Other Show, Non-News Break-In
FOX
"Fox and Friends"
"Fox News Live"
"Day Side With Linda Vester"
"Studio B With Shepard Smith"
"Your World With Neil Cavuto"
"The Big Story With John Gibson"
"Special Report with Brit Hume"
"Fox Report with Shepard Smith"
"The O'Reilly Factor"
"Hannity and Colmes"
Program Procurement and Story Selection/Inclusion
Cable news videotape was collected by VMS, a commercial third-party
monitoring service. Dubbed copies were sold to PEJ for use
for "internal review, analysis or research only."
In a few instances, the Federal Document Clearing House Inc.,
a nongovernment source, supplied videotape unavailable through
VMS. On a few occasions the videotapes ended five minutes
short of an hour or started a few minutes late. Here, transcripts
were consulted when needed. In total, fewer than three hours
were missing out of the total 240-hour sample.
For each 16-hour day, editorial content was broken into individual
story items (the television equivalent of newspaper articles).
An item was defined by its format, produced as a single program
element standing alone, and by its content, covering a discrete
story. Each item was measured for its duration and coded for
its journalistic content according to an array of variables
including format, topic focus, levels of sourcing, any depiction
of a central protagonist and coverage of the year's overarching
major news developments, in particular Iraq-related coverage.
One-fifth, 20 percent, of the sample was coded for teasers
and promos and analyzed separately.
Electronic Media Coding Procedures
A team of three coders analyzed television news. No one coder
analyzed less than 20 percent of each the three types of programming
(morning broadcast, evening broadcast and cable news). In
order to keep track of repetition and freshness of stories
during the course of the day, coders were assigned to an entire
bloc of a cable network's 16 hours of programming.
Since many of the findings are weighted by time spent on
each item, it was essential to ensure accurate measurement
of the duration of items. They were documented as rundowns
with start times and end times, allowing for the duration
to be computed as the difference between the two. All of the
commercial network broadcasts were viewed twice to derive
an accurate inventory and item length statistics. To double
check the accuracy of all long-format items, coders not only
designated packages, interviews (external or in-house) and
stand-ups, but also documented a story summary in the form
of headlines/slugs and the identity of the reporter or interview
subject involved. These headlines/slugs were also used to
double check topic and big story coding.
Intercoder Reliability Testing For Electronic Media
An intercoder reliability check was performed to test the
stability of the protagonist code and the source codes, both
named and unnamed. A total of 35 hours of programming (777
different items) was recoded on these variables by a second
coder, following the rundowns of start time and end times
produced by the first coder's inventory. The code for a single
protagonist being portrayed as the central character of a
story had an 87 percent reliability level. The code for named
and identified Sources (at least three per story versus two
or fewer) had a 92 percent reliability level. The code for
unnamed yet identified sources (at least three per story versus
two or fewer) had a 94 percent reliability level.
Intercoder Reliability Testing Across Media
The project was designed so that for two specific media subcategories
- Newspapers and Broadcast Network News - there could be direct
comparisons made as to findings within each subcategory. Because
there were two distinct coding operations that worked on Text-Based
and Electronic Media, intercoder reliability testing was conducted
between the two groups. PSRA coders, working independently
and without access to original coding decisions, recoded 10
percent of the Broadcast Network News stories completed by
each Tyndall Research staff member on topic and big story/recurring
lead variables. For topic, agreement was reached in almost
four of every five cases (79 percent); for big story/recurring
lead, agreement was found in more than nine of every ten cases
(94 percent). These are the only media and the only two variables
where cross-media comparison can be and are made.
Data Analysis
For each media subcategory - Newspapers, News Magazines,
Internet News Sites, Broadcast Network News and Cable Network
News - separate datasets were created, and separate tabulations
were constructed. There was no aggregation of the data, and
in reporting results, all cites of "totals" refer
only to the specific media subcategory at hand.
For much of this report, the individual news story is the
unit of analysis. There are, however, selected variables where
it was more informative to present analysis via a measurement
of the time/words devoted to particular topics or recurring
leads.
Within each universe (cable, newspapers, etc.), each case
in the applicable SPSS dataset represents one story. Length
is one of the measurements recorded for each case. (For network
and cable television, this number represents seconds; for
newspaper and news magazines, this number represents word
count; for Internet, no volume analysis was applied.)
To create the volumetric tables, each case was selected
and the number recorded in the length variable was designated
as a weight. Then, that individual weight was applied to each
individual case. The resulting weighted dataset was used in
the production of volumetric tables for selected variables.
Statistical Analyses
For most comparisons of how content and structure of the
news varies as a function of which medium is being examined,
chi square analyses were used. Chi square is a nonparametric
statistic that examines the relationship between nominal variables,
that is, variables that are identified by "name"
and are not on a numeric scale (e.g., CNN, MSNBC and Fox News
are nominal variables.) As explained by scholars Daniel Riffe,
Stephen Lacy and Frederick Fico:
"The chi-square test of statistical significance is
based on the assumption that the randomly sampled data was
appropriately described, within sampling error, the population's
proportions of cases falling into the categorical values of
the variables being tested.
"Chi-square starts with the assumption that there is
in the population only random association between the two
variables, and that any sample finding to the contrary is
merely a sampling artifact. "
This is called the "null hypothesis" and refers
to the situation where there
is no relationship between the two variables examined.
Riffe and Lacy continue, "For each cell in the table
linking the two variables, chi-square calculates the theoretical
expected proportions based in this null relationship. The
empirically obtained data are then compared cell by cell with
the expected null-relationship proportions. Specifically,
the absolute value of the differences between the observed
and expected values in each cell goes into the computation
of the chi-square statistic. Therefore, the chi-square statistic
is large when the differences between the empirical and theoretical
cell frequencies is large, and small when the empirically
obtained data more closely resemble the pattern of the null
relationship.
"This chi-square static has known values that permit
a researcher to reject the null hypothesis (no relationship
between the variables) at the standard 95 percent and 99 percent
levels of probability."
Click
here to view footnotes for this section.

| Next >
| Home
Methodology | Authors
and Collaborators | Source
Bibliography
|