Running head Persuasive text and metacognitive monitoring by xiaohuicaicai

VIEWS: 2 PAGES: 59

									                                                       Metacognitive Monitoring 1


Running head: METACOGNITIVE MONITORING




  The Effects of Persuasive and Expository Text on Metacognitive Monitoring and Control

           Daniel L. Dinsmore, Sandra M. Loughlin, and Meghan M. Parkinson

                                 University of Maryland
                                                             Metacognitive Monitoring 2


                                             Abstract

This investigation examined metacognitive processes across two text types (persuasive and

expository). We also considered the effects of think aloud and three expertise levels

(acclimation, competence, and proficiency) for scrolling (i.e., moving back and forth in text) and

calibration (i.e., difference between confidence and performance). Participants were

undergraduates enrolled in either human development (n = 38) or government/politics courses (n

= 38), and practicing attorneys (n = 4). Participants read two passages on judicial review

presented via computer, and trace data on scrolling behaviors were logged during reading.

Additionally, a calibration measure was completed after reading. Think-alouds were coded for

metacognitive utterances. Data were analyzed via non-parametric bootstrapping. Significant

differences between text type were found for scrolling, calibration, and utterance categories.

There was no significant difference for think aloud condition on scrolling or calibration. Only

scrolling was statistically different for expertise level. However, median differences revealed

interesting trends between expertise groups requiring further investigation.
                                                              Metacognitive Monitoring 3


    The Effects of Persuasive and Expository Text on Metacognitive Monitoring and Control

       Students, particularly those at the undergraduate level, are often required to read,

evaluate, and use information presented in text. Too often, it is assumed that undergraduates are

competent readers able to use text type effectively to learn essential content. However, this

assumption has recently been called into question (e.g., Fox, Dinsmore, Maggioni, & Alexander,

2009). Specifically, Fox et al. (2009) found that undergraduates enrolled in a research methods

course were able to recall only limited information from course-related texts and did not display

the strategic processing expected of competent readers. One possible explanation for those

reported shortfalls was that these undergraduates were poor at monitoring and controlling their

cognitive processes, particularly in regards to comprehension (Wiley, Griffin, & Thiede, 2005).

These problems at monitoring and controlling may revolve around the students’ inability to use

prior knowledge (e.g., Shapiro, 2008), calibrate their learning (i.e., monitoring the relation

between confidence and performance; e.g., Dunlosky, Serra, Matvey, & Rawson, 2005), set

goals, or activate appropriate strategies (e.g., Aleven, McLaren, Roll, & Koedinger, 2006). The

present study was compelled by these concerns and by the goal of understanding how

presumably competent readers engage with texts cognitively and metacognitively.

       As we move into this investigation, we are aided by the fact that the research on

metacognition represents a mature line of inquiry. In particular, there is an extensive literature on

the relation between metacognition and text (e.g., Wiley, Griffin, & Thiede, 2005). However,

despite the richness and diversity in this line of research, several problems and gaps persist.

Specifically, the literature on metacognitive monitoring and control has relied primarily on short

segments of text rather than extended discourse and has not considered the potential effects of

text type or genre on monitoring or control processes. Further, we considered the possibility that
                                                              Metacognitive Monitoring 4


certain measures of metacognition in the literature may actually influence metacognitive

processing. Finally, we found limited consideration given to the levels of readers’ expertise in a

domain and their subsequent metacognitive processing. The aim of the present research is to

address these gaps.

       For the purpose of this investigation, metacognition is defined as "thinking about

thinking" (Miller, Kessel, & Flavell, 1970, p. 613), which encompasses four key components

(Flavell, 1979): metacognitive knowledge, metacognitive experiences, cognitive goals, and the

strategy activation. Metacognitive knowledge refers to knowledge or beliefs that guide the

course of mental operations at either the person, task, or strategy level, while metacognitive

experiences are the cognitive or affective experiences that pertain to a mental operation.

Cognitive goals refer to cognitive or metacognitive goals that direct cognitive or metacognitive

activity. Finally, strategies are cognitive actions that are evoked to monitor (metacognitive

strategies) or make (cognitive strategies) progress toward a goal.

       There are number of studies of metacognition that have examined the difference between

individuals’ confidence and their performance (i.e., calibration) with tasks involving the

memorization of word pairs (e.g., Thiede & Dunlosky, 1994) or general knowledge questions

(Dahl, Allwood, & Hagberg, 2009). However, these studies have infrequently considered the

effect of topic or domain on the outcomes reported, particularly as it relates to academic domains

(Parkinson & Dinsmore, in preparation). Although this research provides insights into

monitoring and control processes, it sheds little light on what might be occurring within the

minds of undergraduates reading challenging texts from which they are expected to learn new

and complex content. Further, those monitoring studies that have used connected discourse

typically utilize expository texts such as Encarta (e.g., Moos & Azevedo, 2008).
                                                              Metacognitive Monitoring 5


       Expository text is characterized as non-fiction reading material in which the intent is to

inform or explain (Williams, Stafford, Lauer, Hall, & Pollini, 2009). Although students often

read expository texts, the aforementioned studies have not been designed to establish how that

particular type of text over other forms may affect metacognitive monitoring and control. For

that reason, we have chosen to compare participants’ metacognitive processing with two text

types (i.e., expository and persuasive text).

       Persuasive text is defined as text in which an author argues a point of view in order to

change a reader’s knowledge, beliefs, or interest (Kamalski, Sanders, & Lentz, 2002; Murphy,

Long, Holleran, & Esterly, 2003). Our interest in persuasive text for this study comes from the

finding that such text can be influential in sparking students’ interest and deepening their

knowledge (e.g., Buehl, Alexander, Murphy, & Sperl, 2001; Carrell & Connor, 1991). This may

be especially true for two-sided refutational text in which competing views on an issue are

presented, although to the advantage of one view over the other (Allen, 1991). We expected that

by presenting participants with two different texts on judicial review, we might uncover

differences in their metacognitive monitoring and control.

       Whether the text is expository or persuasive, it is still necessary to find some viable

method for unearthing typically covert mental processes. This is no easy task and is made more

difficult because the measurements themselves may in fact disrupt these mental processes. This

has long presented a problem for metacognitive researchers, who have attempted a variety of

metacognitive measures. In their review of the metacognition literature, Dinsmore, et al (2008)

identified six types of measures in the contemporary literature: self report, observation, think-

aloud, interviews, and performance ratings. Measures of metacognition should be chosen based

on their utility in uncovering these mental processes, but not disrupting them. Previous measures,
                                                               Metacognitive Monitoring 6


such as performance ratings (i.e., calibration) and observational measures such as rereading (as

measured by the number of times participants scroll backward through text; e.g., Johnson-

Glenberg, 2005) and help seeking (operationalized as soliciting help from an outside source; e.g.,

Aleven & Koedinger, 2002) have not shown evidence of disrupting covert mental processes.

       Of particular interest here was the affect think-aloud protocols may have on other

measures of monitoring and control commonly used (i.e., calibration and observational

measures). This issue has become more salient because there has been an increasing use of think-

aloud methodology in the metacognition literature (Dinsmore et al 2008). In a think-aloud

protocol, participants are asked to perform a task while continuously reporting thoughts that

occur during a task (Erricson & Simon, 1984). Further, Ericsson and Simon conjecture that these

thoughts emanate from working memory. By positioning these concurrent verbalizations in

working memory, think-aloud protocol should only elicit verbalizations about deliberately

enacted strategies, not automated skills (e.g., decoding in reading).

       However, the question of whether the think-aloud protocol affects processing is far from

resolved. Veenman, Elshout, and Groen (1993) investigated this issue with measures of

regulatory processing in a discovery learning situation. They found no significant differences in

their measures of regulatory processing (as measured by student performance relevant to

strategic processing) between a think-aloud and no think-aloud conditions. However, they did

find that time on task differed significantly between the two groups. Since the think-aloud

protocol took a significantly longer amount of time, it is quite possible that this placed higher

demands on participants working memory. These higher demands may limit the amount of

strategic processing one is able engage in, or conversely, take longer because the protocol itself

is eliciting more strategic processing from the participant.
                                                               Metacognitive Monitoring 7


       Although there is no direct empirical evidence that think-aloud protocol affects strategic

processing, there is evidence that it negatively affects learning outcomes. Karahasanović, Hinkel,

Sjøberg, and Thomas (2009) concluded in their study that think-aloud protocol impacted not only

reading time, but also negatively impacted participants’ posttest scores. Further, in a descriptive

study, Greatorex and Süto (2008) found that participants reported a wide variation in

participants’ comments about their experience with the think-aloud protocol.

       These findings may indicate that due to increased demands on participants’ time,

variability of participants’ descriptions of their experience with the think-aloud protocol, and a

negative impact on learning outcomes, it seems likely that some aspect of metacognitive

monitoring and control would be affected by the think-aloud protocol. This study addresses this

concern by comparing participants’ responses measurements of metacognitive monitoring and

control not expected to affect covert mental processing (i.e., scrollbacks, calibration, and help

seeking) in a think-aloud and no think-aloud condition. We would expect that the think-aloud

protocol would elicit more instances of metacognitive monitoring and control due to the fact that

it attempts to make normally covert processes overt.

       Finally, the present research addresses how metacognitive monitoring and control change

with expertise in a particular domain, a relation that has received minimal consideration in the

literature. With a few notable exceptions, most studies of metacognition have not considered the

effect of expertise on metacognitive processes (e.g., de Bruin, Rikers, & Schmidt, 2007). Rather,

they have investigated single populations (i.e., readers who have similar levels of expertise

relative to the content of the text) or have not addressed the issue of expertise at all (e.g., Rhodes

& Castel, 2008). This research paradigm is problematic because the literature predicts

differential processes for individuals at varying levels of expertise within a domain. For example,
                                                                Metacognitive Monitoring 8


Alexander’s Model of Domain Learning (MDL; Alexander, 1997) hypothesizes that levels of

expertise (i.e., acclimation, competence, and proficiency) result from the differential confluence

of knowledge, interest, and strategies; a confluence that likely has implications for metacognitive

monitoring and control processes. For instance, it is probable that individuals at higher levels of

expertise are more knowledgeable about and invested in issues relevant to their domain, and thus

likely engage in different patterns of metacognitive monitoring and control, particularly with

respect to calibration, than are novices while reading the same text.

        In the current study, this relation was addressed by targeting pools of participants at

varying levels of expertise in government and politics, the domain in which our task was situated

(i.e., the texts utilized for this study were on the topic of judicial review). The first pool was

comprised of undergraduates in a human development course that we predicted to have low prior

knowledge and interest in the domain (i.e., acclimation). We also recruited undergraduates

enrolled in a government and politics course, who we expected to demonstrate moderate levels

of prior knowledge and interest in government and politics (i.e., competence). Lastly, we

included practicing attorneys for our expert group, predicting that they would articulate high

levels of prior knowledge and interest in government and politics. Moreover, their professional

status indicated their level of expertise in the domain. It was our expectation that these groups

would differentially monitor and control their reading behaviors.

                                               Method

Participants

        The participants for this study were recruited from three different pools. The first pool

consisted of undergraduates at a large mid-Atlantic university in the United States enrolled in

two sections of an introductory human development course. For the students enrolled in the
                                                                Metacognitive Monitoring 9


human development course (n = 38) the average age was 21.16. Participants in this first pool

were 52.63% female and 76.32% Caucasian. The average GPA for this first pool was 3.25 and

they had completed an average of 80.68 cumulative college credits. These participants came

from a variety of academic majors.

        The second pool consisted of undergraduates at the same university that were enrolled in

an upper-level government and politics course. For the students enrolled in the government and

politics course (n = 38) the average age was 20.34. Participants in this second pool were 39.47%

female and 60.53% Caucasian. The average GPA for this second pool was 3.30 and they had

completed an average of 74.47 cumulative college credits. 71.05% of the participants from the

government and politics class were government and politics majors.

        The third pool consisted of practicing attorneys from the mid-Atlantic region of the

United States. For the practicing attorneys (n = 4) the average age was 28.5. Participants in this

third pool were all male and 75.00% Caucasian.

Materials

        The materials for this study were all computerized. The materials in the computer

environment consisted of two text passages and a glossary, as well as the measures for the study.

        Text passages. The texts for this study consisted of an expository passage and a two-sided

refutational passage. The topic for these texts was judicial review. Currently, there is some

debate over the use (i.e., the overuse) of judicial review that is referred to as judicial activism.

Each of the two passages was adapted so that they were of similar length and difficulty. These

passages were presented in a text box with scroll arrows on the right hand side. Three lines of

text were clearly visible at a time. Lines of text both above and below the target text were in light

gray.
                                                             Metacognitive Monitoring 10


       The expository passage (Appendix A) was adapted from a Microsoft Encarta entry on the

judicial branch (Microsoft, 2008). This passage described the role of the judicial branch and did

not contain any argument relating to judicial review or judicial activism. The expository passage

was 1,111 words and was 79 lines long. The Flesch Reading Ease for this passage was 44.5 and

the Flesch-Kincaid Grade Level was 12.4.

       The two-sided refutational passage (Appendix B) was adapted from two sources. The

first source was from a transcript of a speech given by then Attorney General Alberto Gonzales

at the American Enterprise Institute on January 17, 2007, entitled, ‖Democracy and the Third

Branch" (Gonzales, 2007). Gonazales argued in his speech that the judicial branch should

exercise extreme caution in when declaring executive and legislative actions unconstitutional.

The second source was an article written by Clint Bolick, a member of the CATO Institute,

which appeared in the Wall Street Journal April 3, 2007, entitled, "A Cheer for Judicial

Activism" (Bolick, 2007). Bolick argued in the article that the judiciary must do everything

possible to ensure that the government does not infringe on individuals' civil liberties. These two

sources were woven together to create a two-sided refutational text in which Gonzales's

arguments restricting the use of judicial activism were refuted by Bolick's arguments for the

judicial to do everything possible to ensure individuals' liberties. The two-sided refutational

passage was 1,213 words and was 81 lines long. The Flesch Reading Ease for this passage was

39.2 and the Flesch-Kincaid Grade Level was 14.2.

       Glossary. The glossary consisted of a glossary of terms as well as biographical

information on both Alberto Gonzales and Clint Bolick. The glossary of terms listed keywords

from each of the texts and gave their definitions. The definitions for these terms were adapted

from the Merriam-Webster Online Dictionary (Merriam-Webster, 2008). Sample terms included:
                                                           Metacognitive Monitoring 11


judicial activism, Alexander Hamilton, James Madison, tyranny, Constitutional law, deference,

and judicial review. The brief biographies for both Alberto Gonzales and Clint Bolick were each

less than three hundred words.

Measures

       The measures for this study were also all computerized. The measures for the study

include: demographics, prior knowledge, topic interest, passage knowledge, and calibration.

       Demographics. The demographics questionnaire had students report their sex, age, and

ethnicity (using the United States Census Bureau categories). For the undergraduates, they were

also asked to report their academic major, cumulative college credits completed, and their

cumulative grade point average (based on a four-point scale).

       Prior knowledge. The prior knowledge test measured participants' prior knowledge on the

topic of the judicial review process. The measure consisted of sixteen multiple-choice items

based on information in both the expository and persuasive passages. All the prior knowledge

questions came from the two passages (eight from the expository passage and eight from the

persuasive passage).

       The responses for the multiple-choice items were scored using a targeted response model

(Alexander, Murphy, & Kulikowich, 1998). In this way, differentiation between those immersed

in the topic or domain and those not immersed in the topic or domain could be made. An

example of one of the multiple choice items appears below.

Appellate jurisdiction is exercised by __________.

       a. the United States courts of appeals (4)

       b. the Supreme Court (2)

       c. the President (0)
                                                             Metacognitive Monitoring 12


       d. trial courts (1)

       The answer choices corresponded to one of the following categories: in-topic correct

responses, in-topic incorrect responses, in-domain incorrect response, and popular lore

responses. In this case, the answer choice "the United States Court of Appeals" was the in-topic

correct response and was scored a 4. The answer choice "the Supreme Court" was the in-topic

incorrect response and was scored a 2. The answer choice, "the Supreme Court" was incorrect,

but was within the topic of judicial review. The answer choice, "trial courts" was the in-domain

incorrect response and was scored a 1. Although the trial courts fall within the domain of

government and politics, they have no role in the topic of judicial review. The answer choice,

"the President" was a ―popular lore‖ answer and was scored a 0. This response was one in which

someone with little to no domain knowledge may choose.

       The Cronbach's alpha for the prior knowledge measure was 0.59. Although lower than

the suggested alpha for experimental measures of 0.70, the depressed alpha in this case may

represent participants' fragmentary knowledge on the topic of judicial review. Bernardi (1994)

suggests that alpha is partially dependent on the sample chosen. In this case, it is quite possible

that the weak correlation between items may have been due to the fact that the participants

(particularly the human development undergraduates) may have some declarative knowledge

(i.e., statements or propositions about a domain) but that this knowledge may not be principled

(i.e., overarching conceptualizations in a domain).

       Topic interest. Topic interested was assessed by having participants report their level of

interest for ten items related to the judicial branch. These items included how interested they

were in: checks and balances, historic court decisions, judges and justices, the Constitution, and

the founding fathers. The participants were asked to respond to these ten items by making a slash
                                                             Metacognitive Monitoring 13


on a 100-pixel line with "not interested" and "very interested" at opposite poles. The Cronbach's

alpha for this scale was 0.90. An example item for the topic interest scale appears below.


   Governmental systems of checks and balances




 not interested                                             very interested



       Passage knowledge. For each passage, knowledge was assessed immediately following

each passage. These passage knowledge questions related directly to information presented in the

passage and were similar to the prior knowledge question both in the wording of the questions

and the response format. However, the particular questioned that appeared after each passage

could only be answered from the passage the participants had just read. There were eight

questions per passage.

       Cronbach’s alpha for this scale was 0.39. As discussed above, these were the same

questions as the prior knowledge test. The lower alpha when these items are presented after

reading a passage actually presents an interesting picture. One possibility is the differential

ability of participants to learn from text, thereby weakening further the correlations between

items for this sample. Regardless, since these posttest items were taken directly from the

passage, the validity for the scale outweighs concerns about the reliability as reported by

Cronbach's alpha.

       Calibration. Immediately following each passage knowledge question, participants were

asked to rate their confidence in the answer to the preceding passage knowledge question. The

participants were asked to respond to the calibration items by ―clicking on the line indicating

how confident you would be in the accuracy of your response to the following questions." The
                                                             Metacognitive Monitoring 14


Cronbach's alpha for the confidence scales was 0.89. A sample item for calibration appears

below.


Appellate jurisdiction is exercised by __________.




       0%                                                      100%


Trace Data

         In addition to the measures described above, we collected trace data in the form of

logfiles for scrollbacks and help seeking. We also collected trace data in the form of audiotapes

for the think-aloud protocol.

         Scrollbacks. Scrollbacks were operationalized as the number of times a participant

scrolled backward through the text by at least three lines or more, similar to a study conducted by

Johnson-Glenberg (2005). Trace data were collected on participants' navigation patterns through

the text to give us a count of the total number of scrollbacks for each passage for each

participant. Additionally, we also tracked the amount of time the participants spent on each

portion of the text (i.e., the three line segments).

         Help seeking. Help seeking was operationalized as the number of times a participant

accessed the glossary terms or biographies. Access to the glossary was either access to the terms

or the biographies. Trace data were collected to give us the total number of times for each

passage that the participants accessed the glossary.

         Think aloud. Participants in the think-aloud condition were asked to think aloud while

reading each of the two passages (see the procedures section for more information on the think-

aloud protocol). The 35 think alouds were transcribed into text files by the first and third authors.
                                                             Metacognitive Monitoring 15


These transcripts were then coded for instances of metacognitive monitoring and control by the

first and second authors. Using Flavell's (1979) conception of metacognition, transcripts were

coded for instances of metacognitive knowledge (MK), metacognitive experiences (ME), goals

(G), and the activation of strategies (AS). During coding, goals and the activation of strategies

was combined into a single code (G/AS). Definitions of these three codes and examples for each

appear in Table 1. The level of inter-rater reliability for a randomly seleted 20% of the think

alouds was 90.66%. Differences between these codes were resolved through conference. This

level of inter-rater reliability was considered acceptable, and the first author coded the remainder

of the think-aloud transcripts using this coding scheme.

Procedure

       All participants were treated according to APA (5th Edition) guidelines and completed a

consent form before participating. The experiment was conducted on four PCs in a laboratory

running Internet Explorer 7.0. Data were sent from the PCs to a secure external Apache server

running on a UNIX platform. The experiment was administered by the first, second, and third

authors. Both the order of passages (i.e., expository and persuasive) as well as think-aloud

condition were counterbalanced in a Latin-squares design.

       No think-aloud condition. Participants were seated at one of four computer workstations

in the laboratory. At the beginning of the experiment participants were instructed, "For all of the

measures, if you don't know the answer, please take your best guess." Participants completed the

demographic, prior knowledge, and topic interest measures. According the Latin-squares design,

participants in the no think-aloud condition either had the expository passage first or the

persuasive passage first. Participants were told that they were going to answer questions after

reading the passages. As the participants read the text, they scrolled up or down through the text
                                                             Metacognitive Monitoring 16


until they reached the end of the text. By clicking the "continue" button at the end of the passage

student were directed to the passage knowledge measure. Immediately after the recognition

items, participants completed the confidence scales for each question (calibration). Following the

calibration items, participants completed beliefs and passage interest measures for that particular

passage. Participants then repeated the same procedure for the second passage. Following the

experiment participants were debriefed.

       Think-aloud condition. The procedure for the think aloud condition was identical to the

no think-aloud condition, except for the following additions. Before the first passage subjects

were given instructions for the think-aloud protocol and given a short practice passage. The

protocol for the think aloud is included in Appendix C. The practice passage was about

mosquitoes and was adapted from a popularly written science article by Marston Bates (1975).

Once participants felt comfortable reading aloud, they then read either the expository or

persuasive text first. Before each passage participants were instructed, "As you read this text,

please say out loud what you are thinking and doing." Participants could choose to read aloud or

not. If participants were silent for more than 30 seconds the experimenter prompted the subjects

again to please say out loud what they were thinking or doing. This procedure was repeated for

the second passage.

                                      Results and Discussion

       Results for each of the three research hypotheses (i.e., textual influences on

metacognitive monitoring and control, effects of the think-aloud protocol on metacognitive

monitoring and control, and the influence of domain expertise on metacognitive monitoring and

control) are presented and briefly discussed.

       Due to internet and power failures during data collection, data from 4 participants were
                                                              Metacognitive Monitoring 17


lost. The following analyses used the 76 remaining participants (n = 36 for the think aloud

condition and n = 40 for the no think aloud condition). In addition one think aloud was unusable

due to poor tape quality. Given the circumstances, these data can be considered missing at

random and not a participant effect. All presented analyses use data from the remaining 76

participants and 35 think-aloud transcripts unless otherwise noted.

       Means and standard deviations for the metacognitive monitoring and control variables

(i.e., scrollbacks, help seeking, absolute accuracy, and bias) appear in Table 2 across passages

and think-aloud conditions. The data in Table 2 provide evidence that the number of help seeking

behaviors that this sample engaged in was very limited. Due to the very low prevalence of help

seeking in this investigation (i.e., three participants), we have excluded it from further analyses.

       Since the data collected during this investigation consisted of both trace data (i.e., count

data) and difference scores (i.e., calibration), inferential analysis on the means of these variables

was considered inappropriate. Since these frequency data and difference scores should not be

considered to follow a normal distribution, we chose to use a non-parametric bootstrap

technique. Bootstrap has been identified as a good technique to test non-parametric data (Efron

& Tibshirani, 1993), such as frequency and difference scores in this investigation. For all of the

following tests we used the bootstrapping technique to resample (N=5000) from the participants

in our study (n=76). The re-sample created a distribution in which we calculated the median

(Med) along with a 95% confidence interval at the 2.5 (P2.5) and 97.5 (P97.5) percentiles. This

allowed us to test null hypotheses that differences between passages, conditions, groups, or

interactions were zero at α = 0.05.

Textual Influences on Metacognitive Monitoring and Control

       We compared the monitoring and control variables (i.e., scrollbacks, absolute accuracy,
                                                             Metacognitive Monitoring 18


and bias) between the expository and persuasive passages. In addition to looking at differences

between these measures, we also examined the think-aloud data from participants in the think-

aloud condition (n=35).

         Scrollbacks. We began by testing to see how many times participants scrolled backward

through the expository passage and the persuasive passage (a between-passages test). Figure 1

displays the medians for both passages (0.76 for the expository passage and 1.00 for the

persuasive passage). The median difference between these two passages was -0.25. This suggests

that overall scrollbacks were used 0.25 more times during the persuasive passage than the

expository passage. This was difference was not significant (Med = -0.25, P2.5 = -0.82, P97.5 =

0.35).

         The lack of difference between passages may mask the difference within individuals

between the passages. Specifically, we calculated the difference scores for each individual on

scrollbacks between the passages in order to investigate whether individuals used scrollbacks

more often for the expository or persuasive passage. This is in effect a within-subjects repeated

measures test using bootstrapping. First, we tested the value of the absolute difference (by

participant) in the usage of scrollbacks between the passages by subtracting the number of

scrollbacks in the expository passage by the number of scrollbacks in the persuasive passage.

The median of the resample from the bootstrap test was 0.99. This indicates that a participant at

the 50th percentile of difference scores had a difference in scrollback usage between the passages

of 0.99, regardless of which passage they used the greater number of scrollbacks for. This

median difference was significantly different than zero (Med = 0.99, P2.5 = 0.72, P97.5 = 1.32).

         Specifically, to test our hypothesis that the persuasive passage would elicit more evidence

of metacognitive monitoring and control, we examined the directionality of these difference
                                                              Metacognitive Monitoring 19


scores (i.e., did the individuals use more scrollbacks for the persuasive passage?). Here we tested

the value of the signed difference (retaining the positive or negative value of the difference

score) in participants’ usage of scrollbacks between the passages. The median difference was -

0.24. This indicates that a participant at the 50th percentile of signed difference scores used 0.24

more scrollbacks for the persuasive text than the expository text. However, this was not a

significant difference (Med = -0.24, P2.5 = -0.61, P97.5 = 0.14). This evidence suggests that there

is in fact a main effect for scrollbacks between passages, but that the directionality (i.e., which

passage was greater) was non-significant in this sample.

       Calibration. To test participants’ calibration between passages, we calculated both

absolute accuracy and bias. We used a similar procedure to the one Nietfeld, Cao, and Osborne

(2005) used. For absolute accuracy we calculated the difference between their overall confidence

on the multiple-choice items (on a 100-pixel scale) and their corresponding performance on

those posttest multiple-choice items (percent correct on the posttest). Since we used a targeted

response model, we divided the scores across the eight items by 32 (maximum possible score on

all eight items) instead of the total number of items, as Nietfeld, et al (2005), did. We then took

the absolute value of these differences to get an absolute accuracy score for each individual. For

bias, we used the same procedure except that we retained the signed value of the difference score

between confidence and performance to see if the participants were over- or under-confident.

       We hypothesized that participants would be better calibrated for the persuasive passage

than the expository passage. Figure 2 presents the absolute accuracy and bias scores between the

passages. Lower difference scores indicate that participants were better calibrated. Medians for

absolute accuracy were 11.41 for the expository passage and 11.45 for the persuasive passage.

The median difference between the two passages for these participants on absolute accuracy was
                                                             Metacognitive Monitoring 20


-0.075. This indicates that a participant with an absolute difference score at the 50th percentile

was more closely calibrated on the expository passage than the persuasive passage by 0.075

points, regardless of whether they were overconfident or under-confident on the passages. This

difference was not significant (Med = -0.075, P2.5 = -3.88, P97.5 = 3.97).

       Further, we predicted that participants would be overconfident for the expository passage,

but not so for the persuasive passage. Figure two shows that, in fact, participants were

overconfident for the expository passage (Med = 3.97) and under-confident for the persuasive

passage (Med = -4.63). This indicates that a participant with a signed difference score at the 50th

percentile was overconfident on the expository passage by 3.97 points (confidence was higher

than performance) and under-confident on the persuasive passage by -4.63 points (performance

was higher than confidence). The median difference between the passages was significant (Med

= -8.59, P2.5 = -14.74, P97.5 = -2.39). This evidence suggests that while absolute accuracy did not

differ between the two passages, the manner in which they differed (i.e. over- or under-

confidence) did.

       Think aloud. Figure 3 presents the differences in median number of utterances for

metacognitive knowledge, metacognitive experiences, and goals/activation of strategies within

the 35 participants in the think-aloud condition. Again, we tested the differences within

individuals by subtracting the number of metacognitive knowledge utterances in the persuasive

passage from the metacognitive knowledge utterances in the expository passage. The medians

for metacognitive knowledge utterances were 3.14 and 1.23 for the expository and persuasive

passages respectively. The median difference between passages at the individual level for

metacognitive knowledge was 1.91. This indicates that a participant with a metacognitive

knowledge difference score at the 50th percentile made 1.91 more metacognitive knowledge
                                                             Metacognitive Monitoring 21


utterances in the expository passage than the persuasive passage. This difference was significant

(Med = 1.91, P2.5 = 0.91, P97.5 = 3.09).

         The medians for metacognitive experience utterances were 2.71 and 5.60 for the

expository and persuasive passages respectively. The median difference between passages was -

2.77. This indicates that a participant with a metacognitive experience difference score at the 50th

percentile made 2.77 more metacognitive experience utterances in the persuasive passage than

the expository passage. This difference was also significant (Med = -2.77, P2.5 = -4.52, P97.5 = -

1.11).

         The medians for goals/activation of strategies were 1.54 and 2.11 for the expository and

persuasive passages respectively. The median difference between passages was -0.57. This

indicates that a participant with a goals/activation difference score at the 50th percentile made

0.57 more goals/activation of strategy utterances in the persuasive passage than the expository

passage. This difference was also significant (Med = -0.57, P2.5 = -1.11, P97.5 = -0.29). Overall,

this evidence demonstrates that type of text may elicit more types of metacognitive monitoring

and control, and also demonstrates that type of text may elicit different types of metacognitive

monitoring and control.

Effects of the Think-Aloud Protocol on Metacognitive Monitoring and Control

         Next, we turn to an examination of the differences between the think-aloud and no think-

aloud groups in regards to scrollbacks and calibration. We predicted that think aloud would elicit

greater metacognitive monitoring and control. To test the hypotheses about the difference

between the think-aloud and no think-aloud conditions, we again relied on the non-parametric

bootstrap. For the following analyses differences in scrollbacks and calibration were examined

for passages within participants (i.e., difference scores) between the two groups (essentially a
                                                               Metacognitive Monitoring 22


repeated measures test of passage effects using the think-aloud groups as the between-subjects

effect).

           Scrollbacks. To test the hypothesis that participants in the think-aloud group would

demonstrate more metacognitive monitoring and control via scrollbacks and calibration, we

conducted a bootstrap test with a null hypothesis that the difference in the medians of each group

equaled zero. Figure 4 presents these data for scrollbacks. First, we looked to see if there were

differences in the absolute (unsigned) difference in scrollbacks between passages for each

participant. The absolute median difference between scrollbacks for each individual on the

passages was 0.64 for both the think-aloud and no-think aloud groups. The difference between

these two medians was not significant (Med = 0.00, P2.5 = -0.98, P97.5 = 0.51). This indicates that

a participant at the 50th percentile of the think-aloud group had the same number of scrollbacks

as a participant in the 50th percentile of the no-think aloud group.

           Further, we examined if the groups scrolled back differently for one passage versus the

other by retaining the signed difference. For the think-aloud group, the median difference in

scrollbacks was -0.36, indicating that a participant in the 50th percentile of the acclimation group

scrolled back 0.36 times more often in the persuasive passage than the expository passage. For

the no think-aloud group, the median difference in scrollbacks was -0.13, indicating that a

participant in the 50th percentile of the no think-aloud group scrolled 0.13 times more often in the

persuasive passage than the expository passage. The difference between the two groups on

scrollbacks between the passages was not significant (Med = -0.23, P2.5 = -0.98, P97.5 = 0.52).

These tests indicate that there were no between-subjects effects for the think-aloud condition.

           Calibration. To test the hypothesis that the think-aloud group would be more closely

calibrated than the no think-aloud group, we conducted a bootstrap test with a null hypothesis
                                                             Metacognitive Monitoring 23


that the difference between groups was zero. This is the between-subjects effects (think-aloud

condition) for the repeated measures (i.e., the passages) for calibration. Results for both absolute

accuracy and bias are presented in Figure 5. The first examination was of differences in the

groups’ absolute accuracy (the unsigned difference between confidence and performance). There

were no significant differences in absolute accuracy between the two groups for either the

expository passage (Med = -0.33, P2.5 = -4.17, P97.5 = 3.34) or the persuasive passage (Med =

1.31, P2.5 = -2.71, P97.5 = 5.28).

        The second examination was for bias (the signed difference between confidence and

performance). The difference between the two groups (i.e., think-aloud and no think-aloud) for

the expository passage was 0.94. This means that a participant at the 50th percentile of the think-

aloud group was more under-confident compared to a participant at the 50th percentile of the no

think-aloud group, though this was not statistically significant (Med = 0.94, P2.5 = -5.01, P97.5 =

7.09). These differences were also not significant for the persuasive passage with a median

difference between the two groups of 0.89. This indicates that a participant in the 50th percentile

of the think-aloud group was more under-confident than a participant at the 50th percentile of the

no think-aloud group (Med = 0.89, P2.5 = -5.42, P97.5 = 6.79). These tests indicate that there were

no significant differences in the between-subjects effects for think-aloud condition.

Effect of Domain Expertise on Metacognitive Monitoring and Control

        The participants for the study were chosen specifically because their various levels of

expertise were hypothesized to differ. Although the undergraduates (i.e., from the human

development and government and politics classes) are similar to each other in terms of their GPA

and cumulative college credits completed, they differed in both their prior knowledge and

interest in the judicial review process. Table 3 and Figure 6 show the differences in mean levels
                                                             Metacognitive Monitoring 24


of prior knowledge and topic interest across the three participant pools (which were continuous,

normally distributed data). As we would expect, there is a clear increase in both prior knowledge

and topic interest from the human development undergraduates (those in assimilation),

government and politics undergraduates (those in competence), to the practicing attorneys (those

in expertise).

        An omnibus ANOVA test indicated that there were significant differences in both prior

knowledge (F = 12.72, df = 2, p < 0.01) and topic interest (F = 9.59, df = 2, p < 0.01) between

these three groups. Contrasts (i.e., Fischer's LSD) indicated that there were also significant

differences between the human development undergraduates and the government and politics

undergraduates in both prior knowledge (Mdif = 6.46, SE = 1.54, p < 0.01) and topic interest (Mdif

= 14.46, SE = 4.08, p < 0.01). There were also significant differences between the human

development undergraduates and the practicing attorneys in both prior knowledge (Mdif = 12.85,

SE = 3.50, p < 0.01) and topic interest (Mdif = 29.33, SE = 8.92, p < 0.01). However, significant

differences were not found between the government and politics undergraduates and the

practicing attorneys in either prior knowledge (Mdif = 6.39, SE = 3.50, p = 0.072) or topic interest

(Mdif = 14.86, SE = 8.96, p = 0.10). However, we contend that these differences were not

detected due to the small sample size of the practicing attorneys combined with the conservative

nature of contrasts such as Fischer's LSD.

        Since significant differences were found between the two undergraduate participant pools

and we were able to obtain large enough samples, an examination of the differences between

these two groups in regards to scrollbacks and calibration was undertaken. To test the hypotheses

about the difference between the human development undergraduates (i.e., those in acclimation)

and the government and politics undergraduates (i.e., those in competence), we again relied on
                                                             Metacognitive Monitoring 25


the non-parametric bootstrap. For the following analyses differences in scrollbacks and

calibration were examined for passages within participants (i.e., difference scores) between the

two groups (essentially a repeated measures test of passage effects using the developmental

groupings as the between-subjects effect).

       Scrollbacks. To test the hypothesis that participants in the acclimation group would use

scrollbacks more often in the expository passage than the persuasive passage, whereas the

participants in the competence group would use scrollbacks more for the persuasive passage than

the expository passage, we conducted a bootstrap test with a null hypothesis that the difference in

the medians of each group equaled zero. Figure 7 presents these data for scrollbacks. First, we

looked to see if there were differences in the absolute (unsigned) difference in scrollbacks

between passages for each participant. The absolute median difference between scrollbacks for

each individual on the passages was 0.57 for the acclimation group and 1.38 for the competence

group. The difference between these two medians was significant (Med = -0.80, P2.5 = -1.38,

P97.5 = -0.26). This indicates that a participant at the 50th percentile of the competence group used

0.80 more scrollbacks in one passage versus the other, regardless of which passage they used the

greater number for.

       Further, we examined which of these passages had more scrollbacks in each of these

groups by retaining the signed difference. For the acclimation group, the median difference in

scrollbacks was 0.29, indicating that a participant in the 50th percentile of the acclimation group

scrolled back 0.29 times more often in the expository passage than the persuasive passage. For

the competence group, the median difference in scrollbacks was -0.84, indicating that a

participant in the 50th percentile of the competence group scrolled 0.84 times more often in the

persuasive passage than the expository passage. The difference between the two groups on
                                                              Metacognitive Monitoring 26


scrollbacks between the passages was significant (Med = 1.08, P2.5 = 0.33, P97.5 = 1.73). This

indicates that in addition to a repeated measures effect for passages (i.e., textual effects of

metacognitive monitoring and control), there is also a between-subjects effect of developmental

level.

         Calibration. To test the hypothesis that the group in competence would be more closely

calibrated than the group in acclimation, we conducted a bootstrap test with a null hypothesis

that the difference between groups was zero. This is the between-subjects effects (developmental

group) for the repeated measures (i.e., the passages) for calibration. Results for both absolute

accuracy and bias are presented in Figure 8. First, we looked to see if there were differences in

their absolute accuracy (the unsigned difference between confidence and performance). There

were no significant differences in absolute accuracy between the two groups for either the

expository passage (Med = -1.74, P2.5 = -5.48, P97.5 = 1.86) or the persuasive passage (Med =

0.00, P2.5 = -4.05, P97.5 = 4.09).

         For bias, slight differences between the groups began to emerge, particularly for the

persuasive passage. The difference between the two groups (i.e., acclimation and competence)

for the expository passage was 0.71. This means that a participant at the 50th percentile of the

competence group was slightly more overconfident compared to a participant at the 50th

percentile of the acclimation group, though this was not statistically significant (Med = 0.71, P2.5

= -5.28, P97.5 = 6.59). However, for the persuasive passage this difference was greater. In fact,

for our sample, a participant in the 50th percentile of the acclimation group was more

underconfident (by 5.13 points) than a participant at the 50th percentile of the competence group,

although this difference was not significant (Med = 5.13, P2.5 = -1.18, P97.5 = 11.55).

                                             Conclusions
                                                               Metacognitive Monitoring 27


        To our knowledge, this study was the first to investigate metacognitive monitoring and

control between expository and persuasive text. Previous evidence of persuasion on knowledge

and interest (Buehl et al, 2001) spurred us to investigate learners’ strategic processing with

persuasive text. In addition, it was important to us that measures of metacognitive monitoring

and control did not change or elicit different levels (quantity or quality) of participants’ mental

processing. Moreover, by sampling from participant pools which we hypothesized would have

varying levels of expertise (i.e., acclimation, competence, and proficiency) we were able to

examine these differences among participants of different familiarity with the domain.

        Of the results presented above, the most surprising to us was the very limited use of the

help-seeking feature (i.e., the glossary) by the participants of all expertise levels in this

investigation. Given the active lines of research dealing with help seeking in the literature (e.g.,

Aleven & Koedinger, 2002), we expected participants to use help seeking in at least one of the

two passages. Two reasons may underlie the limited use of help seeking here. One, it may be a

reflection of the differences in task environment, and two, it may be a reflection of the

participants' motivation.

        First, unlike Aleven and Koedinger’s work (which primarily deals with well-structured

tasks such as solving geometry problems), the task environment here was an ill-structured task,

comprehending text. Since participants were not required to find ―an answer‖ to a problem, but

rather try to comprehend the passage, the participants may have been unaware that they needed

to seek help (a monitoring problem). Additionally, the accessibility of the help seeking feature

may make a difference in their probability of using the feature. For example, if the environment

(such as a cognitive tutor) prompts students with a help-seeking option, they may be more likely

to examine these features. In this investigation, participants were told the help-seeking feature
                                                               Metacognitive Monitoring 28


was available, but were not prompted during the task to use this feature.

       Second, participants’ motivation may have played a role in the limited use of help

seeking within this study (a control problem). Participants may know they do not understand a

term, but lack the interest or need to comprehend the passage to actually seek help. This finding

is particularly helpful in attempts to structure environments (computerized and otherwise) that

encourage participants to monitor and control their mental processes. However, with these results

in mind, we caution that if participants are prompted to seek help, this does not mean that they

will be able or willing to seek help on their own. This is particularly salient in the literature

dealing with metacognition and self-regulated learning since a large percentage of studies use

some form of prompting (Dinsmore, et al, 2008).

Textual Influences on Metacognitive Monitoring and Control

       Rereading differed across conditions, as we found evidence that participants used

scrollbacks differently across the two passages, but that the participants did not necessarily use

scrollbacks more for the persuasive passage than for the expository passage. One possible

explanation may relate to working memory demands, while the other is a limitation with the

choice of measure in this investigation. If in fact, as Kellogg (2001) found, that persuasive text

places greater demands on working memory than expository text, this may explain why some

participants reread more for the persuasive text and others reread more for the expository text. In

order to deal with higher demands on working memory, one might need to use strategies, such as

rereading to deal with the higher demands of the persuasive passage. On the other hand, it is also

possible that the high demands on working memory make monitoring and control more costly,

causing one to reread less of the persuasive passage. We suspect that some of these issues will be

clarified as we examine rereading among the domain expertise groups. It would be our
                                                              Metacognitive Monitoring 29


contention that prior knowledge and interest of the individual may help explain these findings in

regards to rereading.

       An alternative explanation may involve the limitations of using scrollbacks to measure

rereading. We were only able to detect when participants scrolled back more than three lines.

Since we had collected think-aloud data for some individuals, an inspection of these transcripts

revealed that participants reported rereading more times than they had scrolled back (i.e., going

back one or two lines). We will continue to examine participants' strategic moves through text

with measures more fine-tuned than scrollbacks. For example, we are hoping that using eye-

tracking methodology will help us examine strategic moves through different types of text.

       While absolute accuracy did not differ between passages as we expected, the difference

in bias (i.e., overconfidence and under-confidence) was significant. We can forward two possible

explanations for this finding. First, the difference in bias may relate to participants relative

familiarity in reading expository and persuasive passages. A large majority of the participants in

the study were university undergraduates who read mostly expository texts (i.e., textbooks) for

their classes. Familiarity with this type of text may increase their confidence to levels beyond

their actual performance. Conversely, their relative unfamiliarity with persuasive texts

(especially in the classroom environment) may make them less confident in their ability to

comprehend the text. As we collect more data for practicing attorneys, we hypothesize that their

familiarity with legal briefs (a type of persuasive text) may moderate their bias scores.

       Overall, metacognitive experience and goals/activation of strategies were higher for the

persuasive passage, while metacognitive knowledge was higher for the expository passage. This

finding makes sense to us, since the expository passage was primarily a collection of declarative

facts (e.g., ―Since Marbury v. Madison, about 150 federal laws have been struck down in whole
                                                             Metacognitive Monitoring 30


or in part, along with about 1000 state laws and more than 100 municipal ordinances.‖). Making

connections to their prior knowledge (e.g., ―I knew that‖, ―I didn’t know that‖) was the main

metacognitive monitoring activity during this expository passage. Whereas, in the persuasive

passage participants had to evaluate both comprehension and agreement (e.g., ―I don’t

understand that‖, ―I agree with that‖) in order to analyze the arguments being presented in the

passage (e.g., ―Gonzales, arguing against judicial activism, states that courts should be very

careful in taking the step of declaring that a law or agency action is unconstitutional.‖).

Interestingly, there were more utterances of goals/activation of strategies in the persuasive

passage. This may indicate increased engagement with the text, especially since the participants

had to evaluate both comprehension and agreement more closely in the persuasive text than the

expository text. This finding supports the explanation that perhaps scrollbacks are not fine

grained enough to differentiate the activation of strategies in these two types of texts.

Effects of the Think-Aloud Protocol on Metacognitive Monitoring and Control

       In line with previous studies (e.g., Veenman et al, 1993), we did not find significant

differences in metacognitive monitoring and control as measured by scrollbacks and calibration.

However, this does not mean that differences do not exist. It may be the case that our ability to

detect these differences was limited by our measures. For example, the rereading was

operationalized as scrolling back through more than three lines of text. It is possible, as we stated

above, that participants looked back one or two lines more often in one of the conditions, but that

this difference was undetectable in our data.

       Although we found no significant differences here, we are still unsure that the think-

aloud protocol has no impact on metacognitive monitoring and control, especially given the

evidence that this protocol significantly affects participant outcomes (Karahasanović, Hinkel,
                                                              Metacognitive Monitoring 31


Sjøberg, and Thomas, 2009). Since we have other data on participant outcomes in addition to the

multiple-choice items, we plan to investigate whether this is the case in this study as well.

         Additionally, as Greatorx and Süto (2008) reported in their descriptive study, participants

reported varied experiences with the think-aloud protocol. An examination of Table 4 shows that

while the means for scrollbacks are similar for the two conditions, the standard deviation for the

think-aloud condition was higher. In fact Box’s Test (which tests the equality of the covariance

matrices between groups) was significant (F = 3.07, df = 3, 1559607, p < 0.05). This finding is in

line with what Greatorex and Süto (2008) found in their descriptive study. We can forward two

explanations for this difference in variance between the groups. The literature suggests that the

directions (specifically whether they chose to read out loud or not) may have primed some

participants to engage in certain behaviors, strategic and otherwise (Bannert & Mengelkamp,

2008).

Effect of Domain Expertise on Metacognitive Monitoring and Control

         For the third question, a between-subjects effect of developmental level, we found

significant differences between the groups' rereading behavior, but not their calibration. This

question is one of central importance, as studies comparing metacognition at different levels of

expertise are limited in the contemporary literature (Dinsmore, et al, 2008). First, we found that

the government and politics students reread more for the persuasive passage than the expository

passage. This clarifies the findings from the within-subjects effects of the passages above. In

fact, it was interesting that unlike the trend for all the participants together, the human

development undergraduates as a group scrolled reread more for the expository text than the

persuasive text. Not only were these participants likely more unfamiliar with persuasive text in

the classroom environment, they were as a group more unfamiliar with the topic. We contend
                                                            Metacognitive Monitoring 32


that they were probably able to engage with the expository text more easily because it required

less prior knowledge to comprehend. Conversely, it is likely that more prior knowledge would be

necessary to understand and engage with the arguments presented for and against judicial

activism, which would subsequently impact ease of comprehension.

       We did not find significant differences for calibration, which is in line with previous

research that novices and experts do not necessarily differ in how well-calibrated they are to a

task (Lichtenstein & Fischoff, 1980). Both groups (i.e., acclimation and competence) were

overconfident for the expository passage and under-confident on the persuasive passage. Overall,

these participants actually seemed to be fairly well calibrated. The median participant was only

miscalibrated by about 11 points. We were surprised that most participants, particularly the

participants in acclimation were so accurate.

       Considering metacognition through a developmental theory of expertise, such as the

MDL is crucial. When assigning course text, students' prior knowledge and interest should be

considered as these factors indicate competence within a domain and the subsequent ease with

which students can engage with particular types of text. As Fox et al. have cautioned,

undergraduates are not as apt to learn from assigned texts as instructors often assume. Difficulty

learning from text often stems from poor metacognitive monitoring and control (Wiley, Griffin,

& Thiede, 2005). In order to further investigate the findings reported in this study it may be

necessary to collect more data from practicing attorneys, so we can examine differences among

all levels of expertise, including those demonstrating proficiency.
                                                          Metacognitive Monitoring 33


                                         Author Note

       We would like to thank Emily Fox for her help in adapting the passages. We would also

like to thank the members of the Disciplined Reading and Learning Research Laboratory for

their helpful comments and feedback on this manuscript.
                                                            Metacognitive Monitoring 34


                                           References

Aleven, V. A., & Koedinger, K. R. (2002). An effective metacognitive strategy: Learning by

       doing and explaining with a computer-based Cognitive Tutor. Cognitive Science, 26, 147-

       179.

Aleven, V. A., McLaren, B., Roll, I., & Koedinger, K. R. (2006) Toward Meta-cognitive

       Tutoring: A Model of Help Seeking with a Cognitive Tutor. International Journal of

       Artificial Intelligence in Education, 16, 101-128.

Alexander, P. A. (1997). Mapping the multidimensional nature of domain learning: The interplay

       of cognitive, motivational, and strategic forces. In M. L. Maehr & P. R. Pintrich (Eds.),

       Advances in motivation and achievement (Vol. 10, pp. 213–250). Greenwich, CT: JAI

       Press.

Alexander, P. A., Murphy, P. K., & Kulikowich, J. M. (1998). What responses to domain-

       specific analogy problems reveal about emerging competence: A new perspective on an

       old acquaintance. Journal of Educational Psychology, 90, 397-406.

Allen, M. (1991). Meta-analysis comparing the persuasiveness of one-sided and two-sided

       messages. Western Journal of Speech Communication, 55, 390-404.

Bannert, M., & Mengelkamp, C. (2008). Assessment of metacognitive skills by means of

       instruction to think aloud and reflect when prompted. Does the verbalization affect

       learning? Metacognition and Learning, 3, 39-58.

Bates, M. (1975). The lady lives on blood. In A. Ternes (Ed.), Ants, Indians, and little dinosaurs

(pp. 74-82). New York: Charles Scribner’s Sons.

Bernardi, R. A. (1994). Validating research results when Cronbach's alpha is below .70: A

       methodological procedure. Educational and Psychological Measurement, 54, 766-775.
                                                            Metacognitive Monitoring 35


Bolick, C. (2007). A cheer for judicial activism. Retrieved January 21, 2008,

       http://www.cato.org/pub_display.php?pub_id=8168.

Buehl, M. M., Alexander, P. A., Murphy, P. K., & Sperl, C. T. (2001). Profiling persuasion: The

       role of beliefs, knowledge, and interest in the processing of persuasive texts that vary by

       argument structure. Journal of Literacy Research, 33, 269-301.

Carrell, P. L., & Connor, U. (1991). Reading and writing descriptive and persuasive texts. The

       Modern Language Journal, 75, 314-324.

Dahl, M., Allwood, C. M., & Hagberg, B. (2009). The realism in older people's confidence

       judgments of answers to general knowledge questions. Psychology and Aging, 24, 234-

       238.

de Bruin, A. B. H., Rikers, R. M. J. P., Schmidt, H. G. (2007). Improving metacomprehension

       accuracy and self-regulation in cognitive skill acquisition: The effect of learner expertise.

       European Journal of Cognitive Psychology, 19, 671-688.

Dinsmore, D. L., Alexander, P. A., & Loughlin, S. M. (2008). Focusing the conceptual lens on

       metacognition, self-regulation, and self-regulated learning. Educational Psychology

       Review, 20, 391-409.

Dunlosky, J., Serra, M. J., Matvey, G., & Rawson, K. A. (2005). Second-order judgments about

       judgments of learning. Journal of General Psychology, 132, 335-346.

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman &

       Hall.

Ericcson, K. A., & Simon, H. A. (1984). Protocol analysis: Verbal reports as data. Cambridge,

       MA, US: The MIT Press.
                                                          Metacognitive Monitoring 36


Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitive–

       developmental inquiry. American Psychologist, 34, 906–911.

Fox, E., Dinsmore, D. L., Maggioni, L., & Alexander, P. A. (2009, April). Factors associated

       with undergraduates’ success in reading and learning from course texts. Paper presented

       at the annual meeting of the American Educational Research Association, San Diego,

       CA.

Gonzales, A. (2007). Speech to the American Enterprise Institute. Retrieved January 21, 2008,

       from http://jurist.law.pitt.edu/paperchase/2007/01/gonzales-disparages-judicial.php.

Greatorex, J., & Süto, W. M. I. (2008). What do GCSE examiners think of 'thinking aloud'?

       Findings from an exploratory study. Educational Research, 50, 319-331.

Johnson-Glenberg, M. C. (2005). Web-based training of metacognitive strategies for text

       comprehension: Focus on poor comprehenders. Reading and Writing, 18, 755-786.

Kamalski, J. Sanders, T., & Lentz, L. (2008). Coherence marking, prior knowledge, and

       comprehension of informative and persuasive texts: Sorting things out. Discourse

       Processes, 45, 323-345.

Karahasanović, A., Hinkel, U. N., Sjøberg, D. I. K., & Thomas, R. (2009). Comparing of

       feedback-collection and think-aloud methods in program comprehension studies.

       Behaviour & Information Technology, 28, 139-164.

Kellogg, R. T. (2001). Competition for working memory among writing processes. American

       Journal of Psychology, 114, 175-191.

Lichenstein, S., & Fischoff, B. (1980). Training for calibration. Organizational Behavior and

       Human Performance, 26, 149-171.
                                                           Metacognitive Monitoring 37


Microsoft (2008). The judicial branch. Retrieved January 21, 2008,

       http://encarta.msn.com/encyclopedia_761595623/Judicial_Branch.html

Miller, P. H., Kessel, F. S., & Flavell, J. H. (1970). Thinking about people thinking about people

       thinking about...: A study of social–cognitive development. Child Development, 41, 613–

       623.

Moos, D. C., & Azevedo, R. (2008). Self-regulated learning with hypermedia: The role of prior

       domain knowledge. Contemporary Educational Psychology, 33, 270-298.

Murphy, P. K., Long, J. F., Holleran, T. A., & Esterly, E. (2003). Persuasion online or on paper:

       A new take on an old issue. Learning and Instruction, 13, 511-532.

Parkinson M. M., & Dinsmore, D. L. (in preparation). Calibrating calibration.

Nietfeld, J. L., Cao, L., & Osborne, J. W. (2005) Metacognitive monitoring accuracy and student

       performance in the postsecondary classroom. Journal of Experimental Education, 74, 7-

       28.

Rhodes, Matthew G.; Castel, Alan D. (2008). Metacognition and part-set cuing: Can interference

       be predicted at retrieval? Memory & Cognition, 36, 1429-1438.

Shapiro, A. M. (2008) Hypermedia design as learner scaffolding. Educational Technology

       Research and Development, 56, 29-44.

Thiede, K. W., & Dunlosky, J. (1994). Delaying students' metacognitive monitoring improves

       their accuracy in predicting their recognition performance. Journal of Educational

       Psychology, 86, 290-302.

Veenman, M. V. J., Elshout, J. J., & Groen, M. G. M. (1993) Thinking aloud: Does it affect

       regulatory processes in learning? Tijdschrift voor Onderwijsresearch, 18, 322-330.
                                                            Metacognitive Monitoring 38


Wiley, J., Griffin, T. D., Thiede, K. W. (2005). Putting the comprehension in

       metacomprehension. Journal of General Psychology, 132, 408-428.

Williams, J. P., Stafford, K. B., Lauer, K. D., Hall, K. M., Pollini, S. (2009). Embedding reading

       comprehension training in content-area instruction. Journal of Educational Psychology,

       101, 1-20.
                                                                Metacognitive Monitoring 39


                                              Appendix A

                                           Expository Passage

       The Judicial Branch, the portion of the United States national government that decides

cases arising under federal laws and under the Constitution of the United States. The judicial

branch interprets laws that have been passed by the legislative branch (Congress) and approved

by the president of the United States, who leads the executive branch.

       Article III of the Constitution vests the judicial power in ―one supreme Court, and in such

inferior courts as the Congress may from time to time establish.‖ This means that apart from the

Supreme Court, the organization of the judicial branch is left in the hands of Congress.

Beginning with the Judiciary Act of 1789, Congress created several types of courts and other

judicial organizations, which now include lower courts, specialized courts, and administrative

offices to help run the judicial system.

       Federal courts have a leading role in interpreting laws, rules, and other government

actions, and determining whether they conform to the Constitution. This function of judicial

review was asserted in 1803 by Chief Justice John Marshall in the case of Marbury v. Madison.

Judicial review includes both interpreting the law and judging cases. First, in Marshall’s words,

―it is emphatically the province and duty of the judicial department to say what the law is.‖ This

need to explain the law stems from the fact that the Constitution and many laws include vague

words or phrases. The ambiguity of the Constitution’s 14th Amendment, for example, makes it

one of the most important sources of cases argued before the Supreme Court. The amendment

guarantees citizens ―due process of law‖ and ―equal protection of the laws.‖ The meaning of

these phrases is unclear, leading to protracted court battles over the application of the 14th

Amendment to groups such as racial minorities, women, people with disabilities, and legal and
                                                             Metacognitive Monitoring 40


illegal aliens. Confusion and disagreement over the amendment have thrust the courts into

disputes over affirmative action, abortion, sexual preferences, welfare benefits, and the rights of

the disabled.

       Striking down laws or practices that violate the Constitution is another function of

judicial review. Although the Court voided few laws during its first hundred years, it proved

much more willing to take such strong steps in the 20th century. Since Marbury v. Madison,

about 150 federal laws have been struck down in whole or in part, along with about 1000 state

laws and more than 100 municipal ordinances.

       The courts do not always have the final say in settling issues of legal interpretation.

Working together, Congress and the states can compel the courts to accept a legal principle by

amending the Constitution. After the Supreme Court ruled that income taxes were

unconstitutional in Pollock v. Farmers’ Loan & Trust Co. in 1895, for example, Congress and the

states ratified the 16th Amendment in 1913 to permit such taxes. Amending the Constitution is

difficult and is usually time consuming, however.

       The president and members of Congress have their own ideas of what the Constitution

permits, and on occasion they may try to impede or simply ignore the courts’ decisions.

The president of the United States appoints federal judges, but these appointments are subject to

approval by the Senate. Once confirmed by the Senate, federal judges have appointments for life

or until they choose to retire. Federal judges can be removed from their positions only if they are

convicted of impeachable offenses by the Senate, but this has happened on only a few occasions.

The life-long appointments of federal judges makes it easier for the judiciary to stay removed

from political pressure. The long terms mean that presidential appointees to federal courts will
                                                              Metacognitive Monitoring 41


have an influence that lasts for decades, so the Senate closely scrutinizes many appointments,

and sometimes blocks them altogether.

         The federal courts—which include district courts, courts of appeal, and the Supreme

Court—handle only a small part of the legal cases in the United States. Most cases involve state

and local laws, so they are tried in state and local courts rather than federal courts. Despite its

relatively narrow jurisdiction, the caseload of the federal court system usually increases every

year. To cope with the rapidly rising volume of work, Congress has repeatedly expanded the

number of lower federal courts and judges.

         Most federal cases start out in the district courts, which are trial courts—courts that hear

testimony about the facts of a case. There are about 90 district courts, including one or more in

each state, one in the District of Columbia, one in Puerto Rico, and three territorial courts with

jurisdiction over Guam, the Virgin Islands of the United States, and other U.S. territories. Each

district is assigned from 2 to 28 judges, and there are about 650 district court judges in all. Each

year the district courts handle more than 250,000 civil cases and more than 45,000 criminal

cases, but only a tiny percentage of the civil and criminal cases actually go to trial.

After a district court hears the facts of a case and issues a decision, the decision can be appealed

to the second tier in the judicial branch, the courts of appeals. The appeals courts can consider

only questions of law and legal interpretation, and in nearly all cases must accept the lower

court’s factual findings. An appeals court cannot, for example, consider whether the physical

evidence in a case was enough to prove a person was guilty. Instead, the appeals court might

consider whether the district court followed appropriate rules in accepting evidence during the

trial.
                                                               Metacognitive Monitoring 42


          The federal appeals courts system was created in 1891 to assist the Supreme Court with

its workload. About 50,000 such appeals are filed every year. For appeals purposes, the United

States is divided into 12 judicial areas called circuits, each with an appeals court containing from

6 to 28 judges. Every state, territory, and the District of Columbia belongs to an appeals circuit .

An additional appeals court, the Court of Appeals for the Federal Circuit, has nationwide

jurisdiction over major federal questions.

          Decisions of the appeals courts are final, unless the U.S. Supreme Court agrees to hear a

further appeal. In district courts, most cases are heard by a single judge. In the appeals courts,

cases are usually heard by a panel of three or more judges. When all of the court’s panels of

judges sit together to hear a case the court is said to be sitting en banc.

          The United States Supreme Court is the highest court of the country. It consists of nine

judges called justices, including a chief justice and eight associate justices. This number has

remained steady for decades and now seems fixed, although in the 19th century the Court’s size

varied.
                                                             Metacognitive Monitoring 43


                                            Appendix B

                                        Persuasive Passage

       Judicial activism has always been a subject of argument, but is now getting more

attention, particularly due to recent court decisions, such as Hamdan v. Rumsfeld. In this case, a

federal court decided that the Executive Branch could not hold certain suspects without trial

indefinitely. Judicial activism is viewed by its critics, such as Alberto Gonzales, as ―the judiciary

overstepping the bounds set by the Constitution.‖ On the other hand, supporters of a strong,

active judiciary, such as Clint Bolick, feel that recent cases in which judges have been described

as "activist" are actually examples of the judiciary upholding its constitutional role and

protecting the rights of individuals. Both sides base their supporting arguments on historical

grounds, on checks and balances of the Constitution, and on citizens’ rights.

       Historical references are used as support both by critics and by those in favor of judicial

activism. Gonzales uses the writers of the U. S. Constitution to support his argument against

activism, saying that he does not believe those who wrote the Constitution ever intended that

judges or courts would take on the role of making policy. He refers to Alexander Hamilton's

statement in the Federalist Papers in which Hamilton says that the judicial branch of the

government will have the least power to endanger political rights because of the limited nature of

the functions assigned to it in the Constitution. Bolick uses similar but more compelling

historical references to make his case in favor of a stronger role for the courts. He argues that

judicial review, the power to invalidate unconstitutional laws, was essential to the type of

government established by our Constitution. He quotes James Madison, another writer of the

Constitution, who argued that one role of the judicial branch will be to guard our individual

rights from possible violation by the executive or legislative branches of government. For
                                                              Metacognitive Monitoring 44


example, courts have found that certain anti-abortion legislation made by states violates the 14th

amendment of the Constitution, which protects the "right to privacy." Therefore, many state laws

regarding abortion have been deemed unconstitutional by the courts and thrown out. So the

function of judicial review given to the courts by the Constitutional actually gives them great

power as the guardian of the constitutional rights of every citizen.

       The checks and balances of the three branches of government are also used to both

criticize and support an active role for the judiciary branch. The writers of the U. S. Constitution

envisioned three separate but equal branches of the federal government. The checks and balances

of the Constitution ensure that no one branch of government or person has too much power.

Gonzales, arguing against judicial activism, states that courts should be very careful in taking the

step of declaring that a law or agency action is unconstitutional. He says that lawmakers and

Executive Branch officials have sworn to uphold the Constitution, just as judges do. Courts that

too easily use the Constitution as a way to strike down the actions of the other branches may not

be allowing the legislature and the President to exercise their proper constitutional roles.

However, Bolick raises the counter-argument that the courts are well equipped to second-guess

lawmakers’ decisions that may be made too hastily or for the wrong reasons and that do not take

into account all of the possible Constitutional issues. If legislators carefully considered the merits

and constitutionality of legislation, then Gonzales's arguments might have merit. But our

legislators rarely even read the complex bills they pass, which all too often are written to please

outside interests, such as lobbyists who may have special interests or big business at heart.

Judges, by contrast, look carefully at the competing evidence presented by both sides, as they

should. If the courts did not check whether laws or decisions by the executive branch are actually

in line with the Constitution, they would not be carrying out their own constitutional role. This
                                                              Metacognitive Monitoring 45


would undo our checks and balances system and allow the legislative and executive branches to

have too much power

       Protection of citizens’ rights is another issue used both to criticize and to support an

active role for judges and the courts. Gonzales agrees that the courts must protect people from

situations where the wishes of the majority might go against an individual’s constitutional rights.

But he says that it is far more important to guard against the situation of having activist judges

who undermine the right of the people to govern themselves. We elect lawmakers and our

president and we have the right to expect that they will express the will of the majority – that is

their job. And if they do not, we have the power to select different representatives and a different

president in the next election. But when power is held by a few judges who are not elected and

who can overturn the actions of our elected officials, we face a far greater danger. Yet, in posing

this argument, Gonzales fails to take into account the other side of the problem, individual rights.

Bolick says that the situation of unelected judges overriding the strong and clearly expressed

wishes of a majority of the voters is extremely rare. A far greater problem is that judges do not

take enough care to protect individual rights. The courts are much more likely to presume that

laws and government actions are constitutional, making it much harder for individuals to prove

that their rights have been violated. Even worse, courts have decided that the Constitution does

not protect some very important individual rights against the interference of the government,

including some related to the protections and privileges that go with being a citizen. So not only

are courts ignoring legislation that is unconstitutional, they are interpreting the Constitution in a

way that lets the government override the rights of individual citizens.

       Gonzales concludes that if the people have decided they favor your policy goals at the

ballot box, then you get a chance to set policy and make laws. He says that the party that controls
                                                             Metacognitive Monitoring 46


Congress and has the votes to enact laws supporting their policies should be free to do so without

contradiction from activist judges who disagree with those laws on political grounds. Bolick

shifts the argument away from the narrow issue of politics. He argues instead that the importance

of judicial activism revolves around the minority rights that are the essential element of the

Constitution and our democracy. He says that a court gavel can be David's hammer against the

Goliath of big government. Among our governmental institutions, courts alone are designed to

protect the individual against the power of the majority, and against special interest groups with

too much influence. We all have a stake in seeing that the judiciary does protect us, for as

government expands with new demands, such as Homeland Security, our freedom depends on

the willingness of courts to keep the government in line. For better or worse, the courts are the

last line of defense against the government running roughshod over individual liberties. When

judges swear allegiance to the Constitution, they must be aware of the danger of going beyond

the proper bounds of their judicial power, but even more so of the greater danger of not using it

enough.
                                                             Metacognitive Monitoring 47


                                            Appendix C

                               Protocol for Think-Aloud Condition

Instructions for Think-aloud Protocol

"In this investigation, we are interested in what you think and do while you read a text. What we

want you to do is say what you are thinking and doing out loud. You can decide for yourself

whether you would like to read the text silently or out loud, or do some of both. Do whatever

feels most natural to you. We are only interested in what you are thinking or doing as you read.

For example, if you are going back to reread, please say that's what you are doing. If something

in the text reminds you of prior experiences or things you already know, let us know. If you are

thinking that you don't understand something, please say that, too. There is no right or wrong

things to say here, just whatever is going through your head as you read. If you are quiet for a

period of time, I'll ask you to say what you're thinking. Do you have any questions?"



Instructions for Practice Passage

"So that you can get comfortable with thinking aloud while you read, I'm going to give you a

practice passage to read first. This is just a practice, and I won't be recording what you say. You

can take your time and get used to how it feels. So, what I want you to do now is read the

passage and say what you're thinking and doing out loud."
                                                           Metacognitive Monitoring 48


Table 1

Codes Used for Think-Aloud Transcripts

Code                           Description                                Example

Metacognitive       Knowledge or beliefs that affect      "Wow, I never knew that."

Knowledge (MK)      the course of mental operations       "Judicial activism, I'm pretty sure I

                    about a person, task, or strategy.    know what that is."




Metacognitive       Cognitive or affective experience     "I'm being distracted by noise

Experience (ME)     that pertain to a mental operation.   outside."

                                                          "Ok, I didn't understand that part."



Goals and           Realizing through a                   "I'll just start that paragraph over."

Activation of       metacognitive experience and          "I'm going back, to re-read

Strategies (G/AS)   planning to evoke a strategy and      something."

                    evidence of those strategies.
                                                    Metacognitive Monitoring 49


Table 2

Descriptive Statistics of Metacognitive Monitoring and Control Across Think-Aloud

Conditions and Across Passages

              Min.     Max.      Mean (SD)

Scrollbacks    0.00     8.00      1.76 (2.04)

Help
               0.00     3.00      0.13 (0.47)
Seeking

Absolute
               2.00    64.00     22.89 (13.50)
Accuracy


Bias          -49.88   25.00     -8.57 (15.26)
                                                     Metacognitive Monitoring 50


Table 3

Descriptive Statistics of the Three Participant Groups on Prior Knowledge and Interest in

the Judicial Review Process

                        Prior Knowledge                      Topic Interest
               Min.     Max. Mean (SD)             Min.     Max.      Mean (SD)

HDU            23.00    58.00    41.65 (7.00)      12.60    85.20      45.70 (16.21)


GPU            35.00    59.00    48.11 (6.53)       9.90    87.10      60.16 (17.52)


PA             52.00    57.00    54.50 (2.08)      53.30    97.50      75.03 (18.63)

Note. HDU = human development undergraduates, GPU = government and politics

undergraduates, PA = practicing attorneys
                                                   Metacognitive Monitoring 51


Table 4

Descriptive Statistics of Metacognitive Monitoring and Control Between Think-Aloud

Conditions and Across Passages

                        Think aloud                        No think aloud
              Min.      Max. Mean (SD)            Min.     Max.     Mean (SD)

Scrollbacks    0.00     8.00      2.03 (2.40)      0.00     6.00     1.53 (1.65)

Help
               0.00     2.00      0.56 (0.33)      0.00     3.00     0.13 (0.56)
Seeking

Absolute
               2.00     50.00    23.36 (11.73)     4.00    64.00    22.47 (15.06)
Accuracy


Bias           -49.88   25.00    -8.57 (18.25)    -30.88   16.13     -8.58 (12.20)
                                                    Metacognitive Monitoring 52


Figure 1

Median Number of Scrollbacks by Passage

                           1.2

                            1
   Number of ScrollBacks




                           0.8

                           0.6

                           0.4

                           0.2

                            0
                                 Expository   Persuasive
                                                                           Metacognitive Monitoring 53


Figure 2

Median Calibration Scores for Absolute Accuracy and Bias Between the Expository and
Persuasive Passages

                                           14
   Calibration (Condfidence-Performance)




                                           12
                                           10
                                           8
                                           6
                                           4
                                                                    *
                                                                               Expository
                                           2                                   Persuasive
                                           0
                                           -2
                                           -4
                                           -6
                                                Absolute Accuracy   Bias
                                                        Metacognitive Monitoring 54


Figure 3

Differences in Think-Aloud Utterances for the Expository and Persuasive Passages

                                 6
                                          *
   Median Number of Utterances




                                 5

                                 4

                                 3
                                     *
                                                             Expository
                                               *             Persuasive
                                 2

                                 1

                                 0
                                     MK   ME   G/AS


Note: MK = Metacognitive Knowledge, ME = Metacognitive Experience, G/AS =
Goals/Activation of Strategies
                                                          Metacognitive Monitoring 55


Figure 4

Absolute and Signed Difference in Number of Scrollbacks Between the Expository and
Persuasive Passages Among Think-Aloud and No Think-Aloud Conditions

                               0.8

                               0.6
   Difference in Scrollbacks
   (Expository-Persuasive)




                               0.4

                               0.2
                                                           Think Aloud
                                 0
                                                           No Think Aloud
                               -0.2

                               -0.4

                               -0.6
                                      Absolute   Signed
                                                                                                  Metacognitive Monitoring 56


Figure 5

Absolute Accuracy and Bias of the Expository and Persuasive Passages Among the Think-Aloud
and No Think-Aloud Conditions

                                          14

                                          12
   Calibration (Confidence-Performance)




                                          10

                                           8

                                           6

                                           4
                                                                                                                   Think Aloud
                                           2
                                                                                                                   No Think Aloud
                                           0

                                          -2

                                          -4

                                          -6
                                                 Absolute       Absolute     Bias (Expository) Bias (Persuasive)
                                                 Accuracy       Accuracy
                                               (Expository)   (Persuasive)
                                                                   Metacognitive Monitoring 57


Figure 6

Differences in Prior Knowledge and Topic Interest Among the Three Participant Groups

                   100
                                                       *
                    90
                    80           *
                    70                            *            75.03
                             *
   Average Score




                    60
                                                       60.16                 HD
                    50                   54.5
                                 48.11                                       GP
                    40                          45.7
                         41.65                                               PA
                    30
                    20
                    10
                     0
                          Prior Knowledge         Topic Interest

Note. HD = Human Development Undergraduates, GP = Government and Politics
Undergraduates, PA = Practicing Attorneys
                                                          Metacognitive Monitoring 58


Figure 7

Absolute and Signed Difference in Number of Scrollbacks Between the Expository and
Persuasive Passages Among Human Development and Government and Politics Undergraduates

                               1.5
                                        *
                                 1
   Difference in Scrollbacks
   (Expository-Persuasive)




                               0.5
                                                    *               HD
                                 0                                  GP


                               -0.5


                                -1
                                      Absolute   Signed

Note. HD = Human Development Undergraduates, GP = Government and Politics
Undergraduates
                                                                                                    Metacognitive Monitoring 59


Figure 8

Absolute Accuracy and Bias of the Expository and Persuasive Passages Among Human
Development and Government and Politics Undergraduates

                                          15
   Calibration (Confidence-Performance)




                                          10



                                           5

                                                                                                                            HD
                                           0
                                                                                                                            GP


                                           -5



                                          -10
                                                Absolute Accuracy Absolute Accuracy Bias (Expository)   Bias (Persuasive)
                                                   (Expository)     (Persuasive)

Note. HD = Human Development Undergraduates, GP = Government and Politics
Undergraduates

								
To top