Long-Term Memory: Storage (Information Processing Theory)

Introduction

This section concerns itself with the matter of information storage within Long-Term Memory (LTM). Albeit our understanding of LTM remains circumscribed, owing to the absence of direct observational access to the cerebrum, scholarly investigation has yielded a tolerably coherent depiction of the storage processes involved.

The characterisation of LTM in this discourse delineates a structure wherein knowledge is represented as discrete loci, or nodes, within interconnected networks. Note should be taken of the resemblance between these cognitive networks and the neural networks previously discussed in this course. When discoursing on networks, our primary focus shall be declarative and procedural knowledge. Conditional knowledge shall receive due attention in section seven of this very course, alongside metacognitive activities that serve to monitor and direct cognitive processing. It is presupposed that the preponderance of knowledge is stored within LTM in verbal codifications, though the import of imagery shall also be addressed towards the conclusion of this discourse.

Propositions

The Nature of Propositions

A proposition constitutes the smallest unit of information capable of being adjudged true or false. Propositions are the fundamental units of knowledge and meaning within Long-Term Memory (LTM) (Anderson, 1990; Kosslyn, 1984; Norman & Rumelhart, 1975). Each of the following represents a proposition:

The Declaration of Independence was signed in 1776.
.
Aunt Frieda harbours a dislike for turnips.
I possess aptitude in mathematics.
The principal dramatis personae are introduced betimes in a narrative.

These exemplars of propositions may be evaluated for their veracity. Note, however, that individuals may dissent in their judgements. Carlos may opine that he is deficient in mathematics, whilst his instructor may deem him quite proficient.

The precise nature of propositions remains somewhat obscure. Whilst they may be conceived as sentences, it is more probable that they embody the meanings of sentences (Anderson, 1990). Research lends credence to the notion that we store information in memory as propositions, rather than as complete sentences. Kintsch (1974) furnished participants with sentences of equivalent length, yet varying in the number of propositions contained therein. The greater the number of propositions within a sentence, the longer the duration required for participants to comprehend it. This suggests that, albeit students may generate the sentence, “The Declaration of Independence was signed in 1776,” what they are most likely to have stored in memory is a proposition encapsulating only the salient information (Declaration of Independence: signed—1776). Save for certain exceptions (e.g., committing a poem to memory), it appears that individuals typically store meanings rather than verbatim wordings.

Propositions constitute networks composed of discrete nodes or locations. Nodes may be envisioned as individual words, albeit their precise nature remains unknown, yet is likely abstract. For instance, students attending a course in history are likely to possess a “history class” network encompassing nodes such as “book,” “teacher,” “location,” “name of student situated to their left,” and so forth.

Propositional Networks

Propositions are formulated in accordance with a set of rules. Researchers are in disagreement as to which rules constitute this set, yet they generally concur that rules amalgamate nodes into propositions and, in turn, propositions into higher-order structures or networks, which are assemblages of interrelated propositions.

Anderson’s ACT theory (Anderson, 1990, 1993, 1996, 2000; Anderson et al., 2004; Anderson, Reder, & Lebiere, 1996) posits an ACT-R (Adaptive Control of Thought-Rational) network model of LTM, exhibiting a propositional structure. ACT-R is a model of cognitive architecture endeavouring to explicate how all components of the mind co-operate to engender coherent cognition (Anderson et al., 2004). A proposition is formed by combining two nodes with a subject–predicate link, or association; one node constituting the subject and another the predicate. Exemplars include (implied information in parentheses): “Fred (is) rich” and “Shopping (takes) time.” A second type of association is the relation–argument link, wherein the relation is a verb (in meaning) and the argument is the recipient of the relation or that which is affected by the relation. Examples are “eat cake” and “solve puzzles.” Relation arguments may serve as subjects or predicates to construct complex propositions. Exemplars include “Fred eat(s) cake,” and “solv(ing) puzzles (takes) time.”

Propositions are interrelated when they share a common element. Common elements enable individuals to resolve problems, cope with environmental demands, draw analogies, and so forth. Absent common elements, transfer would not transpire; all knowledge would be stored in isolation, and information processing would be protracted. One would fail to recognise that knowledge pertinent to one domain is also relevant to other domains.

The figure 'Sample propositional network' illustrates an instance of a propositional network. The common element is “cat”, being a component of the propositions, “The cat walked across the front lawn,” and “The cat caught a mouse.” One may surmise that the former proposition is linked with other propositions pertaining to one’s dwelling, whilst the latter is linked with propositions concerning mice.

Evidence doth intimate that propositions are organised in hierarchical structures. Collins and Quillian (1969) demonstrated that individuals store information at the zenith of generality. For example, the LTM network for “animal” would have stored at its highest stratum such facts as “moves” and “eats.” Subordinate to this category would reside such species as “birds” and “fish.” Stored under “birds” are “has wings,” “can ﬂy,” and “has feathers” (albeit exceptions exist—chickens are avians, yet they are incapable of ﬂying). The verity that avians consume and ambulate is not stored at the level of “bird”, inasmuch as that information is stored at the superior level of animal. Collins and Quillian ascertained that retrieval times augmented in proportion to the distance separating concepts within memory.

The notion of hierarchical organisation has been amended by research evincing that information is not invariably hierarchical. Thus, “collie” is more proximate to “mammal” than to “animal” within an animal hierarchy, yet individuals are more expeditious in assenting that a collie is an animal than in assenting that it is a mammal (Rips, Shoben, & Smith, 1973).

Furthermore, familiar information may be stored both with its concept and at the apex of generality (Anderson, 1990). Should one possess a bird feeder and frequently observe avians consuming, one might have “eat” stored with both “birds” and “animals.” This discovery does not detract from the central tenet that propositions are organised and interconnected. Whilst some knowledge may be hierarchically organised, a preponderance of information is likely organised in a less systematic fashion within propositional networks.

The Repository of Cognizance

Declarative Cognizance

Declarative cognizance (knowing that something is the case) doth encompass facts, beliefs, opinions, generalisations, theories, hypotheses, and attitudes concerning oneself, others, and events of the world (Gupta & Cohen, 2002; Paris et al., 1983). 'Tis acquired when a novel proposition is stored within the Long-Term Memory (LTM), typically within a related propositional network (Anderson, 1990). The ACT theory doth posit that declarative cognizance is represented in discrete portions, comprising the basic information alongside related categories (Anderson, 1996; Anderson, Reder, & Lebiere, 1996).

The storage process operates thusly: First, the learner doth receive new information, such as when a pedagogue makes a pronouncement, or the learner peruses a sentence. Next, this new information is translated into one or more propositions within the learner's Working Memory (WM). Concurrently, related propositions within the LTM are prompted. These new propositions are associated with related propositions within the WM, through the process of spreading activation (discussed anon). At this juncture, learners might generate additional propositions. Finally, all new propositions—those received and those generated by the learner—are stored together within the LTM (Hayes-Roth & Thorndyke, 1979).

Refer to Figure 'Storage of declarative cognizance', which doth illustrate this process. Presume a pedagogue is presenting a unit on the U.S. Constitution, and doth proclaim to the class, “The Vice President of the United States serves as President of the Senate, yet doth not vote unless there be a tie.” This statement may prompt other propositional cognizance stored within students' memories, relating to the Vice President (e.g., elected with the President, becomes President upon the President's demise or resignation, can be impeached for crimes of treason) and the Senate (e.g., 100 members, two elected from each state, 6-year terms). By assembling these propositions, the students should infer that the Vice President would cast a vote if 50 senators voted in favour of a bill, and 50 voted against it.

Storage impediments may arise when students possess no pre-existing propositions with which to link new information. Students unacquainted with the U.S. Constitution, and ignorant of what a constitution entails, shall draw a blank upon first hearing the term. Information devoid of conceptual import can be stored within the LTM, yet students learn with greater efficacy when new information is related to that which they already know. Presenting students with a facsimile of the U.S. Constitution, or relating it to a topic they have previously studied (e.g., the Declaration of Independence), doth furnish them with a referent to link with the new information.

Even when students have studied related material, they may not automatically forge a link with new information. Often, the links must be rendered explicit. When discussing the function of the Vice President within the Senate, pedagogues could remind students of the composition of the U.S. Senate, and the other roles appertaining to the Vice President. Propositions sharing a common element are linked within the LTM only if they are simultaneously active within the WM. This point doth elucidate why students might fail to perceive how new material relates to old, even when the link is patent to the pedagogue. Instruction that best establishes propositional networks within learners' minds doth incorporate review, organisation of material, and reminders of matters they know, yet are not presently contemplating.

As with many processes of memory, meaningfulness, organisation, and elaboration doth facilitate the storage of information within memory. Meaningfulness is of import, for meaningful information can be readily associated with pre-existing information within memory. Consequently, less rehearsal is requisite, which doth conserve space and time of information within the WM. The students under discussion in the opening scenario are encountering difficulty in rendering algebra meaningful, and the pedagogues express their frustration at not imparting the content in a meaningful fashion.

A study by Bransford and Johnson (1972) doth furnish a dramatic illustration of the role of meaningfulness in storage and comprehension. Consider the subsequent passage:

The procedure is actually quite simple. First you arrange things into different groups. Of course, one pile may be sufficient depending on how much there is to do. If you have to go somewhere else due to lack of facilities that is the next step, otherwise you are pretty well set. It is important not to overdo things. That is, it is better to do too few things at once than too many. In the short run this may not seem important, but complications can easily arise. A mistake can be expensive as well. At first the whole procedure will seem complicated. Soon, however, it will become just another facet of life. It is difficult to foresee any end to the necessity for this task in the immediate future, but then one never can tell. After the procedure is completed one arranges the materials into different groups again. Then they can be put into their appropriate places. Eventually they will be used once more and the whole cycle will then have to be repeated. However, that is part of life. (p. 722)

Without prior cognizance, this passage is difficult to comprehend and store within memory, as relating it to existing cognizance within memory is arduous. However, knowing that it concerns “washing clothes” makes remembering and comprehension easier. Bransford and Johnson found that students cognisant of the topic recalled approximately twice as much as those unaware of it. The importance of meaningfulness in learning hath been demonstrated in numerous other studies (Anderson, 1990; Chiesi, Spilich, & Voss, 1979; Spilich, Vesonder, Chiesi, & Voss, 1979).

Organisation facilitates storage, for well-organised material is easier to relate to pre-existing memory networks than poorly organised material (Anderson, 1990). To the extent that material can be organised into a hierarchical arrangement, it furnishes a ready structure to be accepted into the LTM. Absent an existing LTM network, creating a new LTM network is easier with well-organised information than with poorly organised information.

Elaboration, or the process of appending information to material to be learned, improves storage, for by elaborating information, learners may be able to relate it to something they know. Through spreading activation, the elaborated material may be swiftly linked with information in memory. For example, a pedagogue might be discussing Mount Etna, the volcano. Students who can elaborate that cognizance by relating it to their personal cognizance of volcanoes (e.g., Mount St. Helens) will be able to associate the new and old information in memory, and better retain the new material.

Spreading Activation

Spreading activation doth explicate how new information is linked to cognizance within the LTM (Anderson, 1983, 1984, 1990, 2000; Collins & Loftus, 1975). The basic underlying principles are as followeth (Anderson, 1984):

Human cognizance can be represented as a network of nodes, where nodes correspond to concepts and links to associations amongst these concepts.
The nodes within this network can exist in various states, corresponding to their levels of activation. More active nodes are processed “better.”
Activation can spread along these network paths by a mechanism whereby nodes can cause their neighbouring nodes to become active. (p. 61)

Anderson (1990) doth cite the example of an individual presented with the word “dog.” This word is associatively linked with such other concepts within the individual's LTM as “bone,” “cat,” and “meat.” In turn, each of these concepts is linked to other concepts. The activation of “dog” within the LTM will spread beyond “dog” to linked concepts, with the spread lessening with concepts farther removed from “dog.”

Experimental support for the existence of spreading activation was obtained by Meyer and Schvaneveldt (1971). These investigators employed a reaction time task that presented participants with two strings of letters, and asked them to decide whether both were words. Words associatively linked (“bread,” “butter”) were recognised faster than words not linked (“nurse,” “butter”).

Spreading activation results in a larger portion of the LTM being activated than cognizance immediately associated with the content of the WM. Activated information remains within the LTM unless it is deliberately accessed, yet this information is more readily accessible to the WM. Spreading activation also facilitates transfer of cognizance to different domains. Transfer depends upon propositional networks within the LTM being activated by the same cue, thus students recognise that cognizance is applicable within the domains.

Schemas

Propositional networks represent small fragments of cognizance. Schemas (or schemata) are large networks that represent the structure of objects, persons, and events (Anderson, 1990). Structure is represented with a series of “slots,” each of which corresponds to an attribute. Within the schema or slot for houses, some attributes (and their values) might be as followeth: material (wood, brick), contents (rooms), and function (human dwelling). Schemas are hierarchical; they are joined to superordinate ideas (building) and subordinate ones (roof).

Brewer and Treyens (1981) found research support for the underlying nature of schemas. Individuals were asked to tarry within an office for a brief period, after which they were brought into a room where they wrote down everything they could recall about the office. Recall reflected the strong influence of a schema for an office. They correctly recalled the office possessing a desk and a chair (typical attributes), yet not that the office contained a skull (non-typical attribute). Books are a typical attribute of offices; although the office had no books, many persons incorrectly recalled books.

Schemas are important during teaching and for transfer (Matlin, 2009). Once students learn a schema, pedagogues can activate this cognizance when they teach any content to which the schema is applicable. Suppose an instructor teaches a general schema for describing geographical formations (e.g., mountain, volcano, glacier, river). The schema might contain the following attributes: height, material, and activity. Once students learn the schema, they can employ it to categorise new formations they study. In so doing, they would create new schemata for the various formations.

Procedural Cognizance

Procedural cognizance, or cognizance of how to perform cognitive activities (Anderson, 1990; Gupta & Cohen, 2002; Hunt, 1989; Paris et al., 1983), is central to much school learning. We employ procedural cognizance to solve mathematical problems, summarise information, skim passages, and perform laboratory techniques.

Procedural cognizance may be stored as verbal codes and images, much the same way as declarative cognizance is stored. ACT theory posits that procedural cognizance is stored as a production system (Anderson, 1996; Anderson, Reder, & Lebiere, 1996). A production system (or production) is a network of condition–action sequences (rules), wherein the condition is the set of circumstances that activates the system, and the action is the set of activities that occurs (Anderson, 1990; Andre, 1986; see next section). Production systems seem conceptually similar to neural networks.

Production Systems and Connectionist Models

Production systems and connectionist models offer paradigms for the examination of the operation of cognitive learning processes (Anderson, 1996, 2000; Smith, 1996). Connectionist models furnish a relatively novel perspective on cognitive learning. To date, there exists scant research on connectionist models that bears relevance to education. Additional sources furnish further intellection regarding connectionist models (Bourne, 1992; Farnham-Diggory, 1992; Matlin, 2009; Siegler, 1989).

Production Systems

ACT—an activation theory—stipulates that a production system (or production) is a network of condition–action sequences (rules), wherein the condition constitutes a set of circumstances that activates the system, and the action comprises the set of activities that ensue (Anderson, 1990, 1996, 2000; Anderson, Reder, & Lebiere, 1996; Andre, 1986). A production consists of 'if–then' statements: 'If' statements (the condition) encompass the goal and test statements, and 'then' statements embody the actions. As an exemplification:

IF I observe two numbers and they are to be summed,
THEN determine which is the greater, commence with that number, and enumerate up to the next. (Farnham-Diggory, 1992, p. 113)

Although productions are forms of procedural knowledge to which conditions (conditional knowledge) may be affixed, they also incorporate declarative knowledge.

Learning procedures for the performance of skills frequently transpires gradually (J. Anderson, 1982). Initially, learners represent a sequence of actions in terms of declarative knowledge. Each step in the sequence is represented as a proposition. Learners progressively shed individual cues and integrate the disparate steps into a continuous sequence of actions. For instance, children learning to sum a column of numbers are wont initially to perform each step deliberately, perchance even verbalising it aloud. As their proficiency burgeons, addition becomes an integral part of an automatic, seamless sequence that unfolds rapidly and without deliberate, conscious attention. Automaticity constitutes a central feature of numerous cognitive processes (e.g., attention, retrieval) (Moors & De Houwer, 2006). When processes attain automaticity, this permits the processing system to dedicate itself to more intricate facets of tasks.

A cardinal constraint on skill learning resides in the size limitation of WM (Baddeley, 2001). Procedures would be learned more expeditiously were WM capable of simultaneously holding all the declarative knowledge propositions. Inasmuch as it cannot, students must combine propositions deliberately and periodically pause to reflect (e.g., 'What am I to do next?'). WM contains insufficient capacity to construct extensive procedures in the nascent stages of learning. As propositions are amalgamated into diminutive procedures, the latter are stored in WM concurrently with other propositions. In this manner, larger productions are gradually constructed.

These notions explicate why skill learning advances more swiftly when students are capable of performing the prerequisite skills (i.e., when they attain automaticity). When the latter exist as well-established productions, they are activated in WM contemporaneously with new propositions to be integrated. In learning to resolve long-division problems, students who comprehend how to multiply simply recall the procedure when requisite; it need not be learned in conjunction with the other steps in long division. Albeit this does not appear to constitute the problem in the opening scenario, learning algebra poses a challenge for students with fundamental skill deficiencies (e.g., addition, multiplication), inasmuch as even simple algebra problems become arduous to answer correctly. Children with reading disabilities seem to lack the capacity to effectively process and store information simultaneously (de Jong, 1998).

In certain instances, specifying the steps in detail proves challenging. For example, creative thinking may not adhere to the selfsame sequence for each student. Teachers can model creative thinking to encompass such self-queries as, 'Are there any other possibilities?'. Whenever steps can be specified, teacher demonstrations of the steps in a procedure, followed by student practice, are efficacious (Rosenthal & Zimmerman, 1978).

One impediment to the learning of procedures resides in the possibility that students might regard them as inflexible sequences to be followed irrespective of their appropriateness. Gestalt psychologists demonstrated how functional fixedness, or an inflexible approach to a problem, impedes problem-solving (Duncker, 1945). Adamantly adhering to a sequence whilst learning may facilitate its acquisition, but learners also require an understanding of the circumstances under which alternative methods are more efficient.

Occasionally, students overlearn skill procedures to the point that they eschew the use of alternative, simpler procedures. Concurrently, there exist few, if any, alternatives for numerous procedures that students learn (e.g., decoding words, adding numbers, ascertaining subject–verb agreement). Overlearning these skills to the point of automatic production becomes an asset to students and renders it easier to learn new skills (e.g., drawing inferences, composing term papers) that necessitate mastery of these basic skills.

One might contend that teaching problem-solving or inference skills to students deficient in basic mathematical facts and decoding skills, respectively, is of little avail. Research evinces that a poor grasp of basic number facts is correlated with diminished performance on complex arithmetic tasks (Romberg & Carpenter, 1986), and slow decoding is associated with poor comprehension (Calfee & Drum, 1986; Perfetti & Lesgold, 1979). Not only is skill learning affected, but self-efficacy suffers as well.

Practice is indispensable for the instatement of basic procedural knowledge (Lesgold, 1984). In the early stages of learning, students necessitate corrective feedback highlighting the portions of the procedure they implemented correctly and those requiring modification. Oftentimes, students learn certain parts of a procedure but not others. As students acquire skill, teachers can underscore their progress in solving problems more swiftly or more accurately.

Transfer of procedural knowledge transpires when the knowledge is linked in LTM with disparate content. Transfer is aided by having students apply the procedures to the disparate content and altering the procedures as necessary. General problem-solving strategies are applicable to varied academic content. Students learn about their generality by applying them to different subjects (e.g., reading, mathematics).

Productions are pertinent to cognitive learning, but several issues warrant address. ACT theory posits a single set of cognitive processes to account for diverse phenomena (Matlin, 2009). This view conflicts with other cognitive perspectives that delineate different processes depending on the type of learning (Shuell, 1986). Rumelhart and Norman (1978) identified three types of learning. Accretion involves encoding new information in terms of existing schemata; restructuring (schema creation) is the process of forming new schemata; and tuning (schema evolution) refers to the gradual modification and refinement of schemata that occurs when employing them in diverse contexts. These entail varying amounts of practice: much for tuning and less for accretion and restructuring.

ACT is essentially a computer program designed to simulate learning in a coherent manner. As such, it may not address the range of factors implicated in human learning. One issue concerns how individuals discern which production to employ in a given situation, particularly if situations lend themselves to the employment of different productions. Productions may be ordered in terms of likelihood, but a means for deciding what production is best given the circumstance must be available. Also of concern is the issue of how productions are altered. For instance, if a production does not function effectively, do learners discard it, modify it, or retain it whilst seeking further evidence? What is the mechanism for deciding when and how productions are changed?

Another concern pertains to Anderson’s (1983, 1990) assertion that productions originate as declarative knowledge. This assumption seems overly assertive given evidence that this sequence is not invariably followed (Hunt, 1989). Inasmuch as representing skill procedures as pieces of declarative knowledge is essentially a way station along the road to mastery, one might question whether students should learn the individual steps. The individual steps will eventually fall into disuse, so time may be better spent permitting students to practice them. Furnishing students with a list of steps to which they can refer as they gradually develop a procedure facilitates learning and enhances self-efficacy (Schunk, 1995).

Finally, one might question whether production systems, as generally described, are nothing more than elaborate stimulus-response (S-R) associations (Mayer, 1992). Propositions (bits of procedural knowledge) become linked in memory such that when one piece is cued, others are also activated. Anderson (1983) acknowledged the associationist nature of productions but avers that they are more advanced than simple S-R associations inasmuch as they incorporate goals. In support of this point, ACT associations are analogous to neural network connections. Perchance, as is the case with behaviorist theories, ACT can better explicate performance than it can explicate learning. These and other questions (e.g., the role of motivation) necessitate address by research and relation to the learning of academic skills to better establish the usefulness of productions in education.

Connectionist Models

A line of recent theorising regarding complex cognitive processes involves connectionist models (or connectionism, but not to be confounded with Thorndike’s connectionism discussed earlier in the course; Baddeley, 1998; Farnham-Diggory, 1992; Smith, 1996). Like productions, connectionist models represent computer simulations of learning processes. These models link learning to neural system processing wherein impulses fire across synapses to forge connections. The assumption is that higher-order cognitive processes are formed by connecting a large number of basic elements such as neurons (Anderson, 1990, 2000; Anderson, Reder, & Lebiere, 1996; Bourne, 1992). Connectionist models encompass distributed representations of knowledge (i.e., spread out over a wide network), parallel processing (many operations occur at once), and interactions amongst large numbers of simple processing units (Siegler, 1989). Connections may be at different stages of activation (Smith, 1996) and linked to input into the system, output, or one or more in-between layers.

Rumelhart and McClelland (1986) described a system of parallel distributed processing (PDP). This model is useful for making categorical judgements about information in memory. These authors furnished an example involving two gangs and information regarding gang members, including age, education, marital status, and occupation. In memory, the similar characteristics of each individual are linked. For example, Members 2 and 5 would be linked if they were both about the same age, married, and engaged in similar gang activities. To retrieve information about Member 2, one could activate the memory unit with the person’s name, which in turn would activate other memory units. The pattern created through this spread of activation corresponds to the memory representation for the individual. Borowsky and Besner (2006) described a PDP model for making lexical decisions (e.g., deciding whether a stimulus is a word).

Connectionist units bear some similarity to productions in that both involve memory activation and linked ideas. Concurrently, differences exist. In connectionist models all units are alike, whereas productions contain conditions and actions. Units are differentiated in terms of pattern and degree of activation. Another difference concerns rules. Productions are governed by rules. Connectionism has no set rules. Neurons 'know' how to activate patterns; after the fact one may furnish a rule as a label for the sequence (e.g., rules for naming patterns activated; Farnham-Diggory, 1992).

One impediment to the connectionist approach is explaining how the system discerns which of the many units in memory to activate and how these multiple activations become linked in integrated sequences. This process seems straightforward in the case of well-established patterns; for example, neurons know how to react to a ringing telephone, a cold wind, and a teacher announcing, 'Everyone pay attention!'. With less-established patterns the activations may be problematic. One might also ask how neurons become self-activating in the first place. This question is important inasmuch as it helps to explicate the role of connections in learning and memory. Albeit the notion of connections seems plausible and grounded in what one knows about neurological functioning, to date this model has been more useful in explicating perception rather than learning and problem-solving (Mayer, 1992). The latter applications necessitate considerable research.