V. Learning Ch. 8. (See relevant text and
readings for this topic from course schedule) The behaviorist (and other) views of learning as a
modification in behavior based on experience.
The basic model (or models) here are of
basic learning mechanisms that allow an organism to learn what things go
together in the world (or what are reliable signals of impending events) and
also organisms learning to repeat behaviors that "work" in leading to
pleasure (which often results from the satisfaction of basic drives and thus
has adaptive value). Not all behavior or learning fits these basic models,
but they are nonetheless of great use.
1. Pavlov and classical conditioning. Conditioned and unconditioned stimuli and
responses. Learning what goes with what in the world--to respond to a new
stimulus in a manner similar to how you responded to an original one. We do
this because the new one (the cs) is a signal that the original one is about to
appear. Stimulus generalization and stimulus discrimination. An important way
of learning about the regularities in the world--what is associated with what.
Pavlov's work showed that learning can happen at low levels of the nervous
system.
2. Thorndike's law of effect and its later modification by Skinner into
instrumental learning. Reward based learning. Role of punishment. Extinction
and resistance to extinction. Impact of different schedules of
reinforcement--strong impact on how easy or hard it is to extinguish behavior.
Stimulus generalization and stim. discrimination. Shaping or successive
approximation. Real world applications. of instrumental learning principles.
3. Long term potentiation as a mechanism of learning. Experiments with Aplysia
and classical conditioning (neural model of learning).
4. Modern cognitive "reinterpretation" of what is learned--> a
more central & representational view of learning that modifies some of the
overgeneralized or overextended claims of behaviorism. Issues such as
belongingness, informational value of stimuli as signals of impending events,
(contingency). Remember the smart apes!
5. What is the usefulness/functionality of the Behaviorist view of learning--how
has it been applied/can it be applied? How might you use both of these basic
models of learning? Which do you think is the more useful or important? Can we
separate the usefulness from the ideology? In particular, how does learning
couple with motivation (homeostatic control) to make animals adaptive to their
changing environments?
A. Audition (Ch. 5): normal range 20-20,000 Hz; transduction from physical signal into neural signal occurs in the ear at the basilar membrane with the membrane vibrating against the hair cells. Know the parts of the ear.
Two theories: Frequency theory states that the frequency of the action potentials tells the actual frequency in the real world (problem is that we could only hear to 1000 Hz since neurons can only fire about 1000 times per second). Place theory says that where on the basilar membrane hair cells are depolarized (firing) tells the frequency of the sound in the real world (different parts of the basilar membrane have different points of maximum vibration and this gets sharpened via lateral inhibition).
Recall that we do not hear all sounds equally well. We are more sensitive to the central range than the low or high ends, and that is the range where most speech sounds occur. This effect decreases with volume (remember the "loudness control" issue in your stereo.)
B. Vision (Ch. 5):(low level stuff); how do you focus on objects? Change the shape of your lens. Recall the responses to looking at something close: (1) lens gets fat (2) eyes converge (point in--cross-eyed), and (3) the pupil gets smaller. Also remember what the eye looks like: where is the lens, pupil, retina? Where does vision (or at least the portion or the process that goes on in the eye) really take place? (answer: retina)
Retina: Rods and Cones: rods very sensitive to light; cones sensitive to color and have high acuity (acuity = how sharp you can see something). These two different systems lead to the duplex theory of vision: The center of the retina (fovea) is good for processing fine detail and color. There are mostly cones here, and each cone connects to one ganglion cell, which connects to one cell in the visual cortex. Because of this the fovea has a very direct path to the brain. Peripheral part of the retina is predominantly rods. Because of this you have poor color vision in the periphery (very few cones), but you are very sensitive to light (consider looking at stars at night--you can not see them if you look directly at them--i.e., with your fovea, but you can see them by looking to the side of them using the more peripheral retina). The peripheral system is however not sharp (low acuity).
Other things to remember about vision: very wide intensity range from candle at 12 miles to noonday sun (ratio of intensities is 1:10,000,000,000,000 ); we can also see things that are very small (1 second of arc, which is 1/60 of 1/60 of 1/360 of a circle--very small indeed--under optimal conditions a bar the thickness of a thumb at 15 miles!).
Vision and the brain: Lateral (or sideways) inhibition: (originally studied with Limulus, the horseshoe crab) when cells that are next to each other inhibit their neighbors; this is good for detecting changes in the visual world (such as borders or edges). The visual system (and other sensory systems as well) seem to be specialized for picking up changes rather than steady state stimulation. Remember the lateral inhibition lab that dealt with this issue and what it showed/how it works.
Feature detectors--a major mechanism of bottom up perception: Cortical Simple cells (Hubel & Wiesel): Simple cortical cells respond to bars of light (or darkness) surrounded by darkness (or light; see above) in a particular orientation. They seem to be made up of a line of circularly oriented retinal ganglion cells connecting to a brain cell These are good examples of feature detectors--detectors that "look out" at the world for particular, organized features rather than simply responding to the level of light (or sound). Remember the study by Riggs, Ratliff, Cornsweet & Cornsweet where they presented the word BEER and held it on the same place on the retina (i.e., they did not let the subjects move their eyes to different parts of the word). The word faded along feature lines, e.g., PEEP, BFFR, PEER etc.
C. Perception (Ch. 6): Top down processing: Remember the Muller-Lyer illusion and the top-down explanation for it. One factor involved in that explanation (and in many other aspects of our perceptual processing) is our having a perceptual constancy, in this case, size constancy. This is the tendency to see a stimulus as having a constant size even though the retinal image varies greatly depending how far away it is. There are other constancies as well. Remember the "we are not from Missouri" argument-that the eye-brain does not take a photograph of the world, but rather actively extracts information from it and reaches a decision about the stimulus. Top-down processes allow this to happen more effectively but this introduces the possibility of the wrong conclusion being reached. Think about top down effects found in the illusion lab. An overiding issue is that in order to perceive the world correctly we have to potentially distort it--top down perception is necessary! Review text treatment of depth perception and the type of cues (binocular and monocular) that accomplish it. Binocular cues includes convergence (the two eyes converge on the same point/object) and monocular include linear perspective (far objects cast a smaller retinal image than near ones--picture railroad tracks running off to the horizon), relative size (also far objects cast a smaller retinal image than near ones), and interposition (far objects are usually blocked by near ones). These depth cues also show importance of top-down proc.
II. Memory (Chapter 7) Remember the model from class. I won't be drawing it--you should know it by now! Three stages: Sensory store; STM; LTM. Loss can occur at each stage. Also, there is rehearsal at the STM level. Finally, there is encoding into and retrieval from LTM.
Memory was first studied experimentally by Ebbinghaus using nonsense syllables. Recall his experiments and what he found. (Findings in summary: Retention decreased rapidly at first (rapid forgetting) and then the rate of loss of information slowed. Also, spaced practice tended to reduce the amount of information forgotten, and the phenomenon of overlearning helped retention. Also, remember his retention measure: savings.)
A. Sensory store: iconic memory (in vision; echoic in audition). This is shown by Sperlingąs experiments with the 3x4 array of letters. Found that the sensory store (in vision) lasts about a second. He found this by comparing the whole report (subjects told to report the entire array) to the partial report (only report one row). Subjects were better in partial report. If, however, he delayed the report in the partial report condition, performance declined as a function of time. Also evidence from backward masking: If an image is presented in the same area right after a previous image, the second image will interfere with the first--it over-writes the previous image (no double exposures!)
B. STM: Limited capacity/limited store (think of mental multiplication as an example of the limited nature). About 7 ± 2 slots in STM. This was the work of George Miller. You can beat this limitation by chunking the information into more meaningful units Consider the 15 letters: FBI CIA TWA IBM NBA. You can remember 15 because they form meaningful chunks of information. However, recall that Baddeley presented evidence that modifies this idea. He showed that more digits could be held if they had shorter names because subjects could go through the list fast and thus get back to the beginning of the list before it is lost! (Remember how Baddeley interpreted this rehearsal as an articulatory loop where everything that could be pronounced (rehearsed) in a second or two could be held. His view had this working along with a visual-spatial sketchpad for more visual-spatial information and a top level central executive to organize/control immediate memory.) Also recall the experiment by Chase (done right here at CMU) on increasing digit span. He trained a runner, S.F., to recall up to 86 digits. He chunked the digits into running times and came up with a retrieval structure. The same technique was taught to a 2nd subj. to do over one hundred digits! How? Chunking and retrieval structures (that allow direct input to LTM.)
Note: The sensory store takes in a massive amount of information, but itąs limited in time (i.e., the info fades rapidly). STM can hold less info, but for longer periods of time (especially if you rehearse). So, think of the sensory store as having little capacity limitation but a very tight temporal (time) limitation, while STM has having a very limited capacity but less of a temporal limitation than sensory store.
Short term nature of STM: Peterson & Peterson vs. Waugh & Norman on how material is "lost" from STM. P & P argued for decay by preventing rehearsal (count backwards by 3). They found evidence that STM lasts ~20 sec. However, the counting may have occupied some of the STM slots, suggesting interference rather than passive decay of material from STM. W & N argued for interference by having a different amount of time and different numbers of intervening digits between two digits that needed to be remembered (see your notes). Results of W & N suggest that interference plays a major role, but there is still a role for decay. Also remember the Sternberg experiment on how we determine if something is in our STM (lab & lecture): we seem to scan our STM serially (this is suggested by an increase in reaction time when the set size increases), and exhaustively (equal times for positive and negative items). This yielded a generally useful technique for investigating cognitive phenomena via the behavioral measure of reaction times (relating the pattern of reaction times to models of the underlying cognitive processes that produce such a pattern of reaction times). This indirect method solves the challenge of building a scientific understanding of processes that are not themselves directly observable.
Relation between STM and LTM: There is an alternative to the STM box in our model or at least an alternative way of thinking about it. We can think of it as the activated (or "lit up") nodes of information in LTM instead of as a “place” where we hold items retrieved from LTM (or inputted from the outside). This suggests that STM may be more accurately called working memory, meaning the "activated" part of LTM or the set of things we are conscious of at any one time.
C. LTM: Seems to be of unlimited capacity. How do we store things? (1) articulatory loop type rehearsal (simple but poor method) (2) elaborative rehearsal (gives you many retrieval pathways) (3) spaced practice (not massed--i.e. NOT CRAMMING!) (4) organize information (5) encoding context (remember the studies of Godden & Baddeley with the divers and also the lab on word contexts). The key is to have multiple retrieval pathways.
Types of LTM: 1) Episodic (memory of specific episodes) vs. Semantic (general info about the world); e.g., what you had for breakfast vs. knowing what "breakfast" means. (2) Declarative (things you can recall and tell--"knowing that") vs. Procedural (skills--"knowing how").
Amnesics: anterograde (canąt form new memories--can not remember things from the trauma forward in time (for declarative info, procedural can still be learned)) and retrograde (can not remember things from the trauma backward in time--they forget the past). Some research suggests that anterograde amnesics donąt have ability to form new declarative memories but do have procedural memory capability (can learn new skills).
Semantic Memory: (1) Collins & Quillian (nodes and spreading activation): information in memory is stored in a connected hierarchy, and you store at the highest point where the info relates to everything below it (recall the example of "bird"--you store the fact that it breathes high in the hierarchy--at the "animal" level). (2) The Anderson ACT model: information can be stored as propositions (i.e., simple assertions). Activation spreads to connected items. Remember the experiment where subjects were given sentences like "The doctor is in the bank"; subjects were fastest to classify correct relations if he activation did not have too many irrelevant pathways to spread over (and be "diluted"). These ideas are also supported by the Meyer & Schvanaveldt exp. on semantic priming (remember the lab). M & S presented subjects with 2 words and had them determine whether the second word was a word or not. If the first word (e.g., NURSE) was related to the second word (e.g., DOCTOR), subjects were faster to respond to the second word.
Eyewitness Testimony: as evidence for active processing in memory. Loftus: the form of the question is important; e.g., "How fast were the cars going when they __into each other?" The word in the blank ("bumped" vs."smashed") influences the subjects response and their responses 1 week later to the question "did you see broken glass?" These results indicate that information in LTM can be distorted (i.e., non-veridical, or not reflecting what actually happened) under the influence of higher-level processes, in this case, expectations engendered by the form of the question. Just as active top-down processes can result in perceptual distortion, so too can they result in memory distortion. Remember the in-class demo of this distortion of eye-witness reporting of the incident with the two women confronting Gary.
III. Thinking (Ch. 8) What are the codes that we think in? (Note: these correspond to memory codes but can also be looked at as the modalities of thought.) Is there one or multiple codes? (E.g., verbal codes only or verbal and visual codes stored?) This was the research of Brooks, who presented block letters and sentences. The visual/spatial task (block letters) was harder when the response was made visually/spatially (pointing at Y or N), as opposed to verbally (saying "yes" or "no"). The verbal task was slightly harder when the response was verbal as opposed to visual/spatial. (Recall the class demo.) These results suggest we have different codes in memory. Also, the work of Shepard and his colleagues (Metzler and Cooper) with mental rotation-finding that the mental operation is analagous to the physical operation. Similarly with the Kosslyn island scanning experiment. All argue for a spatial quality to some of the mental operations involved in thought.
Older work on problem solving: The Zeigarnik effect--we tend to remember interrupted problems. Problem-solving set. The Luchens water jar experiment. People who solved series of problems using one method tended to over apply that method to new similar appearing problems even when other methods were easier or where the learned method no longer could solve the problem. Functional fixedness (Duncker's candle experiment) is similar phenomena with the over-learned use of objects/tools. One viewpoint is that these help more than they hurt--that while we often look "dumb" when responding from a mental or problem-solving set or being limited in seeing new uses for familiar items, we are really "smart" in that we learn from and generalize from our experience and thus save vast amounts of time most of the time (in not having to figure out how to do things from scratch).
New work on problem-solving, much done at CMU, starting with Herbert Simon & Allen Newell. Viewing probem-solving as an external process-search through a problem space. Characteristics of problem space: start, goal, nodes (states of knowledge) and links which connect the nodes. Move from node to node by the application of move operators.
Types of problems were delineated in the form of dichotomies. These included knowledge rich vs. knowledge lean problems (based on how much knowledge is required to solve them), large problem space problems vs. small prob. space problems (depending on how large the potential search space is), insight vs. non-insight problems (depending on how suddenly the solution “appears”, mutilated checkerboard), and well-defined vs. ill-defined problems (depending on whether there is a recognizable and clearly defined “correct” answer or not). First artificial intelligence programs (LT (the logic theorist) followed by GPS (the general problem solver)). GPS incorporates means-ends-analysis, a heuristic technique based on attempting to move through a problem space from start to goal by picking move operators that will move you closer, testing for their applicability to the current situation (node), and adopting subgoals if there is no move that will take you to the goal in one step. Think about issues of artificial intelligence. Subgoaling is a powerful technique (heuristic). Be able to define algorithms and heuristics. Examples of heuristics: hill-climbing (can not do detour problems where you have to move away from goal in order to get to it), working backward working back from the goal (lily-pond example), fractionation (breaking problem into smaller pieces (subgoals) that are more solvable), practicing to automate move-making, incubation (leaving problem for a while and thinking about something else as a way of getting out of a rut you might be looked into--backing up in the problem space). We need heuristics because many problems have an impossibly large search space-cannot be solved by exhaustive search or trial and error.
Importance of representation for problem-solving. Good representations allow people to exhibit more effective, expert-like problem-solving earlier because they lighten the processing load, making moving/planning moves easier. Remember work with Tower of Hanoi and isomorphs (structurally similar problems) that showed large differences in difficulty depending on the representation of the problem. Expertise -- building up lots of chunks in long term memory and using good representations (chess). Practice again that makes the performance of tasks automatic and lessens the demands on attention/memory. (Think of the Stroop task here.) How does problem solving interact with our cognitive architecture?
Evidence for non-conscious problem solving. Also, expertise was discussed. Think about the ten year rule and work with chess masters.