Bob Kentridge 1995
Comparative Psychology:Lecture 6.
Edward Thorndike, puzzle-boxes and the law of effect.
You may recall that we left comparative psychology's
development as an empirical response to Darwin's 'Descent of
Man' very much on the verge of becoming a science. The
importance of the problem and the practical difficulties had been
recognised and, by the end of the century, serious efforts were
being made to produce objective tests of animal intelligence. The
focus of this work was now America where the publication of
William James' 'Principles of Psychology' (1890) inspired a
growing number of graduate-students. One, Edward Thorndike,
attempted to develop some of the anecdotes on the mechanical
problem solving ability of cats and dogs collected by George
Romanes into an objective experimental method. Thorndike
devised a number of wooden crates which required various
combinations of latches, levers, strings and treadles to open them.
A dog or a cat would be put in one of these 'puzzle-boxes' and,
sooner or later would manage to escape from it.
Thorndike's initial aim was to show that the anecdotal
achievements of cats and dogs could be replicated in controlled,
standardised circumstance, however, he soon realised that he
could now measure animal intelligence using this equipment. His
method was to set an animal the same task repeatedly, each time
measuring the time it took to solve it. Thorndike could then
compare these 'learning-curves' across different situations and
different species.
Thorndike was particularly interested in discovering whether his
animals could learn their tasks through imitation or observation.
He compared the learning curves of cats who had been given the
opportunity of observing others escaping from a box with those
who had never seen the box being solved and found no difference
in their rate of learning. He obtained the same null result with
dogs and, even when he showed the animals the methods of
opening a box by placing their paws on the appropriate levers and
so on, he found no improvement. He fell back on a much simpler
trial and error explanation of learning. Occasionally, quite by
chance, an animal performs an action which frees it from the box.
When the animal finds itself in the same position again it is more
likely to perform the same action again. The reward of being
freed from the box somehow strengthens an association between a
stimulus, being in a certain position in the box, and an appropriate
action. Reward acts to strengthen stimulus-response associations.
The animal learns to solve the puzzle-box not by reflecting on
possible actions and really puzzling its way out of it but by a quite
mechanical development of actions originally made by chance. By
1910 Thorndike had formalised this notion into a 'law' of
psychology - the law of effect. In full it reads:
"Of several responses made to the same situation those which are
accompanied or closely followed by satisfaction to the animal will,
other things being equal, be more firmly connected with the
situation, so that, when it recurs, they will be more likely to recur;
those which are accompanied or closely followed by discomfort to
the animal will, other things being equal, have their connections to
the situation weakened, so that, when it recurs, they will be less
likely to occur. The greater the satisfaction or discomfort, the
greater the strengthening or weakening of the bond."
It is worth quoting in full, first because it essentially drove
comparative psychology in north America and Europe for fifty
years and second because Thorndike maintained that, in
combination with the law of exercise, the notion that associations
are strengthen by use and weakened with disuse, and the concept
of instinct, the law of effect could explain all of human behaviour
in terms of the development of myriads of stimulus-response
associations.
It is worth briefly comparing trial and error learning with
classical conditioning. In classical conditioning a neutral stimulus
becomes association with part of a reflex (either the US or the UR).
In trial and error learning no reflex is involved. A reinforcing or
punishing event (a type of stimulus) alters the strength of
association between a neutral stimulus and quite arbitrary
response. The response is not to any part of a reflex.
J.B. Watson, learning and the lab rat.
The position that human behaviour could be explained entirely
terms of reflexes, stimulus-response associations, and the effects
of reinforcers upon them entirely excluding 'mental' terms like
desires, goals and so on was taken up by John Broadhus Watson in
his 1914 book 'Behavior: An Introduction to Comparative
Psychology.'. Watson had also been involved in the introduction
of the most favoured subject in comparative psychology - the
laboratory rat. One of his early jobs which he used to fund his
Ph.D. was as a caretaker, one of whose duties was to look after
laboratory rats used in studies intended to mimic 'real-life'
learning tasks such as navigating complex mazes (a scale-model of
the Hampton-Court maze!). Watson became adept at taming rats
and found he could train rats to open a puzzle-box like
Thorndike's for a small food-reward. He also studied maze-
learning but simplified the task dramatically. One type of maze is
simply a long straight alley with food at the end. Watson found
that once the animal was well trained at running this 'maze' it did
so almost automatically. Once started by the stimulus of the maze
its behaviour becomes a series of associations between
movements (or their kinaesthetic consequences) rather than
stimuli in the outside world. This is made plain by shortening the
alleyway - the well-trained rats now run straight into the end
wall. This was known as the kerplunk experiment. The
development of wel-controlled behavioural techniques by Watson
also allowed him to explore animals sensory abilities, for example
their abilities to discriminate between similar stimuli,
experimentally.
Watson's theoretical position was even more extreme than
Thorndike's - he would have no place for mentalistic concepts like
pleasure or distress in his explanations of behaviour. He
essentially rejected the law of effect, denying that pleasure or
discomfort caused stimulus-response associations to be learned.
For Watson, all that was important was the frequency of
occurrence of stimulus-response pairings. Reinforcers might cause
some responses to occur more often in the presence of particular
stimuli, but they did not act directly to cause their learning.
Watson could therefore reject the notion that some mental traces
of stimuli and responses needed to be retained in an animals mind
until a reinforcer caused an association between them to be
strengthened, which is a rather mentalistic consequence of the law
of effect.
Human Behaviour and Little Albert.
Watson became an extremely influential force in American
Psychology, publishing his second book 'Psychology from the
Standpoint of a Behaviorist' in 1919. His rejection of mentalism
was total. He felt that thought was explicable as subvocalisation
and that speech was simply another behaviour which might be
learned by the law-of effect. In Psychology from the Standpoint
of a Behaviorist' he addresses a number of practical human
problems such as education, the development of emotional
reaction and the effects of factors like alcohol or drugs on human
performance. He even suggests that thought processes might be
investigated by monitoring movements in the larynx.
Watson believed that mental illness was the result of 'habit
distortion' which might be caused by fortuitous learning of
inappropriate associations which then go on to influence a
person's behaviour so that it become ever more abnormal.
Watson tested part of this hypothesis on a baby in the hospital in
which he worked. The baby, 'little Albert', apparently showed no
particular fears or phobias about anything apart from sudden loud
sounds. For example, when Watson placed a tame white rat in
little Albert's lap the child happily played with the animal. On a
subsequent occasion Watson placed the rat in Albert's lap and his
assistant made a loud noise by striking a large steel bar directly
behind Albert's head. One week later Albert was subjected to the
same experience. After this, when Albert was showed the rat be
began to fret, appearing anxious. Similar reactions were produced
by other furry objects (a fur coat). Watson was keen to use this as
evidence for the behavioural basis of phobias, however,
apparently Albert's reactions to the rate were quite mild.
Nevertheless, one of the most widespread applications of
conditioning has been in the treatment of phobias and other
behaviour problems and the case of Little Albert is often cited as
the first experiment in this field.
The Fall and Rise of J.B.Watson.
Shortly after his experiment on Little Albert Watson became
romantically involved with one of his research assistant's - Rosalie
Rayner. At the time such behaviour was not tolerated in
American academia and Watson was eventually forced to retire
from research. He soon, however, found gainful employment with
the J. Walter Thompson advertising agency where, using
techniques from his behavioural psychology, he showed that
people's preferences between rival products were not based on
their sensory qualities but on their associations. He went on to
develop the selling of products like Maxwell House Coffee, Pond's
Cold Cream, Johnson's Baby Powder and Odorono (one of the first
deodorants). By 1924 he was on of the four vice-presidents of
this very successful agency.
Cognition and learning in the 1930s.
In the 1920's behaviorism began to wane in popularity somewhat.
A number of studies in the Berkeley laboratory of Edward Tolman
appeared both to show flaws in the law of effect and require
mental representations in their explanation. For example, rats
were allowed to explore a maze in which there were three routes
of different lengths between the starting position and the goal.
The rats behaviour when the maze was blocked implied that they
must have some sort of mental map of the maze. The rats prefer
the routes according to their shortness, so, when the maze is
blocked at point A, stopping them using the shortest route, they
will choose the second shortest route. When, however, the maze is
blocked at point B the rats does not retrace his steps and use
route 2, which would be predicted according to the law of effect,
but rather uses route 3 . The rat must be recognising that block B
will stop him using route 2 by using some memory of the layout
of the maze. Tolman's group also showed that animals could use
knowledge they gained learning a maze by running to navigate it
swimming and that unexpected changes in the quality of reward
could weaken learning even though the animal was still rewarded.
This result was developed further by Crespi who, in 1942, showed
that unexpected decreases in reward quantity caused rats
temporarily to run a maze more slowly than normal while
unexpected increases caused a temporary elevation in running
speed - effects Crespi referred to as depression and elation!
At the same time as this work was appearing in the USA the
Polish psychologists Konorski and Miller began the first cognitive
analyses of classical conditioning - the forerunners of the work of
Rescorla, Wagner, Dickinson and Mackintosh which I described
earlier. In Germany Wolfgang Koehler was studying insight and
observation as mechanisms of learning in Chimps. All work which
was quite problematic for behaviourism.
Operant Conditioning.
In 1938 Burrhus Friederich Skinner published what was arguably
the most influential work on animal behaviour of the century 'The
Behavior of Organisms'. In the interim it had been shown that
Tolman's results were sensitive to factors like the openness of his
maze - if the rats could not see stimuli outside the maze they did
not make appropriate choices when it was blocked, suggesting
that they may have learned many stimulus response associations
in different parts of the maze, perhaps in sequence, rather than
having internalised a map of it. Skinner resurrected the law of
effect in more starkly behavioural terms and provided a
technology which allowed sequences of behaviour produced over
a long time to be studied objectively - a great improvement on the
individual learning trials of Watson and Thorndike.
Skinner's formulation of operant conditioning was based around
the contingencies between three types of event he termed the
discriminative stimulus (SD), the operant response (R) and the
reinforcing stimulus (SR). The operant response is some
behaviour which, if it is followed by a reinforcing stimulus, comes
to occur more frequently. The discriminative stimulus is another
neutral stimulus which is present when the contingency between
the operant response and reinforcement is true - it serves to
discriminate the conditions under which the operant response will
be made. Skinner did not believe that operant conditioning was
the result of stimulus-response learning - for Skinner the basic
association in operant conditioning was between the operant
response and the reinforcer, the discriminative stimulus served to
signal when this association would be acted upon.
Skinner's great technological contribution was the operant test
chamber or Skinner box. The interior of a Skinner box typically
contain a lever which an animal can press, a stimulus light and a
place in which a reinforcer like food can be delivered. The
animal's presses on the lever can be detected and recorded and a
contingency between these presses, the state of the stimulus light
and the delivery of reinforcement can be set up, all automatically.
The lever allows the experimenter to measure the production of
an operant - lever-pressing. The stimulus light serves as a
discriminative stimulus and the food as a reinforcer. It is also
possible to deliver other reinforcers such as water or to deliver
punishers like electric shock through the floor of the chamber.
Other types of response can be measured - nose-poking at a
moving panel, or hopping on a treadle - both often used when
testing birds rather than rats. It is also common to use more than
one lever, discriminative stimulus or type of reinforcer in a
Skinner box. The flexibility of this technology produced a flood of
work on operant-conditioning for nearly 50 years (some is still
going on today, but not nearly as much as at the time of the peak
of interest in the 50s and 60s). We will discuss the techniques
and results of operant conditioning experiments next week and
use them to try and understand what is learned in intrumental
learning.
Sources.
As usual I've drawn heavilly on Bob Boakes' 'From Darwin to
Behaviourism' for the historical material. The classic 'Theories of
learning.' by G.H. Bower and E.R. Hilgard (my copy is the 5th edition
from 1981, Englewood Cliff, NJ: Prentice-Hall - the first edition was
published in 1948) has very good discussions of the differences
between Thordike and Watson's theoretical positions and the status of
Tolman's work. There is also a little in the introductory chapter of
Schwartz about Thorndike's early work.