FAQ – Matheta

What is an Ardweeno?
Isn’t this really an electrical circuit?
Isn’t the Arduino running a program?
Doesn’t the Roomba have artificial intelligence of its own?
I get how it emits behavior, but then how does it learn?
I think I follow what you are saying here but got hung up on the phrase “what is happening at each synapse only.” I know what a synapse is (the gap between neurons where the charge travels from one to another, right?), but not sure what you mean by synapse in that sentence — that correlation between two events?
As I understand your model, you arbitrarily assigned synaptic strengths to your “neurons” from zero to 100.
So if you started with predetermined strengths, what got modified neurally when the light went on in your experiment?
You state that when synapses change strength, learning occurs. And my point, question really, was doesn’t learning strengthen the synapses?
Are you saying that the synapse works better when “learning” has been accomplished?
Seems like we need to know what causes synapses to change strength. My uneducated guess is that learning must do that and that that is how neural pathways are formed and strengthened.
If I follow this, what the program does is count the number of positive correlations between a stimulus and a behavior, and once they go above a certain number, that constitutes enough of a correlation to be considered a meaningful association and the device will now respond that way all the time, otherwise it is considered too random and therefore the device will continue to move about randomly?
What do you mean in the presentation when you say, with respect to the matrix, that the positive neuron changes the synapses?
If you had 20 different Roombas with this program in it and handed them to 20 different people, wouldn’t they end up “learning” different things depending on the whims of the people “teaching” them as to what response and stimulus they decide to repeat the most?
Here’s a link to a Ping-Pong playing robot. Can yours do that?
http://www.designboom.com/technology/kuka-ping-pong-timo-boll-03-11-2014/
How is synaptic efficacy strengthened in the real, i.e., uncontrolled world? Which is to say, how is learning expressed at the neurobiological level, not as observed changes in behavior, but in the brain?
This is perhaps a philosophical or ideological question/matter: It strikes me that your model of human learning as presented is completely behavioral. For me, while I think the behavioral approach to learning has value, I do not think it is the whole story. I think that the behaviorists were incorporating cognition into their operant models. Consequences don’t have a straightforward connection to particular learned behaviors. For example, being punished may not stop a teenager from being on the computer all the time; it might instead lead to rebellious behavior, more secrecy, and greater time spent on a computer.

What is an Ardweeno?

The Arduino is the green circuit board with all the wires sitting on top of the Roomba. It is a “micro controller” which means it is a small computer that talks to the outside world through pins on the board.

Isn’t this really an electrical circuit?

Yes, I have put together an electrical circuit that includes a light sensor, a mini microphone, and a LED beam interrupt (at the base of the funnel that detects when I drop a candy through it.) These constitute the sensors of the (incredibly minimal) sensory system of my “artificial organism.” They provide Robby with information about (a very few) events happening in the world around him.

Isn’t the Arduino running a program?

Yes, the Arduino runs a program I wrote. The main function of the program, however, is to mathematically model a simple nervous system, that is, a set of interconnected neurons. This is done by using matrices to represent neurons and their connections. One matrix is the model of the nervous system where each row is an individual behavior neuron. Elements of another matrix (the stimulus matrix) goes from 0 to 1 whenever a sensory neuron “fires” (like it would, say, when the light is on, or a candy drops through the funnel, or I honk the horn). The elements of the nervous system matrix indicate the “synaptic efficacy” of the connection between each sensory neuron and each behavior neuron. The greater the synaptic efficacy, i.e., the larger the number in that matrix position, the more the firing of the sensory neuron contributes to the firing of the behavior neuron.

When the program multiplies these two matrices together (which is what it’s doing most of the time), the effect is to sum up the inputs (some of which may be negative, i.e., inhibitory) on each neuron and “fire” those that exceed a threshold value. When a behavior neuron fires, the Arduino then simply commands the Roomba to execute the appropriate behavior (turn left, go forward, etc.)

Doesn’t the Roomba have artificial intelligence of its own?

The Roomba has been hacked. Its native AI is no longer operative. The Arduino tells the Roomba what to do by sending commands that are specified by the Roomba’s ROI. This consists of low level commands such as move both wheels together at a specified rate, move both wheels at different rates (in order to make the Roomba turn), etc.

I get how it emits behavior, but then how does it learn?

The ONLY other thing the program on the Arduino does is change the “synaptic efficacy” values in the nervous system matrix depending on rules about only WHAT IS HAPPENING AT EACH SYNAPSE. This last is important because it is what most distinguishes this from a program anyone could write that would simulate (rather than replicate or model) a nervous system. That is, a program that says “If x happens then do y.”

That’s why I say that everything you see follows from the actions of the nervous system, why Robby is indeed an artificial organism only with wires instead of physical neurons, and light sensors instead of eyes, etc.

The most important point being that if you agree that the organism is LEARNING in a way that replicates what is known about animal learning, then it would seem to suggest (strongly, I believe) that the rules that determine the operation of these artificial neurons must have at least SOME resemblance to the rules that determine the functioning of REAL neurons. And if that is true, why can’t we, given present computer capabilities, replicate the functioning (and hence, learning capability) of any brain, including our own?

I think I follow what you are saying here but got hung up on the phrase “only what is happening at each synapse.” I know what a synapse is (the gap between neurons where the charge travels from one to another, right?), but not sure what you mean by synapse in that sentence — that correlation between two events?

The synapse is the gap between the pre- and post-synaptic neuron. However the transmission is chemical, not electrical. When an action potential (the electrical signal generated when a neuron fires) reaches the synapse, it causes a rapid release of a bolus of chemicals across the synaptic cleft. These neurotransmitters bind with receptors on the surface of the post synaptic neuron where they have an excitatory or inhibitory effect on the post synaptic neuron, I.e., making it more or less likely to fire.

This mechanism seems to be inherently capable of modifications in efficacy, i.e., changes in the amount of neurotransmitter released and/or response of the post synaptic receptors will produce corresponding changes in the magnitude of the excitatory or inhibitory effect of the firing of the presynaptic neuron on the post synaptic one. (We take advantage of this fact when we use psychotropic medications.)

By “only what is happening at the synapse” I mean that’s all the program concerns itself with (beyond the mechanics of seeing if buttons have been pressed, outputting behaviors, etc.) The program looks at each synapse and asks “did the post synaptic neuron fire recently?” “Did the pre synaptic neuron fire?” Depending on the answers to these questions, it applies a formulaic increase or decrease to the efficacy of the synapse. The result of that process is that IF a reliable association between stimulus, behavior, and reward EXISTS, that will cause a permanent change in synaptic efficacy to occur and, as a result, a permanent change in the behavior of the organism (unless environmental contingencies change.) Thus, this association or contingency has been detected, recorded, and used to change behavior. The observation of that change in behavior is what psychologists call learning (or conditioning, exactly to avoid the confusion engendered by the term learning, which means, to most of us, studying some book.)

As I understand your model, you arbitrarily assigned synaptic strengths to your “neurons” from zero to 100.

I do assign synaptic strengths in the matrix, but not really “arbitrarily” and not to neurons. A value of zero in the nervous system matrix means there is no connection, no synapse, between that particular pre-synaptic neuron and the neuron represented by that row in the matrix. A one means there is a very weak positive, (i.e., excitatory) connection, or synapse, while minus 1 means a weak inhibitory connection. On the other hand, a value of 100 means that a stimulus-response connection exists because of the high efficacy synapse between the pre- and post-synaptic neurons. EVERY time that presynaptic neuron fires, the post synaptic one will then fire. This is exactly what we would expect for a reflex: every time you get the stimulus, you get the response (think of meat powder on the dog’s tongue and salivation. It’s all built in as a result of natural selection.) Thus, where the numbers in the matrix are placed determines the specific nervous system (set of interconnected neurons) being modeled. Put another way, it’s where the numbers are in the matrix that determines whether one neuron is connected to another or not and the “strength” of that connection.

So if you started with predetermined strengths, what got modified neurally when the light went on in your experiment?

No synapse gets modified when the light goes on. But all the synapses between the stimulus neuron sensitive to the light and the various operant behaviors (forward, turn left, etc.) DO become eligible for modification. Whether they do get modified or not depends on what else happens, e.g., was an operant emitted? I can look at the matrix at any time and track changes in the values, i.e., synaptic efficacy, and see if they are in fact changing in accordance with my rules. But I can also see that they must be changing because I can observe the behavior change occur that my rules predict will occur. That is, Robby learns to turn left when the light is on because that’s the (arbitrary, “artificial”) contingency that I created (the behavior I rewarded) in order to demonstrate Robby’s learning capability.

You state that when synapses change strength, learning occurs. And my point, question really, was doesn’t learning strengthen the synapses?

“Learning” is the name scientist give to OBSERVED changes in behavior. Classical and Operant conditioning represent learning phenomena in their most basic, paired down form. But they are simply observed phenomena. We now know (which we didn’t before Pavlov) exactly the PROCEDURES needed to reliably produce changed behaviors in organisms, i.e., “learning”. And we have a good idea WHY learning occurs — it produces changes in behavior that provide the organism with an increased probability of survival. But we don’t know HOW it occurs. What changes in the nervous system that produces the observed changes in behavior?

That’s where my model comes in. I developed a theory that says learning occurs through modification of synaptic efficacy (this is generally accepted, though not proven). I specify three rules that describe events that, when they occur AT THE SYNAPSE, produce changes in synaptic efficacy. For example, I say that if a neuron fires it’s synapses become ELIGIBLE for subsequent modification. So what the program does is implement these rules at each synapse, increasing and decreasing synaptic efficacy in accordance with the rules. This means that the number in the matrix representing the synaptic efficacy between a stimulus and a response is changed according to the rules. It may start at 1 then increase to thirty then decrease to ten, etc.

Are you saying that the synapse works better when “learning” has been accomplished?

Your question has got it exactly backwards. It is generally believed that when synapses change their efficacy, learning occurs. In classical conditioning we see that a formerly neutral stimulus, e.g., the ringing of a bell, originally has NO effect on salivation. Somehow, as a result of repeated pairings with meat powder, something causes some change in the dog’s nervous system such that afterwards the bell elicits salivation. Since our model does the same thing, it suggests that perhaps the model is based on some insight into what changes in the nervous system to account for learning (and why.)

We (scientist) assume that every thought, perception, behavior, idea, awareness, etc. that any animal possesses, learns, does, etc. is mediated by neurons. So any change of any kind, from behavior to cognition, must be the result of some change in the nervous system. There are two possibilities: when learning occurs, it is the result of a neuron growing a connection to another neuron or it is the result of a change in the strength (or efficacy) of the connection one neuron already makes with another. Because of the speed at which learning occurs (learning can, in some cases, be demonstrated to have occurred within seconds) it is generally thought that neuronal growth is an unlikely mechanism to account for learning. Therefore most neuroscientist believe learning (i.e., any change in a neurally mediated processes) must occur through the modification of synaptic efficacy, i.e., the effect of the firing of one neuron on another, downstream neuron must increase or decrease. Yet, in spite of this believe, even today, to the best of my knowledge, neuroscientist DO NOT KNOW whether synaptic efficacies DO change or how, even in the case of the most simple of all (associative) learning paradigms, classical conditioning!

So I have a theory that says if certain events happen AT THE SYNAPSE, these sequence of events will produce a change in synaptic efficacy and that change in efficacy will be reflected in, evidenced by, a change in observable behavior of the organism. The program mathematically represents individual neurons as rows in a matrix and the strength of the synaptic connection between one neuron and another is given by the value of the number is the corresponding position in the matrix. Then I expose this simulated nervous system to external events and show that the result is that it’s output of behavior changes in ways nearly identical to what we would expect to see if we exposed a real animal to the same conditions.

Seems like we need to know what causes synapses to change strength. My uneducated guess is that learning must do that and that that is how neural pathways are formed and strengthened.

You’ve got it backwards. It isn’t learning that strengthens synapses. It’s increase in synaptic efficacy (produced in Matheta by my rules) that results in changed neural firing patterns that then produce the observed changes in behavior, i.e., learning. The learned, observed, change in behavior is a consequence of the changes in synaptic efficacy.

I think you confuse the technical definition of learning, i.e., a change in behavior, with the experience of learning. You say “learning must (cause synapses to change strength) and that is how neural pathways are formed and strengthened”. I think you may be saying “Well when I read a book, I’m learning new things and that must be changing neural pathways.” That’s true. But the change in pathways is the result of associations, connections, “meaning” you are detecting in the words and arguments of the book. If you just read words (say, made up non-sense words) that had no meaning, you would not “learn” anything and there would be no changes in your synapses. I hope that helps make the distinction clear. What I am demonstrating is an evolved capability of synapses that underlies ALL the learning that animals do.

If I follow this, what the program does is count the number of positive correlations between a stimulus and a behavior, and once they go above a certain number, that constitutes enough of a correlation to be considered a meaningful association and the device will now respond that way all the time, otherwise it is considered too random and therefore the device will continue to move about randomly?

No, NOTHING is counted, nothing is calculated. No correlation coefficients, no ratios, etc. I am modeling neurons and synapses, which are too stupid to CALCULATE associations, but are able to detect, record, and change firing patterns as a result of such associations existing in the pattern of stimuli-induced neuronal firings.

What do you mean in the presentation when you say, with respect to the matrix, that the positive neuron changes the synapses?

It’s probable that you don’t know what that means because I don’t tell you! What is implied is that the positive neuron has something to do with being fed and that somehow produces changes in synaptic efficacy that explains OPERANT conditioning, a much tougher nut to crack (than Classical Conditioning). Specifically, it is a conceptual challenge to understand how a provided reward, which occurs many seconds AFTER the behavior that is being rewarded has occurred can somehow “go back in time” and interact with neural events (e.g., the firing of neurons) that happened earlier and then somehow modify something (neural tracks) so that learning occurs. The Matheta model is based on a mechanism that apparently works. We use Robby, running the neural simulation program, to show that if you expose the simulation to the same sequence of events that result in learning in real animals, once again, you see the same changes in behavior, (i.e., learning) in our “artificial” organism. We hope that you are intrigued by this capability.

If you had 20 different Roombas with this program in it and handed them to 20 different people, wouldn’t they end up “learning” different things depending on the whims of the people “teaching” them as to what response and stimulus they decide to repeat the most?

That is correct. What Robby will learn depends on what contingencies exist. What we demonstrated was that one could arbitrarily INVENT a contingency (i.e., I’ll feed you if you turn left when the light is on) and the operationalization of that contingency causes Robby to learn and respond in a way so as to take advantage of that contingency so as to get fed. That is to say, the contingency only exists, and is detectable by Robby, if and only if I actually “feed” Robby (almost) every time it emits the behavior I have chosen to reward. Other people could choose other contingencies to reward. (For example, cha-cha when the light is on, go forward after the bump sensor is pressed, etc.) Further, the time it would take Robby to learn is entirely dependent upon the reliability with which the user establishes or operationalizes the contingency. If you make a mistake and reward the wrong behavior, or delay the reward too long, or fail to reward it, etc., Robby’s learning will be appropriately affected.

However, the far more important implication is that contingencies in the real world are REAL. You will eat if and only if you do the right thing at the right time in the right circumstances. Animals have evolved to be able to detect those contingencies. Robby replicates that ability.

Here’s a link to a Ping-Pong playing robot. Can yours do that?

http://www.designboom.com/technology/kuka-ping-pong-timo-boll-03-11-2014/

This is a great example of the difference between the Matheta approach and conventional robotics. The Ping-Pong playing robot works by some system of sensors (probably cameras) and computers (not shown) that CALCULATE the position and velocity (speed and direction) of the ball, then uses basic equations ( acceleration = force/mass; distance = velocity x time) to calculate where the ball will be and what it needs to do to put the paddle at the right place, at the right time, with the right orientation and velocity.

But when YOU play Ping-Pong, you NEVER calculate ANYTHING. You do not know the distance to the ball in millimeters, nor it’s speed in meters per second, nor the three dimensional vector which describes where it’s headed. BUT you have evolved to be a GENERAL PURPOSE learning machine. In order for you to play, you had to LEARN, through thousands of repetitions, how to gauge the distance, speed, path of the ball, etc. You had to learn how to hold the racket, and what happens to the ball when you hit it with any tiny variation in force, direction, etc. All that knowledge is retained in the synapses of your ten trillion neurons and it is so extensive that you can beat a programmed machine.

Like you, Matheta is a general purpose learning machine; the initial implementation of one, with less than a dozen neurons. Programmed robots have been around for a long time now and have years of research and development behind them. I’ve written one program. But, what approach do you think will win in the long run, a model of animal (including human) learning or calculation? And remember, you can play Ping-Pong, swim, play chess, discover hidden laws of the universe, create all kinds of things that never before existed, etc., etc. All because of your neurons and synapses.

How is synaptic efficacy strengthened in the real, i.e., uncontrolled world? Which is to say, how is learning expressed at the neurobiological level, not as observed changes in behavior, but in the brain?

Let me modify your question a bit: “How is synaptic efficacy modified in the real world?” If by that you are asking how do real world processes modify synaptic efficacy, there are two elements to the answer. The first is that real world associations and contingencies exist and when an organism encounters them they modify synaptic efficacy the same way that constructed contingencies do in a Skinner box (or my demonstrations).

But if you are asking what biochemical process occurs at the synapse that modifies synaptic efficacies, the answer is, to the best of my knowledge, we don’t know. I have seen some research that suggest we may be on the verge of understanding how synapses get modified but I’m pretty sure there is not general consensus regarding that question. I do know we’ve known for a long time that protein synthesis has something to do with what’s called “consolidating” memory (which means causing it to endure for a long time) because protein synthesis inhibiting antibiotics have been shown to disrupt “consolidation” if given shortly after learning trials.

This is perhaps a philosophical or ideological question/matter: It strikes me that your model of human learning as presented is completely behavioral. For me, while I think the behavioral approach to learning has value, I do not think it is the whole story. I think that the behaviorists were incorporating cognition into their operant models. Consequences don’t have a straightforward connection to particular learned behaviors. For example, being punished may not stop a teenager from being on the computer all the time; it might instead lead to rebellious behavior, more secrecy, and greater time spent on a computer.

I am most definitely NOT claiming this is a simulation of HUMAN learning! Robby has less than a dozen neurons! You have trillions (each with 100’s to hundreds of thousands of synapses, i.e., connections to other neurons.) So the capabilities of humans (including cognition) are way, WAY beyond anything I can presently simulate/replicate. However I believe that the evidence strongly suggests that even the sophisticated learning of humans is still based on the same principles that underlie my simulation. Specifically, this model is based on the NEURON’S capabilities to detect, record (“remember”), and use ASSOCIATIONS (i.e., that two stimuli occur simultaneously all the time [spatial associations] or one always precedes another [temporal associations]). And that “reward” although it starts as food, can become almost anything else. In the teenager example, when he is punished he may think that since his independence as not being respected (or he is being treated like a child) he then self-rewards (“I’ll show them they can’t control me.”) when he secretly uses the computer. (BTW, in Behaviorism an expected punishment that is not administered is a reward. So when our teenager gets away with secretly using his computer, he would “feel” rewarded.)

So we can’t model humans today, but if we had the money, time, person hours, etc. that went into Watson, the program that beat the Jeopardy champs, we believe we could be coming pretty close.