Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

A Logical Calculus of the Ideas Immanent in Nervous Activity

From Wikipedia, the free encyclopedia
Seminal 1943 paper proposing artificial neural networks

"A Logical Calculus of the Ideas Immanent in Nervous Activity" is a 1943 article written byWarren McCulloch andWalter Pitts.[1] The paper, published in the journalThe Bulletin of Mathematical Biophysics, proposed a mathematical model of the nervous system as a network of simple logical elements, later known as artificial neurons, orMcCulloch-Pitts neurons. These neurons receive inputs, perform a weighted sum, and fire an output signal based on a threshold function. By connecting these units in various configurations, McCulloch and Pitts demonstrated that their model could perform all logical functions.

It is a seminal work incognitive science,computational neuroscience,computer science, andartificial intelligence. It was a foundational result inautomata theory.John von Neumann cited it as a significant result.[2]

Mathematics

[edit]

The artificial neuron used in the original paper is slightly different from the modern version. They considered neural networks that operate in discrete steps of timet=0,1,{\displaystyle t=0,1,\dots }.

The neural network contains a number of neurons. Let the state of a neuroni{\displaystyle i} at timet{\displaystyle t} beNi(t){\displaystyle N_{i}(t)}. The state of a neuron can either be 0 or 1, standing for "not firing" and "firing". Each neuron also has a firing thresholdθ{\displaystyle \theta }, such that it fires if the total input exceeds the threshold.

Each neuron can connect to any other neuron (including itself) with positivesynapses (excitatory) or negative synapses (inhibitory). That is, each neuron can connect to another neuron with a weightw{\displaystyle w} taking an integer value. A peripheral afferent is a neuron with no incoming synapses.

We can regard each neural network as adirected graph, with the nodes being the neurons, and the directed edges being the synapses. A neural network has a circle or a circuit if there exists a directed circle in the graph.

Letwij(t){\displaystyle w_{ij}(t)} be the connection weight from neuronj{\displaystyle j} to neuroni{\displaystyle i} at timet{\displaystyle t}, then its next state isNi(t+1)=H(j=1nwij(t)Nj(t)θi(t)),{\displaystyle N_{i}(t+1)=H\left(\sum _{j=1}^{n}w_{ij}(t)N_{j}(t)-\theta _{i}(t)\right),}whereH{\displaystyle H} is theHeaviside step function (outputting 1 if the input is greater than or equal to 0, and 0 otherwise).

Symbolic logic

[edit]

The paper used, as alogical language for describing neural networks, "Language II" fromThe Logical Syntax of Language byRudolf Carnap with some notations taken fromPrincipia Mathematica byAlfred North Whitehead andBertrand Russell. Language II covers substantial parts of classical mathematics, includingreal analysis and portions of set theory.[3]

To describe a neural network with peripheral afferentsN1,N2,,Np{\displaystyle N_{1},N_{2},\dots ,N_{p}} and non-peripheral afferentsNp+1,Np+2,,Nn{\displaystyle N_{p+1},N_{p+2},\dots ,N_{n}} they considered logical predicate of formPr(N1,N2,,Np,t){\displaystyle Pr(N_{1},N_{2},\dots ,N_{p},t)}wherePr{\displaystyle Pr} is afirst-order logicpredicate function (a function that outputs aboolean),N1,,Np{\displaystyle N_{1},\dots ,N_{p}} are predicates that taket{\displaystyle t} as an argument, andt{\displaystyle t} is the onlyfree variable in the predicate. Intuitively speaking,N1,,Np{\displaystyle N_{1},\dots ,N_{p}} specifies the binary input patterns going into the neural network over all time, andPr(N1,N2,,Nn,t){\displaystyle Pr(N_{1},N_{2},\dots ,N_{n},t)} is a function that takes some binary input patterns, and constructs an output binary patternPr(N1,N2,,Nn,0),Pr(N1,N2,,Nn,1),{\displaystyle Pr(N_{1},N_{2},\dots ,N_{n},0),Pr(N_{1},N_{2},\dots ,N_{n},1),\dots }.

A logical sentencePr(N1,N2,,Nn,t){\displaystyle Pr(N_{1},N_{2},\dots ,N_{n},t)} is realized by a neural network iff there exists a time-delayT0{\displaystyle T\geq 0}, a neuroni{\displaystyle i} in the network, and an initial state for the non-peripheral neuronsNp+1(0),,Nn(0){\displaystyle N_{p+1}(0),\dots ,N_{n}(0)}, such that for any timet{\displaystyle t}, the truth-value of the logical sentence is equal to the state of the neuroni{\displaystyle i} at timet+T{\displaystyle t+T}. That is,t=0,1,2,,Pr(N1,N2,,Np,t)=Ni(t+T){\displaystyle \forall t=0,1,2,\dots ,\quad Pr(N_{1},N_{2},\dots ,N_{p},t)=N_{i}(t+T)}

Equivalence

[edit]

In the paper, they considered some alternative definitions of artificial neural networks, and have shown them to be equivalent, that is, neural networks under one definition realizes precisely the same logical sentences as neural networks under another definition.

They considered three forms of inhibition: relative inhibition, absolute inhibition, and extinction. The definition above is relative inhibition. By "absolute inhibition" they meant that if any negative synapse fires, then the neuron will not fire. By "extinction" they meant that if at timet{\displaystyle t}, any inhibitory synapse fires on a neuroni{\displaystyle i}, thenθi(t+j)=θi(0)+bj{\displaystyle \theta _{i}(t+j)=\theta _{i}(0)+b_{j}} forj=1,2,3,{\displaystyle j=1,2,3,\dots }, until the next time an inhibitory synapse fires oni{\displaystyle i}. It is required thatbj=0{\displaystyle b_{j}=0} for all largej{\displaystyle j}.

Theorem 4 and 5 state that these are equivalent.

They considered three forms of excitation: spatial summation, temporal summation, and facilitation. The definition above is spatial summation (which they pictured as having multiple synapses placed close together, so that the effect of their firing sums up). By "temporal summation" they meant that the total incoming signal isτ=0Tj=1nwij(t)Nj(tτ){\displaystyle \sum _{\tau =0}^{T}\sum _{j=1}^{n}w_{ij}(t)N_{j}(t-\tau )} for someT1{\displaystyle T\geq 1}. By "facilitation" they meant the same as extinction, except thatbj0{\displaystyle b_{j}\leq 0}. Theorem 6 states that these are equivalent.

They considered neural networks that do not change, and those that change byHebbian learning. That is, they assume that att=0{\displaystyle t=0}, some excitatory synaptic connections are not active. If at anyt{\displaystyle t}, bothNi(t)=1,Nj(t)=1{\displaystyle N_{i}(t)=1,N_{j}(t)=1}, then any latent excitatory synapse betweeni,j{\displaystyle i,j} becomes active. Theorem 7 states that these are equivalent.

Logical expressivity

[edit]

They considered "temporal propositional expressions" (TPE), which arepropositional formulas with one free variablet{\displaystyle t}. For example,N1(t)N2(t)¬N3(t){\displaystyle N_{1}(t)\vee N_{2}(t)\wedge \neg N_{3}(t)} is such an expression. Theorem 1 and 2 together showed that neural nets without circles are equivalent to TPE.

For neural nets with loops, they noted that "realizablePr{\displaystyle Pr} may involve reference to past events of an indefinite degree of remoteness". These then encodes for sentences like "There was some x such that x was a ψ" or(x)(ψx){\displaystyle (\exists x)(\psi x)}. Theorems 8 to 10 showed that neural nets with loops can encode allfirst-order logic with equality and conversely, any looped neural networks is equivalent to a sentence in first-order logic with equality, thus showing that they are equivalent in logical expressiveness.[4]

As a remark, they noted that a neural network, if furnished with a tape, scanners, and write-heads, is equivalent to aTuring machine, and conversely, every Turing machine is equivalent to some such neural network. Thus, these neural networks are equivalent toTuring computability,Church'slambda-definability, andKleene'sprimitive recursiveness.

Context

[edit]

Previous work

[edit]

The paper built upon several previous strands of work.[5][6]

In the symbolic logic side, it built on the previous work by Carnap, Whitehead, and Russell. This was contributed by Walter Pitts, who had a strong proficiency with symbolic logic. Pitts provided mathematical and logical rigor to McCulloch’s vague ideas on psychons (atoms of psychological events) and circular causality.[7]

In the neuroscience side, it built on previous work by the mathematical biology research group centered aroundNicolas Rashevsky, of which McCulloch was a member. The paper was published in theBulletin of Mathematical Biophysics, which was founded by Rashevsky in 1939. During the late 1930s, Rashevsky's research group was producing papers that had difficulty publishing in other journals at the time, so Rashevsky decided to found a new journal exclusively devoted to mathematical biophysics.[8]

Also in the Rashevsky's group wasAlston Scott Householder, who in 1941 published an abstract model of the steady-state activity of biological neural networks. The model, in modern language, is an artificial neural network withReLU activation function.[9] In a series of papers, Householder calculated the stable states of very simple networks: a chain, a circle, and abouquet. Walter Pitts' first two papers formulated a mathematical theory of learning and conditioning. The next three were mathematical developments of Householder’s model.[10]

In 1938, at age 15, Pitts ran away from home in Detroit and arrived in theUniversity of Chicago. Later, he walked into Rudolf Carnap's office with Carnap's book filled with corrections and suggested improvements. He started studying under Carnap and attending classes during 1938--1943. He wrote several early papers on neuronal network modelling and regularly attended Rashevsky's seminars in theoretical biology. The seminar attendants included Gerhard von Bonin and Householder. In 1940, von Bonin introduced Lettvin to McCulloch. In 1942, both Lettvin and Pitts had moved in with McCulloch's home.[11]

McCulloch had been interested in circular causality from studies withcausalgia after amputation,epileptic activity of surgically isolated brain, andLorente de Nò's research showingrecurrent neural networks are needed to explainvestibular nystagmus. He had difficulty with treating circular causality until Pitts demonstrated how it can be treated by the appropriate mathematical tools ofmodular arithmetics and symbolic logic.[4][10]

Both authors' affiliation in the article was given as "University of Illinois, College of Medicine, Department of Psychiatry at the Illinois Neuropsychiatric Institute, University of Chicago, Chicago, U.S.A."

Subsequent work

[edit]

It was a foundational result inautomata theory.John von Neumann cited it as a significant result.[2] This work led to work on neural networks and their link tofinite automata.[12] Kleene introduced the term"regular" for "regular language" in a 1951 technical report, where Kleene proved thatregular languages are all that could be generated by neural networks, among other results. The term "regular" was meant to be suggestive of "regularly occurring events" that the neural net automaton must process and respond to.[13]

Marvin Minsky was influenced by McCulloch, built an early example of neural networkSNARC (1951), and did a PhD thesis on neural networks (1954).[14]

McCulloch was the chair to the tenMacy conferences (1946--1953) on "Circular Causal and Feedback Mechanisms in Biological and Social Systems". This was a key event in the beginning ofcybernetics, and what later became known ascognitive science. Pitts also attended the conferences.[15]

In the 1943 paper, they described how memories can be formed by a neural network with loops in it, or alterable synapses, which are operating over time, and implementslogical universals -- "there exists" and "for all". This was generalized for spatial objects, such as geometric figures, in their 1947 paperHow we know universals.[16] Norbert Wiener found this a significant evidence for a general method for how animals recognizing objects, by scanning a scene from multiple transformations and finding a canonical representation. He hypothesized that this "scanning" activity is clocked by thealpha wave, which he mistakenly thought was tightly regulated at 10 Hz (instead of the 8 -- 13 Hz as modern research shows).[17]

McCulloch worked withManuel Blum in studying how a neural network can be "logically stable", that is, can implement a boolean function even if the activation thresholds of individual neurons are varied.[18]: 64  They were inspired by the problem of how the brain can perform the same functions, such as breathing, under influence ofcaffeine oralcohol, which shifts the activation threshold over the entire brain.[4]

See also

[edit]

References

[edit]
  1. ^McCulloch, Warren S.; Pitts, Walter (December 1943)."A logical calculus of the ideas immanent in nervous activity".The Bulletin of Mathematical Biophysics.5 (4):115–133.doi:10.1007/BF02478259.ISSN 0007-4985.
  2. ^abvon Neumann, J. (1951).The general and logical theory of automata. In L. A. Jeffress (Ed.),Cerebral mechanisms in behavior; the Hixon Symposium (pp. 1–41). Wiley.
  3. ^"Rudolf Carnap > G. Logical Syntax of Language (Stanford Encyclopedia of Philosophy)".plato.stanford.edu. Retrieved2024-10-13.
  4. ^abcMcCulloch, Warren (1961)."What is a number, that a man may know it, and a man, that he may know a Number"(PDF).General Semantics Bulletin (26 & 27):7–18.
  5. ^Abraham, Tara H. (2002). "(Physio)logical circuits: The intellectual origins of the McCulloch-Pitts neural networks".Journal of the History of the Behavioral Sciences.38 (1):3–25.doi:10.1002/jhbs.1094.ISSN 0022-5061.PMID 11835218.
  6. ^Piccinini, Gualtiero (August 2004)."The First Computational Theory of Mind and Brain: A Close Look at Mcculloch and Pitts's "Logical Calculus of Ideas Immanent in Nervous Activity"".Synthese.141 (2):175–215.doi:10.1023/B:SYNT.0000043018.52445.3e.ISSN 0039-7857.
  7. ^Aizawa, Kenneth (September 2012)."Warren McCulloch's Turn to Cybernetics: What Walter Pitts Contributed".Interdisciplinary Science Reviews.37 (3):206–217.Bibcode:2012ISRv...37..206A.doi:10.1179/0308018812Z.00000000017.ISSN 0308-0188.
  8. ^Abraham, Tara H. (2004)."Nicolas Rashevsky's Mathematical Biophysics".Journal of the History of Biology.37 (2):333–385.doi:10.1023/B:HIST.0000038267.09413.0d.ISSN 0022-5010.
  9. ^Householder, Alston S. (June 1941)."A theory of steady-state activity in nerve-fiber networks: I. Definitions and preliminary lemmas".The Bulletin of Mathematical Biophysics.3 (2):63–69.doi:10.1007/BF02478220.ISSN 0007-4985.
  10. ^abSchlatter, Mark; Aizawa, Ken (May 2008)."Walter Pitts and "A Logical Calculus"".Synthese.162 (2):235–250.doi:10.1007/s11229-007-9182-9.ISSN 0039-7857.
  11. ^Smalheiser, Neil R (December 2000)."Walter Pitts".Perspectives in Biology and Medicine.43 (2):217–226.doi:10.1353/pbm.2000.0009.ISSN 1529-8795.PMID 10804586.
  12. ^Kleene, S. C. (1956-12-31), Shannon, C. E.; McCarthy, J. (eds.),"Representation of Events in Nerve Nets and Finite Automata",Automata Studies. (AM-34), Princeton University Press, pp. 3–42,doi:10.1515/9781400882618-002,ISBN 978-1-4008-8261-8, retrieved2024-10-14{{citation}}:ISBN / Date incompatibility (help)
  13. ^Stephen Cole Kleene (Dec 1951).Representation of Events in Nerve Nets and Finite Automata(PDF) (Research Memorandum). U.S. Air Force / RAND Corporation. Here: p.46
  14. ^Arbib, Michael A (2000)."Warren McCulloch's Search for the Logic of the Nervous System".Perspectives in Biology and Medicine.43 (2):193–216.doi:10.1353/pbm.2000.0001.ISSN 1529-8795.PMID 10804585.
  15. ^"Summary: The Macy Conferences".asc-cybernetics.org. Retrieved2024-10-14.
  16. ^Pitts, Walter; McCulloch, Warren S. (1947-09-01)."How we know universals the perception of auditory and visual forms".The Bulletin of Mathematical Biophysics.9 (3):127–147.doi:10.1007/BF02478291.ISSN 1522-9602.PMID 20262674.
  17. ^Masani, P. R. (1990),"McCulloch, Pitts and the Evolution of Wiener's Neurophysiological Ideas",Norbert Wiener 1894–1964, Basel: Birkhäuser Basel, pp. 218–238,doi:10.1007/978-3-0348-9252-0_16,ISBN 978-3-0348-9963-5, retrieved2024-10-14
  18. ^Blum, Manuel. "Properties of a neuron with many inputs."Bionics Symposium: Living Prototypes--the Key to New Technology, 13-14-15 September 1960. WADD technical report, 60-600. (1961)
Retrieved from "https://en.wikipedia.org/w/index.php?title=A_Logical_Calculus_of_the_Ideas_Immanent_in_Nervous_Activity&oldid=1315466091"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp