When a neuron is driven beyond its threshold it spikes, and the fact that it does not communicate its continuous membrane potential is usually seen as a computational liability. Here we show that this spiking mechanism allows neurons to produce an unbiased estimate of their causal influence, and a way of approximating gradient descent learning. Importantly, neither activity of upstream neurons, which act as confounders, nor downstream non-linearities bias the results. By introducing a local discontinuity with respect to their input drive, we show how spiking enables neurons to solve causal estimation and learning problems.
Lansdell B, Kording K, bioRxiv 2019
]]>Recently at Drexel, I gave this presentation, in which I explore this hypothesis in the context of learning in neural networks. I cover my research showing that framing the gradient estimation problem as one of causal inference can lead to new learning algorithms in spiking neural networks. These algorithms rely on, rather than smooth out, the spiking discontinuity. I then show how such causal effect estimators can be used to train weights of a feedback network to communicate gradient signals in a way that avoids the biologically implausible elements of the back-propagation algorithm. The result is learning algorithms with comparable performance to back-propagation, and better performance than other biologically plausible gradient-based learning rules, on simple benchmarks. These approaches thus yield efficient and plausible learning algorithms for the brain, which also have applications in neuromorphic hardware and specialized hardware optimized for implementing back-propagation.
Check out the slides here: slides
]]>Backpropagation is driving today’s artificial neural networks (ANNs). However, despite extensive research, it remains unclear if the brain implements this algorithm. Among neuroscientists, reinforcement learning (RL) algorithms are often seen as a realistic alternative: neurons can randomly introduce change, and use unspecific feedback signals to observe their effect on the cost and thus approximate their gradient. However, the convergence rate of such learning scales poorly with the number of involved neurons. Here we propose a hybrid learning approach. Each neuron uses an RL-type strategy to learn how to approximate the gradients that backpropagation would provide. We provide proof that our approach converges to the true gradient for certain classes of networks. In both feedforward and convolutional networks, we empirically show that our approach learns to approximate the gradient, and can match the performance of gradient-based learning. Learning feedback weights provides a biologically plausible mechanism of achieving good performance, without the need for precise, pre-specified learning rules.
Lansdell B, Prakash P, Kording K, arXiv ICLR 2020 main meeting
]]>What, after all, is an intervention, often used as a basis for defining causal relationships? Philosophies of causation admit that this is hard to define in non-causal terms. I believe a hard-coded notion of action spaces in terms of interventions is not flexible enough to allow robust reasoning about interventions. I suggest we focus on operationalizing the notion of internvention, and focus on agents that can solve each of these asepcts. A key aspect, as mentioned above, is the transfer of knowledge obtained from observation to an understanding of what will happen when the agent itself acts – in this way recognizing the world as possessing causal relationships that exist separately to the agent, and that can be exploited by the agent. This is what philosopher Jim Woodward calls ‘intervention-centric’ causal reasoning.
This workshop paper presented at ICLR is my foray into developing such tasks and showing they can be solved with meta-reinforcement learning. These ideas are the result of discussions with folks in Yoshua Bengio’s group at MILA and Upenn. Check out the abstract!
Interventions are central to causal learning and reasoning. Yet ultimately an intervention is an abstraction: an agent embedded in a physical environment (perhaps modeled as a Markov decision process) does not typically come equipped with the notion of an intervention – its action space is typically ego-centric, without actions of the form ‘intervene on X’. Such a correspondence between ego-centric actions and interventions would be challenging to hard-code. It would instead be better if an agent learnt which sequence of actions allow it to make targeted manipulations of the environment, and learnt corresponding representations that permitted learning from observation. Here we show how a meta-learning approach can be used to perform causal learning in this challenging setting, where the action-space is not a set of interventions and the observation space is a high-dimensional space with a latent causal structure. A meta-reinforcement learning algorithm is used to learn relationships that transfer on observational causal learning tasks. This work shows how advances in deep reinforcement learning and meta-learning can provide intervention-centric causal learning in high-dimensional environments with a latent causal structure.
Lansdell B (pdf) ICLR 2020 Workshop on Causal Learning for Decision Making
]]>Designing brain-computer interfaces (BCIs) that can be used in conjunction with ongoing motor behavior requires an understanding of how neural activity co-opted for brain control interacts with existing neural circuits. For example, BCIs may be used to regain lost motor function after stroke. This requires that neural activity controlling unaffected limbs is dissociated from activity controlling the BCI. In this study we investigated how primary motor cortex accomplishes simultaneous BCI control and motor control in a task that explicitly required both activities to be driven from the same brain region (i.e. a dual-control task). Single-unit activity was recorded from intracortical, multi-electrode arrays while a non-human primate performed this dual-control task. Compared to activity observed during naturalistic motor control, we found that both units used to drive the BCI directly (control units) and units that did not directly control the BCI (non-control units) significantly changed their tuning to wrist torque. Using a measure of effective connectivity, we observed that control units decrease their connectivity. Through an analysis of variance we found that the intrinsic variability of the control units has a significant effect on task proficiency. When this variance is accounted for, motor cortical activity is flexible enough to perform novel BCI tasks that require active decoupling of natural associations to wrist motion. This study provides insight into the neural activity that enables a dual-control brain-computer interface.
Lansdell B, Milovanovic I, Mellema C, Fairhall A, Fetz E, Moritz C IEEE Transactions in neural systems and rehabilitation engineering 2020 arXiv
]]>Excessively changing policies in many real world scenarios is difficult, unethical, or expensive. After all, doctor guidelines, tax codes, and price lists can only be reprinted so often. We may thus want to only change a policy when it is probable that the change is beneficial. In cases that a policy is a threshold on contextual variables we can estimate causal effects for populations lying at the threshold. This allows for a schedule of incremental policy updates that let us optimize a policy while making few detrimental changes. Using this idea, and the theory of linear contextual bandits, we present a conservative policy updating procedure which updates a deterministic policy only when justified. We provide simulations and an analysis of an infant health well-being causal inference dataset, showing the algorithm efficiently learns a good policy with few changes. Our approach allows efficiently solving problems where excessive changes are to be avoided, with applications in medicine, economics and beyond.
Check it out here: Lansdell B, Triantafillou S, Kording K, arXiv 2019
]]>To understand in what way the brain is a computer it’s of course instructive to first ask what makes anything a computer? Alan Turing may have provided a formal framework for digital computation in the 1930s but what we would call computers (albeit primitive) were envisioned long before then. The immediate examples that spring to mind are of course the difference and analytical engines of Charles Baggage, and the ‘programmable’ looms developed by Jacquard. These were devices that could be instructed to perform some task or calculation automatically. These machines off-load from ourselves the execution of a routine, generally but not necessarily sequential, set of tasks, and also may perform calculations for us. Critically, interpretting the machine’s behaviour in terms of elements within a set of tasks, or components of a calculation, requires that we ascribe to the machine a correspondance between its physical state and components of the task or calculation being performed. The abacus only counts if we correspond left and right beads with counted and uncounted units.
Thus representation is a core component of what makes something a computer. Indeed, a definition (though debated, as with any philosophical topic) of a computer is a physical system whose states can be put in to a reasonable correspondence with variables which perform a calculation of interest. Notice that such a definition allows for computation to be either analog or digital – a sundial acts as a simple computer. Computers today (as in laptops, PCs, mobile phones, etc), are such exceptionally powerful devices in terms of the breadth and speed with which they may perform myriad calculations of interest to us that it can seem hard to imagine how any other machine could reasonably be called a computer also. But the essential property that makes even the most powerful of servers a computer is exactly the same as the property that makes a sundial a computer – its capacity to represent and manipulate quantities of interest.
Given this general definition of computation, it may at least seem less surprising that the brain is an organ capable of computation. But we’re not done with our argument yet, we just have argued against the notion that computers need to be digital, serial, and run on electricity.
An often offered explanation for the relation between brains and computers is that our understanding of brains is often compared to the popular technology of the day. Thus in the 1600s mechanical analogies were used to understand the body’s function – Descartes envisioned pulleys and gears determining our behavior. Following the dynamical revolution of Newton (and Leibniz), cognition was viewed as a dynamical process: in terms of forces pushing and pulling our ego in conflicting direction. Finally, the digital computer last century has spurred the most recent set of comparisons. However, I would like to argue that, by looking at the history of computation, there is a more fundamental relation between computers and cognition then merely a metaphor in terms of ‘the technology of the day’. Unlike mechanical or dynamical analogies previously, the history of computation itself reveals itself as a kind of abstracted form of reasoning. If this is the case, then almost by definition, cognition must have a computational component, and brains, equally, must be implementations of computers, when suitably generally defined.
Thus the relation between brains, minds and computers dates back at least to the 30s and 40s of last century, but also in a sense much earlier. Starting with Frege, Boole, and even Leibniz, philosophers have sought after formal systems capable of expressing thought and formal rules of reasoning. Frege’s formal system presented in his Begriffsschrift and Leibniz’s notion of a ‘calculus ratiocinator’, for instance. We can look at quotes from Hobbes and Leibniz:
“By reasoning, I understand computation. And to compute is to collect the sum of many things added together at the same time, or to know the remainder when one thing has been taken from another.” — Hobbes 1655
and
“The only way to rectify our reasonings is to make them as tangible as those of the Mathematicians, so that we can find our error at a glance, and when there are disputes among persons, we can simply say: Let us calculate, without further ado, to see who is right.” — Leibniz 1685
as evidence this view that computation is viewed as a type of mechanical reasoning.
The desire to make mathematical reasoning a purely algorithmic, or formal, process lead ultimately to Hilbert’s formalist program, and to Turing’s formulation of an ideal, mechanical, rule-following computer. Though ultimately Turing’s universal machines were physically realized as actual computing devices, originally they were devised purely as theoretical tools, models of mechanical reasoning to be used in proofs of statements in metamathematics. It is helpful to note that in the 1930s a computer referred to a person performing calculations, and that Turing’s original paper described a machine having a finite number of ‘states of mind’ – finite due to our own limited mental capacities. Starting with McCulloch and Pitts, Turing machines were envisioned as a kind of model of the mind. Indeed, his eponymous machines provided the basis for early functionalist philosophies of mind, and they still provide the basis for contemporary theories in cognitive science – the dominant view being that cognition is a type of computation. Turing maintained a strong interest in the relation between computers and the mind throughout his life.
Conversely, the other key figure in the founding of computer science, John von Neumann, also held a deep interest in the brain. In the 1940s, as part of Weiner’s cybernetics school, von Neumann engaged in much discussion about the relation between humans and machines, which lead to the brain being used very loosely as a model for the computers he was involved in designing. Von Neumann’s ‘First Draft of a Report on the EDVAC’, still the template for most computers in use today, makes reference to ‘organs’ of the computer, and discusses similarities and differences between the logical units of transistors and neurons. Later von Neumann, realizing the immense structural and functional complexity of the brain, moved away from thinking of modeling the brain so literally with a computer, and instead advocated using computers to numerically simulate some elements of the brain’s function. Though a gross charicature of their views, Turing and von Neumann serve to neatly represent the two elements of the brain-computer relation that computational neuroscience explores today – computers as models for cognitive processes and computers as tools to simulate the brain.
Given both the modern and early founders of computer science were so interested in reasoning and the brain, it is no surprise that we now find computers and brains intimately related – computation being a used as a model of a sort of formal reasoning. If mental states are computational states then they must have a physical substrate in the brain somewhere – thus brain states must also act as computational states. The connection is almost baked into the definitions.
]]>Ensure the markdown engine is set to kramdown
in _config.yml
. This is now the only supported markdown processor on github pages, so this should be set anyway.
Include a new file in _includes
named _mathjax_support.html
(a clever idea from here):
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: {
equationNumbers: {
autoNumber: "AMS"
}
},
tex2jax: {
inlineMath: [ ['$', '$'] ],
displayMath: [ ['$$', '$$'] ],
processEscapes: true,
}
});
MathJax.Hub.Register.MessageHook("Math Processing Error",function (message) {
alert("Math Processing Error: "+message[1]);
});
MathJax.Hub.Register.MessageHook("TeX Jax - parse error",function (message) {
alert("Math Processing Error: "+message[1]);
});
</script>
<script type="text/javascript" async
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
The bottom two hooks alert the user/writer about math and tex errors.
Importantly, in contrast to older guides online, note the https in the MathJax CDN. Unencrypted access to the CDN is a security risk and now will either not render in some browsers (didn’t work in Chrome for me), or will issue warnings in other browsers (Firefox). See the MathJax documentation for more information.
Next, include in the <head>
of _layouts/default.html
:
{% if page.use_math %}
{% include mathjax_support.html %}
{% endif %}
Now to include $\LaTeX$ in a post you just need set the variable use_math: true
in the YAML front-matter of the page/post! Enclose inline formulas in $
s and display formulas in $$
s. For instance,
$$
K(a,b) = \int \mathcal{D}x(t) \exp(2\pi i S[x]/\hbar)
$$
produces:
Note that any equations requiring alignment (use of ampersand &) need some care. The solution I found was to wrap any of these elements in <div>’s.
Add the following to MathJax.Hub.Config
:
CommonHTML: {
scale: 85
}