{"title": "A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data", "book": "Advances in Neural Information Processing Systems", "page_first": 1932, "page_last": 1940, "abstract": "Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials.  However, the most common neural point process models, the Poisson process and the gamma renewal process, do not capture interactions and correlations that are critical to modeling populations of neurons.  We develop a novel model based on a determinantal point process over latent embeddings of neurons that effectively captures and helps visualize complex inhibitory and competitive interaction.  We show that this model is a natural extension of the popular generalized linear model to sets of interacting neurons.   The model is extended to incorporate gain control or divisive normalization, and the modulation of neural spiking based on periodic phenomena.  Applied to neural spike recordings from the rat hippocampus, we see that the model captures inhibitory relationships, a dichotomy of classes of neurons, and a periodic modulation by the theta rhythm known to be present in the data.", "full_text": "A Determinantal Point Process Latent Variable\nModel for Inhibition in Neural Spiking Data\n\nJasper Snoek\u2217\nHarvard University\n\njsnoek@seas.harvard.edu\n\nRyan P. Adams\nHarvard University\n\nrpa@seas.harvard.edu\n\nRichard S. Zemel\nUniversity of Toronto\n\nzemel@cs.toronto.edu\n\nAbstract\n\nPoint processes are popular models of neural spiking behavior as they provide a\nstatistical distribution over temporal sequences of spikes and help to reveal the\ncomplexities underlying a series of recorded action potentials. However, the most\ncommon neural point process models, the Poisson process and the gamma renewal\nprocess, do not capture interactions and correlations that are critical to modeling\npopulations of neurons. We develop a novel model based on a determinantal point\nprocess over latent embeddings of neurons that effectively captures and helps vi-\nsualize complex inhibitory and competitive interaction. We show that this model\nis a natural extension of the popular generalized linear model to sets of interacting\nneurons. The model is extended to incorporate gain control or divisive normaliza-\ntion, and the modulation of neural spiking based on periodic phenomena. Applied\nto neural spike recordings from the rat hippocampus, we see that the model cap-\ntures inhibitory relationships, a dichotomy of classes of neurons, and a periodic\nmodulation by the theta rhythm known to be present in the data.\n\nIntroduction\n\n1\nStatistical models of neural spike recordings have greatly facilitated the study of both intra-neuron\nspiking behavior and the interaction between populations of neurons. Although these models are\noften not mechanistic by design, the analysis of their parameters \ufb01t to physiological data can help\nelucidate the underlying biological structure and causes behind neural activity. Point processes in\nparticular are popular for modeling neural spiking behavior as they provide statistical distributions\nover temporal sequences of spikes and help to reveal the complexities underlying a series of noisy\nmeasured action potentials (see, e.g., Brown (2005)). Signi\ufb01cant effort has been focused on address-\ning the inadequacies of the standard homogenous Poisson process to model the highly non-stationary\nstimulus-dependent spiking behavior of neurons. The generalized linear model (GLM) is a widely\naccepted extension for which the instantaneous spiking probability can be conditioned on spiking\nhistory or some external covariate. These models in general, however, do not incorporate the known\ncomplex instantaneous interactions between pairs or sets of neurons. Pillow et al. (2008) demon-\nstrated how the incorporation of simple pairwise connections into the GLM can capture correlated\nspiking activity and result in a superior model of physiological data. Indeed, Schneidman et al.\n(2006) observe that even weak pairwise correlations are suf\ufb01cient to explain much of the collective\nbehavior of neural populations. In this paper, we develop a point process over spikes from col-\nlections of neurons that explicitly models anti-correlation to capture the inhibitive and competitive\nrelationships known to exist between neurons throughout the brain.\n\n\u2217Research was performed while at the University of Toronto.\n\n1\n\n\fAlthough the incorporation of pairwise inhibition in statistical models is challenging, we demon-\nstrate how complex nonlinear pairwise inhibition between neurons can be modeled explicitly and\ntractably using a determinantal point process (DPP). As a starting point, we show how a collection\nof independent Poisson processes, which is easily extended to a collection of GLMs, can be jointly\nmodeled in the context of a DPP. This is naturally extended to include dependencies between the in-\ndividual processes and the resulting model is particularly well suited to capturing anti-correlation or\ninhibition. The Poisson spike rate of each neuron is used to model individual spiking behavior, while\npairwise inhibition is introduced to model competition between neurons. The reader familiar with\nMarkov random \ufb01elds can consider the output of each generalized linear model in our approach to\nbe analogous to a unary potential while the DPP captures pairwise interaction. Although inhibitory,\nnegative pairwise potentials render the use of Markov random \ufb01elds intractable in general; in con-\ntrast, the DPP provides a more tractable and elegant model of pairwise inhibition. Given neural\nspiking data from a collection of neurons and corresponding stimuli, we learn a latent embedding\nof neurons such that nearby neurons in the latent space inhibit one another as enforced by a DPP\nover the kernel between latent embeddings. Not only does this overcome a modeling shortcoming of\nstandard point processes applied to spiking data but it provides an interpretable model for studying\nthe inhibitive and competitive properties of sets of neurons. We demonstrate how divisive normal-\nization is easily incorporated into our model and a learned periodic modulation of individual neuron\nspiking is added to model the in\ufb02uence on individual neurons of periodic phenomena such as theta\nor gamma rhythms.\nThe model is empirically validated in Section 4, \ufb01rst on three simulated examples to show the in-\n\ufb02uence of its various components and then using spike recordings from a collection of neurons in\nthe hippocampus of an awake behaving rat. We show that the model learns a latent embedding of\nneurons that is consistent with the previously observed inhibitory relationship between interneurons\nand pyramidal cells. The inferred periodic component of approximately 4 Hz is precisely the fre-\nquency of the theta rhythm observed in these data and its learned in\ufb02uence on individual neurons is\nagain consistent with the dichotomy of neurons.\n2 Background\n2.1 Generalized Linear Models for Neuron Spiking\nA standard starting point for modeling single neuron spiking data is the homogenous Poisson pro-\ncess, for which the instantaneous probability of spiking is determined by a scalar rate or intensity\nparameter. The generalized linear model (Brillinger, 1988; Chornoboy et al., 1988; Paninski, 2004;\nTruccolo et al., 2005) is a framework that extends this to allow inhomogeneity by conditioning the\nspike rate on a time varying external input or stimulus. Speci\ufb01cally, in the GLM the rate parameter\nresults from applying a nonlinear warping (such as the exponential function) to a linear weighting\nof the inputs. Paninski (2004) showed that one can analyze recorded spike data by \ufb01nding the max-\nimum likelihood estimate of the parameters of the GLM, and thereby study the dependence of the\nspiking on external input. Truccolo et al. (2005) extended this to analyze the dependence of a neu-\nron\u2019s spiking behavior on its past spiking history, ensemble activity and stimuli. Pillow et al. (2008)\ndemonstrated that the model of individual neuron spiking activity was signi\ufb01cantly improved by\nincluding coupling \ufb01lters from other neurons with correlated spiking activity in the GLM. Although\nit is prevalent in the literature, there are fundamental limitations to the GLM\u2019s ability to model real\nneural spiking patterns. The GLM can not model the joint probability of multiple neurons spiking\nsimultaneously and thus lacks a direct dependence between the spiking of multiple neurons. Instead,\nthe coupled GLM relies on an assumption that pairs of neurons are conditionally independent given\nthe previous time step. However, empirical evidence, from for example neural recordings from the\nrat hippocampus (Harris et al., 2003), suggests that one can better predict the spiking of an individ-\nual neuron by taking into account the simultaneous spiking of other neurons. In the following, we\nshow how to express multiple GLMs as a determinantal point process, enabling complex inhibitory\ninteractions between neurons. This new model enables a rich set of interactions between neurons\nand enables them to be embedded in an easily-visualized latent space.\n2.2 Determinantal Point Processes\nThe determinantal point process is an elegant distribution over con\ufb01gurations of points in space that\ntractably models repulsive interactions. Many natural phenomena are DPP distributed including\nfermions in quantum mechanics and the eigenvalues of random matrices. For an in-depth survey,\n\n2\n\n\fsee Hough et al. (2006); see Kulesza and Taskar (2012) for an overview of their development within\nmachine learning. A point process provides a distribution over subsets of a space S. A determi-\nnantal point process models the probability density (or mass function, as appropriate) for a subset\nof points, S \u2286 S as being proportional to the determinant of a corresponding positive semi-de\ufb01nite\ngram matrix KS, i.e., p(S) \u221d |KS|. In the L-ensemble construction that we limit ourselves to here,\nthis gram matrix arises from the application of a positive semi-de\ufb01nite kernel function to the set S.\nKernel functions typically capture a notion of similarity and so the determinant is maximized when\nthe similarity between points, represented as the entries in KS is minimized. As the joint probability\nis higher when the points in S are distant from one another, this encourages repulsion or inhibition\nbetween points. Intuitively, if one point i is observed, then another point j with high similarity, as\ncaptured by a large entry [KS]ij of KS, will become less likely to be observed under the model. It\nis important to clarify here that KS can be any positive semi-de\ufb01nite matrix over some set of in-\nputs corresponding to the points in the set, but it is not the empirical covariance between the points\nthemselves. Conversely, KS encodes a measure of anti-correlation between points in the process.\nTherefore, we refer hereafter to KS as the kernel or gram matrix.\n3 Methods\n3.1 Modeling inter-Neuron Inhibition with Determinantal Point Processes\nWe are interested in modelling the spikes on N neurons during an interval of time T . We will\nassume that time has been discretized into T bins of duration \u03b4. In our formulation here, we assume\nthat all interaction across time occurs due to the GLM and that the determinantal point process\nonly modulates the inter-neuron inhibition within a single time slice. This corresponds to a Poisson\nassumption for the marginal of each neuron taken by itself.\nIn our formulation, we associate each neuron, n, with a D-dimensional latent vector yn \u2208 RD and\ntake our space to be the set of these vectors, i.e., S = {y1, y2,\u00b7\u00b7\u00b7 , yN}. At a high level, we use an\nL-ensemble determinantal point process to model which neurons spike in time t via a subset St \u2282 S:\n(1)\nHere the entries of the matrix KS arise from a kernel function k\u03b8(\u00b7,\u00b7) applied to the values {yn}N\nn=1\nso that [KS]n,n(cid:48) = k\u03b8(yn, yn(cid:48)). The kernel function, governed by hyperparameters \u03b8, measures the\ndegree of dependence between two neurons as a function of their latent vectors. In our empirical\nanalysis we choose a kernel function that measures this dependence based on the Euclidean distance\nbetween latent vectors such that neurons that are closer in the latent space will inhibit each other\nmore. In the remainder of this section, we will expand this to add stimulus dependence.\nAs the determinant of a diagonal matrix is simply the product of the diagonal entries, when KS\nis diagonal the DPP has the property that it is simply the joint probability of N independent (dis-\ncretized) Poisson processes. Thus in the case of independent neurons with Poisson spiking we can\nwrite KS as a diagonal matrix where the diagonal entries are the individual Poisson intensity param-\neters, KS = diag(\u03bb1, \u03bb2,\u00b7\u00b7\u00b7 , \u03bbN ). Through conditioning the diagonal elements on some external\ninput, this elegant property allows us to express the joint probability of N independent GLMs in\nthe context of the DPP. This is the starting point of our model, which we will combine with a full\ncovariance matrix over the latent variables to include interaction between neurons.\nFollowing Zou and Adams (2012), we express the marginal preference for a neuron \ufb01ring over\nothers, thus including the neuron in the subset S, with a \u201cprior kernel\u201d that modulates the covariance.\nAssuming that k\u03b8(y, y) = 1, this kernel has the form\n\nn=1) = |KSt|\n\nPr(St |{yn}N\n\n|KS + I N|\n\n.\n\n[KS]n,n(cid:48) = k\u03b8(yn, yn(cid:48))\u03b4\n\n(2)\nwhere n, n(cid:48)\n\u2208 S and \u03bbn is the intensity measure of the Poisson process for the individual spiking\nbehavior of neuron n. We can use these intensities to modulate the DPP with a GLM by allowing\nthe \u03bbn to depend on a weighted time-varying stimulus. We denote the stimulus at time t by a\nvector xt \u2208 RK and neuron-speci\ufb01c weights as wn \u2208 RK, leading to instantaneous rates:\nThis leads to a stimulus dependent kernel for the DPP L-ensemble:\n\nn = exp{xT\n\u03bb(t)\n\nt wn}.\n\n\u03bbn(cid:48),\n\n(3)\n\n\u03bbn\n\n(cid:112)\n\n(cid:112)\n\n[K(t)\n\nS ]n,n(cid:48) = k\u03b8(yn, yn(cid:48)) \u03b4 exp\n\n3\n\n(cid:26) 1\n\n2\n\n(cid:27)\n\nxT\n\nt (wn + wn(cid:48))\n\n.\n\n(4)\n\n\f(cid:113)\n\n(cid:113)\n\n(cid:113)\nT(cid:89)\n\nt=1\n\nIt is convenient to denote the diagonal matrix \u03a0(t) = diag(\nthe St-restricted submatrix \u03a0(t)\nSt\nneurons that spiked at time t. We can now write the joint probability of the spike history as\n\n\u03bb(t)\nN ), as well as\n, where St indexes the rows of \u03a0 corresponding to the subset of\n\n\u03bb(t)\n2 ,\u00b7\u00b7\u00b7 ,\n\n\u03bb(t)\n1 ,\n\nPr({St}T\n\nt=1 |{wn, yn}N\n\nn=1,{xt}T\n\nt=1, \u03b8) =\n\nSt\n\nKSt\u03a0(t)\nSt |\n\n|\u03b4\u03a0(t)\nS KS\u03a0(t)\n\nS + IN|\n\n|\u03b4\u03a0(t)\n\n.\n\n(5)\n\nThe generalized linear model now modulates the marginal rates, while the determinantal point pro-\ncess induces inhibition. This is similar to unary versus pairwise potentials in a Markov random \ufb01eld.\nNote also that as the in\ufb02uence of the DPP goes to zero, KS tends toward the identity matrix and\nthe probability of neuron n \ufb01ring becomes (for \u03b4 (cid:28) 1) \u03b4\u03bb(t)\nn , which recovers the basic GLM. The\nlatent embeddings yn and weights wn can now be learned so that the appropriate balance is found\nbetween stimulus dependence and inhibition due to, e.g., overlapping receptive \ufb01elds.\n3.2 Learning\nn=1 from data by maximizing the likelihood in Equation 5.\nWe learn the model parameters {wn, yn}N\nThis optimization is performed using stochastic gradient descent on mini-batches of time slices.\nThe computational complexity of learning the model is asymptotically dominated by the cost of\ncomputing the determinants in the likelihood, which are O(N 3) in this model. This was not a\nlimiting factor in this work, as we model a population of 31 neurons. Fitting this model for 31\nneurons in Section 4.3 with approximately eighty thousand time bins requires approximately three\nhours using a single core of a typical desktop computer. The cubic scaling of determinants in this\nmodel will not be a realistic limiting factor until it is possible to simultaneously record from tens of\nthousands of neurons simultaneously. Nevertheless, at these extremes there are promising methods\nfor scaling the DPP using low rank approximations of KS (Affandi et al., 2013) or expressing them\nin the dual representation when using a linear covariance (Kulesza and Taskar, 2011).\n3.3 Gain and Contrast Normalization\nThere is increasing evidence that neural responses are normalized or scaled by a common factor such\nas the summed activations across a pool of neurons (Carandini and Heeger, 2012). Many compu-\ntational models of neural activity include divisive normalization as an important component (Wain-\nwright et al., 2002). Such normalization can be captured in our model through scaling the individual\nneuron spiking rates by a stimulus-dependent multiplicative constant \u03bdt > 0:\nKSt\u03a0(t)\nSt |\n\nPr(St |{wn, yn}N\nt w\u03bd}. We learn these parameters w\u03bd jointly with the other model parameters.\n\nwhere \u03bdt = exp{xT\n3.4 Modeling the In\ufb02uence of Periodic Phenomena\nNeuronal spiking is known to be heavily in\ufb02uenced by periodic phenomena. For example, in our\nempirical analysis in Section 4.3 we apply the model to the spiking of neurons in the hippocampus\nof behaving rats. Csicsvari et al. (1999) observe that the theta rhythm plays a signi\ufb01cant role in\ndetermining the spiking behavior of the neurons in these data, with neurons spiking in phase with\nthe 4 Hz periodic signal. Thus, the \ufb01ring patterns of neurons that \ufb01re in phase can be expected to\nbe highly correlated while those which \ufb01re out of phase will be strongly anti-correlated. In order to\nincorporate the dependence on a periodic signal into our model, we add to \u03bb(t)\nn a periodic term that\nmodulates the individual neuron spiking rates with a frequency f, a phase \u03d5, and a neuron-speci\ufb01c\namplitude or scaling factor \u03c1n,\n\nn=1, xt, \u03b8, \u03bdt) = |\u03bdt\u03b4\u03a0(t)\n|\u03bdt\u03b4\u03a0(t)\n\nS KS\u03a0(t)\n\nS + IN|\n\n(6)\n\nSt\n\n,\n\n\u03bb(t)\n\n(7)\nwhere t is the time at which the spikes occurred. Note that if desired one can easily manipulate\nEquation 7 to have each of the neurons modulated by an individual frequency, ai, and offset bi.\nAlternatively, we can create a mixture of J periodic components, modeling for example the in\ufb02uence\nof the theta and gamma rhythms, by adding a sum over components,\n\nt wn + \u03c1n sin(f t + \u03d5)(cid:9)\n\nn = exp(cid:8)xT\n\uf8f1\uf8f2\uf8f3xT\n\nJ(cid:88)\n\nj=1\n\n4\n\n\uf8fc\uf8fd\uf8fe\n\n\u03bb(t)\nn = exp\n\nt wn +\n\n\u03c1jn sin(fj t + \u03d5j)\n\n(8)\n\n\f(a) Sliding Bar\n\n(b) Random Spiking\n\n(c) Gain Control\n\nFigure 1: Results of the simulated moving bar experiment (1a) compared to independent spiking behavior (1b).\nNote that in 1a the model puts neighboring neurons within the unit length scale while it puts others at least one\nlength scale apart. 1c demonstrates the weights, w\u03bd, of the gain component learned if up to 5x random gain is\nadded to the stimulus at retina locations 6-12.\n\n4 Experiments\nIn this section we present an empirical analysis of the model developed in this paper. We \ufb01rst\nevaluate the model on a set of simulated experiments to examine its ability to capture inhibition in\nthe latent variables while learning the stimulus weights and gain normalization. We then train the\nmodel on recorded rat hippocampal data and evaluate its ability to capture the properties of groups of\ninteracting neurons. In all experiments we compute KS with the Mat\u00b4ern 5/2 kernel (see Rasmussen\nand Williams (2006) for an overview) with a \ufb01xed unit length scale (which determines the overall\nscaling of the latent space).\n4.1 Simulated Moving Bar\nWe \ufb01rst consider an example simulated problem where twelve neurons are con\ufb01gured in order along\na one dimensional retinotopic map and evaluate the ability of the DPP to learn latent representations\nthat re\ufb02ect their inhibitive properties. Each neuron has a receptive \ufb01eld of a single pixel and the\nneurons are stimulated by a three pixel wide moving bar. The bar is slid one pixel at each time step\nfrom the \ufb01rst to last neuron, and this is repeated twenty times. Of the three neighboring neurons\nexposed to the bar, all receive high spike intensity but due to neural inhibition, only the middle one\nspikes. A small amount of random background stimulus is added as well, causing some neurons to\nspike without being stimulated by the moving bar. We train the DPP speci\ufb01ed above on the resulting\nspike trains, using the stimulus of each neuron as the Poisson intensity measure and visualize the\none-dimensional latent representation, y, for each neuron. This is compared to the case where all\nneurons receive random stimulus and spike randomly and independently when the stimulus is above\na threshold. The resulting learned latent values for the neurons are displayed in Figure 1. We see\nin Figure 1a that the DPP prefers neighboring neurons to be close in the latent space, because they\ncompete when the moving bar stimulates them. To demonstrate the effect of the gain and contrast\nnormalization we now add random gain of up to 5x to the stimulus only at retina locations 6-12 and\nretrain the model while learning the gain component. In Figure 1c we see that the model learns to\nuse the gain component to normalize these inputs.\n4.2 Digits Data\nNow we use a second simulated experiment to examine the ability of the model to capture structure\nencoding inhibitory interactions in the latent representation while learning the stimulus dependent\nprobability of spiking from data. This experiment includes thirty simulated neurons, each with a\ntwo dimensional latent representation, i.e., N = 30, yn \u2208 R2. The stimuli are 16\u00d716 images of\nhandwritten digits from the MNIST data set, presented sequentially, one per \u201ctime slice\u201d. In the\ndata, each of the thirty neurons is specialized to one digit class, with three neurons per digit. When\na digit is presented, two neurons \ufb01re among the three: one that \ufb01res with probability one, and one\nof the remaining two \ufb01res with uniform probability. Thus, we expect three neurons to have strong\nprobability of \ufb01ring when the stimulus contains their preferred digit; however, one of the neurons\ndoes not spike due to competition with another neuron. We expect the model to learn this inhibition\nby moving the neurons close together in the latent space. Examining the learned stimulus weights\nand latent embeddings, shown in Figures 2a and 2b respectively, we see that this is indeed the\ncase. This scenario highlights a major shortcoming of the coupled GLM. For each of the inhibitory\n\n5\n\n024681012\u22121.5\u22121\u22120.500.511.52Latent ValueOrder in 1D Retina024681012\u22122\u22121012Latent ValueOrder in 1D Retina02468101200.20.40.60.811.21.4Gain WeightOrder in 1D Retina\f(a) Stimulus Weights\n\n(b) 2D Latent Embedding\n\nFigure 2: Results of the digits experiment. A visualization of the neuron speci\ufb01c weights wn (2a) and latent\nembedding (2b) learned by the DPP. In (2b) each blue number indicates the position of the neuron that always\n\ufb01res for that speci\ufb01c digit, and the red and green numbers indicate the neurons that respond to that digit but\ninhibit each other. We observe in (2b) that inhibitory pairs of neurons, the red and green pairs, are placed\nextremely close to each other in the DPP\u2019s learned latent space while neurons that spike simultaneously (the blue\nand either red or green) are distant. This scenario emphasizes the bene\ufb01t of having an inhibitory dependence\nbetween neurons. The coupled GLM can not model this scenario well because both neurons of the inhibitory\npair receive strong stimulus but there is no indication from past spiking behavior which neuron will spike.\n\n(a) Kernel Matrix, KS\n\n(b) Stimulus Weights, wn\n\n(c) w\u03bd\n\n(d) wn=3\n\nFigure 3: Visualizations of the parameters learned by the DPP on the Hippocampal data. Figure 3a shows a\nvisualization of the kernel matrix KS. Dark colored entries of KS indicate a strong pairwise inhibition while\nlighter ones indicate no inhibition. The low frequency neurons, pyramidal cells, are strongly anti-correlated\nwhich is consistent with the notion that they are inhibited by a common source such as an interneuron. Figure 3b\nshows the (normalized) weights, wn learned from the stimulus feature vectors, which consist of concatenated\nlocation and orientation bins, to each neuron\u2019s Poisson spike rate \u03bb(t)\nn . An interesting observation is that the\ntwo highest frequency neurons, interneurons, have little dependence on any particular stimulus and are strongly\nanti-correlated with a large group of low frequency pyramidal cells. 3c shows the weights, w\u03bd to the gain\ncontrol, \u03bd, and 3d shows a visualization of the stimulus weights for a single neuron n = 3 organized by\nlocation and orientation bins. In 3a and 3b the neurons are ordered by their \ufb01ring rates. In 3d we see that the\nneuron is stimulated heavily by a speci\ufb01c location and orientation.\n\npairs of neurons, both will simultaneously receive strong stimulus but the conditional independence\nassumption will not hold; past spiking behavior can not indicate that only one can spike.\n4.3 Hippocampus Data\nAs a \ufb01nal experiment, we empirically evaluate the proposed model on multichannel recordings from\nlayer CA1 of the right dorsal hippocampus of awake behaving rats (Mizuseki et al., 2009; Csicsvari\net al., 1999). The data consist of spikes recorded from 31 neurons across four shanks during open\n\ufb01eld tasks as well as the syncronized positions of two LEDs on the rat\u2019s head. The extracted positions\nand orientations of the rat\u2019s head are binned into twenty-\ufb01ve discrete location and twelve orientation\nbins which are input to the model as the stimuli. Approximately twenty seven minutes of spike\nrecording data was divided into time slices of 20ms. The data are hypothesized to consist of spiking\n\n6\n\n051015202530Neuron Index0510152025300.00.51.0051015202530Neuron Index05101520Stimulus Index05101520Stimulus Index0123401234Location GridOrientations\f(a) Latent embedding of neurons\n\n(b) Latent embedding of neurons (zoomed)\n\nFigure 4: A visualization of the two dimensional latent embeddings, yn, learned for each neuron. Figure 4b\nshows 4a zoomed in on the middle of the \ufb01gure. Each dot indicates the latent value of a neuron. The color\nof the dots represents the empirical spiking rate of the neuron, the number indicates the depth of the neuron\naccording to its position along the shank - from 0 (shallow) to 7 (deep) - and the letter denotes which of four\ndistinct shanks the neurons spiking was read from. We observe that the higher frequency interneurons are\nplaced distant from each other but in a con\ufb01guration such that they inhibit the low frequency pyramidal cells.\n\n(a) Single periodic component\n\n(b) Two component mixture\n\n(c) (Csicsvari et al., 1999)\n\nFigure 5: A visualization of the periodic component learned by our model. In 5a, the neurons share a single\nlearned periodic frequency and offset but each learn an individual scaling factor \u03c1n and 5b shows the average\nin\ufb02uence of the two component mixture on the high and low spike rate neurons. In 5c we provide a reproduction\nfrom (Csicsvari et al., 1999) for comparison. In 5a the neurons are colored by \ufb01ring rate from light (high) to\ndark (low). Note that the model learns a frequency that is consistent with the approximately 4 Hz theta rhythm\nand there is a dichotomy in the learned amplitudes, \u03c1, that is consistent with the in\ufb02uence of the theta rhythm\non pyramidal cells and interneurons.\n\noriginating from two classes of neurons, pyramidal cells and interneurons (Csicsvari et al., 1999),\nwhich are largely separable by their \ufb01ring rates. Csicsvari et al. (1999) found that interneurons \ufb01re\nat a rate of 14 \u00b1 1.43 Hz and pyramidal cells at 1.4 \u00b1 0.01 Hz. Interneurons are known to inhibit\npyramidal cells, so we expect interesting inhibitory interactions and anti-correlated spiking between\nthe pyramidal cells. In our qualitative analysis we visualize the the data by the \ufb01ring rates of the\nneurons to see if the model learns this dichotomy.\nFigures 3, 4 and 5a show visualizations of the parameters learned by the model with a single periodic\ncomponent according to Equation 7. Figure 3 shows the kernel matrix KS corresponding to the\nlatent embeddings in Figure 4 and the stimulus and gain control weights learned by the model. In\nFigure 4 we see the two dimensional embeddings, yn, learned for each neuron by the same model.\nIn Figure 5 we see the periodic components learned for individual neurons on the hippocampal\ndata according to Equation 7 when the frequency term f and offset \u03d5 are shared across neurons.\nHowever, the scaling terms \u03c1n are learned for each neuron, so the neurons can each determine the\nin\ufb02uence of the periodic component on their spiking behavior. Although the parameters are all\nrandomly initialized at the start of learning, the single frequency signal learned is of approximately\n4 Hz which is consistent with the theta rhyhtm that Mizuseki et al. (2009) empirically observed in\nthese data. In Figures 5a and 5b we see that each neuron\u2019s amplitude component depends strongly\n\n7\n\n\u22121.0\u22120.50.00.51.0\u22120.6\u22120.4\u22120.20.00.20.40.60.84a4b5c6c4c6d110SpikeRate(Hz)\u22120.2\u22120.10.00.10.2\u22120.20\u22120.15\u22120.10\u22120.050.000.050.100.150.205a6a0a6b1b3b7c2c3c5d110SpikeRate(Hz)0.00.20.40.60.81.0Time (seconds)0.00.51.01.52.002\u00bc4Hz Phase0.20.40.60.81.01.21.41.61.8Low Spike Rate (Pyr)High Spike Rate (Int)\fModel\nOnly Latent\nOnly Stimulus\nStimulus + Periodic + Latent\nStimulus + Gain + Periodic\nStimulus + Gain\nStimulus + Periodic + Gain + Latent\nStimulus + 2\u00d7Periodic + Gain + Latent\n\nValid Log Likelihood Train Log Likelihood\n\n\u22123.79\n\u22123.17\n\u22123.07\n\u22123.04\n\u22122.95\n\u22122.74\n\u22122.07\n\n\u22123.68\n\u22123.29\n\u22122.91\n\u22122.92\n\u22122.84\n\u22122.63\n\u22121.96\n\nTable 1: Model log likelihood on the held out validation set and training set for various combinations of\ncomponents. We found the algorithm to be extremely stable. Each model con\ufb01guration was run 5 times with\ndifferent random initializations and the variance of the results was within 10\u22128.\n\non the neuron\u2019s \ufb01ring rate. This is also consistent with the observations of Csicsvari et al. (1999)\nthat interneurons and pyramidal cells are modulated by the theta rhythm at different amplitudes. We\n\ufb01nd a strong similarity between the periodic in\ufb02uence learned by our two component model (5b) to\nthat in the reproduced \ufb01gure (5c) from Csicsvari et al. (1999).\nIn Table 1 we present the log likelihood of the training data and withheld validation data under\nvariants of our model after learning the model parameters. The validation data consists of the last\nfull minute of recording which is 3,000 consecutive 20ms time slices. We see that the likelihood of\nthe validation data under our model increases as each additional component is added. Interestingly,\nadding a second component to the periodic mixture greatly increases the model log likelihood.\nFinally, we conduct a leave-one-neuron out prediction experiment on the validation data to compare\nthe proposed model to the coupled GLM. A spike is predicted if it increases the likelihood under\nthe model and the accuracy is averaged over all neurons and time slices in the validation set. We\ncompare GLMs with the periodic component, gain, stimulus and coupling \ufb01lters to our DPP with the\nlatent component. The models did not differ signi\ufb01cantly in the correct prediction of when neurons\nwould not spike - i.e. both were 99% correct. However, the DPP predicted 21% of spikes correctly\nwhile the GLM predicted only 5.5% correctly. This may be counterintuitive, as one may not expect a\nmodel for inhibitory interactions to improve prediction of when spikes do occur. However, the GLM\npredicts almost no spikes (483 spikes of a possible 92,969), possibly due to its inability to capture\nhigher order inhibitory structure. As an example scenario, in a one-of-N neuron \ufb01ring case the GLM\nmay prefer to predict that nothing \ufb01res (rather than incorrectly predict multiple spikes) whereas the\nDPP can actually condition on the behavior of the other neurons to determine which neuron \ufb01red.\n5 Conclusion\nIn this paper we presented a novel model for neural spiking data from populations of neurons that is\ndesigned to capture the inhibitory interactions between neurons. The model is empirically validated\non simulated experiments and rat hippocampal neural spike recordings. In analysis of the model\nparameters \ufb01t to the hippocampus data, we see that it indeed learns known structure and interac-\ntions between neurons. The model is able to accurately capture the known interaction between a\ndichotomy of neurons and the learned frequency component re\ufb02ects the true modulation of these\nneurons by the theta rhythm.\nThere are numerous possible extensions that would be interesting to explore. A de\ufb01ning feature of\nthe DPP is an ability to model inhibitory relationships in a neural population; excitatory connections\nbetween neurons are modeled as through the lack of inhibition. Excitatory relationships could be\nmodeled by incorporating an additional process, such as a Gaussian process, but integrating the\ntwo processes would require some care. Also, a limitation of the current approach is that time\nslices are modeled independently. Thus, neurons are not in\ufb02uenced by their own or others\u2019 spiking\nhistory. The DPP could be extended to include not only spikes from the current time slice but also\nneighboring time slices. This will present computational challenges, however, as the DPP scales with\nrespect to the number of spikes. Finally, we see from Table 1 that the gain modulation and periodic\ncomponent are essential to model the hippocampal data. An interesting alternative to the periodic\nmodulation of individual neuron spiking probabilities would be to have the latent representation\nof neurons itself be modulated by a periodic component. This would thus change the inhibitory\nrelationships to be a function of the theta rhythm, for example, rather than static in time.\n\n8\n\n\fReferences\nEmery N. Brown. Theory of point processes for neural systems. In Methods and Models in Neuro-\n\nphysics, chapter 14, pages 691\u2013726. 2005.\n\nJ. W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke, E. J. Chichilnisky, and E. P. Simoncelli.\nSpatio-temporal correlations and visual signaling in a complete neuronal population. Nature, 454\n(7206):995\u2013999, Aug 2008.\n\nElad Schneidman, Michael J. Berry, Ronen Segev, and William Bialek. Weak pairwise correlations\nimply strongly correlated network states in a neural population. Nature, 440(7087):1007\u20131012,\nApril 2006.\n\nDavid R. Brillinger. Maximum likelihood analysis of spike trains of interacting nerve cells. Biolog-\n\nical Cybernetics, 59(3):189\u2013200, August 1988.\n\nE.S. Chornoboy, L.P. Schramm, and A.F. Karr. Maximum likelihood identi\ufb01cation of neural point\n\nprocess systems. Biological Cybernetics, 59(3):265\u2013275, 1988.\n\nLiam Paninski. Maximum likelihood estimation of cascade point-process neural encoding models.\n\nNetwork: Computation in Neural Systems, 15(4):243\u2013262, 2004.\n\nW. Truccolo, U. T. Eden, M. R. Fellows, J. P. Donoghue, and E. N. Brown. A point process frame-\nwork for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covari-\nate effects. Journal of Neurophysiology, 93(2):1074, 2005.\n\nK. D. Harris, J. Csicsvari, H. Hirase, G. Dragoi, and G. Buzsaki. Organization of cell assemblies in\n\nthe hippocampus. Nature, 424:552\u2013555, 2003.\n\nJ. Ben Hough, Manjunath Krishnapur, Yuval Peres, and Blint Vir\u00b4ag. Determinantal processes and\n\nindependence. Probability Surveys, 3:206\u2013229, 2006.\n\nAlex Kulesza and Ben Taskar. Determinantal point processes for machine learning. Foundations\n\nand Trends in Machine Learning, 5(2\u20133), 2012.\n\nJames Zou and Ryan P. Adams. Priors for diversity in generative latent variable models. In Advances\n\nin Neural Information Processing Systems, 2012.\n\nRaja H. Affandi, Alex Kulesza, Emily Fox, and Ben Taskar. Nystr\u00a8om Approximation for Large-\n\nScale Determinantal Processes. In Arti\ufb01cial Intelligence and Statistics, 2013.\n\nAlex Kulesza and Ben Taskar. Structured determinantal point processes. In Advances in Neural\n\nInformation Processing Systems, 2011.\n\nMatteo Carandini and David J. Heeger. Normalization as a canonical neural computation. Nature\n\nreviews. Neuroscience, 13(1):51\u201362, January 2012.\n\nMartin J. Wainwright, Odelia Schwartz, and Eero P. Simoncelli. Natural image statistics and divisive\nnormalization: Modeling nonlinearity and adaptation in cortical neurons. In R Rao, B Olshausen,\nand M Lewicki, editors, Probabilistic Models of the Brain: Perception and Neural Function,\nchapter 10, pages 203\u2013222. MIT Press, February 2002.\n\nJ. Csicsvari, H. Hirase, A. Czurk\u00b4o, A. Mamiya, and G. Buzs\u00b4aki. Oscillatory coupling of hippocampal\npyramidal cells and interneurons in the behaving rat. The Journal of Neuroscience, 19(1):274\u2013\n287, jan 1999.\n\nCarl E. Rasmussen and Christopher Williams. Gaussian Processes for Machine Learning. MIT\n\nPress, 2006.\n\nKenji Mizuseki, Anton Sirota, Eva Pastalkova, and Gy\u00a8orgy Buzs\u00b4aki. Theta oscillations provide\ntemporal windows for local circuit computation in the entorhinal-hippocampal loop. Neuron, 64\n(2):267\u2013280, October 2009.\n\n9\n\n\f", "award": [], "sourceid": 984, "authors": [{"given_name": "Jasper", "family_name": "Snoek", "institution": "University of Toronto"}, {"given_name": "Richard", "family_name": "Zemel", "institution": "University of Toronto"}, {"given_name": "Ryan", "family_name": "Adams", "institution": "Harvard University"}]}