{"title": "Synergy and Redundancy among Brain Cells of Behaving Monkeys", "book": "Advances in Neural Information Processing Systems", "page_first": 111, "page_last": 117, "abstract": null, "full_text": "Synergy and  redundancy among brain \n\ncells  of behaving monkeys \n\nItay  Gat\u00b7 \n\nInstitute of Computer Science  and \n\nCenter  for  Neural  Computation \n\nThe Hebrew  University,  Jerusalem 91904,  Israel \n\nNaftali  Tishby t \n\nNEC  Research  Institute \n\n4 Independence  Way \nPrinceton  N J  08540 \n\nAbstract \n\nDetermining the relationship between  the activity of a single nerve \ncell  to that of an entire  population is  a  fundamental question  that \nbears  on  the  basic  neural  computation  paradigms.  In  this  paper \nwe  apply  an  information theoretic  approach  to  quantify  the  level \nof cooperative  activity  among cells  in  a  behavioral  context.  It is \npossible to discriminate between  synergetic  activity of the cells  vs . \nredundant  activity, depending on the difference  between  the infor(cid:173)\nmation  they  provide  when  measured  jointly  and  the  information \nthey  provide independently.  We define  a synergy  value that is  pos(cid:173)\nitive in the first  case and negative in  the second  and show  that the \nsynergy value can be measured by detecting the behavioral mode of \nthe animal from  simultaneously recorded  activity of the  cells.  We \nobserve  that  among  cortical  cells  positive  synergy  can  be  found, \nwhile cells  from the  basal ganglia, active during the same task, do \nnot  exhibit similar synergetic  activity. \n\ntitay,tishby}@cs.huji.ac.il \nPermanent  address:  Institute  of Computer  Science  and  Center for  Neural  Computa(cid:173)\n\ntion,  The  Hebrew  University,  Jerusalem  91904,  Israel. \n\n\f112 \n\n1 \n\nIntroduction \n\nI.  Gat and N.  Tishby \n\nMeasuring  ways  by  which  several  neurons  in  the  brain  participate  in  a  specific \ncomputational  task  can  shed  light  on  fundamental  neural  information  processing \nmechanisms.  While it is  unlikely  that complete information from  any  macroscopic \nneural  tissue  will  ever  be available,  some  interesting  insight  can  be obtained  from \nsimultaneously recorded  cells  in  the cortex  of behaving  animals.  The  question  we \naddress in this study is the level  of synergy, or the level of cooperation , among brain \ncells,  as  determined  by  the  information they  provide  about  the  observed  behavior \nof the  animal. \n\n1.1  The experimental data \n\nWe analyze simultaneously recorded  units from behaving monkeys during a delayed \nresponse  behavioral experiment.  The data was  collected  at the high  brain function \nlaboratory of the Haddassah Medical  School of the Hebrew  universitY[l,  2].  In  this \ntask the monkey had to remember the location of a  visual stimulus and respond  by \ntouching  that  location  after  a  delay  of 1-32 sec.  Correct  responses  were  rewarded \nby  a  drop  of juice. \nIn  one  set  of  recordings  six  micro-electrodes  were  inserted \nsimultaneously to the frontal or prefrontal cortex[l, 3].  In another set of experiments \nthe same behavioral paradigm was used  and recording were taken from the striatum \n- which  is  the  first  station  in  basal  ganglia  (a  sub-cortical  ganglia)[2].  The  cells \nrecorded  in  the  striatum  were  the  tonically  active  neurons[2],  which  are  known  to \nbe  the cholinergic  inter-neurons of the striatum.  These  cells  are known  to respond \nto reward. \n\nThe monkeys were  trained to perform the task in two alternating modes,  \"Go\"  and \n\"No-Go\" [1].  Both sets of behavioral modes can be detected  from the recorded  spike \ntrains  using  several  statistical  modeling  techniques  that  include  Hidden  Markov \nModels  (HMM)  and Post Stimulus Histograms (PSTH).  The details of these detec(cid:173)\ntion  methods  are  reported  elsewhere[4 , 5].  For  this  paper  it  is  important to know \nthat  we  can  significantly  detect  the  correct  behavior,  for  example  in  the  \"Go\"  vs. \nthe  \"No-Go\"  correct detection is  achieved about 90% of the time, where the random \nis  50%  and the monkey's average performance is  95% correct  on  this task. \n\n2  Theoretical  background \n\nOur measure of synergy  level  among cells is  information theoretic and was  recently \nproposed  by  Brenner  et.  aZ. [6]  for  analysis  of spikes  generated  by  a  single  neuron. \nThis is the first application of this measure to quantify cooperativity among neurons. \n\n2.1  Synergy and redundancy \n\nA  fundamental  quantity  in  information theory  is  the  mutual  information between \ntwo random variables X  and Y.  It is  defined  as  the cross-entropy  (Kullbak-Liebler \ndivergence)  between  the joint distribution of the variables, p(x, y),  and the product \nof the  marginal distributions p(x)p(y).  As  such  it  measures  the statistical depen(cid:173)\ndence  of the variables X  and Y.  It is  symmetric in X  and Y  and  has the following \n\n\fSynergy and Redundancy among Brain Cells of Behaving Monkeys \n\n113 \n\nI(X; Y) \n\nfamiliar relations to their entropies[7]: \n\nDKL [P(X, Y) I P(X) P(Y)] = ~ P( x, y) log (~~ ~~r~) ) \n\n(1) \nH(X) + H(Y) - H(X, Y)  = H(X) - H(XIY) = H(Y) - H(YIX). \n\nWhen  given  three  random  variables  X I,  X 2  and  Y,  one  can  consider  the \nmutual  information  between  the  joint  variables  (X I ,X2 )  and  the  variable  Y, \nI(XI' X 2; Y)  (notice  the  position  of the  semicolon),  as  well  as  the  mutual  infor(cid:173)\nmations  I(XI; Y)  and  I(X2; Y).  Similarly,  one  can  consider  the  mutual  informa(cid:173)\ntion  between  Xl  and  X 2  conditioned  on  a  given  value  of Y  =  y,  I(XI; X21y)  = \nDKL[P(XI ,X2Iy)IP(Xl ly)P(X2Iy)]'  as  well  as  its  average,  the  conditional mutual \ninformation , \n\nI(XI; X 2IY) =  LP(y)Iy(XI; X2)' \n\nY \n\nFollowing Brenner  et.  al.[6]  we  define  the  synergy level of Xl  and  X2  with  respect \nto the variable Y  as \n\nSyny(XI ,X2) = I(XI ,X2;Y) - (I(XI;Y) + I(X2;Y)), \n\n(2) \nwith the natural generalization to more than two variables  X .  This expression  can \nbe  rewritten  in terms of entropies  and conditional information as follows: \n\nSyny(XI , X 2)  = \n(3) \nH(XI,X2) - H(XI,X2IY) - ((H(Xt) - H(XIIY)) + (H(X2) - H(X2IY))) \nH(XIIY) + H(X2IY) - H(XI' X2IY) + H(XI' X 2) - (H(Xd + H(X2)) \n\n\" \n\n., \n\nI \n\n\" \n\n\" \n\n., \n\nDepends  On  Y \n\nIndependent  of  Y \n\nWhen  the variables exhibit  positive synergy  value, with respect  to the  variable Y, \nthey jointly provide more information on Y  than when considered  independently, as \nexpected  in  synergetic cases.  Negative  synergy  values  correspond  to  redundancy  -\nthe variables do not provide  independent  information about Y.  Zero  synergy  value \nis  obtained when  the variables are  independent  of Y  or when  there  is  no change  in \ntheir  dependence  when  conditioned  on  Y.  We  claim that  this  is  a  useful  measure \nof cooperativity among neurons,  in  a given computational task. \n\nIt is  clear from  Eq.(  3)  that if \n\nIy(XI; X 2) = I(XI; X 2)  Vy  E Y =>  Syny (Xl, X 2 )  = 0, \n\n(4) \n\nsince  in that case  L y P(y)Iy (XI;X2) =  I(XI;X2). \nIn other words, the synergy value is not zero only if the statistical dependence,  hence \nthe  mutual information between  the  variables,  is  affected  by  the  value  of Y .  It is \npositive  when  the  mutual information  increase,  on  the average,  when  conditioned \non  Y,  and  negative  if this  conditional  mutual  information  decrease.  Notice  that \nthe  value  of synergy  can  be  both  positive  and  negative  since  information,  unlike \nentropy,  is  not sub-additive in the X  variables. \n\n\f114 \n\n1.  Gat and N  Tishby \n\n3  Synergy among neurons \n\nOur  measure  of  synergy  among  the  units  is  based  on  the  ability  to  detect  the \nbehavioral  mode  from  the  recorded  activity,  as  we  discuss  bellow.  As  discussed \nabove, synergy among neurons is possible only if their statistical dependence change \nwith  time.  An  important case  where  synergy  is  not  expected  is  pure  \"population \ncoding\" [8].  In  this  case  the  cells  are  expected  to fire  independently,  each  with  its \nown  fixed  tuning curve.  Our synergy  value can thus be used  to test if the recorded \nunits are indeed participating in a pure population code of this kind, as hypothesized \nfor  certain motor cortical  activity. \n\nTheoretical  models  of the  cortex  that  clearly  predict  nonzero  synergy  include  at(cid:173)\ntractor  neural  networks  (ANN)[9]  and  synfire  chain  models(SFC)[3] .  Both  these \nmodels predict changes in the collective activity patterns, as neurons move between \nattractors  in  the  ANN  case,  or  when  different  synfire-chains  of activity  are  born \nor  disappear  in  the  SFC  case.  To  the  extent  that  such  changes  in  the  collective \nactivity  depend  on  behavior,  nonzero  synergy  values  can  be  detected.  It remains \nan  interesting  theoretical  challenge  to estimate the quantitative synergy  values for \nsuch  models and  compare it to observed  quantities. \n\n3.1  Time-dependent cross  correlations \n\nIn  our  previous  studies[4]  we  demonstrated,  using  hidden  Markov  models  of the \nactivity,  that  the  pairwise  cross-correlations  in  the  same  data  can  change  signifi(cid:173)\ncantly  with  time,  depending  on  the  underlying  collective  state  of activity.  These \nstates,  revealed  by  the hidden  Markov  model, in  turn depend  on  the  behavior and \nenable  its  prediction .  Dramatic  and  fast  changes  in  the  cross-correlation  of cells \nhas also been shown by  others[lO].  This finding indicate directly that the statistical \ndependence  of the  neurons  can  change  (rapidly)  with  time,  in  a  way  correlated  to \nbehavior.  This  clearly  suggests  that  nonzero  synergy  should  be  observed  among \nthese  cortical  units , relative to this  behavior.  In  the  present  study  this  theoretical \nhypothesis  is  verified. \n\n3.2  Redundancy cases \n\nIf on the other hand the conditioned mutual information equal zero for all behavioral \nmodes,  i.e.  Iy(Xl; X2)  =  0  Vy  E Y, while  I(Xl; X 2) >  0, we  expect  to get  negative \nsynergy,  or  redundancy  among the  cells,  with  respect  to  the  behavior  variable  Y. \nWe  observed  clear  redundancy in another part of the brain,  the basal ganglia, dur(cid:173)\ning  the  same experiment,  when  the  behavior  was  the pre-reward  and  post-reward \nactivity.  In  this  case  different  cells  provide  exactly  the  same  information,  which \nyields  negative synergy  values. \n\n4  Experimental results \n\n4.1  Synergy measurement in practice \n\nTo  evaluate  the  synergy  value  among  different  cells,  it  is  necessary  to  estimate \nthe conditional distribution  p(ylx)  where  y  is  the current  behavior and  x  represent \na  single  trial  of spike  trains  of the  considered  cells.  Estimating this  probability, \n\n\fSynergy and Redundancy among Brain Cells of Behaving Monkeys \n\n115 \n\nhowever,  requires  an  underlying  statistical  model,  or  a  represented  of  the  spike \ntrains.  Otherwise  there  is  never  enough  data since  cortical  spike  trains  are  never \nexactly  reproducible.  In  this  work  we  choose  the rate representation,  which  is  the \nsimplest  to evaluate.  The estimation of p(ylx)  goes  as  follows: \n\n\u2022  For each of the M  behavioral modes (Y1, Y2 .. , YM)  collect spike train samples \n\n(the  tmining data set). \n\n\u2022  Using  the  training  sample,  construct  a  Post  Stimulus  Time  Histogram \n\n(PSTH),  i.e.  the rate as  function  of time, for  each  behavioral mode. \n\n\u2022  Given  a  spike train,  outside of the training set,  compute its probability to \n\nbe result  in each of the  M  modes. \n\n\u2022  The spike train considered  correctly  classified  if the  most probable mode is \n\nin  fact  the true behavioral mode,  and  incorrectly  otherwise. \n\n\u2022  The fraction of correct classification, for all spike trains of a given behavioral \nmode  Yi,  is  taken  as  the  estimate of P(Yi Ix),  and  denoted  pc.,  where  Ci  1S \nthe  identity of the cells  used  in the computation. \n\nFor the case of only two categories of behavior and for  a  uniform distribution of the \ndifferent categories, the value of the entropy H(Y) is the same for all combinations of \ncells,  and  is  simply  H (Y)  =  - Ly p(y) log2 (p(y))  =  log22  =  1.  The full  expression \n(in  bits)  for  the synergy  value can  be thus  written  as  follows: \n\n~p(x) [-~ Po\"\"  log2(P\"\",)]  ; \n1+ ~P(x) [- ~ Po,  IOg,(P,,)]  + ~ p(x)  [- ~ Po, IOg2(P,,)] \n\n, \n\n(5) \n\nIf the first  expression  is  larger than  the second  than there is  (positive)  synergy  and \nvice  versa for  redundancy.  However  there is  one very  important caveat.  As  we  saw \nthe computation of the mutual information is  not done exactly, and what one really \ncomputes is  only a lower bound.  If the bound is tighter for  multiple cell calculation, \nthe  method could  falsely  infer  positive synergy,  and  if the  bound  is  tighter for  the \nsingle cell computation, the method could falsely infer negative synergy.  In previous \nworks we  have shown that the method we  use  for this estimation is  quite reasonable \nand  robust[5],  therefore,  we  believe  that  we  have  even  a  conservative  (i.e. \nless \npositive)  estimate of synergy. \n\n4.2  Observed synergy values \n\nIn  the  first  set  of experiments  we  tried  to  detect  the  behavioral  mode during  the \ndelay-period  of  correct  trials. \nIn  this  case  the  two  types  of  behavior  were  the \n\"Go\"  and  the  \"No-Go\"  described  in  the introduction.  An example of this detection \nproblem is given in figure  lAo  In this figure there are 100 examples of multi-electrode \nrecording of spike trains during the delay period.  On the left is the  \"Go-mode\" data \nand  on  the  right  the  \"No-Go  mode\",  for  two  cells.  On  the  lower  part  there  is  an \nexample of two single spike trains that need  to be classified  by  the mode models. \n\n\f116 \n\nA. \n\n00_. \n\n110-00  1104. \n\nB. \n\nPre-r-.r4 \n\n\u2022 \u2022\u2022\u2022\u2022\u2022\u2022.\u2022 _ \n\nPoet-reward \n\n. ..\u2022... __ \u2022\u2022\u2022 -\n\n--\"\"\"--\"\"'-\"--\"\"'''1 \n\nI.  Gat and N.  Tishby \n\n\" \n\n: , \n\n~:~I 1\", \n,~ \n\n,  ,;-,.-c:;;---..------;;;--\"' .\u2022 ~~ -m~~' \n\n. \n[_.:~ \u2022 \u2022\u2022 \u2022\u2022 :~_ \u2022\u2022 ~ \u2022\u2022\u2022 . :  \u2022\u2022 ;'~~\"h \u2022\u2022 ~. __ ~ \u2022\u2022\u2022\u2022 ~_ \u2022\u2022 ~.J \n\n\u2022 \n:? \n\n\"\"\"'If-._ \n\n\u2022\u2022 ,,... \n... :::-'--._ \n\n.1 \n?  'T' \n\nr\"'ij\"l\"i\"i\"\"i,~\"':('l',~u,i;~','Ll \nr'\u00b7\u00b7jil\u00b7~\u00b7\u00b7~\u00b7~IIUTI~j~I\u00b7I;i\u00b7\u00b7\u00b7\u00b7\"\u00b7II\u00b7\u00b7II\u00b7\u00b7I\u00b7:.j \nl ... :.~!!.: ...... : ............. : ........... ! ........ ~ .. _.J  l .. J ... L .. .I. .. : .. ! ... : .. I ... : ...... l ...... : .... ,.j \n\n811agl.  trial  \u00bbct.  2 \n\nabag1e  trial  110.  1 \n\n\u2022 \n:  ? \n\n\"\"\"'If-. ___ \u2022\u2022\u2022\u2022 ,... \n........ - -.---\n\n\u2022 \n?  i \n\n................... L.,.:.:.:.~~.~.~................ \n\nI.  II 1.1.. ..  II  i.! \n\ni \n\nI I  1  1 I  I  I i  I \n\n~ \u2022\u2022 : ..... ; ........ : ................. _ \u2022\u2022\u2022 _ \u2022\u2022 ;U: ........ _ .... ::.: \n\n_ ............. ~ .. ::-: .. :-::::::\"',..i. ......................... , \n\n'  I.  ..  1 U J 1 Jli...i \nI  I  ! \n\n~ .. : \u2022\u2022\u2022\u2022\u2022 ~ ........ u  . . . . . . . . . . . . . . . . . . . .  ~ . . . . . .  __  \u2022 \n\nI \n\nI \n\n. . . .  \":'.: \n\ni\n\n-\n\n81 . .  1&  trial  110.  1 \n\nFigure  1:  Raster  displays  of simultaneously recorded  cells  in  the  2 different  areas, \nin  each  area there were  2 behavioral  modes. \n\nTable 1 gives some examples of detection  results obtained by  using  2 cells indepen(cid:173)\ndently,  and  by  using  their  joint  combination.  It can  be  seen  that  the  synergy  is \npositive and  significant.  We  examined  19  recording  session  of the same behavioral \nmodes for  two  different  animals and evaluated  the synergy  value.  In  18  out of the \n19 sessions  there was at least one example of significant positive synergy among the \ncells. \n\nFor  comparison  we  analyzed  another  set  of experiments  in  which  the  data  was \nrecorded  from  the striatum  in  the  basal  ganglia.  An  example for  this  detection  is \nshown  in  figure  lB.  The  behavioral  modes  were  the  \"pre-reward\"  vs. \nthe  \"post(cid:173)\nreward\"  periods.  Nine recording sessions  for  the two different  monkeys were  exam(cid:173)\nined  using  the  same  detection  technique.  Although  the  detection  results  improve \nwhen  the  number  of cells  increase,  in  none  of these  recordings  a  positive  synergy \nvalue was  found.  For  most of the data the synergy  value was  close  to zero,  i.e.  the \nmutual information among two cells jointly was close to the sum of the mutual infor(cid:173)\nmation of the independent  cells,  as  expected  when  the cells  exhibit  (conditionally) \nindependent  activity. \n\nThe prevailing difference between the synergy measurements in the cortex and in the \nTAN s' of the basal ganglia is also strengthen by the different mechanisms underlying \nthose  cells.  The TANs' are assumed  to  be globally mediators of information in  the \nstriatum, a  relatively simple task,  whereas  the information processed  in  the frontal \ncortex in this task  is  believed to be much more collective and complicated.  Here  we \nsuggest a first  handle for  quantitative detection of such different  neuronal activities. \n\nAcknowledgments \n\nSpecial  thanks are due to Moshe Abeles  for  his encouragement and support, and to \nWilliam Bialek for  suggesting  the idea to look for  the synergy  among cortical cells. \nWe  would  also  like to thank A.  Raz,  Hagai Bergman, and  Eilon Vaadia for  sharing \ntheir  data  with  us.  The  research  at the  Hebrew  university  was  supported  in  part \nby  a  grant from  the  Unites States Israeli  Binational Science  Foundation  (BSF). \n\n\fSynergy and Redundancy among Brain  Cells of Behaving Monkeys \n\n117 \n\nTable 1:  Examples of synergy among cortical neurons.  For each example the mutual \ninformation of each  cell  separately  is  given  together  with  the  mutual  information \nof the pair.  In parenthesis the matching detection probability (average over p(ylx)) \nis  also  given.  The  last  column  gives  the  percentage  of increase  from  the  mutual \ninformation of the single cells to the mutual information of the pair.  The table gives \nonly  those  pairs  for  which  the  percentage  was  larger  than  20%  and  the  detection \nrate higher than  60%. \nSession  Cells \n\nBoth cells \n\nSyn  (%) \n\nCellI \n\nCe1l2 \n\nb116b \nbl21b \nbl21b \nbl26b \nbl26b \ncl77b \ncr38b \ncr38b \ncr38b \ncr43b \n\n5,6 \n1,4 \n3,4 \n0,3 \n1,2 \n2,3 \n0,2 \n0,4 \n3,4 \n0,1 \n\n0.068  (64.84) \n0.201  (73.74) \n0.082  (66.67) \n0.062  (62.63) \n0.030  (60.10) \n0 .054  (62.74) \n0.074  (65.93) \n0.074  (65.93) \n0.051  (62.09) \n0.070  (65.00) \n\n0.083  (66.80) \n0.118  (69.70) \n0.118  (69.70) \n0.077  (66.16) \n0.051  (63.13) \n0.013  (61.50) \n0.058  (63.19) \n0.042  (62.09) \n0.042  (62.09) \n0.063  (64.44) \n\n0.209  (76.17) \n0.497  (87.88) \n0.240  (77.78) \n0.198  (75.25) \n0.148  (72.22) \n0.081  (68.01) \n0.160  (73.08) \n0.144  (71.98) \n0.111  (69.23) \n0.181  (74.44) \n\n38 \n56 \n20 \n42 \n82 \n20 \n21 \n24 \n20 \n36 \n\nReferences \n\n[1]  M.  Abeles,  E.  Vaadia,  H.  Bergman,  Firing  patterns  of single  unit  in  the  pre(cid:173)\n\nfrontal  cortex  and  neural-networks  models.,  Network  1  (1990). \n\n[2]  E.  Raz ,  et  al  Neuronal  synchronization  of  tonically  active  neurons  in  the \nstriatum  of normal  and parkinsonian  primates,  J.  Neurophysiol.  76:2083-2088 \n(1996). \n\n[3]  M.  Abeles,  Corticonics,  (Cambridge University  Press,  1991). \n\n[4]  I.  Gat , N.  Tishby  and  M.  Abeles,  Hidden  Markov  modeling  of simultaneously \nrecorded cells in  the  associative cortex of behaving  monkeys,  Network,8:297-322 \n(1997). \n\n[5]  I. Gat,  N.  Tishby,  Comparative study of different  supervised detection  methods \n\nof simultaneously recorded  spike  trains,  in  preparation. \n\n[6]  N.  Brenner,  S.P.  Strong,  R.  Koberle,  W.  Bialek,  and  R.  de  Ruyter  van \nSteveninck,  The  Economy  of Impulses  and  the  Stiffnes  of Spike  Trains,  NEC \nResearch  Institute Technical  Note  (1998). \n\n[7]  T.M .  Cover  and  J.A . Thomas,  Elements  of Information  Theory.,  (Wiley  NY, \n\n1991). \n\n[8]  A.P.  Georgopoulos, A.B. Schwartz,  R.E. Kettner,  Neuronal  Population  Coding \n\nof Movement  Direction,  Science,  233:1416-1419  (1986). \n\n[9]  D.J.  Amit,  Modeling  Brain  Function ,  (Cambridge University  Press,  1989). \n\n[10]  E.  Ahissar  et  al  Dependence  of Cortical  Plasticity  on  Correlated  Activity  of \n\nSingle  Neurons  and  on  Behavioral  Context,  Science,  257:1412-1415  (1992). \n\n\f", "award": [], "sourceid": 1611, "authors": [{"given_name": "Itay", "family_name": "Gat", "institution": null}, {"given_name": "Naftali", "family_name": "Tishby", "institution": null}]}