{"title": "Adaptive Synchronization of Neural and Physical Oscillators", "book": "Advances in Neural Information Processing Systems", "page_first": 109, "page_last": 116, "abstract": null, "full_text": "Adaptive Synchronization of \n\nNeural and Physical Oscillators \n\nKenji Doya \n\nUniversity of California, San Diego \n\nLa Jolla, CA  92093-0322, USA \n\nShuji Yoshizawa \nUniversity of Tokyo \n\nBunkyo-ku, Tokyo 113,  Japan \n\nAbstract \n\nAnimal  locomotion  patterns  are  controlled  by recurrent  neural  networks \ncalled central pattern generators  (CPGs).  Although  a  CPG  can  oscillate \nautonomously,  its  rhythm and  phase  must  be  well  coordinated  with  the \nstate of the physical system using sensory inputs.  In this paper we propose \na learning algorithm for synchronizing neural and physical oscillators with \nspecific phase relationships.  Sensory input connections are modified by the \ncorrelation between cellular activities and input signals.  Simulations show \nthat the learning rule can be used for setting sensory feedback connections \nto a  CPG as  well  as coupling connections between CPGs. \n\n1  CENTRAL AND SENSORY MECHANISMS IN \n\nLOCOMOTION  CONTROL \n\nPatterns of animal locomotion, such as walking, swimming, and fiying,  are generated \nby recurrent neural networks that are located in segmental ganglia of invertebrates \nand spinal  cords  of vertebrates  (Barnes  and  Gladden,  1985).  These networks  can \nproduce basic rhythms of locomotion without sensory inputs and are  called central \npattern generators  (CPGs).  The physical systems of locomotion, such as  legs, fins, \nand  wings  combined  with  physical environments,  have  their  own  oscillatory  char(cid:173)\nacteristics.  Therefore,  in  order  to  realize  efficient  locomotion,  the  frequency  and \nthe  phase  of oscillation  of a  CPG must  be  well  coordinated  with  the state  of the \nphysical system.  For example,  the  bursting patterns  of motoneurons  that  drive  a \nleg  muscle must be coordinated  with  the configuration  of the leg,  its contact with \nthe ground,  and the state of other legs. \n\n109 \n\n\f110 \n\nDoya and Yoshizawa \n\nThe oscillation pattern of a  ePG is largely affected by proprioceptive inputs.  It has \nbeen shown in crayfish (Siller et al., 1986) and lamprey (Grillner et aI, 1990) that the \noscillation of a  ePG is entrained by cyclic stimuli to stretch sensory neurons over a \nwide range of frequency.  Both negative and positive feedback pathways are found in \nthose systems.  Elucidation of the function  of the sensory  inputs to CPGs requires \ncomputational studies  of neural  and  physical  dynamical systems.  Algorithms  for \nthe learning of rhythmic patterns in recurrent neural networks have been derived by \nDoya and Yoshizawa (1989),  Pearlmutter  (1989),  and Williams  and  Zipser  (1989). \nIn this paper we propose a  learning  algorithm for  synchronizing a  neural oscillator \nto rhythmic input signals with a  specific  phase relationship. \nIt is  well known that a coupling between nonlinear oscillators can entrainment their \nfrequencies.  The relative phase between oscillators is determined by the parameters \nof coupling  and  the  difference  of their  intrinsic  frequencies.  For  example,  either \nin-phase or anti-phase oscillation  results  from  symmetric coupling  between  neural \noscillators  with  similar  intrinsic frequencies  (Kawato and  Suzuki,  1980).  Efficient \nlocomotion involves subtle phase relationships between physical variables and motor \ncommands.  Accordingly,  our goal  is  to derive  a  learning  algorithm that can finely \ntune  the sensory  input  connections  by which  the  relative  phase  between  physical \nand neural oscillators is kept at a  specific  value  required by the task. \n\n2  LEARNING  OF SYNCHRONIZATION \n\nWe will  deal with the following continuous-time model of a  CPG network. \n\ndes  \n\nTi dtXi(t)  = -Xi(t) + L Wijgj(Xj(t)) + L Vi1:yA:(t) , \n\nj=1 \n\n1:=1 \n\n(1) \n\nwhere Xi(t)  and gi(Xi(t)) (i = 1, ... , C) represent the states and the outputs ofCPG \nneurons  and  Y1:(t)  (k  =  1, ... , S)  represents sensory  inputs.  We  assume  that  the \nconnection weights W  = {Wij} are already established so that the network oscillates \nwithout sensory inputs.  The goal oflearning is  to find the input connection weights \nV  = {Vij}  that make  the network state x(t) = (Xl (t), ... ,xc(t))t entrained to the \ninput signal  yet) = (Yl(t), .. . ,Ys(t))t  with a specific  relative  phase. \n\n2.1  AN  OBJECTIVE FUNCTION  FOR PHASE-LOCKING \n\nThe standard way to derive a learning algorithm is to find out an objective function \nto be  minimized.  If we  can  approximate the waveforms of Xi(t)  and Y1:(t)  by sine \nwaves,  a  linear relationship \n\nx(t) = Py(t) \n\nspecifies  a  phase-locked oscillation of x(t)  and Yet).  For example,  if we  have  Yl  = \nsin wt and Y2  = cos wt, then a matrix P = (~ fi) specifies Xl = v'2 sine wt +1r /4) and \nX2  = 2 sine wt + 1r /3).  Even when  the waveforms are not sinusoidal, minimization of \nan objective function \n\n1 \n\nE(t) = \"2l1x(t) - py(t)1I2 = \"2 2: {Xi(t) - L Pi1:Y1:(t)}2 \n\n1  c \n\ns \n\n(2) \n\ni=l \n\n1:=1 \n\n\fAdaptive Synchronization of Neural and Physical  Oscillators \n\n111 \n\ndetermines a specific relative phase between x(t) and y(t).  Thus we  call P  = {Pik} \na  phase-lock matrix. \n\n2.2  LEARNING PROCEDURE \n\nUsing  the  above objective function,  we  will  derive  a  learning  procedure for  phase(cid:173)\nlocked  oscillation  of x(t)  and  y(t).  First,  an  appropriate  phase-lock  matrix  P  is \nidentified while  the relative phase between x(t) and y(t) changes gradually in time. \nThen, a  feedback  mechanism can  be applied so  that the network state x(t)  is  kept \nclose  to the target waveform P y(t). \nSuppose  we actually have an appropriate phase relationship between x(t) and y(t), \nthen  the  phase-lock  matrix  P  can  be  obtained  by  gradient  descent  of E(t)  with \nrespect  to PH:  as follows  (Widrow and Stearns,  1985). \n\nd \ndtPik  = -TJ  {}.  = TJ  {Xi(t) - LPijYj(t)}Yk(t). \n\nS \n\n{}E(t) \nP,k \n\nj=1 \n\n(3) \n\nIf the coupling between x(t) and y(t) are weak enough, their relative phase changes \nin  time  unless  their  intrinsic  frequencies  are  exactly  equal  and  the  systems  are \ncompletely noiseless.  By modulating the learning coefficient TJ  by some performance \nindex  of the  total system,  for  example,  the speed  of locomotion,  it  is  possible  to \nobtain a  matrix P  that satisfies the requirement of the task. \nOnce  a  phase-lock matrix is  derived,  we  can  control x(t)  close  to  Py(t) using  the \ngradient of E(t) with respect  to the network state \n\n{}E(t) \n{}  .() = Xi(t)  - L  PikYk(t). \nX,  t \n\nk=1 \n\nS \n\nThe simplest feedback algorithm is to add this term to the CPG dynamics as follows. \n\nd e s  \n\nTi dtXi(t) = -Xi(t) + L  Wijgj(Xj(t)) - O'{Xi(t) - LPikYk(t)}. \n\nj=1 \n\nk=1 \n\nThe  feedback  gain  0' (>  0)  must  be  set  small  enough  so  that  the  feedback  term \ndoes not destroy the intrinsic oscillation of the CPG. In that case, by neglecting the \nsmall additional decay term O'Xi(t),  we  have \n\nd e s  \nTj dt Xi(t) = -Xj(t) + L  Wijgj(Xj (t)) + L  O'PikYk(t), \nwhich is  equivalent to the equation (1)  with input  weights Vik  = O'Pik. \n\nk=1 \n\nj=1 \n\n(4) \n\n\f112 \n\nDoya and Yoshizawa \n\n3  DELAYED  SYNCHRONIZATION \n\nWe  tested  the  above  learning  scheme  on  a  delayed  synchronization  task;  to  find \ncoupling weights between neural oscillators so  that they synchronize with  a specific \ntime delay.  We used the following coupled  CPG model. \nc \n\nc \n\nTdd  xi(t) = -xi(t) + L  wijyj(t) + ~ Lpi1:y~-n(t), \n\n(5) \n\nt \n\n. \nJ=1 \nyi(t) = g(xi(t)), \n\n1:=1 \n\n(i =  1, . .. , C), \n\nwhere superscripts denote the indices of two CPGs (n = 1,2).  The goal of learning \nwas to synchronize the waveforms yHt) and y~(t) with  a time delay  ~T. We  used \n\nz(t) = -Iy~(t - ~T) - y~(t)1 \n\nas the performance index.  The learning coefficient 7]  of equation (3)  was  modulated \nby the deviation of z(t) from its running average  z(t)  using the following  equations. \n\n7](t) = 7]0  {z(t) - z(t)}, \n\nd \n\nTa dt z(t) = -z(t) + z(t). \n\n(6) \n\na \n\nb \ny1 \n\ny2 \n\n0.0 \n\n4. 0 \n\n8. 0 \n\n12. 0 \n\n16. 0 \nd \n\n20. 0 \n\n24. 0 \n\n28. 0 \n\n32. 0 \n\n.... \n\n..... \n\ny2~rvl\\\u00b7 \n\nO. 0 \n\n4. 0 \n\n8. 0 \n\n12. 0 \n\n16. 0 \n\n0.'-;;' O---;4~: 0i\\\"\"\"\"'\"\"-\"8.\"A' o---:-l-;!-i\"\"\"o -~1S: 0 \n\nc \n\ne \n\ny1  . \n\ny2~y2 \n\n0.0 \n\n4.0 \n\n8.0 \n\n12. 0 \n\n16. 0  o .... 'o-----:4:-'-::o::---~8: ..... 0---:-1-;:1-i-:-0 --1~6: 0 \n\nFigure 1:  Learning of delayed synchronization of neural oscillators.  The dotted and \nsolid curves represent yf(t) and y;(t) respectively.  a:without coupling.  b:~T = 0.0. \nc:~T = 1.0.  c:~T = 2.0.  d : ~T = 3.0. \n\n\fAdaptive Synchronization of Neural  and Physical  Oscillators \n\n113 \n\nFirst,  two CPGs were trained independently to oscillate with sinusoidal waveforms \nof period Tl  = 4.0  and T2  = 5.0  using  continuous-time  back-propagation  learning \n(Doyaand Yoshizawa, 1989).  Each CPG was composed of two neurons (C = 2) with \ntime  constants  T  = 1.0  and  output  functions  g()  = tanh() .  Instead  of following \nthe two step procedure described in the previous section, the network dynamics (5) \nand the learning equations (3)  and (6) were simulated concurrently with parameters \na  = 0.1,  '10  = 0.2,  and To  = 20.0. \nFigure 1 a  shows the oscillation of two CPGs without coupling.  Figures 1 b  through \ne  show the phase-locked waveforms  after learning for  200  time  units  with  different \ndesired  delay times. \n\n4  ZERO-LEGGED  LOCOMOTION \n\nN ext  we  applied  the  learning  rule  to  the  simplest  locomotion  system  that  in(cid:173)\nvolves a  critical phase-lock between the state of the physical system and the motor \ncommand-a zero-legged locomotion system as shown in  Figure 2 a. \nThe  physical  system  is  composed  of a  wheel  and  a  weight  that  moves  back  and \nforth  on a  track fixed  radially  in  the wheel.  It rolls  on  the ground by changing its \nbalance with the displacement of the weight.  In order to move the wheel in a given \ndirection,  the weight must be moved  at  a specific  phase  with  the  rotation angle  of \nthe wheel.  The motion equations are shown in Appendix. \nFirst, a  CPG network was trained to oscillate with  a sinusoidal waveform of period \nT  = 1.0  (Doya  and  Yoshizawa,  1989).  The  network  consisted  of one  output  and \ntwo hidden units (C = 3) with time  constants Ti  = 0.2  and output functions  giO = \ntanh().  Next,  the  output  of the  CPG  was  used  to  drive  the  weight  with  a  force \n/ = /max gl(Xl(t\u00bb.  The position T  and the velocity T of the weight and the rotation \nangle  (cos 0, sin 0)  and  the  angular  velocity  of the  wheel  iJ  were  used  as  sensory \nfeedback  inputs Yl:(t)  (k = 1, .. . ,5)  after scaling to [-1,1]. \nIn  order  to  eliminate  the  effect  of biases  in  x(t)  and  yet),  we  used  the following \nlearni~g equations. \n\nd \ndtPil:  = '1 ((Xi(t) - Xi(t\u00bb - L Pi; (y;(t) - y;(t\u00bb}(Yl:(t) - Yl:(t\u00bb, \n\nS \n\n;=1 \n\nd \n\nTtl:  dt Xi(t)  = -Xi(t) + Xi(t), \nTy dtYl:(t) = -Yl:(t) + Yl:(t). \n\nd \n\n(7) \n\nThe rotation speed of the wheel  was  employed  as  the performance index  z(t)  after \nsmoothing by the following  equation. \n\nd \n\nT, dt z(t) = -z(t) + OCt). \n\n. \n\nThe learning coefficient '1  was modulated by equations (6).  The time constants were \nTtl:  = 4.0,  Ty  = 1.0,  T,  = 1.0,  and  To  = 4.0.  Each  training run  was  started from  a \nrandom configuration of the wheel  and was finished  after ten seconds. \n\n\f114 \n\nDoya and Yoshizawa \n\na \n\nb \n\nc \n\nsin90 \n\n\u2022 \n\ncos9O----\n\n9~ \n\npos \nvel \ncos \nSID \nrot \n\n,perle-\n\n6.0  0.0 \n\n;-= \n\n3. 0  4.0 \n\n, \n1.0 \n\n, \n2.0 \n\n5.0 \n\n6.0 \n\n, \n1.0 \n\n, \n, \n2.0  3.0 \n\n, \n4.0 \n\n, \n5.0 \n\n0.0 \n\n/'  /' \n\n-0.5 \n\n/' \n\n/'  /'  /' \n\n0.5 \n\n0.0 \n\npos  \"------' \nvel \ncos \nsm \n\nO. 0 \n\n1. 0 \n\nbidS ~ :r-----\n..... , _ ....... ' ___ ,'-,-----'-, ,..----'-:-' _ , -1 - '  _-::-I' \n\n2. 0 \n\n3. 0  4. 0 \n\n5. 0  6. 0  O. 0 \n\n1. 0 \n\n2. 0  3. 0 \n\n4. 0 \n\n5. 0  6. 0 \n\n/'  /'  /'  /' \n\n-0.5 \n\n0.0 \n\n0.5 \n\nFigure 2:  Learning of zero-legged locomotion. \n\n\fAdaptive Synchronization of Neural  and Physical Oscillators \n\n115 \n\nFigure  2  b  is  an  example  of the  motion  of the  wheel  without  sensory  feedback. \nThe rhythms of the CPG and the physical system were not entrained to each other \nand the  wheel  wandered left  and right.  Figure 2 c  shows  an  example of the wheel \nmotion after 40  runs of training with parameters Tlo  = 0.1 and (}'  = 0.2.  At first, the \noscillation of the CPG was slowed down by the sensory inputs and then accelerated \nwith  the rotation of the wheel in  the  right direction. \nWe  compared  the patterns of sensory  input  connections  made  after  learning with \nwheels  of different sizes.  Table  1 shows the connection  weights to the output unit. \nThe positive connection from sin 0 forces  the  weight  to  the  right-hand side  of the \nwheel  and  stabilize  clockwise  rotation.  The  negative  connection  from  cos 0  with \nsmaller radius fastens  the rhythm of the CPG when the wheel  rotates too fast  and \nthe  weight  is  lifted  up.  The  positive  input  from  r  with  larger  radius  makes  the \nweight stickier to both ends of the track and slows down the rhythm of the CPG. \n\nTable  1:  Sensory input weights to the output unit  (Plk; k = 1, ... ,5). \n\nradius \n2cm \n4cm \n6cm \n8cm \n10cm \n\nr \n\n0.15 \n0.28 \n0.67 \n0.70 \n0.90 \n\nr \n\n-0.53 \n-0.55 \n-0.21 \n-0.33 \n-0.12 \n\ncosO \n-1.35 \n-1.09 \n-0.41 \n-0.40 \n-0.30 \n\nsinO \n1.32 \n1.22 \n0.98 \n0.92 \n0.93 \n\n0 \n0.07 \n0.01 \n0.00 \n0.03 \n-0.02 \n\n5  DISCUSSION \n\nThe architectures of CPGs in lower vertebrates and invertebrates are supposed to \nbe determined by genetic information.  Nevertheless, the wayan animal utilizes the \nsensory inputs must be adaptive to the characteristics of the physical environments \nand the  changing dimensions  of its body parts. \nBack-propagation through forward  models  of physical systems  can  also  be  applied \nto the learning of sensory feedback  (Jordan and Jacobs,  1990).  However, learning of \nnonlinear dynamics of locomotion systems is  a  difficult  task;  moreover,  multi-layer \nback-propagation is  not appropriate as a biological model of learning.  The learning \nrule  (7)  is  similar  to  the  covariance  learning  rule  (Sejnowski  and  Stanton,  1990), \nwhich is  a biological model of long term potentiation of synapses. \n\nAcknowledgements \n\nThe  authors  thank  Allen  Selverston,  Peter  Rowat,  and  those  who  gave  comments \nto our poster at NIPS  Conference.  This work was partly supported by grants from \nthe  Ministry of Education, Culture, and Science of Japan. \n\n\f116 \n\nDoya and Yoshizawa \n\nReferences \n\nBarnes,  W.  J. P.  &  Gladden,  M.  H.  (1985)  Feedback  and  Motor Control in  Inverte(cid:173)\nbrates  and  Vertebrates.  Beckenham,  Britain:  Croom Helm. \nDoya,  K.  &  Yoshizawa, S.  (1989)  Adaptive neural oscillator  using  continuous-time \nback-propagation learning.  Neural Networks,  2,  375-386. \nGrillner, S. &  Matsushima, T. (1991) The neural network underlying locomotion in \nLamprey-Synaptic and cellular mechanisms.  Neuron,  7(July),  1-15. \nJordan,  M.  I. &  Jacobs,  R.  A.  (1990)  Learning to control an unstable system with \nforward  modeling.  In Touretzky,  D.  S.  (ed.),  Advances in  Neural Information  Pro(cid:173)\ncessing Systems  2.  San Mateo,  CA:  Morgan  Kaufmann. \nKawato,  M.  & Suzuki,  R.  (1980)  Two coupled  neural oscillators  as  a  model  of the \ncircadian pacemaker.  Journal  of Theoretical Biology, 86, 547-575. \nPearlmutter, B.  A.  (1989)  Learning state space trajectories in recurrent neural net(cid:173)\nworks.  Neural Computation,  1, 263-269. \nSejnowski, T. J. &  Stanton, P.  K.  (1990)  Covariance storage  in  the Hippocampus. \nIn Zornetzer, S. F. et aI.  (eds.), An Introduction to  Neural and Electronic Networks, \n365-377.  San Diego,  CA:  Academic Press. \nSiller,  K.  T.,  Skorupski,  P.,  Elson,  R.  C.,  &  Bush,  M.  H.  (1986)  Two  identified \nafferent neurones entrain a central locomotor rhythm generator.  Nature,  323, 440-\n443. \nWidrow,  B.  &  Stearns,  S.  D.  (1985)  Adaptive Signal  Processing.  Englewood  Cliffs, \nN J: Prentice  Hall. \nWilliams,  R.  J.  &  Zipser,  D.  (1989)  A  learning  algorithm for  continually running \nfully  recurrent neural networks.  Neural Computation,  1, 270-280. \n\nAppendix \n\nThe dynamics of the zero-legged  locomotion system: \n\n.. \n\nmr  =  JO  + \n\n.f.(1  mR2 sin2 0) \n\n10 \n\n- mgc  cos  + \nR \u00b7  Ov+2mr(r+RcosO)0' \n\n(0   mRsin20(r+RcosO\u00bb \n\n10 \n\n10 \n\n+m  sm \n-loR sin 0 + mgcsinO(r + RcosO) - (v + 2mr(r + RcosO\u00bbO, \nImax g(Xl(t\u00bb - ur3 - /Jr, \n1+ MR2 + m(r + RcoSO)2. \n\n+mr \n\n0'2 \n\n, \n\n100 \n10 \n10 \n\nParameters:  the  masses  of the  weight  m  = 0.2[kg)  and  the  wheel  M  = 0.8[kg); \nthe  radius  of the  wheel  R = 0.02throughO.l[m)j  the  inertial  moment  of the  wheel \nI  = t M R2 j  the  maximum force  to  the  weight  1 max  = 5[N) j the  stiffness  of the \nlimiter  of the  weight  u  = 20/ R3  [N/m3);  the  damping  coefficients  of the  weight \nmotion /J = 0.2/ R  [N/(m/s\u00bb) and the wheel rotation v = 0.05(M +m)R [N/(rad/s\u00bb). \n\n\f", "award": [], "sourceid": 537, "authors": [{"given_name": "Kenji", "family_name": "Doya", "institution": null}, {"given_name": "Shuji", "family_name": "Yoshizawa", "institution": null}]}