{"title": "A Computational Basis for Phonology", "book": "Advances in Neural Information Processing Systems", "page_first": 372, "page_last": 379, "abstract": null, "full_text": "372 \n\nTouretzky and Wheeler \n\nA  Computational Basis for Phonology \n\nDavid S.  Touretzky \n\nSchool of Computer Science \nCarnegie Mellon  University \n\nPittsburgh, PA  15213 \n\nDeirdre W. Wheeler \n\nDepartment of Linguistics \nUniversity of Pittsburgh \nPittsburgh, PA  15260 \n\nABSTRACT \n\nThe phonological structure of human  languages  is  intricate,  yet highly \nconstrained.  Through  a  combination  of connectionist  modeling  and \nlinguistic  analysis,  we are attempting to develop a computational basis \nfor  the  nature  of phonology.  We  present  a  connectionist  architecture \nthat  performs  multiple  simultaneous  insertion,  deletion,  and  mutation \noperations on sequences  of phonemes, and introduce a novel additional \nprimitive,  clustering.  Clustering  provides  an  interesting  alternative  to \nboth iterative and relaxation accounts of assimilation processes  such as \nvowel  harmony.  Our resulting  model  is  efficient because it processes \nutterances entirely in parallel  using  only  feed-forward circuitry. \n\nINTRODUCTION \n\n1 \nPhonological  phenomena  can  be  quite  complex,  but  human  phonological  behavior  is \nalso  highly  constrained.  Many  operations  that  are  easily  learned  by  a  perceptron-like \nsequence  mapping network  are  excluded  from  real  languages.  For  example,  as  Pinker \nand  Prince  (1988)  point  out in  their  critique of the  Rumelhart  and  McClelland  (1986) \nverb  learning  model,  human  languages  never  reverse  the  sequence  of segments  in  a \nword, but this is an easy mapping for a network to learn.  On the other hand,  we note that \nsome  phonological  processes  that  are  relatively  common  in  human  languages,  such  as \nvowel  harmony, appear difficult for  a  sequence-mapping architecture  to  learn.  Why are \nonly certain types of sequence operations found in human languages, and not others?  We \nsuggest that this is a reflection of the limitations of an underlying, genetically-determined, \nspecialized computing architecture.  We are searching for  this  architecture. \n\n\fA Computational Basis for Phonology \n\n373 \n\nOur  work  was  initially  inspired  by  George  Lakoff's  theory  of  cognitive  phonology \n(Lakoff,  1988,  1989),  which  is  in  tum  a  development  of the  ideas  of John  Goldsmith \n(to  appear).  Lakoff proposes  a  three-level  representation  scheme.  The  M  (morpho(cid:173)\nphonemic)  level  represents  the  underlying  form  of an  utterance,  the P  (phonemic)  level \nis  an  intermediate form, and the F  (phonetic)  level is  the derived surface form. \nLakoff uses  a  combination of inter-level  mapping rules  and intra-level  well-formedness \nconditions  to  specify  the  relationships  between  P- and  F-Ievel  representations  and  the \nM-Ievel  input.  In  a  connectionist  implementation,  the  computations  performed  by  the \nmapping rules  are straightforward, but we find  the well-formedness conditions  troubling. \nGoldsmith's  proposal  was  that phonology  is  a  goal-directed  constraint satisfaction  sys(cid:173)\ntem  that operates  via  parallel relaxation.  He cites  Smolensky's hannony  theoryl  Lakoff \nhas  adopted  this  appeal  to  hannony  theory  in  his  description  of how  well-formedness \nconditions  could work. \nIn our model, we further develop the Goldmsith and Lakoff mapping scheme, but we reject \nharmony-based  well-formedness  conditions  for  several  reasons.  First,  harmony  theory \ninvolves simulated annealing search.  The timing constraints of real nervous  systems rule \nout simulated annealing.  Second,  it is  not clear how  to  construct an energy function  for \na connectionist network that performs complex discrete phonological operations.  Finally \nthere is  our desire to explain  why  certain  types  of processes  occur in  human  languages \nand others  do not.  Harmony  theory alone is  too unconstrained for  this  purpose. \nWe have implemented a model called M3p  (for \"Many Maps\" Model of Phonology) that \nallows  us  to account for virtually all of the phenomena in  (Lakoff,  1989) using a tighUy(cid:173)\nconstrained, purely-feedforward computing  scheme.  In  the next  section we  describe the \nmapping  matrix  architecture  that  is  the  heart of M3p. Next we  give an  example  of an \niterative process, Yawelmani vowel harmony,2, which Lakoff models with a P-Ievel well(cid:173)\nformedness  condition.  Such  a  condition  would  have  to  be implemented  by  relaxation \nsearch  for  a  \"minimum  energy  state\"  in  the  P-Ievel  representation,  which  we  wish  to \navoid.  Finally  we  present  our  alternative  approach  to  vowel  harmony,  using  a  novel \nclustering mechanism  that eliminates the need  for  relaxation. \n\n2  THE MAPPING MATRIX ARCHITECTURE \nFigure  1 is  an  overview of our \"many  maps\"  model.  M-P  constructions  compute  how \nto  go  from  the M-Ievel  representation  of an  utterance  to  the P-Ievel representation.  The \nderivation is  described as  a set of explicit changes  to  the M-Ievel string.  M-P construc(cid:173)\ntions read the segments in the M-Ievel buffer and write the changes, phrased as  mutation, \ndeletion,  and  insertion requests,  into  slots  of a  buffer  called  P-deriv.  The M-Ievel  and \nP-deriv  buffers  are  then  read  by  the  M-P  mapping  matrix,  which  produces  the  P-Ievel \nrepresentation  as  its  output.  The process  is  repeated  at the  next  level,  with  P-F  con(cid:173)\nstructions  writing  changes  into  an  F-deriv  buffer,  and  a  P-F  map  deriving  an  F-Ievel \n\n1 Srnolensky' s \"hannony theory\" should not be confused with the linguistic phenomenon of \"vowel hannony.\" \n2Yawelmani  is  a  dialect of Yokuts,  an  American  Indian language from  California.  Our Yawelmani  data  is \n\ndrawn from  Kenstowicz and Kisseberth  (1979), as  is  Lakoff's. \n\n\f374 \n\nTouretzky and Wheeler \n\nM-Level Buffer \n\nI \n\nM-P Constructions \n\nP-Level Buffer \n\nI \n\nP-F Constructions \n\n-----4~~ F-Level Buffer \n\nCanonicalization \n\nI \nI \n\nSurface Phonetic  ... \nRepresentation \n\n.... ----' \n\nFigure 1:  Overview of the \"many maps\"  model. \n\nrepresentation.  A final  step called \"canonicalization\" cleans up the representations of the \nindividual segments. \n\nFigure 2 shows the effect of an  M-P construction that breaks  up  CCC consonant clusters \nby  inserting  a  vowel  after  the  first consonant,  producing  CiCCo  The  input  in  this  case \nis  the  Yawelmani  word /?ugnhin/  \"drinks\",  and  the  desired  insertion  is  indicated  in  P(cid:173)\nderiv.  The mapping matrix derives the P-Ievel representation right-justified in the buffer, \nwith  no  segment  gaps  or  collisions.  It can  do  this  even  when  mutliple  simultaneous \ninsertions  and deletions  are being  performed.  But it cannot perform  arbitrary  sequence \nmanipulations,  such  as  reversing all the segments  of an  utterance.  Further details  of the \nmatrix architecture are given in  (Touretzlcy,  1989) and (Wheeler and Touretzky,  1989). \n\nITERATIVE PHENOMENA \n\n3 \nSeveral types of phonological processes operate on groups of adjacent segments, often by \nmaking  them  more  similar to an  immediately preceding  (or following)  trigger  segment. \nVowel  harmony  and  voicing  assimilation  are two examples.  In  Yawelmani,  vowel  har(cid:173)\nmony  takes  the following  form:  an  [ahigh] vowel  that is  preceded by  an  [ahigh] round \nvowel  becomes  round  and  back.  In  the  form  Ido:s+aV  \"might  report\",  the  non-round, \nback vowel Ia! is  [-high], as is the preceding round vowel/o/.  Therefore the Ia! becomes \nround, yielding the surface form  [do:soIJ.  Similarly, in Idub+hin/ \"leads by the hand\", the \n[+highJ  vowel Ii! is preceded by the [+high] round vowel lui, so the (II becomes round and \nback,  giving  [dubhun].  In  /bok'+hinl \"finds\",  the Ii! does  not undergo  harmony because \nit differs  in  height from  the  preceding  vowel. \n\n\fA Computational Basis for Phonology \n\n375 \n\nM-Level: \n\nP-Deriv: \n\nh \n\nn \ni \n\nP-Level: \n\n? \n\nu \n\ng \n\nn \n\nh \n\nmut \ndel \nins \n\n-\n-\n\n-\n\n-\n-\n\n-\n-\n\n-\n\n-\n-\n\n1 \n\n-\n\n+ \n\n-\n-\n\n-\n\nh \n\nn \n\n-\n-\n\nn \n\n-\n\ni \n\n-\n\n-\n-\n\n1 \n\nn \n\ni \n\ng \n\nu \n\n? \n\nM-PMap \nMatrix \n\nping \n\nFigure 2:  Perfonning an insertion  via the M-P mapping matrix. \n\nHannony is  described as  an iterative process because it can apply  to entire sequences of \nvowels, as  in  the following  derivation: \n\nIt'ul+sit+hin/ \nIt'ul+sut+hinl \nIt'ul+sut+hunl  harmony on third vowel \n\n\"burns  for\" \nharmony on second vowel \n\nIn  Yawelmani  we  saw  an  epenthesis  process  that  inserts  a  high  vowel  Ii!  to  break  up \nlengthy  consonant  clusters.  Epenthetic  vowels  may  either  undergo  or  block  hannony. \nWith the  word /logw+xa! \"let's pulverize\", epenthesis inserts an Ii! to break up  the Igwx! \ncluster, producing /logiw+xa!.  Now  the Ia!  is  preceded by a  [+high,  -round]  vowel,  so \nhannony does not apply, whereas in Ido:s+al/,  which has the same sequence of underlying \nvowels, it did.  This is an instance of epenthesis blocking hannony.  In other environments \nthe  epenthetic vowel  may itself undergo hannony.  For example: \n\n/?ugn+hinl \n/?uginhinl \n/?ugunhin/ \nI?ugunhun/  harmony on third vowel \n\n\"drinks\" \nepenthesis \nharmony on epenthetic vowel \n\nThe  standard  generative phonology  analysis  of hannony  utilizes  the following  rule,  ap(cid:173)\nplying after epenthesis, that is  supposed to  iterate through the utterance from  left to right, \nchanging one vowel  at a time: \n\n\f376 \n\nTouretzky and Wheeler \n\n[ \n\n+syll \na high \n\n]  [ \n\n--\n\n+round \n+back \n\n1 [ +SYll \n\n/ \n\n:~;~d  Co _ \n\n] \n\nLakoff offers an  alternative account of epenthesis  and  harmony that eliminates  iteration. \nHe states  epenthesis  as  an  M-P construction: \n\nM: \n\nP: \n\nC \nI \n[ ] \n\ni \n\nC \nI \n[ ] \n\n{C.#} \n\nThe harmony  rule  is  stated  as  a P-Ievel  well-formedness  condition  that applies  simulta(cid:173)\nneously throughout the buffer: \n\nP: \n\nIf [+syll. +round. ahigh] Co  X. \n\nthen if X = [+syll. ahigh]. then  X = [+round. +back]. \n\nStarting  with  /?ugn+hin/  at M-Ievel.  Lakoff\u00b7s  model  would  settle  into  a  representation \nof nugunhun/ at P-Ievel.  We  repeat again  the  crucial  point  that  this  representation  is \nnot derived  by  sequential  application  of rules;  it  is  merely  licensed  by  one  application \nof epenthesis  and two of harmony.  The actual computation of the  P-Ievel representation \nwould be performed by a parallel relaxation process. perhaps  using  simulated annealing. \nthat  somehow  determines  the  sequence  that  best  satisfies  all  applicable  constraints  at \nP-Ievel. \n\n4  THE CLUSTERING MECHANISM \nOur account  of vowel  harmony  must  differ  from  LakofCs  because  we  do  not  wish  to \nrely  on  relaxation  in  our  model.  Instead.  we  introduce  special  clustering  circuitry  to \nrecognize  sequences  of segments  that  share  certain  properties.  The  clustering  idea  is \nmeant to be analogous  to perceptual grouping in  vision.  Sequences of adjacent visually(cid:173)\nsimilar  objects  are  naturally  perceived as  a  whole.  A  similar  mechanism  operating  on \nphonological  sequences.  although  unprecedented  in  linguistic  theory.  does  not  appear \nimplausible.  Crucial  to  our  model  is  the  principle  that  perceived  sequences  may  be \noperated on as  a unit.  This  allows  us  to  avoid iteration and give a fully-parallel  account \nof vowel  harmony. \nThe  clustering  mechanism  is  controlled by a  small  number of language-specific param(cid:173)\neters.  The  rule  shown  below  is  the  P-F  clustering  rule  for  Yawelmani.  Cluster  type \n[+syllabic]  indicates  that  the  rule  looks  only  at  vowels.  (This  is  implemented  by  an \nadditional  mapping  matrix  that extracts  the  vowel  projection  of the  P-Ievel  buffer.  The \nclustering mechanism actually looks at the output of this  matrix rather than at the P-Ievel \nbuffer directly.)  The  trigger  of a  cluster  is  a  round  vowel  of a  given  height.  and  the \nelements  are the subsequent adjacent vowels of matching height.  Application of the rule \ncauses  elements  (but not triggers)  to  undergo a change;  in  this  case. they  become round \nand back. \n\n\fA Computational Basis for Phonology \n\n377 \n\nYawelmani vowel  harmony - P-F mapping: \n\nCluster type: \nTrigger: \nElement: \nChange: \n\n[+syllabic] \n[+round, ahigh] \n[ahigh] \n[+round, +back] \n\nThe  following  hypothetical  vowel  sequence  illustrates  the  application  of this  clustering \nrule.  Consonants  are omitted  for clarity: \n\nI  2 34   5  6 \n0 \n1 \n+ \n\nU \n+ \n\ni \n\ni \n\ne \n\n7  8  9 \ni \n0 \n\na \n\n+  + \n\n+  + \n\ntrigger: \nelement: \n\nThe second  vowel is  round,  so  it's  a trigger.  Since the third  and fourth  vowels  match  it \nin height,  they become elements.  The fifth  vowel is  [-highl, so it is  not included in  the \ncluster.  The sixth vowel triggers a new cluster because it's round;  it is also [-high].  The \nseventh and eighth vowels are also [-highl, so they can be elements, but the ninth vowel \nis  excluded  from  the  cluster  because  is  [+highl.  Note  that  vowel 7  is  an  element,  but \nit also  meets  the  specification  for  a  trigger.  Given  a  choice,  our model  prefers  to  mark \nsegments  as  elements  rather than  triggers  because  only  elements  undergo  the  specified \nchange.  The  distinction  is  moot  in  Yawlemani,  where  triggers  are  already  round  and \nback, but it matters  in other languages;  see (Wheeler and Touretzky,  1989) for details. \n\nFigures  2 and  3  together  show  the derivation  of the Yawelmani  word  [?ugunhunl  from \nthe underlying  form  /?ugn+hin/.  In  figure  2 an  M-P construction inserted a high  vowel. \nIn  figure  3 the  P-F clustering  circuitry  has  examined  the  P-Ievel buffer and  marked  the \ntriggers  and  elements.  Segments  that  were  marked  as  elements  then  have  the  change \n[+round,  +backl  written  into their corresponding  mutation  slots  in  F-deriv.  Finally,  the \nP-F mapping  matrix  produces  the  sequence /?ugunhun/ as  the  F-Ievel representation  of \nthe utterance. \n\n5  DISCUSSION \nWe  could  not  justify  the  extra  circuitry  required  for  clustering  if it were  suitable only \nfor Yawelmani vowel harmony.  The same mechanism handles a variety of other iterative \nphenomena, including Slovak and Gidabal vowel shortening, Icelandic umlaut, and Rus(cid:173)\nsian  voicing  assimilation.  The full  mechanism  has  some  additional  parameters  beyond \nthose covered in the discussion of Yawelmani.  For example, clustering may proceed from \nright-to-Ieft  (as  is  the case in  Russian)  instead  of from  left-to-right  Also,  clusters  may \nbe of either bounded or unbounded length.  Bounded clusters  are required  for alternation \nprocesses,  such  as  Gidabal  shortening.  They  cover exactly  two  segments:  a trigger and \none element  We are making a  deliberate  analogy here  with  metrical  phonology  (stress \nsystems),  where  unbounded  feet  may  be  of arbitrary  length,  but  bounded  feet  always \ncontain exactly two syllables.  No language has  strictly trisyllabic feet  We predict a sim(cid:173)\nilar constraint  will  hold for  iterative  phenomena when  they  are reformulated  in  parallel \nclustering  terms,  i.e.,  no  language requires  bounded-length  clusters  with  more than  one \nelement \n\n\f378 \n\nTouretzky and Wheeler \n\nP-Ievel: \n\n? \n\nu \n\ng \n\n. \n\n1 \n\nn \n\nh \n\n1 \n\nn \n\nClustering: \n\ntrigger \nelement \n\nF-deriv: \n\nF-Ievel: \n\n.. \n\n.....-\nn \nI--u \n11 \nI--\nn \nU -\n~ \nu \nT \n\n\"\"\"--\n\nn  J \n\nu \n\nh \n\nP-FM \n\n. \napplng \nMa trix \n\nn \n\nu \n\ng \n\nu \n\n? \n\nFigure 3:  Clustering applied to  Yawelmani  vowel hannony. \n\n\fA Computational Basis for Phonology \n\n379 \n\nOur  model  makes  many  other  predictions  of constraints  on  human  phonology,  based \non  limitations  of the highly-structured \"many  maps\" architecture.  We are  attempting  to \nverify these predictions, and also to extend the model to additional aspects of phonological \nbehavior, such as  syllabification and stress. \n\nAcknowledgements \n\nThis  research  was  supported by  a  contract from  Hughes  Research  Laboratories,  by  the \nOffice  of Naval  Research  under  contract  number  NOOOI4-86-K-0678,  and  by  National \nScience  Foundation  grant EET-8716324.  We  thank  George  Lakoff for  encouragement \nand  support,  John  Goldsmith  for  helpful  correspondence,  and  Gillette  Elvgren  III  for \nimplementing the  simulations. \n\nReferences \n\nGoldsmith, J.  (to  appear)  Phonology  as  an  intelligent system.  To  appear in  a festschrift \nfor Leila Gleitman, edited by  D.  Napoli  and J.  Kegl. \n\nKenstowicz,  M.,  and  Kisseberth,  C.  (1979)  Generative  Phonology:  Description  and \nTheory.  San  Diego,  CA:  Academic Press. \nLakoff, G.  (1988) A suggestion  for  a linguistics with connectionist foundations.  In  D.  S. \nTouretzky, G. E. Hinton, and T. J. Sejnowski (eds.), Proceedings of the 1988 Connectionist \nModels Summer School, pp.  301-314.  San Mateo, CA:  Morgan  Kaufmann. \nLakoff,  G.  (1989)  Cognitive  phonology.  Draft  of paper  presented  at  the  UC-Berkeley \nWorkshop on Constraints  vs  Rules, May  1989. \n\nPinker, S.,  and Prince,  A.  (1988) On  language and connectionism:  analysis  of a parallel \ndistributed  processing  model  of language  acquisition.  In  S.  Pinker &  J.  Mehler  (eds.), \nConnections and Symbols.  Cambridge, Massachusetts:  MIT Press. \nRumelhart,  D.  E.,  and  McClelland, J.  L.  (1986)  On  learning  the  past  tenses  of English \nverbs.  In  J.  L.  McClelland and D.  E.  Rumelhart (eds.), Parallel Distributed Processing: \nExplorations in the  MicroStructJ&re  of Cognition,  volume  2.  Cambridge,  Massachusetts: \nMIT Press. \nSmolensky, P.  (1986)  Information processing  in  dynamical systems:  foundations  of har(cid:173)\nmony  theory. \nIn  D.  E.  Rumelhart  and  J.  L.  McClelland  (eds.),  Parallel  Distributed \nProcessing:  Explorations  in  the  MicroStructure  of Cognition,  volume  1.  Cambridge, \nMassachusetts:  MIT Press. \n\nTouretzky, D. S.  (1989) Toward a connectionist phonology:  the \"many maps\" approach to \nsequence manipulation.  Proceedings of the Eleventh Annual Conference of the  Cognitive \nScience  Society, pp.  188-195.  Hillsdale, NJ:  Erlbaum. \n\nWheeler, D. W., and Touretzky, D.  S.  (1989) A connectionist implementation of cognitive \nphonology.  Technical  report  CMU-CS-89-144,  Carnegie  Mellon  University,  School  of \nComputer Science.  To appear in G. Lakoff and L.  Hyman (eds.), Proceedings of the UC(cid:173)\nBerkeley Phonology Workshop  on Constraints vs.  Rules. University of Chicago Press. \n\n\f", "award": [], "sourceid": 268, "authors": [{"given_name": "David", "family_name": "Touretzky", "institution": null}, {"given_name": "Deirdre", "family_name": "Wheeler", "institution": null}]}