{"title": "Analog Neural Networks as Decoders", "book": "Advances in Neural Information Processing Systems", "page_first": 585, "page_last": 588, "abstract": null, "full_text": "Analog Neural Networks as Decoders \n\nRuth Erlanson\u00b7 \nDept. of Electrical Engineering \nCalifornia Institute of Technology \nPasadena, CA 91125 \n\nYaser Abu-Mostafa \nDept. of Electrical Engineering \nCalifornia Institute of Technology \nPasadena, CA 91125 \n\nAbstract \n\nAnalog neural networks with feedback can be used to implement l((cid:173)\nWinner-Take-All (KWTA) networks. In turn, KWTA networks can be \nused as decoders of a class of nonlinear error-correcting codes. By in(cid:173)\nterconnecting such KWTA networks, we can construct decoders capable \nof decoding more powerful codes. We consider several families of inter(cid:173)\nconnected KWTA networks, analyze their performance in terms of coding \ntheory metrics, and consider the feasibility of embedding such networks in \nVLSI technologies. \n\n1 \n\nINTRODUCTION: THE K-WINNER-TAKE-ALL \nNETWORK \n\nWe have previously demonstrated the use of a continuous Hopfield neural network \nas a K-Winner-Take-All (KWTA) network [Majani et al., 1989, Erlanson and Abu(cid:173)\nMostafa, 1988}. Given an input of N real numbers, such a network will converge \nto a vector of K positive one components and (N - K) negative one components, \nwith the positive positions indicating the K largest input components. In addition, \nwe have shown that the (~) such vectors are the only stable states of the system. \nOne application of the KWTA network is the analog decoding of error-correcting \ncodes [Majani et al., 1989, Platt and Hopfield, 1986]. Here, a known set of vectors \n(the codewords) are transmitted over a noisy channel. At the receiver's end of the \nchannel, the initial vector must be reconstructed from the noisy vector. \n\n\u2022 currently at: Hughes Network Systems, 10790 Roselle St., San Diego, CA 92121 \n\n585 \n\n\f586 \n\nErlanson and Abu-Mostafa \n\nIf we select our codewords to be the (Z) vectors with J( positive one components \nand (N - K) negative one components, then the K'VTA neural network will perform \nthis decoding task. Furthermore, the network decodes from the noisy analog vector \nto a binary codeword (so no information is lost in quantization of the noisy vector). \nAlso, we have shown [Majani et al., 1989] that the K\"VTA network will perform the \noptimal decoding, maximum likelihood decoding (MLD), if we assume noise where \nthe probability of a large noise spike is less than the probability of a small noise spike \n(such as additive white Gaussian noise). For this type of noise, an MLD outputs \nthe codeword closest to the noisy received vector. Hence, the most straightforward \nimplementation of MLD would involve the comparison of the noisy vector to all the \ncodewords. For large codes, this method is computationally impractical. \n\nTwo important parameters of any code are its rate and minimum distance. The \nrate, or amount of information transmitted per bit sent over the channel, of this \ncode is good (asymptotically approaches 1). The minimum distance of a code is \nthe Hamming distance between the two closest codewords in the code. The mini(cid:173)\nmum distance determines the error-correcting capabilities of a code. The minimum \ndistance of the KWTA code is 2. \n\nIn our previous work, we have found that the K'VTA network performs optimal \ndecoding of a nonlinear code. However, the small minimum distance of this code \nlimited the system's usefulness. \n\n2 \n\nINTERCONNECTED KWTA NETWORKS \n\nIn order to look for more useful code-decoder pairs, we have considered intercon(cid:173)\nnected K\"VTA networks. 'Ve have found two interesting families of codes: \n\n2.1 THE HYPERCUBE FAMILY \n\nA decoder for this family of codes has m = ni nodes. 'Ve label the nodes Xl, X2, \u2022\u2022. Xi \nwith X j E 1,2 ... n. K'VTA constraints are placed on sets of n nodes which differ \nin only one index. For example, {I, 1, 1, ... ,I}, {2, 1, 1, ... ,I}, {3, 1, 1, ... ,I}, ... , \n{n, 1, 1, ... ,I} are the nodes in one KWTA constraint. \nFor a two-dimensional system (i = 2) the nodes can be laid out in an array where \nthe K\"VTA constraints will be along the rows and columns of the array. For the \ncode associated with the two-dimensional system, we find that \n\nrate ~ 1-\n\n310gn \n. \n\n2n \n\nThe minimum distance of this code is 4. Experimental results show that the decoder \nis nearly optimal. \n\nIn general, for an i-dimensional code, the minimum distance is 2i. The rate of these \ncodes can be bounded only very roughly. \n\nWe also consider implementing these decoders on an integrated circuit. Because \nof the high level of interconnectivity of these decoders and the simple processing \nrequired at each node (or neuron) we assume that the interconnections will dictate \nthe chip's size. Using a standard model for VLSI area complexity, we determine \n\n\fAnalog Neural Networks as Decoders \n\n587 \n\nthat the circuit area scales as the square of the network size. Feature sizes of current \nmainstream technologies suggest that we could construct systems with 222 = 484 \n(2-dimensional), 63 = 216 (3-dimensional) and 54 = 625 (4-dimensional) nodes. \nThus, nontrivial systems could be constructed with current VLSI technology. \n\n2.2 NET-GENERATED CODES \n\nThis family uses combinatorial nets to specify the nodes in the K\\VTA constraints. \nA net on n 2 points consists of parallel classes: Each class partitions the n 2 points \ninto n disjoint lines each containing n points. Two lines from different classes \nintersect at exactly one point. \n\nIf we impose a KWTA constraint on the points on a line, a net can be used to \ngenerate a family of code-decoder pairs. If n is the integer power of a prime number, \nwe can use a projective plane to generate a net with (n + 1) classes. For example, \nin Table 1 we have the projective plane of order 2 (n = 2). A projective plane has \nn 2 + n + 1 points and n 2 + n + 1 lines where each line has n + 1 points and any 2 \nlines intersect in exactly one point. \n\nTable 1: Projective Plane of Order 2. Points are numbered for clarity. \n\npoints: \n\n1 2 3 4 5 6 7 \n1 1 \n\n1 \n\n1 1 \n\n1 1 \n\n1 \n\n1 \n\n1 \n\n1 \n\n1 \n\n1 \n\n1 1 \n\n1 \n\n1 1 \n\n1 1 \n1 \n\nlines: \n\nWe can generate a net of 3 (i.e., n + 1) classes in the following way: Pick one line of \nthe projective plane. \\Vithout loss of generality, we select the first line. Eliminate \nthe points in that line from all the lines in the projective plane, as shown in Table 2. \nRenumber the remaining n 2 + n + 1 - (n + 1) = n 2 points. These are the points of \nthe net. The first class of the net is composed of the reduced lines which previously \ncontained the first point (old label 1) of the projective plane. In our example, this \nclass contains two lines: L1 consists of points 2 and 3, and L2 consists of points 1 \nand 4. The remaining classes of the net are formed in a corresponding manner from \nthe other points of the first line of the projective plane. \nIf we use all (n + 1) classes to specify KWTA constraints, the nodes are over(cid:173)\nconstrained and the network has no stable states. We can obtain n different codes \nby using 1,2, ... , up to n classes to specify constraints. (The code constructed with \ntwo classes is identical to the two-dimensional code in Section 2.1!) Experimentally, \nwe have found that these decoders perform near-optimal decoding on their corre(cid:173)\nsponding code. A code constructed with i nets has a minimum distance of at least \n2i. Thus, a code of size n2 (i.e., the codewords contain n 2 bits) can be constructed \n\n\f588 \n\nErIanson and Abu-Mostafa \n\nwith minimum distance up to 2n. The rate of these codes in general can be bounded \nonly roughly. \nWe found that we could embed the decoder with a nets in an integrated circuit \nwith width proportional to .lan3 , or area proportional to the cube of the number \nof processors. In a typical vLSI process, one could implement systems with 484 \n(a = 2, n = 22), 81 (a = 3, n = 9) or 64 (a = 4, n = 8) nodes. \n\n3 SUMMARY \n\nWe have simulated and analyzed analog neural networks which perform near(cid:173)\noptimal decoding of certain families of nonlinear codes. Furthermore, we have \nshown that nontrivial implementations could be constructed. This work is discussed \nin more detail in [Erlanson, 1991). \n\nReferences \n\nE. Majani, R. Erlanson and Y.S. Abu-Mostafa, \"On the K-Winners-Take-All Feed(cid:173)\nback Network,\" Advances in Neural Information Processing Systems, D. Touretzky \n(ed.), Vol. 1, pp. 634-642, 1989. \n\nR. Erlanson and Y.S. Abu-Mostafa, \"Using an Analog Neural Network for Decod(cid:173)\ning,\" Proceedings of the 1988 Connectionist Models Summer School, D. Touretzky, \nG. Hinton, T. Sejnowski (eds.), pp. 186-190, 1988. \n\nJ.C. Platt and J.J. Hopfield, \"Analog decoding using neural networks,\" AlP Confer(cid:173)\nence Proceedings #151, Neural Networks for Computing, J. Denker (ed.), pp. 364-\n369, 1986. \n\nR. Erlanson, \"Soft-Decision Decoding of a Family of Nonlinear Codes Using a Neural \nNetwork,\" PhD. Thesis, California Institute of Technology, 1991. \n\nTable 2: Constructing a Net from a Projective Plane. \n\nprojective plane's points: 1 2 3 4 5 6 7 \n\n\\ \n\n\\ \n\\ 1 \n1 \n\n1 \n\n\\ \n\\ \n\\ 1 \n\n1 \n\n1 \n\nlines: \n\nnet's points: \n\n\\ \n! \n\n\\ \n\n1 \n1 \n\n1 1 \n\nLl \n\n1 1 \n\n1 L2 \n\n2 3 4 \n\n\fPart X \n\nLanguage and Cognition \n\n\f\f", "award": [], "sourceid": 399, "authors": [{"given_name": "Ruth", "family_name": "Erlanson", "institution": null}, {"given_name": "Yaser", "family_name": "Abu-Mostafa", "institution": null}]}