{"title": "Application of Neural Network Methodology to the Modelling of the Yield Strength in a Steel Rolling Plate Mill", "book": "Advances in Neural Information Processing Systems", "page_first": 698, "page_last": 705, "abstract": null, "full_text": "Application of Neural Network  Methodology to \nthe  Modelling of the Yield Strength in a  Steel \n\nRolling Plate Mill \n\nAh  Chung Tsoi \nDepartment of Electrical  Engineering \nUniversity of Queensland, \nSt  Lucia,  Queensland  4072, \nAustralia. \n\nAbstract \n\nIn this paper, a tree based neural network viz.  MARS (Friedman, 1991) for \nthe modelling of the yield strength of a steel rolling plate mill is  described. \nThe inputs  to  the  time series  model  are  temperature,  strain,  strain  rate, \nand  interpass  time  and  the  output  is  the  corresponding  yield  stress.  It \nis  found  that  the  MARS-based  model  reveals  which  variable's  functional \ndependence  is  nonlinear,  and  significant.  The  results  are  compared  with \nthose  obta.ined  by  using  a  Kalman  filter  based  online  tuning method and \nother  classification  methods,  e.g.  CART,  C4 .5,  Bayesian  classification.  It \nis found  that the  MARS-based  method consistently  outperforms the other \nmethods. \n\n1 \n\nIntroduction \n\nHot  rolling  of steel  slabs  into fiat  plates  is  a  common process  in  a  steel  mill.  This \ntechnology  has  been  in  use  for  many  years.  The  process  of rolling  hot  slabs  into \nplates  is  relatively  well  understood  [see,  e.g.,  Underwood,  1950].  But  with  the \nintense  intrnational  market  competition,  there  is  more  and  more  demand  on  the \nquality of the finished  plates.  This demand for  quality fuels  the search  for  a  better \nunderstanding  of  the  underlying  mechanisms  of  the  transformation  of  hot  slabs \ninto  plates,  and  a  better  control  of the  parameters  involved.  Hopefully,  a  better \nunderstanding  of  the  controlling  parameters  will  lead  to  a  more  optimal  setting \nof the  control  on  the  process,  which  will  lead  ultimately  to  a  better  quality  final \nproduct. \n698 \n\n\fANN Modelling of a Steel  Rolling Plate  Mill \n\n699 \n\nIn  this  paper,  we  consider  the  problem  of  modelling  the  plate  yield  stress  in  a \nhot  steel  rolling  plate  mill.  Rolling  is  a  process  of  plastic  deformation  and  its \nobjective is  achieved  by subjecting  the material to forces  of such  a  magnitude that \nthe reSUlting  stresses  produce  permanent change of shape.  Apart from  the obvious \ndependence  on  the  materials  used,  the  characteristics  of the  material  undergoing \nplastic  deformation  are  described  by  stress,  strain  and  temperature,  if the  rolling \nis  performed  on  hot  slabs.  In  addition,  the  interpass  time,  i.e.,  the  time  between \npasses  of the slab  through  the rollers  (an  indirect  measure of the  rolling  velocity), \ndirectly  influences  the  metallurgical structure of the  metal during rolling. \n\nThere  is considerable evidence  that the yield stress  is  also dependent  011  the strain \nrate.  In fact,  it is  observed  that as  the strain  rate increases,  the  initial yield  point \nincreases  appreciably,  but after an extension is  achieved,  the effect  of strain rate on \nthe yield stress  is very  much reduced  [see,  e.g.,  Underwood,  1950]. \n\nThe  effect  of  temperature  on  the  yield  stress  is  important.  It is  shown  that  the \nresistance  to deformation increases  with  a decrease  in  temperature.  The resistance \nto  deformation  versus  temperature  diagram shows  a  \"hump\"  in  the  curve,  which \ncorresponds  to  the  temperature at  which  the structure  of material changes  funda(cid:173)\nmentally [see,  e.g.,  Underwood,  1950, Hodgson  &  Collinson,  1990]. \nUsing,  e.g., an energy  method, it is  possible to formulate a  theoretical model of the \ndependence  of deformation  resistance  on  temperature,  strain,  strain  rate,  velocity \n(indirectly,  the  interpass  time).  One  may  then  validate  the  theoretical  model  by \nperforming  a  rolling experiment  on  a  piece  of material,  perhaps  under  laboratory \nconditions  [see  .e.g., Horihata,  Motomura,  1988,  for  consideration  of a  three  roller \nsystem] . \nIt is difficult  to apply  the derived  theoretical  model to a  practical situation, due  to \nthe  fact  that in  a  practical  process,  the  measurement of strain and strain  rate  are \nnot  accurate.  Secondly,  one cannot  possibly  perform  a  rolling experiment on each \nnew  piece of material to be  rolled.  Thus though  the theoretical model may serve  as \na  guide to our  understanding of the  process,  it is  not suitable for  controller  design \npurposes. \n\nThere  are  empirical  models  relating  the  resistance  of deformation  to  temperature, \nstrain and  strain  rate  [see,  e.g., Underwood,  1950, for  an  account  of older models]. \nThese  models  are  often  obtained  by  fitting  the  observed  data  to  a  general  data \nmodel. \n\nThe following  model has been  found  useful  in  fitting  the observed  practical data \n\nkm  =  a{b sinh -1 (ci: exp( T)!) \n\nd \n\n(1) \n\nwhere  km  is  the  yield  stress,  {  is  the  strain,  i  is  the  corresponding  strain  rate, \nand  T  is  the  temperature.  a, b, c, d  and  f  are  unknown  constants.  It  is  claimed \nthat  this  model  will  give  a  good  prediction  of the  yield  stress,  especially  at  lower \ntemperatures,  and  for  thin plate  passes  [Hodgson  &  Collinson,  1990] . \nThis  model  does  not  always  give  good  predictions  over  all  temperatures  as  mill \nconditions vary  with  time,  and  the  model is  only  \"tuned\"  on  a  limited set of data. \n\n\f700 \n\nTsoi \n\nIn  order  to  overcome  this  problem,  McFarlane,  Telford,  and  Petersen  [1991]  have \nexperimented  with  a  recursive  model  based  on  the  Kalman filter  in  control  theory \nto  update  the  parameters  (see,  e.g.  Anderson,  Moore,  [1980]),  a, b, c, d, /  in  the \nabove model.  To better  describe  the  material behaviour at different  temperatures, \nthe  model explicitly incorporates  two  separate sub-models with  a  temperature  de(cid:173)\npendence: \n1.  Full crystallisation (T < Tupper) \n\nkm  = alb sinh-1(ci exp( :)J) \n\nThe constants a, b, c, d, f  are  model coefficients. \n2.  Partial recrystallisation (Tiower  ~ T  ~ Tupper). \n\nkm  = a({ + f*)bsinh-1(ciexp(:)J) \nto .5 = j(Ai-lfi-l + {i)9 h\u00abq(Ti - 1, n)h\u00bb \nAi  = h(t, to.5) \n\n(2) \n\n(3) \n\n(4) \n(5) \n\nwhere  A is  the fractional  retained strain; {*,  expressed  as  a  Taylor series expansion \nof Ai-l (i-l, is the retained strain; t  is the interpass time; to.5  is the 50 % recrystalli(cid:173)\nsation time; q(n-l, Ti)  is  a  prescribed  nonlinear function  of n-l and n; h(.) and \n12(.)  are  pre-specified  nonlinear  functions;  i,  the  roll  pass  number;  j, h, g  are  the \nmodel coefficients; Tupper  is an experimentally determined temperature at which the \nmaterial  undergoes  a  permanent change  in structure;  and  1iower  is  a  temperature \nbelow  which  the  material does  not exhibit any  plastic  behaviour. \n\nModel  coefficients  a,b,c,d,/,g,h,j are  either  estimated  in  a  batch  mode  (i.e.,  all \nthe  past  data are  assumed  to  be  available simultaneously)  or  adapted  recursively \non-line  (i.e.,  only  a  limited  number  of the  past  data is  available)  using  a  Kalman \nfilter  algorithm in order  to provide the best model predictions [McFarlane, Telford, \nPetersen,  1991]. \nIt is  noted  that  these  models  are  motivated  by  the desire  to fit  a  nonlinear  model \nof a  special  type,  i.e., one which  has an inverse  hyperbolic sine function.  But, since \nthe  basic  operation  is  data fitting,  i.e.,  to  fit  a  model  to  the  set  of given  data,  it \nis  possible  to consider  more general nonlinear  models.  These  models may not have \nany  ready interpretation in  metallurgical terms,  but these  models may be  better in \nfitting a  nonlinear model to the given data set in  the sense  that it may give a better \nprediction  of the output. \n\nIt has  been  shown  (see,  e.g.,  Hornik  et  aI,  1989)  that  a  class  of artificial  neural \nnetworks,  viz.,  a  multilayer perceptron  with a single  hidden  layer  can approximate \nany  arbitrary  input  output  function  to  an  arbitrary  degree  of accuracy.  Thus  it \nis  reasonable  to  experiment  with  different  classes  of  artificial  neural  network  or \ninduction  tree  structures  for  fitting  the  set  of given  data  and  to  examine  which \nstructure gives the best  performance. \n\n\fANN Modelling of a Steel  Rolling Plate  Mill \n\n701 \n\nThe  structure  of the  paper  is  as  follows:  in  section  2,  a  brief  review  of a  special \nclass of neural networks is given.  In section 3, results in applying the neural network \nmodel to the plate mill data are given. \n\n2  A  Tree Based  Neural  Network model \n\nFriedman  [1991]  introduced  a  new  class  of  neural  network  architecture  which  is \ncalled  MARS  (Multivariate  Adaptive  Regression  Spline).  This  class  of methods \ncan  be  interpreted  as  a  tree  of neurons,  in  which each  leaf of the  tree  consists  of a \nneuron.  The model of the  neuron  may be a  piecewise  linear polynomial, or a  cubic \npolynomial, with  the  knot  as  a  variable.  In  view  of the lack  of space,  we  will  refer \nthe  interested  readers  to Friedman's paper  [1991]  for  details on  this method. \n\n3  Results \n\nMARS  has  been  applied  to  the  platemill  data.  We  have  used  the  data  in  the \nfollowing  manner. \n\nWe concatenate different runs of the plate mill into a single time series.  This consists \nof 2877  points corresponding to 180  individual plates with approximately 16  passes \non each  plate.  There are 4 independent  variables, viz., interpass time, temperature, \nstrain, and strain rate.  The desired  output variable is  the yield stress. \n\nA  plot  of  the  individual  variables,  viz  temperature,  strain,  strain  rate,  interpass \ntime and stress  versus  time reveal  that  the  variables  can  vary  rather  considerably \nover  the  entire  time series.  In  addition,  a  plot of stress  versus  temperature,  stress \nversus strain, stress  versus strain rate and stress  versus  interpass time reveals  that \nthe functional  dependence  could  be highly nonlinear. \nWe  have  chosen  to  use  an  additive  model  (Friedman  [1991]),  instead  of the  more \ngeneral  multivariate model,  as  this will  allow  us  to observe  any  possible  nonlinear \nfunctional  dependencies  of the output as a  function of the inputs. \n\n(6) \nwhere  k., i = I, 2, 3, 4 are gains,  and fi' i = 1, 2, 3,4 are piecewise  nonlinear polyno(cid:173)\nmial models found  by  MARS. \n\nThe results are  as  follows: \n\nBoth  the  piecewise  linear  polynomial and  the  piecewise  cubic  polynomial are  used \nto study this set of data.  It is found that the cubic polynomial gives a better fit  than \nthe  linear  polynomial fit.  Figure  I(a)  shows  the error  plot  between  the  estimated \noutput from  a  cubic  spline  fit,  and  the  training data.  It is observed  that the error \nis  very small.  The maximum error is  about -0.07.  Figure  I(b) shows  the plot of the \npredicted yield stress and the original yield stress over  the set of training data. \n\nThese  figures  indicate that the  cubic  polynomial fit  has captured  most of the  vari(cid:173)\nation  of the  data.  It  is  interesting  to  note  that  in  this  model,  the  interpass  time \n\n\f702 \n\nTsoi \n\n.1' \n\n13 \n\n12 \n\n- 12 \n\nJI \n\n28 \n\n2' \n\nFigure  1:  (a)  The prediction error on  the training data set  (b)  The prediction and \nthe training data set  superimposed \n\nplays no significant part.  This feature  may be  a  peculiar aspect  of this set of data \npoints.  It is  not true  in  general. \nIt is  found  that  the  strain  rate  has  the  most  influence  on  the  data,  followed  by \ntemperature,  and  followed  by  strain.  The  model,  once  obtained,  can  be  used  to \npredict  the yield stress from  a given  set of temperature, strain, and strain rate. \n\nFigure 2(a)  shows  the  prediction  error  between  the  yield  stress  and  the  predicted \nyield stress on a set of testing data, i.e.  the data which is not used to train the model \nand  Figure 2(b) shows a  plot of the predicted  value of yield stress superimposed on \nthe original yield stress. \nIt is  observed  that  the  prediction  on  the  set  of  testing  data  is  reasonable.  This \nindicates that the MARS model has captured most of the  dynamics underlying the \noriginal  training data, and  is  capable of extending  this captured  knowledge onto a \nset of hitherto unseen  data. \n\n4  Comparison with  the  results obtained by conventional \n\napproaches \n\nIn  order  to  compare  the  artificial  neural  network  approach  to  more  conventional \nmethods for  model tuning,  the same data set  was  processed  using: \n\n1.  A  MARS  model with  cubic  polynomials \n2.  An inverse hyperbolic sine law model using least square batch parameter tuning \n\n\fANN Modelling of a Steel  Rolling Plate  Mill \n\n703 \n\n.. ~ \n... \n.3. \n.. ~ \n\n. \u2022 2. \n\n-.\u2022. ~ \n\n32 \n\n28 \n\n26 \n\n2' \n\n22 \n\n. 3 \n\n.6 \n\n\u2022\u2022 \n\n'2 \n\nII ... \n16V \n\n- .\u2022 ~.  ~  \u2022\u2022 \n\n61 \n\n, \u2022 \u2022  \" \n\n'3  ...  '61  , .   211  23  24.  26' \n\n. . . .  \n\n3 \n\n\u2022\u2022 \n\n61 \n\n\u2022 \n\n' \"   .3  ...  '61  , .   211  23  2\" \n\n26' \n\nFigure  2:  (a)  The  prediction error  on  the  testing  data set  (b)  The  prediction  and \nthe  testing  data set superimposed \n\n3.  An  inverse  hyperbolic sine law  model using  a recursive  least squares  tuning \n4.  CART based  classification  [Brie men et.  al.,  1984] \n5.  C4.5 based  method [Quinlan,  1986,1987] \n6.  Bayesian  classification  [Buntine,  [1990] \n\nIn each case,  we  used  a training data set of78 plates (1242 passes) and a testing data \nset of 16 plates (252 passes).  In the cases of CART, C4.5, and Bayesian classification \nmethods, the yield stress variable is  divided  equally into 10  classes,  and this is used \nas  the desired  output instead of the original real  values. \nThe comparison of the results between MARS and the Kalman filter based approach \nare shown  in  the following table \n\nBll \nmean% \n-.64 \nmean abs%  4.61 \nstd % \n\n6.26  5.11 \n\nB12  All  Al2  ell  C12 \n-0.2  4.5 \n1.69 \n-.64 \n4.22  4.61 \n5.3 \n3.5 \n4.9 \n4.7 \n\n2.38 \n5.3 \n6.26  6.25 \n\nwhere \nBll  =  Batch  Tuning:  tuning  model  (  forgetting  factors  =1  in  adaption)  on  the \ntraining data \nB12  =  Batch Tuning:  running  tuned  model on  the  testing data \nAll  =  Adaptation:  on the  training data \nAl2  =  Adaptation:  on the  testing data \n\n\f704 \n\nTsoi \n\nCll =  MARS  on  the  training data \nCu  =  MARS  on  the  testing data, \nand  mean% = mean\u00abkmea,  - kpred)/kmea,), \nmeanabs% = mean(abs(kmea,  - kpred)/kmea,)), \nstd% =  stdev(kmea,  - kpred)/kmea,);  where  mean and stdev stands for  the  mean \nand the standard deviations respectively,  and  kmea\"  kpred  represents the measured \nand predicted  values of the yield stress  respectively. \n\nIt is found that the MARS based model performs extremely well compared with the \nother  methods.  The standard deviation  of the prediction errors in a  MARS  model \nis  considerably  less  than the corresponding  standard deviation of prediction errors \nin  a  Kalman filter  type  batch or online tuning model on the testing data set. \n\nWe  have  also  compared  MARS  with  both  the  CART based  method  and  the  C4.5 \nbased  method.  As  both  CART  and  C4.5  operate  only  on  an  output  category, \nrather  than  a  continuous  output  value,  it  is  necessary  to  convert  the  yield  stress \ninto a  category  type  of variable.  We  have  chosen  to  divide equally  the  yield stress \ninto  10  classes.  With  this  modification,  the  CART and  C4.5  methods  are  readily \napplicable. \nThe following table summarises the results of this comparison.  The values given are \nthe  percentage  of the  prediction error  on  the  testing  data set  for  various methods. \nIn  the  case  of  MARS,  we  have  converted  the  prediction  error  from  a  continuous \nvariable into the corresponding  classes  as  used  in  the CART and  C4.5  methods. \n\nI Bayes I CART I C4.5  I MARS I \n\n65.4 \n\n12.99 \n\n16.14  6.2 \n\nIt is found  that the MARS model is more consistent in predicting the output classes \nthan either  the CART method,  the  C4.5  based  method, or the  Bayesian  classifier. \nThe fact  that the MARS  model performs better than the CART model can be seen \nas  a  confirmation  that  the  MARS  model  is  a  generalisation  of the  CART model \n(see  Friedman  [1991]).  But  it  is  rather  surprising  to  see  that  the  MARS  model \noutperforms a  Bayesian  classifier. \n\nThe  results  are  similar over  a  number  of other  typical  data  sets,  e.g.,  when  the \ninterpass time variable becomes significant. \n\n5  Conclusion \n\nIt is  found  that  MARS  can  be  applied  to model  the  platemill data with  very  good \naccuracy.  In  terms  of predictive  power  on  unseen  data,  it  performs  better  than \nthe  more  traditional methods, e.g.,  Kalman filter  batch or online tuning methods, \nCART,  C4.5 or Bayesian  classifier. \nIt is  almost impossible to convert  the MARS  model into one given in section  1.  The \nHodgson-Collinson  model  places a  breakpoint at a  temperature of 925 deg G, while \nin  the  MARS  model,  the  temperature  breakpoints  are  found  to  be  at  1017 degG \nand  1129 deg C  respectively.  Hence  it  is  difficult  to  convert  the  MARS  model  into \nthose given by  the Hodgson-Collinson model, the  Kalman filter  type models or vice \n\n\fANN Modelling of a Steel  Rolling Plate  Mill \n\n705 \n\nversa. \n\nA  possible  improvement  to  the  current  MARS  technique  would  be  to  restrict  the \nbreakpoints, so  that they must exist within a temperature region where  microstruc(cid:173)\ntural  changes  are  known  to occur. \n\n6  Acknowledgement \n\nThe  author acknowledges  the  assistance  given  by  the staff at the  BHP  Melbourne \nResearch  Laboratory in providing the data, as  well  as in  providing the background \nmaterial in  this paper.  He  specially thanks  Dr D  McFarlane in giving his generous \ntime in  assisting in  the understanding of the  more traditional approaches,  and also \nfor  providing  the  results  on  the  Kalman  filtering  approach.  Also,  he  is  indebted \nto  Dr  W  Buntine,  RIACS,  NASA,  Ames  Research  Center  for  providing  an  early \nversion  of the  induction tree  based  programs. \n\n7  References \n\nAnderson,  B.D.O.,  Moore,  J .B.,  (1980).  Optimal  Fitering.  Prentice  Hall,  Eagle(cid:173)\nwood,  NJ. \nBrieman,  L.,  Friedman,  J.,  Olshen,  R.A.,  Stone,  C.J.,  (1984).  Classification  and \nRegrression  Trees.  Wadworth,  Belmont, CA. \nBuntine,  W,  (1990).  A  Theory  of Learning  Classification  Rules.  PhD  Thesis sub(cid:173)\nmitted to the  University of Technology, Sydney. \nFriedman,  J,  (1991).  \"Multivariate  Adaptive  Regression  Splines\".  Ann  Stat.  to \nappear.  (Also, the implication of the paper on neural network models was presented \norally in  the  1990  NIPS  Conference) \nHodgson,  Collinson,  (1990).  Manuscript  under  preparation  (authors are  with  BHP \nResearch  Lab.,  Melbourne,  Australia) . \n\nHorihata,  M,  Motomura,  M,  (1988).  \"Theoretical analysis of 3-roll  Rolling Process \nby  the energy  method\".  Trans  of the  Iron  and  Steel  Institute  of Japan,  28:6, 434-\n439. \nHornik, K., Stinchcombe, M., White, H., (1989).  \"Multilayer Feedforward Networks \nare  Universal  Approximators\".  Neural  Networks,  2,  359-366. \n\nMcFarlane,  D,  Telford,  A,  Petersen,  I,  (1991).  Manuscript  under  preparation \nQuinlan,  R.  (1986).  \"Induction of Decision  Trees\".  Machine  Learning.  1,81-106. \n\nQuinlan,  R.  (1987).  \"Simplifying Decision  Trees\".  International  J  Man-Machine \nStudies.  27, 221-234. \nUnderwood,  L R,  (1950).  The  Rolling  of Metals.  Chapman &  Hall,  London. \n\n\f", "award": [], "sourceid": 482, "authors": [{"given_name": "Ah", "family_name": "Tsoi", "institution": null}]}