NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Reviewer 1
Summary ======= The authors propose to use a already known method as a pooling function for time series. The idea is to leverage an integral transform, the *path signature*, to map a discretized curve on a real valued sequence. Truncation leads to a vectorized representation which is then used in practice. The proposed transformation is differentiable and thus can be integrated into models trained via backpropagation in an end-to-end manner. As the transformation consumes the one dimension, i.e. the time, stacking it requires to reintroduce a time series like structure in the output. One way of doing so is discussed by the authors. Finally, the application is evaluated on synthetic datasets in the context of (1) learning a generative model, (2) supervised learning of a system hyper-parameter, (3) reinforcement learning. Originality =========== Applying functional transforms as pooling is not a new idea. However, using path signatures as introduced in this work may be beneficial for certain applications. Quality ======= The overall notation is good and easy to follow. Maybe a little more effort could have been spent on introducing the path signature, e.g., the notion of a tensor product is ambiguous and thus should be explicitly defined. There are two major issues which have to be specifically addressed: (1) The lack of motivation. There are many ways of computing signatures of curves, indeed every spectral representation, for example. I do not the *selling point* for the path signature. Are there striking benefits over other transformations? Correlated to this, the related work section is stunningly short. Are there indeed no other approaches using some sort of functional transformation as intermediate differentiable function within the deep learning framework? (2) There is no conclusion! Given that the motivation is somehow assumed to be granted I would at least expect a conclusion summarizing the contributions and insights. Clarity ======= The overall clarity is good. The figures are well executed and contribute to the readers understanding. Significance ============ The authors do not outline and discuss the theoretical benefits of the path signature (although emphasizing the rich theoretical body behind it). In this context the contribution boils down to applying an already known functional transform as intermediate layer in a deep model and showing that it can outperform on not-benchmark synthetic datasets. From my perspective this sums up to a rather low contribution/significance.
Reviewer 2
The paper presents an interesting approach that allows signatures to be used as a layer anywhere within a neural network. The development of the method is well justified by both logical and theoretical argument and is technically sound. Simple as is, the proposed framework has revised the traditional feature transformation based Signature usage, and enriched the body of NN research. Moreover, the discussion about inverting signatures and generative model provides further insights.
Reviewer 3
Originality: This paper proposes a method to use the signature transform as a layer of neural networks. If this is the first work which successfully integrates the signature transform into deep learning, the novelty is high. The previous studies and the motivation of the work are well introduced. Quality: The way the signature transform is integrated into neural networks technically sounds. This seems to be an early work and the experiments are not very extensive. It is basically like "this works". Therefore, it is not very clear if the proposed method is meaningfully better in the real world applications. It would be great to have better explanations on what types of problems can be solved better with the proposed method than existing ones, and it is confirmed by experiments (i.e., more things like Figure 6 and Table 1). Clarity: The writing looks good to me. The experimental section is too brief and it could be improved. The paper needs to have conclusions. It was not clear how the sigmas in both equations in Section 6 relate each other. Significance: I think this is important work. If the signature transformation can be used as a layer and it has unique and important characteristics which are not trivial to represent by other means, the significance of the work is large.