Paper ID: 683
Title: Unsupervised Structure Learning of Stochastic And-Or Grammars
Reviews

Submitted by Assigned_Reviewer_6

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper presents a method for learning the structure of stochastic And-Or grammars. The paper suggests that this "generalizes" previous work on structure learning, but it's actually a special case, which makes the problem tractable. This is a reasonable point on its own, and the paper makes a nice contribution, so I wouldn't try to argue that the problem is more general than the structure learning problem faced in NLP. The basic algorithm is sensible and successful when compared against other methods for inducing grammars.

The experimental evaluation could be improved by adding more comparisons to other methods. In particular, the method of Stolcke and Omohondro (reference 14) seems like it could be applied to both tasks and would be a valuable comparison point.
Q2: Please summarize your review in 1-2 sentences
This is a solid paper presenting a method for learning the structure of a particular class of grammars.

Submitted by Assigned_Reviewer_7

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper presents a method for learning the structure as well as the parameters of stochastic AND-OR grammars. Such grammars contain AND rules and OR rules and can be used to represent several recursive phenomena, including natural language and even grammars. The authors present a nice method for unsupervised structure learning of these grammars by introducing new AND-OR fragments at consecutive steps, and measuring the likelihood and prior gains of their model.

The authors present experiments on two tasks: learning event grammars and learning image grammars. In both they achieve results that are competitive with prior art.

I liked the overall paper as it seems to be a tractable way of learning stochastic grammars that can be modeled using AND and OR rules. My criticism of the paper stems from the following observations:

1) The authors do not mention how tractable the learning algorithm is. Will it scale to thousands of datapoints?

2) I would have liked to seen experiments on natural language sentences as natural language is the most obvious application of such grammars. Will it be even possible to learn using the presented methods on the Penn Treebank dataset for example, on which previous work has focused on (say, Klein and Manning)?
Q2: Please summarize your review in 1-2 sentences
This paper presents a way of estimating the structure and parameters of stochastic AND-OR grammars and presents nice results on two tasks; I would have liked to see more experiments, especially on natural language data.

Submitted by Assigned_Reviewer_8

Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper proposes a new approach to unsupervised learning of the
structure of AND-OR grammars. In contrast to previous approaches, the
present one induces unified AND-OR fragments, rather than searching
separately for AND and OR operations. The value of a proposed new
AND-OR fragment can be computed efficiently using a set of sufficient
statistics. The approach is evaluated on several tasks, parsing
events and visual objects.

This is a generally good paper. The problem is important. The
approach is novel and well-motivated. The algorithm appears
technically sound, but I was not able to check the supplemental
derivations. A few things weren't clear to me, which could just
reflect my own limited time for reading, but I would urge the authors
to mark these points more clearly: First, it wasn't clear whether the
"surrogate measure" of the likelihood gain was an approximate or an
exact notion. Second, it wasn't clear whether the approach could
extend to learning fragments with -arity more than 2, or whether it
could only tractably learn grammars in "Chomsky normal form".

My main concerns with the paper focus on the experiments. I had
trouble understanding most aspects of them: exactly what was done, how
the examples were represented, and what was learned. I got the
impression that the paper's ideal audience is a reader who has
followed, studied intimately and preferrably re-implemented all of the
recent visual-grammar papers coming out of S. C. Zhu's group. To such
a reader, the experimental section would probably have been more
understandable and the results more useful. But for outsiders, it was
less clear. I understand that the short NIPS format has its
limitations. But fortunately NIPS permits supplementary materials,
and if the paper is accepted, I would urge the authors to include in
their supplement many more details and concrete illustrations of their
experiments.

I would especially like to have seen (or to see) examples of any
interesting structure discovered in the learned grammars. The paper
has a great set up about the importance of grammars and structure
learning, but then it is a bit of a letdown for these expectations to
see that the results are presented purely quantitatively. I agree
with the authors that those quantitative results are satisfactory on
their own terms. But they are not very exciting or illuminating. I
would have found the paper more compelling if the authors could show
that interesting structure is learned, and that more interesting
structure is learned by this approach relative to competing
approaches.
Q2: Please summarize your review in 1-2 sentences
This is an interesting paper on unsupervised learning of AND-OR grammars. While I liked it, I had trouble following the experiments and interpreting the results, not being familiar with a lot of prior work (mostly from Zhu's UCLA group) that it seemed to build on heavily. If accepted, the authors should include supplementary material with more details and illustrations of how the experiments worked and what interesting structure was learned.
Author Feedback

Q1:Author rebuttal: Please respond to any concerns raised in the reviews. There are no constraints on how you want to argue your case, except for the fact that your text should be limited to a maximum of 6000 characters. Note however that reviewers and area chairs are very busy and may not read long vague rebuttals. It is in your own interest to be concise and to the point.
We thank the three reviewers for their helpful comments and suggestions.

Review 1:
We plan to perform more comprehensive comparisons as you suggested and we are indeed in the process of extending approaches such as Stolcke and Omohondro’s to learning general And-Or grammars. On the other hand, the approaches that we compared with in the paper are representative of many previous approaches in that they learn And-nodes and Or-nodes separately.

Review 2:
Regarding scalability:
The algorithm runs reasonably fast. Our prototype implementation can finish running under 5 minutes on a desktop with 5000 training samples from our synthetic image dataset. We will discuss this in the paper.

Regarding experiments on language data:
Because of space limit, we decided to focus on the problems that And-Or grammars were originally proposed for, i.e., image/video parsing. We will present experiments of learning natural language grammars in the extended version of the paper.

Review 3:
Regarding the two questions:
1. The measure is exact in terms of Viterbi likelihood under the assumption of grammar unambiguity.
2. The current algorithm is capable of learning And-Or fragments with more than two Or-nodes.
We will clarify them in the paper.

Regarding the presentation of the experiments, we will incorporate more details and illustrative examples into the supplementary materials as you suggested.