This pair of papers evaluates the prediction of product sales based on both linguistic and quantitative aspects of online content. In the Pryzant et al. paper, the authors looked specifically at product descriptions. They obtained more than 93,000 health and chocolate project descriptions for the website Rakuten in order to evaluate these product descriptions linguistically, expanding on previous studies that examined summary stats. Using a neural network that controls for confounding features (pricing, brand loyalty, etc.), they identify a set of words and writing styles that have high impact on sales outcomes. In contrast, the Hu et al. paper examines the influence of online reviews. Rather than performing a linguistic analysis, they instead examine features such as quality of reviewers and age of an item (number of reviews).
I did really enjoy reading through the Pryzant paper. The thorough explanation of the neural network mathematics really helped to make it clear what the authors were doing, and the experiments section was clear and well-written. I think my biggest criticism of the paper is that, if you strip away all of this explanation, it doesn’t feel like the authors did all that much. They extend a neural network to meet their feature selection goals, tokenized two different datasets, ran the model, and reported a few results. This area of research is certainly not my area of expertise, but this feels like a single research question workshop paper or class project. The class project my group is building has 3 (arguably 4) distinct research goals.
Beyond that, the authors don’t spend much time discussing the lack of general cultural applicability of their findings. They note the extensibility of the project to a general lexicon near the very end of the conclusion, and that’s about it. There is no indication of how these results are applicable to any language/culture outside of Japanese/Japan. Additionally, their “seasonality” result seems to me to be too close to some of the confounding variables that the authors wanted to eliminate. Is there really that big of a difference between marketing a product with “free shipping!” in the description and marketing a seasonal item with “great Christmas gift!” in the same place?
Two stylistic criticisms for the Hu paper: (1) I think it could have been better organized by grouping the hypothesis and result of each research question together, rather than having separate hypothesis and result sections (and I feel the same way about our class project final report). I frequently found myself paging back and forward between results, hypotheses, and background that led to those hypotheses. (2) I was intrigued by the tabular related work approach. I can see it being useful in well-developed fields and for survey papers. However, in more recent and novel research, this approach makes it difficult to understand the novelty of the work performed by the authors. It’s more of a list of past results rather than an explanation of the authors’ contributions.