Notes on Text Mining and Analytics - 6 (TMA笔记 6)

本系列「Text Mining and Analytics」文章共六篇,记录我学习Coursera课程Text Mining and Analytics, ChengXiang Zhai,UIUC. 同时希望能与大家分享交流。
原文地址:https://kyzhang.me/2018/05/06/text-mining-week6/

Text-based prediction

  • Latent Aspect Rating Analysis (LARA): The latent aspect weights are not necessarily equal; they are inferred using maximum likelihood. LARA is a generative model for inferring ratings of latent aspects. LARA is composed of two stages: aspect segmentation and latent rating regression.

  • NetPLSA has an additional term in its objective function that penalizes cases where neighbor nodes are assigned different topic coverage. NetPLSA leverages the power of both the text and the network structure to mine topics.

  • The objective function of NetPLSA, increasing λ will make neighbor nodes have more similar topic coverage.

  • Contextual Probabilistic Latent Semantic Analysis (CPLSA) can be applied to discovering temporal trends of topics in text and revealing how the coverage of topics in different locations evolves over time

  • To measure the causality between two series, Granger is often used.

Course summary

六周的课程很快结束了,仔细回想,其实有很多点值得深入理解,课程slides后面对比较重要的点都有对应的文献推荐阅读,日后还是应该自己再深挖一下。

Topics Covered in This Course
Key High-Level Take-Away Messages

Reference

[1] C. Zhai and S. Massung, Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining. ACM and Morgan & Claypool Publishers, 2016. Chapters 18 & 19.
[2] Hongning Wang, Yue Lu, and ChengXiang Zhai, Latent aspect rating analysis on review text data: a rating regression approach. In Proceedings of ACM KDD 2010, pp. 783-792, 2010. doi: 10.1145/1835804.1835903
[3] Hongning Wang, Yue Lu, and ChengXiang Zhai. 2011. Latent aspect rating analysis without aspect keyword supervision. In Proceedings of ACM KDD 2011, pp. 618-626. doi: 10.1145/2020408.2020505
[4] ChengXiang Zhai, Atulya Velivelli, and Bei Yu. A cross-collection mixture model for comparative text mining. In Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2004). ACM, New York, NY, USA, 743-748. doi: 10.1145/1014052.1014150
[5] Qiaozhu Mei, Contextual Text Mining, Ph.D. Thesis, University of Illinois at Urbana-Champaign, 2009.
[6] Hyun Duk Kim, Malu Castellanos, Meichun Hsu, ChengXiang Zhai, Thomas Rietz, and Daniel Diermeier. Mining causal topics in text data: Iterative topic modeling with time series feedback. In Proceedings of the 22nd ACM international conference on information & knowledge management (CIKM 2013). ACM, New York, NY, USA, 885-890. doi: 10.1145/2505515.2505612
[7] Noah Smith, Text-Driven Forecasting. Retrieved on May 31, 2015 from http://www.cs.cmu.edu/~nasmith/papers/smith.whitepaper10.pdf