Text Mining with RapidMiner

The goal of this chapter is to introduce the text mining capabilities of RAPIDMINER through a use case. The use case involves mining reviews for hotels at TripAdvisor.com, a popular web portal. We will be demonstrating basic text mining in RAPIDMINER using the text mining extension. We will present two different RAPIDMINER processes, namely Process01 andProcess02, which respectively describe how text mining can be combined with association mining and cluster modeling. While it is possible to construct each of these processes from scratch by inserting the appropriate operators into the process view, we will instead import these two processes readily from existing model files. Throughout the chapter, we will at times deliberately instruct the reader to take erroneous steps that result in undesired outcomes. We believe that this is a very realistic way of learning to use RAPIDMINER, since in practice, the modeling process frequently involves such steps that are later corrected.

Ertek, G., Tapucu, D., and Arın, I., 2013. Text Mining with RapidMiner. In: Markus Hofmann, Ralf Klinkenberg (Eds.) RapidMiner: Data Mining Use Cases and Business Analytics Applications. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series. Chapman and Hall/CRC.

Note: This is the final draft version of this paper. Please cite this paper (or this final draft) as above.

Download
Text Mining With Rapidminer

Download SUPPLEMENT Data
TripAdvisor Dataset

Dr. Gürdal Ertek recommends the following related books:

 

 

Share