Summary: |
In the real estate industry information are an essential good, which influences the behaviour of market participants. One main source of information about the market are news articles. For the financial markets and especially for the real estate market the quantification of text represents a new source for the extraction of market sentiment. In this study, I examine a newly constructed corpus of news articles regarding the London real estate market, with the help of supervised learning algorithms (i.e. SVM, Maximum Entropy, GLMNET). More than 100,000 articles are used over a period of 11 years (2004-2015). One central issue during this process is the annotation of the documents in the training corpus. Since the real estate market does not offer an annotated news corpus and labelling such a large corpus manually would be expensive in different ways, I propose a new method of how this gap can be overcome. The use of real estate related Amazon book reviews for the training process of the different classifiers has been proven to be quite promising. I used more than 220,000 reviews for the training process. The results suggest, that the book reviews are a good substitute and classifiers trained on the reviews are able to extract the sentiment from the articles. Satisfying graphical results reveal, at least for some of the different classifiers, that the underlying market sentiment was extracted. The textual sentiment indicators are also able to improve the performance of different models. In this study, I will use the textual indicators in a probit model to see whether they have any signalling power about future developments. |