CLICK HERE to download the data as a single zip file titled
This data sets includes 216 news on 240 wind turbine accidents between the years 1980 and 2013. The analysis of this data set and the insights obtained are reported in the following research papers:
Ertek, G., Chi, X., Zhang, A. N., & Asian, S. (2017, December). Text mining analysis of wind turbine accidents: An ontology-based framework. In Big Data (Big Data), 2017 IEEE International Conference on (pp. 3233-3241). IEEE.
The story of how the data is collected, and the explanation of the data are given below.
As of now, the most extensive data available on the Internet on wind turbines accidents is published by the Caithness Windfarm Information Forum (CWIF), a UK-based grassroots organization opposing wind turbine installations.
While the Caithness list is impressive in magnitude, the quality and reliability of the list is open to discussion because of the following reason:
* Many of the web links to the news sources are not valid, and some of the accidents appear in multiple lines of the data.
In spite of containing much more magnitude of data, the data available in other online sources also exhibit similar deficiencies.
So, there are problems when it comes to using the Caithness data or other data in research studies. To this end, we collected data on wind turbine accidents ourselves, also using the data from Caithness and we share our collected data on this page (please click the link at the top of the page to download the data).
The data we collected consists of three folders, and a MS Excel file.
The folder News.txt contains the accident news, with each news in a separate text file:
The folder News.doc contains news, with each news in a separate MS Word file:
The MS Excel file News.Database.xlsx contains the structured data created based on the detailed reading of the accident news text:
The MS Excel file is the file that was analyzed in our research paper.