location: Current position: Shaohua Jiang's homepage >> Scientific Research >> Paper Publications

Web-Based Text Mining for Extracting Relationships among Policies of Building and the Construction Industry

Hits:

Indexed by:会议论文

Date of Publication:2014-09-27

Included Journals:EI、Scopus

Page Number:237-245

Abstract:Web-based data mining is an emerging technology that is increasingly being applied in Decision Support Systems in many industries. The objectives of this study are to develop a Web-based text mining system specific to the building and construction industry for web content gathering, handling and analysis. An authoritative website in the housing industrialization field is chosen as a case on which to carry out this study by means of four steps, as follows. First, a web crawler module for gathering the original web articles has been constructed. Second, the web content processing module is used to parse the HTML tags and segment the text content of each page. Then, the relational database deployed in a cloud platform is used to store the processing result. Finally, the Vector Space Model and TF-IDF algorithm are used to represent articles and calculate the relationship among all the articles gained in the web crawler module. As the government issues the news and policies online continuously, it is possible for people to know the key points and trends embodied in these policies in time by way of the proposed text mining system. ? 2014 American Society of Civil Engineers.

Pre One:结合IFC标准的建设项目中文文本分类研究

Next One:Research of Chinese Construction Project Document Classification Combined with the IFC Standard