Various types of data are produced by the shipbuilding and offshore industry. As the number of ships and offshore structures that were built over time increased, an enormous amount of data, called big data, had to be handled. However, it is difficult to handle effectively such big data with the existing methodology for data storage and processing. Therefore, big data technology needs to be applied to the systems of shipyards, such as the product lifecycle management system. On the other hand, the construction of an offshore structure requires a lot of piping, and there are many materials as much as piping. For a shipyard that executes multiple projects at once, it is not easy to correlate so much piping material. A piping designer should check all materials based on his or her understanding of the design characteristics. However, depending on the maturity of the designer, it can be difficult to handle such large data manually; as a result, there can be errors in piping design. In this study, a big data framework applicable to the shipyard is proposed and used for the analysis. That is, the association analysis of piping materials of an offshore structure is performed based on the big data framework that can process a large amount of data to assist piping designers. As an application, after analyzing the material data for one offshore structure, the applicability of this study was evaluated through the results. We believe this study can help piping designers.

1. Introduction
1.1. Research background

One of the fastest growing technologies in several industries is big data and its applications. Big data can be defined as a series of computer technologies that can store, process, and manage much more data than were handled before. From this next-generation technology and architecture, data can be easily gathered and analyzed. Big data is often characterized by three different words that start with "V" (Beyer 2011). The first "V" word is volume and it refers to a large volume of data. The next "V" word is velocity, referring to how fast the data processing is. Usually, it should be enough to obtain the result of the analysis in real time without any delay. Also, there are many types of processable data, which can also be unstructured; hence, the last "V" word stands for variety. Recently, two more words are used because we cannot define the characteristics of big data with only three words. One is "veracity"; another is "value." Veracity is for selecting high-quality data from all of it (Villanova University 2018) and value is the main reason why people use big data technology (Mauro et al. 2013). Table 1 describes the characteristics of big data briefly.

This content is only available via PDF.
You can access this article if you purchase or spend a download.