Abstract
In 2021, Tengizchevroil (TCO) initiated a pivotal standardization of business processes, data, and technology in several directions, particularly in asset management and materials management in line with Chevron’s enterprise program. Thus, digital solutions were aimed at boosting competitive performance, increasing reliability, preventing failures, managing risk and minimizing the lifecycle cost of surface assets. The central foundation for successful implementation of these solutions is high-quality data management, achieved through the automation of data clean-up and enrichment procedures using Computer Vision (CV) and Large Language Model (LLM) technologies.
The primary challenge faced was the significant amount of missing data in the existing ERP system. To rectify this historical issue, CV and LLM technologies were employed to reduce costs and enhance quality. The complexity of the project was heightened by the need to convert data from scanned files (PDFs) into a computer-friendly JSON format using a CV tool. The resulting deeply nested JSONs were then parsed into structured datasets and allocated to corresponding objects in the ERP system through a reliable LLM.
Fifteen AI models were trained until the suitable ones were found. As a result, approximately 40,000 rows of data were gathered to enrich relevant attributes and characteristics of surface assets and materials. This innovative approach saved 100 working days for eight data analysts with engineering backgrounds, who typically process 50 attributes per day. Much fruitful work has been done on research to obtain the first results appeared in the ERP system.