Current advances in computer-aided chemical process design and synthesis take advantage of surrogate modelling and superstructure optimization techniques. Conventionally, this is completed by using first-principle physical models or data-driven models to replace the original rigorous models for optimization and selection of a specific unit operation. Despite its achievements, this strategy is inefficient when dealing with complex process flowsheets such as utility and refrigeration systems where a large number of unit operations are heavily connected by recycling streams. To address this problem, an integrated data-driven modelling and optimization framework is proposed in this work. The framework first constructs a hybrid machine learning based surrogate model to automatically reduce the system dimensionality and capture the nonlinearity of the underlying chemical process. Then, an efficient optimization algorithm, in specific, evolutionary algorithm, is embedded to identify the optimal solution of this surrogate model. Quality and accuracy of the estimated optimal solution is finally validated against the rigorous process model. Through an iterative approach, optimal operating conditions for the entire process flowsheet are efficiently identified. Furthermore, the novel CryoMan Cascade cycle system for large scale liquefied natural gas manufacturing is used as the case study. This framework is demonstrated to be superior regarding time-efficiency, solution quality, and flexibility over the rigorous model based optimization approach.