Although the Bayesian inference approach possesses promising features, its application to large-scale complex systems faces challenges. One obstacle is the high computational cost. In general, Monte Carlo-based analysis is adopted to acquire the dynamic responses of parameterized model samples, which may lead to prohibitive computational cost when the dimension of the finite element model is high. In some special cases, the computational issue can be resolved by asymptotically approximating the posterior PDF with the so-called most probable value [18]. This technique, however, is only valid for simplified cases [19]. For large-scale problems, one school of thought is to alleviate the computational cost of every single run in the Monte Carlo analysis by employing order-reduced model [20–23]. Another school of thought is to build certain surrogate models to mimic the behavior of the original finite element model by creating generic input–output relations, such as response surface models [24,25], artificial neural networks [26], and Kriging predictor [27]. One issue with these approaches is that the error of response prediction, e.g., the modal truncation error in the order-reduced approach, may become considerable when compared with the actual response deviation between the measurement and finite element model prediction. Indeed, a lot of researchers have looked into the sampling procedure and suggested the Markov chain Monte Carlo (MCMC) for analysis acceleration [28–30]. A Markov chain that contains a reduced number of samples is generated using, for example, the Metropolis–Hastings (MH) algorithm and the importance sampling technique. When applying the MH MCMC, the proposal PDF needs to be defined with proper variance, which fundamentally determines the random sampling property over the entire parametric space. The efficiency and accuracy of MCMC depend heavily on the selection of such proposal PDF. When the posterior PDF is peaked, the peaked region of posterior PDF will never be reached if the proposal PDF is too wide (i.e., with large variance). On the other hand, the Markov chain travels very slowly before reaching the peaked region if the proposal PDF is too narrow [29,31,32]. Even though high efficiency may be reached with wide proposal PDF, the posterior PDF obtained may not be informative since it only contains sparsely distributed data points and most of them are even out of the peaked region. Without the essential information, the model parameters identified based on such posterior PDF may not be considered optimal. Apparently, to improve the identification accuracy, the enrichment of sparse posterior PDF is required, which has not been addressed yet in related studies.