As modern power systems evolve with the growing penetration of renewable energy and dynamic grid operations, ensuring stability, resilience, and sustainability has become a critical challenge. This growing unpredictability necessitates advanced monitoring of
power systems to maintain their stability and ensure reliable operation. Phasor Measurement Unit (PMU) driven Wide-Area Monitoring Systems (WAMS) supplies real-time information which is essential to strengthen grid stability. However, PMU measured data is often corrupted due to hardware malfunctioning, faulty sensors, cyber-attacks, and communication mismatch between PMU and phasor data concentrator (PDC), leading to missing values, outliers, and noise, which hampers the stability and reliability of the
power system. Distinguishing between ambient and ringdown signals in the PMU data streams, especially in noisy environments, poses substantial challenges. Precise classification is crucial for detecting data patterns and ensuring accurate data recovery.
To tackle these challenges, this work presents a novel machine learning based approach, leveraging the Improved Random Forest (RF) technique for signal classification and the Improved XGBOOST technique to counteract the effects of outliers and up to
55 % missing values. The improved Random Forest technique, incorporating 10-fold cross-validation for feature selection and ranking, which demonstrates its superior efficiency. The use of cross-validation ensures the selection of optimal features, streamlining the model&rsquos performance which makes the improved Random Forest algorithm not only more accurate but also significantly faster. In the Improved XGBOOST approach, the process begins with detection of outliers using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, where the identified outliers are converted into missing values. In the second step, both the pre-existing missing values and those generated from outlier detection process of DBSCAN are imputed using the XGBOOST algorithm. This two-step scheme enhances data quality by effectively addressing the outliers and missing values, resulting in more accurate recovery of signal components from the degraded PMU measurements. The recovered signal is subsequently fed to the TLS-ESPRIT algorithm to evaluate modal damping and frequency parameters. A comprehensive comparative analysis has been carried out using synthetic test signals, real-world measurements collected from the Western Electricity Coordinating Council (WECC), oscillatory signal generated from the IEEE 39 bus system validated by Real Time Digital Simulator (RTDS) and 2 Area System data which demonstrates the superior efficacy and robustness of this innovative approach against conventional methods.