To explore the water-energy-carbon nexus of wastewater treatment (WWT), advanced tools such as machine learning play a crucial role. Current research has primarily constructed energy efficiency models, but there exists a lack in considering comprehensive dimensions and comparing pollutant removal types. In this study, we conducted spatial and temporal modeling to predict the energy consumption (EC) of WWT via machine learning approaches. EC (kWh) was the target feature, with the input features covering operational conditions, environmental benefits, and externalities. The optimal spatial model obtained a test R2 of 0.8224 in ridge regression, while the temporal model achieved a test R2 of 0.7253 in random forest. Besides, the removal amount (103 kg) fit best with EC during the spatial modeling, while the discharge concentration (mg/L) fit best with EC during the temporal modeling. Notably, treatment volume, the removal of chemical oxygen demand, and the removal of ammonia nitrogen emerged as the most significant factors. Given this, our findings suggest optimization implications including scale economy utilization and aeration improvement. The spatial and temporal dimensions also illuminated tailored strategies on influent regulation, technology selection, and effluent standard settings for a specific region and season. Results will provide valuable guidance for existing operation and future design of WWT projects toward energy-saving and carbon neutrality.