Journal of Climate Research

Journal of Climate Research

Clustering of Iranian synoptic stations based on meteorological and geographical parameters

Document Type : Original Article

Authors
1 Water Science and Engineering, Ferdowsi University of Mashhad, Iran.
2 College of Agriculture ,Ferdowsi University of Mashhad
3 Associate Professor of Mathematical Statistics, Ferdowsi University of Mashhad, Iran
4 Professor of water science and engineering, Ferdowsi University of Mashhad, Iran
Abstract
Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.





Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.



Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.
Keywords

  1. N.R.Sanjari.D.Bozorgnia.A.2001. Elementary survey sampling. Ferdowsi university of Mashhad. Mashhad.
  2. A.2011.Climate and Agriculture Meteorology. Imam Reza University of Mashhad. Mashhad.
  3. Everitt B.S. and Hothorn, T.. A Handbook of Statistical Analyses Using R. Chapman & Hall/CRC, Taylor & Francis Group, pp269.
  4. Farzandi et al. 2017. Approach, recognition and development of Iran's clustering with main meteorological parameters. National conference of new knowledge and technology in engineering sciences in the age of technology; November 17, 2016-Tehran. Tehran.
  5. Fahmi, Fidan M. et al, 2011. Determining the Climate Zones of Turkey by Center-Based Clustering Methods, Middle East Technical University, Ankara, Turkey.
  6. Gareth J. Daniela W. Trevor H. Robert T. 2013. An Introduction to Statistical Learning with Applications in R. Springer.
  7. Khatibi Rasool, Soltani Saeed, Khodagholi Morteza. Bioclimatic Classification of South East of Iran Using Multivariate Statistical Methods. Applied Ecology and Environmental Sciences. 2019; 7(5):190-205.
  8. Larose, D. T. (2005). Discovering Knowledge in Data ‎An Introduction To Data Mining, John Wiley And Sons.
  9. Pourbabak et al.2015. Classification of temperature and annual precipitation of meteorological stations of Iran using fuzzy clustering. Journal of Geography and Planning.NO 55.
  10. Rezaee Pazhand.H.2001. Application of statistics and probability in water resources. The first edition of Sokhongostar Publications of Islamic Azad University of Mashhad. Mashhad.
  11. Rourdeh et al.2018. Rainfall clustering in Iran using a new method based on the application of SDV mapping and FCM fuzzy clustering. Scientific Research Quarterly of Golestan University, No. 31.
  12. Rao, A.R. and Srinivas V.V. 2006. Regionalization of watersheds by hybrid cluster analysis", Journal of Hydrology, 318: 37-56.
  13. Romesburg H. C., 2004. Cluster Analysis for Researchers. Lifetime Learning Publications, Belmont, C.A, pp169.
  14. M.M. 1997.Multivariate Analysis. University Publishing Center of Tehran. Tehran.
  15. The website of the National Meteorological Organization (irimo.ir).
  16. Yurdanur U. Tayfun K. and Mehmet K. 2003. Redefining the climate zones of turkey using cluster analysis. International Journal of Climatology. 23: 1045–1055.