خوشه‏ بندی ایستگاه‏ های سینوپتیک ایران بر اساس پارامترهای هواشناسی و جغرافیایی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری هواشناسی کشاورزی، دانشگاه فردوسی مشهد، ایران

2 دانشگاه فردوسی مشهد

3 دانشیار آمار ریاضی، دانشگاه فردوسی مشهد، ایران

4 استاد علوم و مهندسی آب، دانشگاه فردوسی مشهد، ایران

چکیده

خوشه‏ بندی ابزاری است که داده ‏های موجود را در گروه های مختلفی قرار می‏دهد. عموما تعداد خوشه‌ها بر اساس کمترین تغییرات درون گروهی و بیشترین تغییرات برون گروهی مشخص می‌شود. منطقه مورد نظر پهنای ایران می‏باشد. مختصات طول، عرض، ارتفاع جغرافیایی، میانگین دما، رطوبت نسبی و مجموع بارش ماهانه 420 ایستگاه سینوپتیک از زمان تاسیس تا سال 2018 در این پژوهش به کار گرفته شده است. پس از بررسی،پاک‏سازی و ترمیم داده ‏ها تنها 375 ایستگاه برای حضور در ادامه پژوهش باقی ماندند. با توجه به اینکه طول دوره آماری یک عامل مهم اثرگذار در خوشه ‏بندی است، ایستگاه‏ ها برحسب دوره آماری به سه دوره کمتر از 5 سال با 42 ایستگاه، 6-10 سال با 33 ایستگاه و بیشتر از 10 سال با 300 ایستگاه دسته ‏بندی شدند. هفت روش خوشه بندی سلسله مراتبی (5 زیرمجموعه) و افرازی (2زیرمجموعه) در این پژوهش استفاده شده است. از ضریب همبستگی کوفنتیک ، آزمون عرض سیلوئت (سایه نما) به عنوان دومعیار برای انتخاب روش خوشه‏ بندی استفاده شده است. کدنویسی ها در نرم افزار آماری R انجام شد. بر اساس شاخص‌‏های ضریب کوفنتیک و سیلوئت بهترین تعداد و روش خوشه برای داده‏ های 5-1 سال 4 خوشه با روش افرازی میانه ‏محور، داده‏ های 10-6 سال 5 خوشه با روش سلسله مراتبی میانگین محور و برای ایستگاه ‏ها با دوره آماری بیش از 10سال 4خوشه و روش افرازی میانگین محور می‏باشد. پهنه ‏بندی خوشه‏ ها بر نقشه جغرافیایی ایران با استفاده از نرم‏افزار ARCGIS برای هر سه دسته رسم شده است.

کلیدواژه‌ها


عنوان مقاله [English]

Clustering of Iranian synoptic stations based on meteorological and geographical parameters

نویسندگان [English]

  • Vajiheh Mohammadi Sabet 1
  • Mohammad Mousavi bayegi 2
  • Mehdi Jabari Noghabi 3
  • Kamran Davari 4
1 Water Science and Engineering, Ferdowsi University of Mashhad, Iran.
2 College of Agriculture ,Ferdowsi University of Mashhad
3 Associate Professor of Mathematical Statistics, Ferdowsi University of Mashhad, Iran
4 Professor of water science and engineering, Ferdowsi University of Mashhad, Iran
چکیده [English]

Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.





Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.



Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.

کلیدواژه‌ها [English]

  • Clustering
  • Geographical coordinates
  • Synoptic
  • Iran
  1. N.R.Sanjari.D.Bozorgnia.A.2001. Elementary survey sampling. Ferdowsi university of Mashhad. Mashhad.
  2. A.2011.Climate and Agriculture Meteorology. Imam Reza University of Mashhad. Mashhad.
  3. Everitt B.S. and Hothorn, T.. A Handbook of Statistical Analyses Using R. Chapman & Hall/CRC, Taylor & Francis Group, pp269.
  4. Farzandi et al. 2017. Approach, recognition and development of Iran's clustering with main meteorological parameters. National conference of new knowledge and technology in engineering sciences in the age of technology; November 17, 2016-Tehran. Tehran.
  5. Fahmi, Fidan M. et al, 2011. Determining the Climate Zones of Turkey by Center-Based Clustering Methods, Middle East Technical University, Ankara, Turkey.
  6. Gareth J. Daniela W. Trevor H. Robert T. 2013. An Introduction to Statistical Learning with Applications in R. Springer.
  7. Khatibi Rasool, Soltani Saeed, Khodagholi Morteza. Bioclimatic Classification of South East of Iran Using Multivariate Statistical Methods. Applied Ecology and Environmental Sciences. 2019; 7(5):190-205.
  8. Larose, D. T. (2005). Discovering Knowledge in Data ‎An Introduction To Data Mining, John Wiley And Sons.
  9. Pourbabak et al.2015. Classification of temperature and annual precipitation of meteorological stations of Iran using fuzzy clustering. Journal of Geography and Planning.NO 55.
  10. Rezaee Pazhand.H.2001. Application of statistics and probability in water resources. The first edition of Sokhongostar Publications of Islamic Azad University of Mashhad. Mashhad.
  11. Rourdeh et al.2018. Rainfall clustering in Iran using a new method based on the application of SDV mapping and FCM fuzzy clustering. Scientific Research Quarterly of Golestan University, No. 31.
  12. Rao, A.R. and Srinivas V.V. 2006. Regionalization of watersheds by hybrid cluster analysis", Journal of Hydrology, 318: 37-56.
  13. Romesburg H. C., 2004. Cluster Analysis for Researchers. Lifetime Learning Publications, Belmont, C.A, pp169.
  14. M.M. 1997.Multivariate Analysis. University Publishing Center of Tehran. Tehran.
  15. The website of the National Meteorological Organization (irimo.ir).
  16. Yurdanur U. Tayfun K. and Mehmet K. 2003. Redefining the climate zones of turkey using cluster analysis. International Journal of Climatology. 23: 1045–1055.