خوشه‏ بندی ایستگاه‏ های سینوپتیک ایران بر اساس پارامترهای هواشناسی و جغرافیایی

نوع مقاله : مقاله پژوهشی

نویسنده

علوم و مهندسی آب، دانشگاه فردوسی مشهد- ایران

چکیده

خوشه‏ بندی ابزاری است که داده ‏های موجود را در گروه های مختلفی قرار می‏دهد. عموما تعداد خوشه‌ها بر اساس کمترین تغییرات درون گروهی و بیشترین تغییرات برون گروهی مشخص می‌شود. منطقه مورد نظر پهنای ایران می‏باشد. مختصات طول، عرض، ارتفاع جغرافیایی، میانگین دما، رطوبت نسبی و مجموع بارش ماهانه 420 ایستگاه سینوپتیک از زمان تاسیس تا سال 2018 در این پژوهش به کار گرفته شده است. پس از بررسی،پاک‏سازی و ترمیم داده ‏ها تنها 375 ایستگاه برای حضور در ادامه پژوهش باقی ماندند. با توجه به اینکه طول دوره آماری یک عامل مهم اثرگذار در خوشه ‏بندی است، ایستگاه‏ ها برحسب دوره آماری به سه دوره کمتر از 5 سال با 42 ایستگاه، 6-10 سال با 33 ایستگاه و بیشتر از 10 سال با 300 ایستگاه دسته ‏بندی شدند. هفت روش خوشه بندی سلسله مراتبی (5 زیرمجموعه) و افرازی (2زیرمجموعه) در این پژوهش استفاده شده است. از ضریب همبستگی کوفنتیک ، آزمون عرض سیلوئت (سایه نما) به عنوان دومعیار برای انتخاب روش خوشه‏ بندی استفاده شده است. کدنویسی ها در نرم افزار آماری R انجام شد. بر اساس شاخص‌‏های ضریب کوفنتیک و سیلوئت بهترین تعداد و روش خوشه برای داده‏ های 5-1 سال 4 خوشه با روش افرازی میانه ‏محور، داده‏ های 10-6 سال 5 خوشه با روش سلسله مراتبی میانگین محور و برای ایستگاه ‏ها با دوره آماری بیش از 10سال 4خوشه و روش افرازی میانگین محور می‏باشد. پهنه ‏بندی خوشه‏ ها بر نقشه جغرافیایی ایران با استفاده از نرم‏افزار ARCGIS برای هر سه دسته رسم شده است.

کلیدواژه‌ها


عنوان مقاله [English]

Clustering of Iranian synoptic stations based on meteorological and geographical parameters

نویسنده [English]

  • Vajiheh Mohammadi Sabet
Water Science and Engineering, Ferdowsi University of Mashhad, Iran.
چکیده [English]

Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.





Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.



Clustering is an instrument that divides existing data into different groups. Generally, the number of clusters is determined based on the least changes within the group and the most changes outside the group. The study area is country of Iran. Coordinates of longitude, latitude, altitude, average temperature, relative humidity and total monthly rainfall of 420 synoptic stations from its establishment until 2018 have been used in this study. After reviewing, screening and repairing the data, only 375 stations remained to continue the research. Due to the length of the statistical period is an important factor influencing clustering, the stations are statistically divided into three periods: less than 5 years with 42 stations; 1-6 years with 33 stations and more than 10 years with 300 stations, were classified. Seven methods of hierarchical clustering (3 subsets), separation (2 subsets) and ward (2 subsets) have been used in this study. Cophenetic correlation coefficient, Silhouette width test are two indicators of clustering and selection. The coding was performed in R statistical software. Based on the Cophenetic and Silhouette coefficient indices, the best number and method of clustering for 1-5-year data are 4 clusters with the middle axis separation method, for the data of 6-10 years are 5 clusters with the mean-centered hierarchical method and for stations with a statistical period of more than 10 years are 4 clusters with the separation average axis method. The zoning of the clusters is plotted on the geographical map of Iran using ARCGIS software for all three categories.

Keywords: Clustering, Geographical coordinates, Synoptic, Iran.

کلیدواژه‌ها [English]

  • Clustering
  • Geographical coordinates
  • Synoptic
  • Iran