In today's data environments where the phenomenon of digitalization is becoming more widespread, the needs for big data are seen in different formal structures from day to day. Large data are complex, bulky and growing data. Big data Industry 4.0, data mining, cloud infrastructure systems with innovative innovations and changes in many data pools thanks to rapid network features have been easily stored. Hence, large data have gained dynamic, physical and biological speed in many disciplines. In particular, the HDFS (Hadoop Distrubuted File System) and Hadoop components are often preferred among large data sets. Nowadays, we see that the big data relationship is more popular in software engineering, food engineering, hardware engineering, health research, coding education, industrial modeling. Due to the fact that geographic data revises itself in the technological sense, these Geographical data are now in the domain of large data pools. Because without the big data, digital world and digital geography processes have become difficult to comprehend. The spatial analysis of geography has been intertwined with spatial analysis, and it has directed geography science to modern technological applications, geography and statistics based processes and programs. In this sense, the main objectives of this study are: To reveal the usage and prevalence of big data with the data of plant geography databases. trying to express the importance of big data is important in the clustering, technological applications and algorithms used to mention briefly, big data and geographic data between the similarity and difference, trying to compare the areas of use, Hadoop strategy of large data and illustrating the relational connection between the basic components and the examples of hardware language. The aim was to conduct the study. The scope of the study included large data, geographic data, large data components, large data clustering techniques and algorithms, cloud technology applications, large data usage areas and data mining applications. The study was attempted in a theoretical framework. In the related literature review, planning, researching and finding the resources, examining the resources, preparing the literature documents and transferring the literature to the text and writing were followed with the discipline. Quantitative research method was preferred. The positivist paradigma methodology was preferred in the study because it was mainly driven by positivist approach in terms of research design and methods. In terms of positivist methodology, deductive work’s approach has been applied in the way of obtaining information. In this context, statistical research type is included in the related study in terms of quantitative research types. Since the study is included in the type of statistical research, statistical analysis and mathematical modeling are used mainly for data collection and analysis techniques. In the findings of the study, pre-formed sample large geographic data sets were tried to be analyzed by MAP Reduce relational connection. Data blocks created with Mapping - Reduction calculation are combined with key value, duplicates, key groups, reduction. The data is then enriched with the schematic representations of Pair-Reduction word counting. Then, large sample of data and large geographic data were evaluated and evaluated. In the final part of the study, the data obtained in the findings and the data, tables, figures and schematic representations were evaluated and the current geography literature and plant geography researches were tried to be put forward. Global Biodiversity Informatics Faculty data sets were used for species distribution database of identified species. In the study, large data cases, large data application areas, large data analytics and dimensions were investigated.
Large data, Geographic data, Map reduce, Plant geography, Data mining