Performing analysis |
Spatial analysis is an analysis technique for spatial data based on the locations and shape of geographic objects. Its purpose is to extract and mine the value of spatial information. DataInsights provides both standard spatial analysis and distributed analysis. Standard analysis includes: buffer, isoline, isosurface, overlay, Thiessen polygons and IDW interpolation analysis. Distributed analysis includes: aggregate points, summarize attributes, density , reconstruct tracks, etc. In DataInsights, the result of spatial analysis is a new data table which also supports the creation of charts and spatial analysis. The data tables obtained through spatial analysis mark with symbol in the list.
The data formats supported by standard analysis and distributed analysis are shown in the following table:
Data type | Standard spatial analysis | Distributed spatial analysis |
Excel |
√ |
√(need to configure relational storage) |
CSV |
√ |
√(need to configure relational storage) |
GeoJSON |
√ |
√(need to configure relational storage) |
JSON |
√ |
|
SHP |
√ |
|
Registered HDFS data |
|
√ |
Registered HBase data |
|
√ |
Standard spatial analysis supports data in formats Excel, CSV, GeoJSON, JSON, and SHP. The following two ways are provided to use standard spatial analysis:
DataInsights uses client-side spatial analysis by default. Click "Setting" on the upper right navigation bar, enter "Analysis" tab, you can configure and change the preferred analysis mode.
Buffer analysis is a basic GIS analysis method. It automatically builds buffer zones with specified distance around point, line, or region geometric objects. In DataInsights, after creating map view, click "Analysis" on right panel, then select "Buffer" under "Tools" tab, set the parameters to be used, click "Run"at the bottom to start analysis. Of the parameters, if checked "Save attribute", the buffer result will contain all the attributed from the original objects (point, line or region). "Union buffers" means the buffer result will merge the intersecting buffers.
Isolines are one of the common methods of representing surfaces on a map, which form a smooth curve by joining all adjacent points of equal value. Commonly used contours are: contour lines, isobath lines, isotherms, isobaric lines, and other precipitation lines.
Parameters:
The isosurface is produced by closing the two neighboring isolines. The change of isosurfaces intuitively represents changes between adjacent isolines, such as elevation, temperature, precipitation, pollution, or atmospheric pressure. The parameters for extracting isosurface are the same as the extraction isolines. The only difference is that the analysis result type is polygon and the analysis result attribute field contains minimum and maximum values.
Overly analysis is an analysis that extracts the new spatial geometric information by processing spatial data. For example, if you need to know the distribution of soil in an administrative area, you can perform overly analysis based on the two datasets of the national land use map and the administrative area plan map to obtain the desired results. Overly analysis is widely used in resource management, urban construction assessment, land management, agriculture, forestry, animal husbandry, and statistics. In DataInsights, overlay operations support clip, erase, identity, intersect, union, etc.
Thiessen polygons can be used for qualitative analysis, statistical analysis and proximity analysis. For example, the properties of the Thiessen polygon region can be described by the properties of discrete points, the data of the Thiessen polygon region can be calculated from the data of the discrete points and it's easy to determine which discrete point is adjacent to other discrete points. In DataInsights, Thiessen polygon requires the type of analysis data must be point, the analysis result is region data, which has the same attributes with the point dataset. Besides, "Clip Region" is used to clip the analysis result area, which means only the area within the clip region will be displayed. For example, as a government administrator, you only care about the area you are governing, then you can set clip region parameter to display only the analysis result of the governing area.
Inverse distance weighted (IDW) estimates the value of a cell using the weighted average of sample points around that cell based on the similarity between the points within the region. A surface is then generated. In DataInsights, similar to extracting isolines, the layer to be analyzed, field, power, cell size, clip region parameters are provided by IDW analysis.
DataInsights supports distributed spatial analysis of CSV, Excel, GeoJSON formatted data, registered user-managed HDFS data and HBase data configured with relational storage. Before using distributed analysis, administrators need to configure the distributed analysis server. For details, see: Analysis server configuration.
After successfully configured the distributed analysis server, click "Setting" on the upper right navigation bar, enter "Analysis" tab, type in the server address in "Distributed analysis server" column, and click "OK" to finish.
Aggregate Points refers to creating an aggregation map based on a point dataset. First, it separates the points on the map with grids or regions. Secondly, it calculates the number of the points in each region or grid and take it as the statistical value of the region or grid. It can also use weighted field of points as the statistical value of the region or grid. At last, it fills the regions or grids with graduated color. In DataInsights, it supports "Aggregate with Grid" and "Aggregate with Polygon" two kinds. In the toolbar on the right, choose "Analysis" > "Distributed Tools" > "Aggregate Points" to start the point aggregation analysis job.
Summarize Attributes refers to summarize the specified fields of the input dataset and calculate statistics on the specified attribute field. By setting the group field, attribute field and statistical mode, it can calculate the summary results. And the results can be displayed with charts.
Parameters for Summarize Attributes:
Density analysis is used to calculate the unit density of point or line feature measurements over a specified neighborhood. It can intuitively reflect the distribution of discrete measurements over a continuous area. DataInsights currently supports simple point density analysis and kernal density analysis, as shown below:
Reconstruct Tracks can describe the motion tracks of an object by it's different locations at different time. For example, a car will upload its locations to the server via GPS at regular intervals during driving, with this uploaded location data, using Reconstruct Tracks function, you can construct the car's tracks over a period of time and clearly see the running state of the car. Reconstruct Tracks require time field should be contained in dataset.
Overlay offered by Distributed Tools has the same function as provided in Standard Tools. The difference is, overlay in distributed analysis has a good performance when dealing with large amount of data, and supports CSV, Excel, GeoJSON formatted data configured with relational storage and registered HBase data.
Create Buffers offered by Distributed Tools has the same function as provided in Standard Tools. Comparing with the standard buffer analysis, the advantage is the good performance when dealing with large amount of data.
Building region grid is used to generate a grid region dataset that can entirely cover the region area according to the input area data, grid width and height. Each generated grid must be intersected with the region area. According to the input point data, statistics can be calculated inside each grid, the attribute of which will store the statistical value.
The spatial analysis results are automatically stored in "My Data" as a data table. You can download the spatial analysis results or share the analysis results with other iPortal users or applications. For details, see: Sharing analysis results.