GIS Processing
|
Search the nearest points in dfB_origin for dfA_origin, and calculate the distance |
|
This method will match the nearest points in gdfB to gdfA, and add a new column called dist |
|
This method will seach from gdfB to find the nearest line to the point in gdfA. |
|
The intput is the linestring GeoDataFrame. |
|
The input is the GeoDataFrame of polygon geometry, and the col name. |
|
The input is the GeoDataFrame of the polygon geometry. |
Nearest neighbor searches
- transbigdata.ckdnearest(dfA_origin, dfB_origin, Aname=['lon', 'lat'], Bname=['lon', 'lat'])
Search the nearest points in dfB_origin for dfA_origin, and calculate the distance
- Parameters:
dfA_origin (DataFrame) – DataFrame A
dfB_origin (DataFrame) – DataFrame B
Aname (List) – The column of lng and lat in DataFrame A
Bname (List) – The column of lng and lat in DataFrame A
- Returns:
gdf – The output DataFrame
- Return type:
DataFrame
- transbigdata.ckdnearest_point(gdA, gdB)
This method will match the nearest points in gdfB to gdfA, and add a new column called dist
- Parameters:
gdA (GeoDataFrame) – GeoDataFrame A, point geometry
gdB (GeoDataFrame) – GeoDataFrame B, point geometry
- Returns:
gdf – The output DataFrame
- Return type:
DataFrame
- transbigdata.ckdnearest_line(gdfA, gdfB)
This method will seach from gdfB to find the nearest line to the point in gdfA.
- Parameters:
gdA (GeoDataFrame) – GeoDataFrame A, point geometry
gdB (GeoDataFrame) – GeoDataFrame B, linestring geometry
- Returns:
gdf – Searching the nearset linestring in gdfB for the point in gdfA
- Return type:
DataFrame
Point to point matching (DataFrame and DataFrame)
In [1]: import transbigdata as tbd
In [2]: import pandas as pd
In [3]: import geopandas as gpd
In [4]: from shapely.geometry import LineString
In [5]: dfA = gpd.GeoDataFrame([[1,2],[2,4],[2,6],
...: [2,10],[24,6],[21,6],
...: [22,6]],columns = ['lon1','lat1'])
...:
In [6]: dfA
Out[6]:
lon1 lat1
0 1 2
1 2 4
2 2 6
3 2 10
4 24 6
5 21 6
6 22 6
In [7]: dfB = gpd.GeoDataFrame([[1,3],[2,5],[2,2]],columns = ['lon','lat'])
In [8]: dfB
Out[8]:
lon lat
0 1 3
1 2 5
2 2 2
transbigdata.ckdnearest()
to match points to points, if the inputs are two DataFrame without geometry columns, you should specify the lon and lat columns.In [9]: tbd.ckdnearest(dfA,dfB,Aname=['lon1','lat1'],Bname=['lon','lat'])
Out[9]:
lon1 lat1 index lon lat dist
0 1 2 0 1 3 1.111949e+05
1 2 4 1 2 5 1.111949e+05
2 2 6 1 2 5 1.111949e+05
3 2 10 1 2 5 5.559746e+05
4 24 6 1 2 5 2.437393e+06
5 21 6 1 2 5 2.105798e+06
6 22 6 1 2 5 2.216318e+06
Point to point searching
Transform DataFrame to GeoDataFrame
In [10]: dfA['geometry'] = gpd.points_from_xy(dfA['lon1'],dfA['lat1'])
In [11]: dfA
Out[11]:
lon1 lat1 geometry
0 1 2 POINT (1.00000 2.00000)
1 2 4 POINT (2.00000 4.00000)
2 2 6 POINT (2.00000 6.00000)
3 2 10 POINT (2.00000 10.00000)
4 24 6 POINT (24.00000 6.00000)
5 21 6 POINT (21.00000 6.00000)
6 22 6 POINT (22.00000 6.00000)
In [12]: dfB['geometry'] = gpd.points_from_xy(dfB['lon'],dfB['lat'])
In [13]: dfB
Out[13]:
lon lat geometry
0 1 3 POINT (1.00000 3.00000)
1 2 5 POINT (2.00000 5.00000)
2 2 2 POINT (2.00000 2.00000)
transbigdata.ckdnearest_point()
进行点与点匹配In [14]: tbd.ckdnearest_point(dfA,dfB)
Out[14]:
lon1 lat1 geometry_x ... lon lat geometry_y
0 1 2 POINT (1.00000 2.00000) ... 1 3 POINT (1.00000 3.00000)
1 2 4 POINT (2.00000 4.00000) ... 2 5 POINT (2.00000 5.00000)
2 2 6 POINT (2.00000 6.00000) ... 2 5 POINT (2.00000 5.00000)
3 2 10 POINT (2.00000 10.00000) ... 2 5 POINT (2.00000 5.00000)
4 24 6 POINT (24.00000 6.00000) ... 2 5 POINT (2.00000 5.00000)
5 21 6 POINT (21.00000 6.00000) ... 2 5 POINT (2.00000 5.00000)
6 22 6 POINT (22.00000 6.00000) ... 2 5 POINT (2.00000 5.00000)
[7 rows x 8 columns]
Point to Line searching (GeoDataFrame and GeoDataFrame)
In this case, Table A is still a node file, Table B is a linestring file
In [15]: dfA['geometry'] = gpd.points_from_xy(dfA['lon1'],dfA['lat1'])
In [16]: dfB['geometry'] = [LineString([[1,1],[1.5,2.5],[3.2,4]]),
....: LineString([[1,0],[1.5,0],[4,0]]),
....: LineString([[1,-1],[1.5,-2],[4,-4]])]
....:
In [17]: dfB
Out[17]:
lon lat geometry index
0 1 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ... 0
1 2 5 LINESTRING (1.00000 0.00000, 1.50000 0.00000, ... 1
2 2 2 LINESTRING (1.00000 -1.00000, 1.50000 -2.00000... 2
In [18]: tbd.ckdnearest_line(dfA,dfB)
Out[18]:
lon1 lat1 ... lat geometry_y
0 1 2 ... 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ...
1 2 4 ... 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ...
2 2 6 ... 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ...
3 2 10 ... 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ...
4 21 6 ... 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ...
5 22 6 ... 3 LINESTRING (1.00000 1.00000, 1.50000 2.50000, ...
6 24 6 ... 5 LINESTRING (1.00000 0.00000, 1.50000 0.00000, ...
[7 rows x 8 columns]
Split the line
splitline_with_length can be used to split a line into several sub-line with a maximum length threshold
- transbigdata.splitline_with_length(Centerline, maxlength=100)
The intput is the linestring GeoDataFrame. The splited line’s length wull be no longer than maxlength
- Parameters:
Centerline (GeoDataFrame) – Linestring geometry
maxlength (number) – The maximum length of the splited line
- Returns:
splitedline – Splited line
- Return type:
GeoDataFrame
The following case will show how to split a line itno 100 subline
#读取线要素
import geopandas as gpd
Centerline = gpd.read_file(r'test_lines.json')
Centerline.plot()
#转换线为投影坐标系
Centerline.crs = {'init':'epsg:4326'}
Centerline = Centerline.to_crs(epsg = '4517')
#计算线的长度
Centerline['length'] = Centerline.length
Centerline
Id | geometry | length | |
---|---|---|---|
0 | 0 | LINESTRING (29554925.232 4882800.694, 29554987... | 285.503444 |
1 | 0 | LINESTRING (29554682.635 4882450.554, 29554773... | 185.482276 |
2 | 0 | LINESTRING (29554987.079 4882521.969, 29555040... | 291.399180 |
3 | 0 | LINESTRING (29554987.079 4882521.969, 29555073... | 248.881529 |
4 | 0 | LINESTRING (29554987.079 4882521.969, 29554969... | 207.571197 |
5 | 0 | LINESTRING (29554773.177 4882288.671, 29554828... | 406.251357 |
6 | 0 | LINESTRING (29554773.177 4882288.671, 29554926... | 158.114403 |
7 | 0 | LINESTRING (29555060.286 4882205.456, 29555082... | 107.426629 |
8 | 0 | LINESTRING (29555040.278 4882235.468, 29555060... | 36.069941 |
9 | 0 | LINESTRING (29555060.286 4882205.456, 29555095... | 176.695446 |
#将线打断为最长100米的线段
import transbigdata as tbd
splitedline = tbd.splitline_with_length(Centerline,maxlength = 100)
#打断后线型不变
splitedline.plot()
#但内容已经变成一段一段了
splitedline
geometry | id | length | |
---|---|---|---|
0 | LINESTRING (29554925.232 4882800.694, 29554927... | 0 | 100.000000 |
1 | LINESTRING (29554946.894 4882703.068, 29554949... | 0 | 100.000000 |
2 | LINESTRING (29554968.557 4882605.443, 29554970... | 0 | 85.503444 |
0 | LINESTRING (29554682.635 4882450.554, 29554688... | 1 | 100.000000 |
1 | LINESTRING (29554731.449 4882363.277, 29554736... | 1 | 85.482276 |
0 | LINESTRING (29554987.079 4882521.969, 29554989... | 2 | 100.000000 |
1 | LINESTRING (29555005.335 4882423.650, 29555007... | 2 | 100.000000 |
2 | LINESTRING (29555023.592 4882325.331, 29555025... | 2 | 91.399180 |
0 | LINESTRING (29554987.079 4882521.969, 29554993... | 3 | 100.000000 |
1 | LINESTRING (29555042.051 4882438.435, 29555048... | 3 | 99.855617 |
2 | LINESTRING (29555111.265 4882370.450, 29555116... | 3 | 48.881529 |
0 | LINESTRING (29554987.079 4882521.969, 29554985... | 4 | 100.000000 |
1 | LINESTRING (29554973.413 4882422.908, 29554971... | 4 | 99.756943 |
2 | LINESTRING (29554930.341 4882335.023, 29554929... | 4 | 7.571197 |
0 | LINESTRING (29554773.177 4882288.671, 29554777... | 5 | 100.000000 |
1 | LINESTRING (29554816.361 4882198.476, 29554821... | 5 | 99.782969 |
2 | LINESTRING (29554882.199 4882125.314, 29554891... | 5 | 99.745378 |
3 | LINESTRING (29554976.612 4882096.588, 29554987... | 5 | 100.000000 |
4 | LINESTRING (29555076.548 4882100.189, 29555077... | 5 | 6.251357 |
0 | LINESTRING (29554773.177 4882288.671, 29554783... | 6 | 100.000000 |
1 | LINESTRING (29554869.914 4882314.006, 29554876... | 6 | 58.114403 |
0 | LINESTRING (29555060.286 4882205.456, 29555062... | 7 | 100.000000 |
1 | LINESTRING (29555081.239 4882107.675, 29555081... | 7 | 7.426629 |
0 | LINESTRING (29555040.278 4882235.468, 29555042... | 8 | 36.069941 |
0 | LINESTRING (29555060.286 4882205.456, 29555064... | 9 | 100.000000 |
1 | LINESTRING (29555094.981 4882299.244, 29555100... | 9 | 76.419694 |
Polygon processing
- transbigdata.merge_polygon(data, col)
The input is the GeoDataFrame of polygon geometry, and the col name. This function will merge the polygon based on the category in the mentioned column
- Parameters:
data (GeoDataFrame) – The polygon geometry
col (str) – The column name for indicating category
- Returns:
data1 – The merged polygon
- Return type:
GeoDataFrame
- transbigdata.polyon_exterior(data, minarea=0)
The input is the GeoDataFrame of the polygon geometry. The method will construct new polygon by extending the outer boundary of the ploygon
- Parameters:
data (GeoDataFrame) – The polygon geometry
minarea (number) – The minimum area. Polygon of less area will be removed
- Returns:
data1 – The processed polygon
- Return type:
GeoDataFrame