# Waterbody loss due to urban expansion of large Chinese cities in the last three decades

This study quantitatively assessed water body loss due to urban expansion of large Chinese cities. We first extracted multi-temporal urban boundaries to determine the expansion of cities of over one million in population from 1990 to 2018. The monthly surface-water dataset was then used to identify surface waterbodies in the study period. Depending on the ratio of surface water body area to urban area, cities were further divided into three categories (ie water-abundant, water-medium, water-deficient). Finally, we quantified the rate of waterbody loss and evaluated the spatial and temporal variation of waterbody loss as a function of urban expansion and according to city type.

### GUB dataset

The Global Urban Boundary (GUB) dataset (http://data.ess.tsinghua.edu.cn) was used to determine urban expansion. GUB provides data on built-up areas over 30 years, with a spatial resolution of 30 m. In the GUB dataset, nonurban areas (such as green space and water space) surrounded by artificial impervious areas are filled within the urban boundary and removed by the algorithm, which is consistent with global mapping methods. The continuous urban boundary was demarcated by morphological image processing methods, which have an overall accuracy of over 90%. In this dataset, extensive water and forests are excluded, and the impervious surface within the urban boundaries accounts for about 60% of the total surface area47. Compared with urban boundaries obtained from night-time light, GUB better separates urban areas from surrounding non-urban areas.

### Monthly water body dataset

We selected the JRC Monthly Water History V1.3 dataset(https://global-surface-water.appspot.com/), which is available from the Google Earth Engine, as the basis for representing surface waterbodies48. This data collection, which was produced by using images from the Landsat series, contains 442 images of global monthly waterbody area from March 1984 to December 2020. In this dataset, the validation confirmed that fewer than 1% of waterbodies were incorrectly detected, and fewer than 5% of water bodies were missed altogether. We chose this dataset due to the long-term spatial distribution of waterbodies and due to mountain shadows and urban constructions masking, which reflects the real changes in waterbodies.

### Theoretical background

It is well known that cities have high concentrations of population and resources and expand spatially during development. There are many different perspectives on the size of cities, and studies have mostly used urban density and population to characterize them. However, because it is challenging to standardize data sources and quality, there is no unified quantitative standard49. Urban construction has concentrated human activity and brought about changes in land types. Cities are also identified as physical spaces, which can be defined as the built environment50,51. The built environment, which includes structures like buildings, roads, and other artificial constructions, is sometimes referred to as a non-natural environment52.

Rural is the antithesis of urban. As large cities have spread outward in developing nations like Asia, a transitional fringe has been created by the gradual blurring of the line separating urban and rural areas53. According to McGee, good locations, easy access, and sizable agricultural land all contribute to the development potential of large cities. Thus, between urban and rural areas, there are transitional areas of active spatial morphological change known as desakota33.54. The peri-urban areas, like desakota, are gradually developed and incorporated into original built-up urban areas in urbanization. The original landscape, which included agricultural land, vegetation, and waterbodies, gradually changed into an urban land use type, ie impervious surface, and thus the city continues to expand outwards. Waterbody, an essential ecological element, has been heavily developed or filled in during urbanization, which may present dangerous ecological risks. In this paper, we identified the urban boundaries based on physical space to explore the encroachment activities on waterbodies during the urbanization of large cities. We determined whether existing waterbodies were transformed into urban waterbodies or encroached upon and whether waterbodies were increased in the expansion of urban boundaries, thus proposing strategies for protecting waterbodies in the future.

### Extracting the extent of large Chinese cities from GUB dataset

To characterize urban expansion, GUB data are selected as the original data for urban boundary selection. The Chinese administrative scale of municipalities is not exclusively urban, but also includes rural areas. In our study, cities were defined as municipal districts excluding the vast countryside within the administrative boundaries of prefecture-level cities. We identified urban areas based on the physical boundaries from the perspective of remote sensing, which can precisely track urban expansion51.

In this work, we selected 159 cities with a population of over one million in 2018 based on the average annual population of urban districts from the 2019 China City Statistical Yearbook (Fig. S1). Taiwan, Hong Kong, and Macau are omitted. According to statistics, China had 160 cities with populations exceeding one million in 2018. However, due to the lack of data for the built-up area in 1990, Guang’an was not included in the study. We thus obtained 159 cities from the GUB dataset. Due to numerous fragmented patches within the administrative boundary, the population identified the main urban areas, and maximum patch areas were comprehensively based on the urban boundaries. Through manual detection and adjustment of the map, we determined that the location of the extracted urban area was consistent with that of the municipal government, and the boundary was extracted for each period. We took the growth area as the expansion area, with the original area being the city at the onset of each period (Fig. S3).

We used the average annual urban growth (AUG) rate to characterize the rate of urban expansion, as is widely done to evaluate urban expansion55,56. It is calculated as

$${text{AUG}} = left[ {frac{{Land_{t1} }}{{Land_{t0} }}^{{frac{1}{t1 – t0}}} – 1} right] times 100% ,$$

where (Country_{t0}) duck (Country_{t1}) represent the urban land area at time t0 duck t1where t0 duck t1 are the start and end of the given study period.

### Identification of urban waterbodies

Urban waterbodies contain all the components of urban flow networks above the ground and include natural waterbodies such as lakes, rivers, streams, and wetlands and artificial waterbodies such as parks and ponds48. We identified all waterbodies existing within the urban boundary as urban waterbodies. Considering urban expansion, urban waterbodies vary as urban boundary shifts at different stages. Our study explored how the original waterbodies changed during urban expansion, including whether they were kept as urban waterbodies or encroached upon. Considering the dryness or wetness of each year, we used the data for 3 years (36 months) around each period (1990, 1995, 2000, 2005, 2010, 2015, and 2018) to describe the water body. Not all waterbodies could be detected for each month of the year; for example, freezing may prevent waterbodies from being detected. To cover seasonal and permanent waterbodies, we used the waterbody frequency index (WFI), which is calculated as the fraction of waterbody months within the 3 years to identify stable waterbodies pixel by pixel57. The spatial distribution of each water body was then mapped comprehensively for each period. By comparing the extracted waterbody with the long-time-series high-resolution remote-sensing images from Google Earth, we found that the extracted waterbodies fit the actual waterbody distribution quite well (Fig. S2):

$$WFIleft( i right) = frac{WMleft( i right)}{{DMleft( i right)}}$$

where WFI(i) is the water occurrence for pixel in in the images before and after the given year, and in is the pixel number for the study area. Wm(in) is the number of months during which the water body is detected in in pixels over the 3 years. DM(in) is the number of months during which the data are available in pixels in. If the waterbody frequency index of a pixel is greater than 25%, this pixel is considered as a waterbody; otherwise, it is not.

### City classification based on surface water body

Cities with over one million in population may not be short of water bodies, but significant differences remain in surface water body abundance. Due to large differences in city size, it is inappropriate to use water body area as a criterion. Considering the influence of urban expansion, we ranked 159 cities according to the indicator of waterbody fraction (WF), namely the fraction of the original surface water within the urban boundary in 2018. Waterbodies not impacted by urbanization were taken as the original surface waterbody, which used the average surface water body from 1985 to 1991 as baseline. We used the natural break method to divide cities into abundant, moderate, and deficient levels (referred to as Type I, Type II, and Type III, respectively) and evaluate the abundance of waterbodies in cities. Based on the waterbody fraction (WF) value, which is calculated as follows:

$${text{WF}} = frac{{Water_{origin} }}{{Land_{2018} }}$$

where WF is used to judge the urban water body abundance in cities. (Water_{1990}) is the origin surface waterbody area (used the year in 1985–1991) in the urban boundary of 2018, (Country_{2018}) the urban land area in the urban boundary of 2018.

### Temporal characteristic of water body loss and gain

To understand the spatial–temporal features of surface water bodies, we used five normalized indicators to compare water body variations between cities during urban expansion from the overall perspective and from the city perspective.

The variation in original natural waterbodies reflects the intensity of the natural resource development in urban expansion. We summarized the reduction and preservation of original waterbodies in urban expansion areas with a population of over one million to represent the encroachment of urban expansion on waterbodies:

$$WL = frac{{sum NWL_{t0_t1} left( i right)}}{{sum W_{t0} left( i right)}} times 100%$$

$$WP = frac{{sum (W_{t0} left( i right) – NWL_{t0_t1} left( i right))}}{{sum W_{t0} left( in right)}} times 100%$$

where in labels the city within the 159 cities, WL and WP are the fractions of waterbody loss and preservation in urban expansion areas of all cities, (NWL_{t0_t1}) is the net water body loss during period t0t1(, and W_{t0}) is the natural water body in the urban expansion area at the time t0.

To estimate the net waterbody loss caused by urban expansion at various stages, we used the standardized indicator, annual average net waterbody loss rate (ANWL), to compare waterbody loss speeds over time. This indicator is independent of the difference in water body abundance and can be compared over time. Waterbody loss is one part of the impact of urbanization; the other is water body gain. We used the same method to evaluate the annual average net water body gain rate (ANWG). The formulas are

$$A{text{NWL}} = frac{{NWL_{t0_t1} }}{{W_{t0} left( {t1 – t0} right)}} times 100%$$

$$ANWG = frac{{NWG_{t0 – t1} }}{{W_{t0} left( {t1 – t0} right)}} times 100%$$

where NWL and NWG are the net water body loss and gain, respectively, and the other abbreviations are the same as above.

Considering the direct impact of urban expansion, we used a normalized indicator, the average net waterbody loss velocity of urban expansion ((AWLV)), which refers to the amount of waterbody encroachment per unit urban expansion area. It quantifies the time-heterogeneity of waterbody loss due to urban expansion and is calculated as follows:

$$AWLV = frac{{NWL_{t0_t1} }}{{Land_{t1} – Land_{t0} }}$$

We calculated these indicators for the six expansion periods (1990–1995, 1995–2000, 2000–2005, 2005–2010, 2010–2015, and 2015–2018) (Fig. 3). In the study, if the water body pixel count is zero at the onset of the period, the indicator for the period is abnormal and thus excluded.