Background
The urban updating is a major urban development problem commonly faced by countries in the world at present, and the urban updating mainly solves the fundamental contradiction of urban development on the basis of understanding the current situation of urban and social economic development and analyzing the future development prospect of the city. Namely, the aged urban districts are scientifically improved to become the generation cities with complete facilities and functions. The city renovation work, which is essentially the metabolism process of city functions, includes a plurality of contents such as city protection, city repair, old city reconstruction, city renovation, new construction and the like. At present, there are three methods often adopted for city update: removal of reconstructive development of critical old houses, commercial development of protected historic buildings, and old zone updates in a comprehensive remediation approach is generally a market-driven real estate development model. Due to the attention on the comprehensive structure and function of the city, the idea of humanism is more and more important in city updating, however, the city updating practice at the present stage can not keep up with the change of the updating idea.
The american building designer harrison-frerk originally proposed a city construction concept that was centrally and comprehensively developed by public transportation, i.e., a public transportation-Oriented land utilization development model tod (transit organized development). In fact, for the update of the city, the concept development based on the TOD can also be carried out, that is, the travel behavior and the transportation mode selection of the residents are guided by systematically coordinating the relationship among land development, city construction and public transportation, and the strategy of city update is determined according to the travel of the residents and the planning conditions of the city building of the region. The city updating thought based on traffic development and travel demand substantially reflects the cooperative relationship between traffic and cities.
However, in practice, limited by the traditional traffic basic data acquisition technology, the work of urban traffic development and travel demand determination has several limitations as follows: 1) in the acquisition work of basic data, the traditional method needs to carry out resident household investigation by means of a large amount of manpower, on one hand, a large amount of manpower, material resources and financial resources are consumed, and the early preparation time of investigation, the investigation carrying time and the investigation data processing time are very long; on the other hand, the coverage area of the survey is very narrow, and the survey can only cover no more than 3% of urban population; 2) in the aspect of demand analysis and determination, approximate processing is adopted for the value taking of part of key data and parameters, so that the result of the traffic travel demand analysis is inaccurate. In addition, due to the fact that the data acquisition period is long, the result of the traffic travel demand analysis has certain hysteresis. The determined traffic demand cannot well determine the strategy of city updating, and when the deviation of the traffic demand is large, even a great decision error of city updating can be caused.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the practical problems that time and labor are needed during traditional city construction and updating, and coordination between city construction and updating mode determination and a city traffic system is insufficient, the invention aims to provide a method and a system for determining a city construction and updating mode based on multi-source data fusion.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a city construction and update mode determination method based on multi-source data fusion comprises the following steps:
(1) multi-source data is collected, and the multi-source data comprises real estate residence data, interest point data, enterprise tax payment data, consumption level data and road network data; the real estate home data comprises the number of residential areas of a city, the name of each residential area, the total number of houses, the total using area of the houses, the living area per person and the building year; the interest point data comprises the total number of the interest points, and the name, industry classification and coordinates of each interest point; the enterprise tax payment data comprises the number of enterprises in a city, the name of each enterprise and the annual business amount; the consumption level data comprises the number of shops in a city, the name of each shop and the per-person consumption amount;
(2) multi-source data fusion, comprising: classifying the interest points of the industry classified as office buildings or residential areas into traffic occurrence interest points, and classifying other interest points into traffic attraction interest points; matching the name of the interest point with the name of a residential area, the name of an enterprise and the name of a shop respectively to obtain the total house use area, the per-capita living area, the annual business amount of the enterprise and the per-capita consumption amount of the successfully matched interest point; calculating the number of people at the interest points in traffic occurrence type by dividing the total area of the buildings at the interest points by the living area of the people; dividing the annual business amount of the enterprises of the interest points by the per-capita consumption amount to calculate the number of the traffic attraction people of the traffic attraction type interest points;
(3) urban traffic trip demand and load are confirmed, include: corresponding the number of traffic occurences and the number of general attractions of all the points of interest in each traffic districtAccumulating the number of traffic occurences and the number of traffic attractors as a traffic cell; the number of the traffic occurences in each traffic district is multiplied by the number of the per-capita trips to obtain the traffic occurences in the traffic districts, and the number of the traffic attractions in each traffic district is the traffic attraction; calculating and determining a traffic distribution matrix by adopting a double-constraint gravity model according to the traffic travel demand and the road impedance calculated according to the road network data; the traffic distribution matrix is distributed on the road network, and the traffic load of each traffic cell is calculated according to the distribution result, wherein the traffic load in the nth traffic cell
J
nThe total number of roads in the nth traffic cell,
respectively the length and traffic load of the jth road in the nth traffic cell;
(4) and determining city construction and updating modes according to the construction year, traffic load and traffic demand sequencing conditions of each traffic cell.
Preferably, in the step (1), web crawler technology is adopted to crawl real estate home data of cities on the internet.
Preferably, an Baidu map API is adopted in the step (1), and interest point data of the city is collected on a Baidu map webpage.
Preferably, in the step (1), a public comment API is adopted, and consumption level data is collected on a public comment webpage.
Preferably, in the step (1), QGIS software is adopted to download all road network data of the city, and road impedance of the road in the case of free flow is calculated.
Preferably, in the step (2), the point of interest name NB is identified by using a KMP algorithmmbRespectively with residential area names NAmaBusiness name NCmcShop name NDmdMatching is carried out; by name NBmb、NAmaWhen matching is performed: if the name NBmb、NAmaAdopts KMP algorithm meterThe calculated value is greater than or equal to the name NBmbWith any one NAmaCalculating by adopting KMP algorithm to obtain value, wherein the value calculated by KMP algorithm is greater than or equal to min (NB)mb、NAma) When the ratio is 0.85 times, the matching is successful; otherwise, the matching fails; by name NBmb、NCmcWhen matching is performed: if the name NBmb、NCmcThe value calculated by adopting KMP algorithm is greater than or equal to the name NBmbWith any one NCmcCalculating by adopting KMP algorithm to obtain value, wherein the value calculated by KMP algorithm is greater than or equal to min (NB)mb、NCmc) When the ratio is 0.85 times, the matching is successful; otherwise, the matching fails; by name NBmb、NDmdWhen matching is performed: if the name NBmb、NDmdThe value calculated by adopting KMP algorithm is greater than or equal to the name NBmbAnd any one NDmdCalculating by adopting KMP algorithm to obtain value, wherein the value calculated by KMP algorithm is greater than or equal to min (NB)mb、NDmd) When the ratio is 0.85 times, the matching is successful; otherwise, the matching fails; wherein ma, mb, mc and md respectively represent the serial number of a residential area, the serial number of an interest point, the serial number of an enterprise and the serial number of a shop; min (NB)mb、NAma)、min(NBmb、NCmc)、min(NBmb、NDmd) Respectively represent the names NBmbAnd NAmaSmaller value of the string length of (NB), name NBmbAnd NCmcSmaller value of the string length of (NB), name NBmbAnd NDmdThe smaller value of the string length of (c).
Preferably, in step (4), the rule for determining the city construction and update mode includes one or more of the following: if a% of the building years of a certain traffic cell are sequenced and a% of the building years of the traffic cell are sequenced, judging that the cell should be subjected to function untwining type city updating; if a% of the building years of a certain traffic cell are sequenced and a% of the traffic loads are sequenced, judging that the cell should be updated by the function-improved city; if the building year of a certain traffic cell is ranked a% before and the traffic load is ranked a% before, judging that the cell should be subjected to traffic infrastructure construction and optimization; if the building year of a certain traffic cell is sequenced a% before and the traffic load is sequenced a%, judging that the cell is to be completed with city supporting facilities; if a% of traffic demands of a certain traffic cell are sequenced, judging that the cell needs to be completed by city supporting facilities; if the traffic demands of a certain traffic cell are sequenced a%, judging that the cell should be subjected to urban traffic system optimization; where a is a set threshold.
In another aspect of the present invention, a system for determining an urban construction and update mode based on multi-source data fusion comprises:
the multi-source data acquisition module comprises a real estate home data acquisition unit, an interest point data acquisition unit, an enterprise tax payment data acquisition unit, a consumption level data acquisition unit and a road network data acquisition unit which are respectively used for acquiring real estate home data, interest point data, enterprise tax payment data, consumption level data and road network data; the real estate home data comprises the number of residential areas of a city, the name of each residential area, the total number of houses, the total using area of the houses, the living area per person and the building year; the interest point data comprises the total number of the interest points, and the name, industry classification and coordinates of each interest point; the enterprise tax payment data comprises the number of enterprises in a city, the name of each enterprise and the annual business amount; the consumption level data comprises the number of shops in a city, the name of each shop and the per-person consumption amount;
a multi-source data fusion module comprising: the interest point classifying unit is used for classifying the interest points of which the industries are classified as office buildings or residential areas into traffic occurrence interest points, and classifying other interest points into traffic attraction interest points; the multi-source data matching unit is used for matching the interest point name with a residential area name, an enterprise name and a shop name respectively to obtain the total house utilization area, the per-person living area, the annual enterprise amount and the per-person consumption amount of the successfully matched interest point; the traffic occurrence number calculating unit is used for calculating the traffic occurrence number of the traffic occurrence interest points by dividing the total house using area of the interest points and the living area of the people; the traffic attraction number calculating unit is used for calculating the traffic attraction number of the traffic attraction type interest points by dividing the annual business amount and the per-capita consumption amount of enterprises at the interest points;
the urban traffic trip demand and load determining module is used for calculating the traffic occurrence and traffic attraction of the traffic district according to the traffic occurrence and traffic attraction of the interest points; calculating and determining a traffic distribution matrix by adopting a double-constraint gravity model according to the traffic travel demand and the road impedance calculated according to the road network data; the traffic distribution matrix is distributed on the road network, and the traffic load of each traffic cell is calculated according to the distribution result, wherein the traffic load in the nth traffic cell
J
nThe total number of roads in the nth traffic cell,
respectively the length and traffic load of the jth road in the nth traffic cell;
and the urban construction and updating mode determining module is used for determining the urban construction and updating mode according to the construction year, the traffic load and the traffic demand sequencing condition of each traffic cell.
Has the advantages that: according to the method for determining the urban construction and update mode based on the multi-source data fusion, interaction and association change relations among urban land development, urban construction and update and traffic are fully considered, and the urban construction and update mode obtained through the method optimizes time consumption and acquisition difficulty of relevant data acquisition on one hand, and more importantly, improves timeliness and reliability of the urban construction and update mode.
Detailed Description
The invention will be further described with reference to the following drawings and specific examples.
As shown in fig. 1, in the method for determining an urban construction and update mode based on multi-source data fusion disclosed in the embodiment of the present invention, a travel load of an urban transportation system is obtained by collecting, fusing and analyzing multi-source data including real estate home data, vectorized road network data, interest point data, enterprise tax payment data, consumption level data, and the like, and an urban construction and update mode is determined based on a traffic load. The method mainly comprises the following steps:
step S1: and (6) multi-source data acquisition. The method specifically comprises the following steps of collecting data:
step 1A) real estate home data acquisition. Real estate home data of a city is crawled on the Internet (such as a chain house (https:// www.lianjia.com /), I love my house (https:// www.5i5j.com /), or a real estate registration website hosted by a city house administration department, and the like) by adopting a web crawler technology. The data collected includes: number of residential areas MA and name of the MA-th residential area NA of a citymaTotal number of houses TA of the ma th residential areamaTotal area of use AA of the premises of the ma th residential areamaThe average living area AB of the ma th residential areamaYEAR of the building YEAR of the ma th residential areama. Wherein MA is the serial number of the residential area, MA is a natural number, and MA is more than or equal to MA and more than or equal to 1;
step 1B) point of interest data acquisition. And acquiring the point of interest data of the city on a Baidu map webpage by adopting a Baidu map API (such as http:// API. map. baidu. com/lbsapi/interface). The data collected includes: the total number of points of interest MB, the name of the MB-th point of interest NBmbMb Point of interest industry Classification TBmbThe coordinates (x) of the mb-th interest pointmb,ymb). Wherein MB is the serial number of the interest point, MB is a natural number, and MB is more than or equal to MB and more than or equal to 1;
step 1C), collecting tax payment data of the enterprise. The enterprise business condition information provided by the enterprise inquiry (https:// www.qichacha.com /) or the heaven-eye inquiry (https:// www.tianyancha.com /) is directly obtained or calculated. The data collected includes: number of enterprises in city MC, name of the MC-th enterprise NCmcAnnual business volume TC of the mc-th enterprisemc. Wherein MC is the serial number of an enterprise, MC is a natural number, and MC is more than or equal to MC more than or equal to 1;
step 1D) consumption level data acquisition. And adopting a public commenting API to collect consumption level data on a public commenting webpage. The data collected includes: number of stores in city MD, name of MD-th store NDmdThe per-person consumption amount TD of the md storemd. Wherein MD is the serial number of the shop, MD is a natural number, and MD is more than or equal to MD more than or equal to 1;
step 1E) road network data acquisition. Manually selecting the range of a city by adopting an Openstreetmap-DownloadData function built in QGIS software, downloading all road network data of the city, and then calculating the road impedance of the road under the condition of free flow by adopting a BPR function of the American public road bureau;
step S2: the multi-source data fusion comprises the following steps:
step 2A): and (4) classifying the interest points. Classifying the point of interest data acquired in the step 1B) according to industry TBmbAnd (4) classifying: if of TBmbThe mb-th interest point is an interest point of an office building and a residential area and is a traffic occurrence type interest point; otherwise, if TBmbThe mb-th interest point is an interest point of traffic attraction type except for an office building and a residential area;
step 2B): and matching the multi-source data. The data of the interest points acquired in the step 1B) and the data of the real estate residences in the step 1A) are acquired according to the name NB
mb、NA
maMatching: for the mb-th point of interest, if the name NB
mb、NA
maThe value calculated by adopting KMP algorithm is greater than or equal to the name NB
mbWith any one NA
maCalculating by adopting KMP algorithm to obtain value, wherein the value calculated by KMP algorithm is greater than or equal to min (NB)
mb、NA
ma) When the total area of the interest points is 0.85 times of the total area of the interest points, the matching is successful, and the total area of the house of the mb-th interest point after the matching is the total area
The matched living area per capita is
Points of interest acquired in the step 1B)According to the data collected in the step 1C), the enterprise tax payment data is collected according to the name NB
mb、NC
mcMatching: for the mb-th point of interest, if the name NB
mb、NC
mcThe value calculated by adopting KMP algorithm is greater than or equal to the name NB
mbWith any one NC
mcCalculating by adopting KMP algorithm to obtain value, wherein the value calculated by KMP algorithm is greater than or equal to min (NB)
mb、NC
mc) When the current time is 0.85 times of the current time, the matching is successful, and at this time, the annual business amount of the enterprise of the mb-th interest point after the matching is
The data of the interest points acquired in the step 1B) and the consumption level data acquired in the step 1D) are added according to the name NB
mb、ND
mdMatching: for the mb-th point of interest, if the name NB
mb、ND
mdThe value calculated by adopting KMP algorithm is greater than or equal to the name NB
mbAnd any one ND
mdCalculating by adopting KMP algorithm to obtain value, wherein the value calculated by KMP algorithm is greater than or equal to min (NB)
mb、ND
md) 0.85 times of the current interest point, the matching is successful, and the average consumption amount of the people of the mb-th interest point is obtained
Wherein the subscript MA
mbSubscript MC for serial number of real estate home data matching the mb' th point of interest
mbSubscript MD for the serial number of enterprise tax data matching the mb-th point of interest
mbThe serial number of the consumption level data matched with the mb-th interest point is shown; min (NB)
mb、NA
ma)、min(NB
mb、NC
mc)、min(NB
mb、ND
md) Respectively represent the names NB
mbAnd NA
maSmaller value of the string length of (NB), name NB
mbAnd NC
mcSmaller value of the string length of (NB), name NB
mbAnd ND
mdThe smaller value of the string length of (a);
step 2C): and calculating the number of the people at the interest points in the traffic occurrence category. If the mb-th point of interest is in step 2A)If the class is a traffic attraction interest point, the traffic occurrence number P of the mb-th interest point
mb0; if the mb-th interest point is classified as a traffic occurrence interest point in the step 2A), calculating the number of traffic occurrences: if the matching of the interest points is successful in the step 2B), the traffic occurrence number of the mb-th interest point
If the matching of the interest points is not successful in the step 2B), the traffic occurrence number P of the mb-th interest point
mbThe average value of the number of the traffic occurrence persons of all other successfully matched traffic occurrence interest points is obtained;
step 2D): and calculating the number of the traffic attraction persons at the traffic attraction type interest points. If the mb-th interest point is classified as a traffic occurrence interest point in the step 2A), the traffic attractor number A of the mb-th interest point
mb0; if the mb-th interest point is classified as a traffic attraction type interest point in the step 2A), calculating the number of traffic attractions: if the matching of the interest points is successful in the step 2B), the number of the traffic attractors of the mb-th interest point
If the matching of the interest points is not successful in the step 2B), the traffic attractive people number A of the mb-th interest point
mbThe average value of the number of the traffic attractions of all other successfully matched traffic attraction interest points is obtained;
step S3: the method for determining the urban traffic travel demand and load comprises the following steps:
step 3A), initializing the transportation travel demand. Setting the initial traffic occurence number of the nth traffic district as Pron0, the initial number of traffic attractions is Attn0. Wherein N is the serial number of the traffic cell, N is a natural number, and N is more than or equal to N and more than or equal to 1. N is the total number of traffic cells;
and step 3B), determining the number of the travelers. And sequentially processing the number of the traffic occurences and the number of the traffic attractors of each interest point: the mb-th interest point is in a traffic cell n
mbIn the mb-th interest point, the number of traffic occurences P
mbTraffic attraction people number A
mbCorresponding accumulation to the n-th
mbNumber of traffic occuring people in traffic district
Number of people attracted by traffic
Above;
and step 3C), determining the traffic travel demand. The traffic occurrence amount of the nth traffic cell is PPron=PronX time, traffic attraction amount AAttn=AttnWherein, the time is the number of trips per capita;
and 3D) determining a traffic distribution matrix. Calculating and determining a traffic distribution matrix by using the traffic travel demand obtained in the step 3C) and the road impedance obtained in the step 1E) and adopting a double-constraint gravity model; the specific method of the dual-constraint gravity model can refer to the contents in pages 66-70 of traffic planning (people's traffic press, 2007 edition) compiled by Wangwu and Chenchuu.
Step 3E) traffic distribution. And (3) carrying out traffic distribution on the traffic distribution matrix obtained in the step 1G4) on the road network collected in the step 1E), wherein the distribution method adopts a capacity limit-multipath distribution method. Recording the result of the traffic distribution: length of jth road in nth traffic district
The jth road has a traffic load of
Wherein J is the serial number of the road, J is a natural number, and J
n≥j≥1。J
nThe total number of roads in the nth traffic cell;
and step 3F) calculating the traffic load. The traffic load in the nth traffic cell is determined by
Step S4: the method for determining the urban construction and updating mode comprises the following steps:
and 4A) sequencing traffic load, demand and construction year. Arranging the traffic districts in descending order according to the traffic load, wherein the sequence number of the n-th traffic district after the sequencing is nVOC(ii) a The traffic district is PPro according to the traffic occurrence quantitynAnd traffic attraction AAttnThe sum of (1) is arranged in descending order, and the order serial number of the n-th traffic cell after the order is ndemand(ii) a The traffic districts are sorted in descending order according to the average building year of residential districts in the traffic districts, and the sorting serial number of the n-th traffic district after sorting is nyear;
And step 4B), determining urban construction and updating modes. Determining a city construction and updating mode according to the sequencing results of the traffic load, the traffic demand and the construction year in the step 4A): if the building year of the nth traffic cell is sequenced 10% later and the traffic load is sequenced 10% earlier, then functional untwining type city updating should be carried out on the cell; if the building year of the nth traffic cell is sequenced 10% later and the traffic load is sequenced 10% later, performing function-enhanced city update on the cell; if the building year of the nth traffic cell is sequenced to be 10% and the traffic load is sequenced to be 10%, constructing and optimizing traffic infrastructure of the cell; if the building year of the nth traffic cell is sequenced to be 10% at the top and the traffic load is sequenced to be 10% at the bottom, city supporting facility improvement is carried out on the cell; if the traffic demand (sum of traffic generation amount and traffic attraction amount) of the nth traffic cell is sequenced to 10%, the city supporting facilities of the cell are improved; if the traffic demand of the nth traffic cell ranks the top 10%, the urban traffic system optimization should be performed on the cell.
The urban construction and updating mode determining system based on multi-source data fusion provided by the other embodiment of the invention comprises a multi-source data acquisition module, a multi-source data fusion module, an urban transportation travel demand and load determining module and an urban construction and updating mode determining module. The multi-source data acquisition module comprises a real estate home data acquisition unit, an interest point data acquisition unit, an enterprise tax payment data acquisition unit, a consumption level data acquisition unit and a road network data acquisition unit which are respectively used for acquiring real estate home data, interest point data, enterprise tax payment data, consumption level data and road network data. A multi-source data fusion module comprising: the interest point classifying unit is used for classifying the interest points of which the industries are classified as office buildings or residential areas into traffic occurrence interest points, and classifying other interest points into traffic attraction interest points; the multi-source data matching unit is used for matching the interest point name with a residential area name, an enterprise name and a shop name respectively to obtain the total house utilization area, the per-person living area, the annual enterprise amount and the per-person consumption amount of the successfully matched interest point; the traffic occurrence number calculating unit is used for calculating the traffic occurrence number of the traffic occurrence interest points by dividing the total house using area of the interest points and the living area of the people; and the traffic attraction number calculating unit is used for calculating the traffic attraction number of the traffic attraction type interest point by dividing the annual business amount and the per-capita consumption amount of the enterprises at the interest point. The urban traffic trip demand and load determining module is used for calculating the traffic occurrence and traffic attraction of the traffic district according to the traffic occurrence and traffic attraction of the interest points; calculating and determining a traffic distribution matrix by adopting a double-constraint gravity model according to the traffic travel demand and the road impedance calculated according to the road network data; carrying out traffic distribution on the traffic distribution matrix on a road network, and calculating the traffic load of each traffic cell according to the distribution result; and the urban construction and updating mode determining module is used for determining the urban construction and updating mode according to the construction year, the traffic load and the traffic demand sequencing condition of each traffic cell. The embodiment of the system for determining the urban construction and update mode based on the multi-source data fusion can be used for executing the embodiment of the method for determining the urban construction and update mode based on the multi-source data fusion, the technical principle, the solved technical problems and the generated technical effects are similar, specific implementation details refer to the embodiment of the method, and details are not repeated here.