Movatterモバイル変換


[0]ホーム

URL:


CN111932867A - Multisource data-based bus IC card passenger getting-off station derivation method - Google Patents

Multisource data-based bus IC card passenger getting-off station derivation method
Download PDF

Info

Publication number
CN111932867A
CN111932867ACN202010559560.6ACN202010559560ACN111932867ACN 111932867 ACN111932867 ACN 111932867ACN 202010559560 ACN202010559560 ACN 202010559560ACN 111932867 ACN111932867 ACN 111932867A
Authority
CN
China
Prior art keywords
station
site
bus
card
drop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010559560.6A
Other languages
Chinese (zh)
Other versions
CN111932867B (en
Inventor
任刚
丁晓澍
朱玉霖
李大韦
凌小静
顾克东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast UniversityfiledCriticalSoutheast University
Priority to CN202010559560.6ApriorityCriticalpatent/CN111932867B/en
Publication of CN111932867ApublicationCriticalpatent/CN111932867A/en
Application grantedgrantedCritical
Publication of CN111932867BpublicationCriticalpatent/CN111932867B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

一种基于多源数据的公交IC卡乘客下车站点推导方法,具有准确性、有效性和可操作性,可广泛应用于公交OD推导中,为分析线路客流分布、确定发车班次等提供理论及方法依据。本申请包括步骤:(1)基础数据准备:对原始数据进行预处理,得到有效的公交GPS数据和IC刷卡数据,计算公交到站时间表,统计各站点上车人数;(2)基于通勤出行链推导下车站点:识别通勤出行链,推导通勤人群下车站点,统计各站点下车人数;(3)基于普通出行链推导下车站点:识别普通出行链,根据出行链特征推导下车站点,统计各站点下车人数;(4)基于站点吸引权推导下车站点:定义站点吸引权计算方法,从集计角度计算站点下车概率,从而得到各站点下车人数。

Figure 202010559560

A derivation method based on multi-source data for bus IC card passenger alighting sites, which has accuracy, effectiveness and operability, and can be widely used in bus OD derivation to provide theoretical and practical solutions for analyzing passenger flow distribution on routes and determining departure schedules. method basis. This application includes steps: (1) Basic data preparation: preprocess the original data to obtain valid bus GPS data and IC card swiping data, calculate the bus arrival timetable, and count the number of people getting on the bus at each station; (2) Based on commuting travel Chain derivation drop-off site: Identify the commuter travel chain, derive the drop-off site for commuters, and count the number of people who get off at each station; (3) Derive drop-off sites based on common travel chains: Identify common travel chains, and derive drop-off sites based on travel chain characteristics , and count the number of people getting off at each station; (4) Derivation of the alighting station based on the attraction weight of the station: define the calculation method of the attraction weight of the station, and calculate the probability of getting off at the station from the aggregate point of view, so as to obtain the number of people getting off at each station.

Figure 202010559560

Description

Translated fromChinese
一种基于多源数据的公交IC卡乘客下车站点推导方法A derivation method of bus IC card passenger drop-off site based on multi-source data

技术领域technical field

本发明属于智能交通领域,涉及一种基于多源数据的公交IC卡乘客下车站点推导方法。The invention belongs to the field of intelligent transportation, and relates to a method for deriving a bus IC card passenger alighting site based on multi-source data.

背景技术Background technique

在大数据时代背景下,通过信息技术手段对海量数据进行挖掘,从中提取有效信息从而支持决策在各个领域中都取得了良好的应用。在交通领域,随着交通大数据的数据采集和处理手段日趋成熟,依托多源数据获取公交OD矩阵,分析公交客流分布特征进成为了研究热点。In the context of the era of big data, mining massive data by means of information technology and extracting effective information from it to support decision-making has achieved good applications in various fields. In the field of transportation, as the data collection and processing methods of transportation big data become more and more mature, it has become a research hotspot to obtain the bus OD matrix based on multi-source data and analyze the distribution characteristics of bus passenger flow.

多源数据包括公交GPS数据,IC刷卡数据,静态公交数据等,如今公交IC卡自动收费系统在全国普及,公交车辆也普遍安装了车载GPS设备,随之产生的IC刷卡数据和GPS数据为研究居民公交出行提供了丰富的数据源。相较于传统人工OD矩阵调查法,利用多源数据挖掘OD矩阵具有采集成本低、更新周期短、准确性高等优点。Multi-source data includes bus GPS data, IC card swiping data, static bus data, etc. Nowadays, the bus IC card automatic toll collection system is popular throughout the country, and bus vehicles are also generally installed with on-board GPS equipment. The resulting IC card swiping data and GPS data are used for research. Residents' bus travel provides a wealth of data sources. Compared with the traditional manual OD matrix investigation method, the use of multi-source data mining OD matrix has the advantages of low acquisition cost, short update cycle and high accuracy.

目前公交系统大多采用一票制收费,即上车刷卡,下车不刷,无法获取乘客下车信息,因此对于下车站点的推导是挖掘多源数据的核心也是技术难点,其推导准确性极大地影响公交客流OD推导结果。以往研究推导思路主要分为基于出行链假设推导和基于站点吸引权计算下车概率,前者对于多源数据的挖掘比例并不高,后者作为一种集计方法无法获取具体乘客出行轨迹以验证准确性。At present, most of the public transportation systems adopt the one-ticket system, that is, swiping the card to get on the bus, but not to get off the bus, so the information of passengers getting off the bus cannot be obtained. Therefore, the derivation of the alighting station is the core of mining multi-source data, which is also a technical difficulty, and its derivation is extremely accurate. The earth affects the OD derivation results of bus passenger flow. The derivation ideas of previous research are mainly divided into the derivation based on the assumption of travel chain and the calculation of the probability of getting off the bus based on the attraction weight of the station. accuracy.

发明内容SUMMARY OF THE INVENTION

针对以上问题,本发明提供一种基于多源数据的公交IC卡乘客下车站点推导方法,结合出行链假设和站点吸引权两种常用的下车站点推导思路,提供一种基于多源数据的公交IC卡乘客下车站点推导方法,提高下车站点推导的准确性和数据利用率,可广泛应用于公交OD推导中,为达此目的,本发明提供一种基于多源数据的公交IC卡乘客下车站点推导方法,包括如下步骤:In view of the above problems, the present invention provides a method for derivation of bus IC card passenger drop-off sites based on multi-source data. Combining the two commonly used ideas of derivation of drop-off sites, travel chain assumption and site attraction right, the present invention provides a multi-source data-based method for derivation of drop-off sites. The derivation method of the bus IC card passenger getting off site improves the accuracy and data utilization rate of the bus IC card derivation, and can be widely used in the bus OD derivation. To achieve this purpose, the present invention provides a bus IC card based on multi-source data The derivation method of the passenger drop-off site includes the following steps:

(1)基础数据准备;(1) Basic data preparation;

所述步骤(1)具体步骤包括:The specific steps of the step (1) include:

(1.1)筛选符合时间和速度条件记录:筛选在公交车运营时间范围内的GPS数据和IC卡数据,筛选速度小于阈值的GPS记录;(1.1) Screening records that meet the time and speed conditions: screen GPS data and IC card data within the bus operating time range, and screen GPS records whose speed is less than the threshold;

(1.2)删除关键字段缺失记录:删除IC刷卡数据中缺少车辆ID和刷卡时间字段的记录和GPS数据中缺少经纬度和时间戳字段的记录;(1.2) Delete records with missing key fields: delete records that lack vehicle ID and card swiping time fields in IC card swiping data and records that lack latitude, longitude and timestamp fields in GPS data;

(1.3)删除重复记录:将GPS数据和IC刷卡数据分别根据车辆ID和IC卡号分类,按照时间进行排序,若在极短时间间隔内产生多条重复记录,仅保留第一条记录;(1.3) Delete duplicate records: Classify GPS data and IC card data according to vehicle ID and IC card number, and sort by time. If multiple duplicate records are generated within a very short time interval, only the first record is retained;

(1.4)匹配公交站点:根据经纬度计算公交车GPS记录和对应线路各公交站点距离,若最近站点距离小于50m,则该站点为匹配站点,否则删除这条GPS记录;(1.4) Matching bus stops: Calculate the distance between the GPS record of the bus and each bus stop of the corresponding line according to the latitude and longitude. If the distance between the nearest stop is less than 50m, the site is a matching site, otherwise the GPS record will be deleted;

(1.5)计算公交到站时间表:每辆车每个公交站点仅保留时间最早的一条GPS记录,然后调整站点编号顺序,再次删除同一站点的重复记录,得到公交到站时间表;(1.5) Calculation of bus arrival timetable: Only the oldest GPS record is kept at each bus stop for each vehicle, and then the sequence of the site number is adjusted, and the duplicate records of the same site are deleted again to obtain the bus arrival timetable;

(1.6)上车站点结果统计:统计线路上每个站点上车人数;(1.6) Statistics of boarding station results: count the number of boarding people at each station on the line;

(2)基于通勤出行链推导下车站点;(2) Deriving the drop-off site based on the commuter travel chain;

所述步骤(2)具体步骤包括:The specific steps of the step (2) include:

(2.1)识别通勤乘客:取一天的公交IC刷卡数据,记录分别在早、晚高峰各产生了一条同一条公交线路刷卡记录的IC卡号或分别在早、晚高峰各产生了两条刷卡记录,且第一条和第四条记录乘坐线路相同,第二条和第三条记录乘坐线路相同的IC卡号,筛选这些卡号的刷卡数据,并按照IC卡编号和刷卡时间进行排序,作为可能的通勤出行集合;(2.1) Identify commuter passengers: Take the bus IC card swiping data for one day, record the IC card number of the same bus line card swiping record in the morning and evening peaks, or generate two card swiping records in the morning and evening peaks, respectively. And the first and fourth records have the same ride route, and the second and third records have the same IC card number. Filter the card swipe data of these card numbers and sort them according to the IC card number and card swiping time as possible commutes. travel collection;

(2.2)直达通勤下车站点推导:对于一天产生两条刷卡记录的卡号,将第一条记录的上车站点作为第二条记录的下车站点,第二条记录的上车站点作为第一条记录的下车站点;(2.2) Derivation of direct commuter drop-off station: For a card number that generates two card swiping records a day, the pick-up site of the first record is taken as the drop-off site of the second record, and the pick-up site of the second record is taken as the first record. a recorded drop-off site;

(2.3)换乘通勤下车站点推导:对于一天产生四条刷卡记录的卡号,将第一条记录的上车站点作为第四条记录的下车站点,第四条记录的上车站点作为第一条记录的下车站点,第二条记录的上车站点作为第三条记录的下车站点,第三条记录的上车站点作为第二条记录的下车站点;(2.3) Derivation of the drop-off site for transfer and commuting: For a card number that generates four card swiping records in one day, take the pick-up site of the first record as the drop-off site of the fourth record, and the pick-up site of the fourth record as the first The drop-off site of the first record, the pick-up site of the second record is the drop-off site of the third record, and the pick-up site of the third record is the drop-off site of the second record;

(2.4)基于通勤出行链下车站点推导结果统计:统计基于通勤出行链推导得到的线路上每个站点下车人数;(2.4) Statistics of the derivation results based on the drop-off station of the commuter travel chain: Statistics of the number of people who get off at each station on the line derived based on the commuter travel chain;

(3)基于普通出行链推导下车站点;(3) Deriving the drop-off station based on the ordinary travel chain;

所述步骤(3)具体步骤包括:The specific steps of the step (3) include:

(3.1)识别普通出行链:筛除已经推导出下车站点的刷卡记录后识别普通出行链,记录一天之内产生2条以上刷卡记录的卡号,筛选这些卡号的刷卡数据,并按照IC卡编号和刷卡时间进行排序,作为可能的普通出行链集合。(3.1) Identify common travel chains: identify common travel chains after filtering out the card swiping records that have been deduced from the alighting station, record the card numbers that generate more than 2 card swiping records within one day, filter the card swiping data of these card numbers, and follow the IC card numbers. Sort by card swiping time as a set of possible common travel chains.

(3.2)推导下车站点:根据经纬度计算每张卡相邻和首尾两条记录前次刷卡线路上与后次刷卡上车站点距离最近的站点。判断距离最近站点与后次刷卡站点是否满足距离阈值,乘车方向和时间顺序三个条件,若均满足则该距离最近站点就是前次刷卡下车站点。(3.2) Derivation of the drop-off station: Calculate the adjacent and first and last two stations of each card according to the latitude and longitude of the nearest station on the previous card swiping line and the next card swiping boarding station. It is judged whether the distance between the nearest station and the next card swiping station meets the three conditions of distance threshold, ride direction and time sequence.

(3.3)基于普通出行链下车站点推导结果统计:统计基于普通出行链推导得到的线路上每个站点下车人数;(3.3) Statistics of the derivation results based on the drop-off station of the ordinary travel chain: Statistics of the number of people who get off at each station on the line derived based on the ordinary travel chain;

(4)基于站点吸引权推导下车站点;(4) Deriving the drop-off site based on the site attraction right;

所述步骤(4)具体步骤包括:The specific steps of the step (4) include:

(4.1)定义站点吸引权:筛除已经推导出下车站点的刷卡记录,结合上车站点和基于出行链的下车站点推导结果,计算各站点客流发生强度和吸引强度,站点与地铁及公共自行车换乘能力和站点公交换乘能力四个指标,然后计算各站点吸引权;(4.1) Defining the attraction power of the station: filter out the card swiping records that have been deduced to get off the station, combine the derivation results of the boarding station and the drop-off station based on the travel chain, calculate the passenger flow occurrence intensity and attraction strength of each station, the station and the subway and public transportation. There are four indicators of bicycle transfer capacity and station bus transfer capacity, and then the attraction power of each station is calculated;

(4.2)计算下车概率:根据各站点的站点吸引权,计算在各站点下车概率;(4.2) Calculate the probability of getting off the bus: Calculate the probability of getting off at each station according to the station attraction weight of each station;

(4.3)计算各站下车人数:根据在各站点下车概率,计算各站下车人数。(4.3) Calculate the number of people getting off at each station: Calculate the number of people getting off at each station according to the probability of getting off at each station.

作为本发明进一步改进,所述步骤1.4中根据经纬度计算两点距离公式为:As a further improvement of the present invention, the formula for calculating the distance between two points according to the latitude and longitude in the step 1.4 is:

Figure BDA0002545781830000031
Figure BDA0002545781830000031

式中,In the formula,

S——A,B两点之间距离,米;S - distance between points A and B, meters;

latA——A点的纬度,需要转换为弧度制,转换公式为latA=latA×π/180°;latA——The latitude of point A, which needs to be converted into radians, and the conversion formula is latA=latA×π/180°;

latB——B点的纬度;latB - the latitude of point B;

a——A,B两点纬度差;a——The latitude difference between A and B;

b——A,B两点经度差;b——The longitude difference between A and B;

R——地球半径,一般取6378137米。R - the radius of the earth, generally 6378137 meters.

作为本发明进一步改进,所述步骤4.1中站点吸引权计算公式为:As a further improvement of the present invention, the formula for calculating the site attraction right in step 4.1 is:

Wi=k1Si+k2Ei+k3Mi+k4Ti (2)Wi = k1 Si +k2 Ei +k3 Mi +k4 Ti( 2)

Figure BDA0002545781830000032
Figure BDA0002545781830000032

Figure BDA0002545781830000033
Figure BDA0002545781830000033

Figure BDA0002545781830000034
Figure BDA0002545781830000034

Figure BDA0002545781830000035
Figure BDA0002545781830000035

式中,In the formula,

Wi——站点i的公交吸引权;Wi ——Bus attraction right of stationi ;

Si——站点i客流发生强度;Si ——the occurrence intensity of passenger flow at site i;

Ei——站点i客流吸引强度;Ei ——Site i’s passenger flow attraction intensity;

Mi——站点i的地铁、公共自行车换乘能力;Mi ——the subway and public bicycle transfer capacity of station i;

Ti——站点i的公交换乘能力;Ti ——the bus transfer capacity of station i;

k1,k2,k3,k4——各个影响因素所占权重,考虑后分别取0.4,0.3,0.15,0.15;k1 , k2 , k3 , k4 ——the weights occupied by each influencing factor, take 0.4, 0.3, 0.15, 0.15 respectively after consideration;

Ni——在站点i相应上行方向站点上车人数;Ni ——the number of people boarding at the corresponding upstream station of station i;

n——线路站点总数;n——the total number of line stations;

Oi——已知在站点i下车的人数;Oi - the number of people who are known to get off at station i;

mi——站点i周围100m范围内的地铁站点和公共自行车租赁点个数;mi ——the number of subway stations and public bicycle rental points within 100m around station i;

ti——经过站点i的公交线路数。ti — the number of bus lines passing through station i.

作为本发明进一步改进,所述步骤4.2中计算在各站点下车概率。As a further improvement of the present invention, the probability of getting off at each station is calculated in the step 4.2.

在站点i上车在站点j下车的概率Pij计算公式为:The formula for calculating the probability Pij of getting on at station i and getting off at station j is:

Figure BDA0002545781830000041
Figure BDA0002545781830000041

式中,In the formula,

Pij——乘客在站点i上车在站点j下车的概率;Pij ——probability of passengers getting on at station i and getting off at station j;

Wj——站点j的公交吸引权。Wj ——Bus attraction power of station j.

作为本发明进一步改进,所述步骤4.3中计算各站下车人数。As a further improvement of the present invention, the number of people getting off at each station is calculated in step 4.3.

在站点i下车人数计算公式为:The formula for calculating the number of people getting off at station i is:

Figure BDA0002545781830000042
Figure BDA0002545781830000042

式中,In the formula,

Xi——站点i总下车人数;Xi - the total number of people getting off at station i;

Sj——站点j上车人数;Sj —— the number of people getting on the bus at station j;

Oj——已知在站点j下车的人数;Oj - the number of people who are known to get off at station j;

Pji——乘客在站点i上车在站点j下车的概率;Pji ——the probability that passengers get on at station i and get off at station j;

Oi——已知在站点i下车的人数。Oi - the number of people known to get off at station i.

本发明的有益效果为:获取准确的公交OD矩阵,是分析公交客流时空分布特征进行公交规划的基础,公交OD矩阵的推导核心在于下车站点推导,本发明以公交GPS和IC刷卡数据为基础,提出了一种基于多源数据的,结合了出行链假设和站点吸引权的两阶段下车站点推导方法,第一阶段基于通勤出行链和普通出行链对于部分乘客下车站点进行挖掘,第二阶段从集计角度考虑,采用站点吸引权方法对第一阶段无法推导下车站点的乘客进行挖掘,作为第一阶段的补充,同时在站点吸引权计算过程中引入第一阶段结果,最终得到公交站间OD,从中可以挖掘公交客流的分布规律,能够为公交规划提供依据,促进提高公交服务水平。The beneficial effects of the present invention are as follows: obtaining an accurate bus OD matrix is the basis for analyzing the spatiotemporal distribution characteristics of bus passenger flow for bus planning; the core of the bus OD matrix derivation lies in the derivation of alighting stations, and the present invention is based on bus GPS and IC card swiping data , proposes a two-stage drop-off site derivation method based on multi-source data, combining travel chain assumptions and site attraction rights. The first stage is based on the commuter travel chain and the ordinary travel chain. In the second stage, from the perspective of aggregation, the method of site attraction weight is used to mine the passengers whose alighting sites cannot be derived in the first stage, as a supplement to the first stage. The OD between bus stops can be used to excavate the distribution law of bus passenger flow, which can provide a basis for bus planning and promote the improvement of bus service levels.

附图说明Description of drawings

图1为本发明的方法流程示意图;Fig. 1 is the method flow schematic diagram of the present invention;

图2为公交到站时间表获取流程示意图;Figure 2 is a schematic diagram of the flow chart of the bus arrival timetable acquisition;

图3为基于普通出行链的下车站点推导流程示意图;Fig. 3 is a schematic diagram of the derivation process of the drop-off site based on the common travel chain;

图4为下车站点推导总体流程示意图。Figure 4 is a schematic diagram of the overall flow of the derivation of the drop-off site.

具体实施方式Detailed ways

下面结合附图与具体实施方式对本发明作进一步详细描述:The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments:

本发明提供一种基于多源数据的公交IC卡乘客下车站点推导方法,结合出行链假设和站点吸引权两种常用的下车站点推导思路,提供一种基于多源数据的公交IC卡乘客下车站点推导方法,提高下车站点推导的准确性和数据利用率,可广泛应用于公交OD推导中。The present invention provides a method for deriving a bus IC card passenger drop-off site based on multi-source data. Combining the two commonly used ideas for deriving a drop-off site, the travel chain assumption and the station attraction right, the invention provides a bus IC card passenger drop-off site derivation method based on multi-source data. The alighting station derivation method can improve the accuracy and data utilization of the alighting station derivation, and can be widely used in the derivation of bus OD.

如图1所示,一种基于多源数据的公交IC卡乘客下车站点推导方法,包括如下步骤:As shown in Figure 1, a method for deriving a bus IC card passenger drop-off site based on multi-source data includes the following steps:

(1)基础数据准备:对原始数据进行预处理,计算公交到站时间表,统计各站点上车人数。(1) Basic data preparation: preprocess the original data, calculate the bus arrival timetable, and count the number of people getting on the bus at each station.

(2)基于通勤出行链推导下车站点:识别通勤出行链,推导通勤乘客下车站点,统计各站点下车人数。(2) Derivation of drop-off sites based on the commuter travel chain: Identify the commuter travel chain, derive the drop-off sites for commuter passengers, and count the number of people who get off at each site.

(3)基于普通出行链推导下车站点:识别普通出行链,推导下车站点,统计各站点下车人数。(3) Derivation of drop-off sites based on common travel chains: Identify common travel chains, derive drop-off sites, and count the number of people getting off at each site.

(4)基于站点吸引权推导下车站点:定义站点吸引权计算方法,计算站点下车概率,从而得到各站点下车人数。(4) Derivation of drop-off sites based on site attraction weight: Define the calculation method of site attraction weight, calculate the stop-off probability of the site, and obtain the number of people who get off at each site.

本发明以公交GPS和IC刷卡数据为基础,提出了一种基于多源数据的,结合了出行链假设和站点吸引权的两阶段下车站点推导方法,第一阶段基于通勤出行链和普通出行链对于部分乘客下车站点进行挖掘,第二阶段从集计角度考虑,采用站点吸引权方法对第一阶段无法推导下车站点的乘客进行挖掘,作为第一阶段的补充,同时在站点吸引权计算过程中引入第一阶段结果,最终得到公交站间OD,从中可以挖掘公交客流的分布规律,能够为公交规划提供依据,促进提高公交服务水平。Based on bus GPS and IC card swiping data, the present invention proposes a two-stage drop-off site derivation method based on multi-source data, combining travel chain assumptions and site attraction rights. The first stage is based on the commuter travel chain and ordinary travel. The chain mines some passenger drop-off sites. In the second stage, from the perspective of aggregation, the site attraction method is used to mine passengers whose drop-off sites cannot be derived in the first stage, as a supplement to the first stage. The results of the first stage are introduced in the calculation process, and the OD between bus stations is finally obtained, from which the distribution law of bus passenger flow can be mined, which can provide a basis for bus planning and promote the improvement of bus service level.

以下是对本发明进行更详细的说明。The following is a more detailed description of the present invention.

步骤1:基础数据准备Step 1: Basic data preparation

多源数据主要包括IC刷卡数据和公交GPS数据,首先需要对原始数据进行清洗,筛除异常冗余数据,提高数据质量。为了便于后续计算,需要根据GPS数据计算公交到站时间表,并根据IC刷卡数据统计各站点上车人数。基础数据准备包括以下内容:Multi-source data mainly includes IC card swiping data and bus GPS data. First, it is necessary to clean the original data, filter out abnormal and redundant data, and improve data quality. In order to facilitate subsequent calculations, it is necessary to calculate the bus arrival timetable based on GPS data, and count the number of people getting on the bus at each station based on IC card swiping data. Basic data preparation includes the following:

步骤1.1:筛选符合时间和速度条件记录Step 1.1: Filter records that meet time and speed conditions

由于公交车需要进行调度、加油等,故在原始数据中经常会出现不在公交运营时间范围内的数据,本发明所采用的数据为宁波市公交运营数据,根据宁波市公交整体运营时间,将此时间区间取为5:30-23:00,筛选在公交车运营时间范围内的GPS数据和IC卡数据。此外,公交到站时间表计算主要利用GPS数据,需要从原始数据中筛选公交车到站时的记录,此时公交车的速度应接近于0,本发明将该公交车到站判定速度上限设为5km/h,筛选速度小于5km/h的GPS记录。Because buses need to be dispatched, refueled, etc., data that is not within the bus operation time range often appears in the original data. The data used in the present invention is the Ningbo City bus operation data. According to the overall operation time of Ningbo City bus, this The time interval is 5:30-23:00, and the GPS data and IC card data within the bus operating time range are screened. In addition, the calculation of the bus arrival timetable mainly uses GPS data, and it is necessary to filter the records when the bus arrives at the station from the original data. At this time, the speed of the bus should be close to 0. The present invention sets the upper limit of the bus arrival judgment speed to It is 5km/h, and the GPS records with the speed less than 5km/h are filtered.

步骤1.2:删除关键字段缺失记录Step 1.2: Delete records with missing key fields

关键字段缺失记录是指无法从其他数据源中匹配,修复缺失字段的记录,包括IC刷卡数据中缺少车辆ID和刷卡时间字段的记录和GPS数据中缺少经纬度和时间戳字段的记录。对于这些记录直接进行删除。Records with missing key fields refer to records that cannot be matched from other data sources to repair missing fields, including records with missing vehicle ID and card swiping time fields in IC card swipe data and records with missing latitude, longitude and timestamp fields in GPS data. Delete these records directly.

步骤1.3:删除重复记录Step 1.3: Delete Duplicate Records

重复记录是指在短时间内同一张卡在同一站点产生若干条相同刷卡记录和短时间内同一辆公交车在同一个位置产生若干条相同公交GPS记录,这可能是数据采集或传输过程中不稳定造成的。将GPS数据和IC刷卡数据分别根据车辆ID分类和IC卡号分类,按照时间进行排序,仅保留同一站点的第一条刷卡记录和同一位置的第一条GPS记录,删除其余重复记录。Duplicate records refer to several identical card swiping records generated by the same card at the same station in a short period of time, and several identical bus GPS records generated by the same bus at the same location in a short period of time. caused by stability. The GPS data and IC card swiping data are classified according to the vehicle ID and IC card number, respectively, and sorted by time. Only the first card swiping record of the same site and the first GPS record of the same location are retained, and the remaining duplicate records are deleted.

步骤1.4:匹配公交站点Step 1.4: Match Transit Stops

获取公交到站时间表首先需要将公交车GPS数据和公交车站点位置数据相匹配,将每条GPS记录对应到具体站点。根据经纬度计算公交车GPS记录和对应线路各公交站点距离,若最近站点距离小于50m,则该站点为匹配站点,否则删除这条GPS记录。To obtain the bus arrival timetable, you first need to match the bus GPS data with the bus station location data, and map each GPS record to a specific station. Calculate the distance between the GPS record of the bus and each bus stop of the corresponding line according to the latitude and longitude. If the distance between the nearest stop is less than 50m, the stop is a matching stop, otherwise the GPS record is deleted.

根据经纬度计算两点距离公式为:The formula for calculating the distance between two points according to latitude and longitude is:

Figure BDA0002545781830000061
Figure BDA0002545781830000061

式中,In the formula,

S——A,B两点之间距离,米;S - distance between points A and B, meters;

latA——A点的纬度,需要转换为弧度制,转换公式为latA=latA×π/180°;latA——The latitude of point A, which needs to be converted into radians, and the conversion formula is latA=latA×π/180°;

latB——B点的纬度;latB - the latitude of point B;

a——A,B两点纬度差;a——The latitude difference between A and B;

b——A,B两点经度差;b——The longitude difference between A and B;

R——地球半径,一般取6378137米。R - the radius of the earth, generally 6378137 meters.

步骤1.5:计算公交到站时间表Step 1.5: Calculate the bus arrival timetable

基于普通出行链推导下车站点时,一个重要的判断依据是是否满足时间顺序,为此需要计算公交到站时间表。在一个班次中,每辆车每一个公交站点仅保留时间最早的一条GPS记录,然后对站点编号顺序进行调整并再次删除同一个站点的重复记录。最终保留的GPS记录的时间戳字段就是这辆公交车到达该站点的到站时间。表1为公交到站时间表示例。When deriving the drop-off station based on the ordinary travel chain, an important judgment basis is whether the chronological order is satisfied, and the bus arrival timetable needs to be calculated for this purpose. In a shift, only the oldest GPS record is kept for each bus stop of each vehicle, and then the sequence of stop numbers is adjusted and the duplicate records of the same stop are deleted again. The timestamp field of the final retained GPS record is the arrival time of the bus at that station. Table 1 is an example of the bus arrival timetable.

表1公交车到站时间表示例Table 1 Example of bus arrival timetable

Figure BDA0002545781830000071
Figure BDA0002545781830000071

步骤1.6:上车站点结果统计Step 1.6: Result statistics at the pick-up site

根据IC刷卡数据的上车站点信息,统计线路上每个站点上车人数。According to the boarding station information of the IC card swiping data, count the number of people getting on the bus at each station on the line.

步骤2:基于通勤出行链推导下车站点Step 2: Derive drop-off sites based on the commuter travel chain

通勤出行是指在居住地和工作地或学校之间的往返过程,是居民总体出行的重要组成部分,通常将经常在早、晚高峰乘坐公交车的乘客定位为通勤人群,通勤人群的出行规律性强,认为他们在早高峰的上车站点是居住地,在晚高峰的上车站点是工作地点或学校,乘客一天内在居住地和工作地或学校之间往返,故早高峰的下车站点为晚高峰的上车站点,晚高峰的下车站点为早高峰的上车站点。Commuter travel refers to the round-trip process between the place of residence and the place of work or school, and is an important part of the overall travel of residents. Usually, passengers who often take buses in the morning and evening rush hour are positioned as commuters, and the travel patterns of commuters are They believe that their boarding site in the morning rush hour is their place of residence, and the boarding site in the evening rush hour is their work place or school. Passengers travel between their place of residence and work place or school within one day, so the alighting site in the morning rush hour is the place of residence. It is the pick-up site for the evening peak, and the drop-off site for the evening peak is the pick-up site for the morning peak.

步骤2.1:识别通勤乘客Step 2.1: Identify commuter passengers

取一天的公交IC刷卡数据,记录分别在早、晚高峰各产生了一条同一条公交线路刷卡记录的IC卡号或分别在早、晚高峰各产生了两条刷卡记录,且第一条和第四条记录乘坐线路相同,第二条和第三条记录乘坐线路相同的IC卡号,筛选这些卡号的刷卡数据,并按照IC卡编号和刷卡时间进行排序,作为可能的通勤出行集合。Take the bus IC card swiping data for one day, and record the IC card number of the same bus line card swiping record in the morning and evening peaks respectively, or generate two card swiping records in the morning and evening peaks respectively, and the first and fourth One record has the same route, and the second and third records have the same IC card number. Screen the card swiping data of these card numbers and sort them according to the IC card number and card swiping time, as a set of possible commuting trips.

步骤2.2:直达通勤下车站点推导Step 2.2: Direct commuter drop-off station derivation

直达公交通勤出行的乘客不需要进行换乘,一天内产生通常为同一公交线路的2条刷卡记录。下车站点推导过程为:取每张卡的两条刷卡记录,将第一条记录的上车站点作为第二条记录的下车站点,第二条记录的上车站点作为第一条记录的下车站点。Passengers who travel by direct bus do not need to transfer, and two card swiping records are usually generated for the same bus line in one day. The derivation process of the drop-off site is: take two swipe records of each card, take the pick-up site of the first record as the drop-off site of the second record, and the pick-up site of the second record as the first record. Drop off site.

步骤2.3:换乘通勤下车站点推导Step 2.3: Derivation of transfer and commute drop-off sites

而换乘公交通勤出行的乘客早、晚高峰均需要进行换乘,一天内产生4条或以上的刷卡记录,由于通勤出行中进行两次或以上换乘的情况比较少见,故只考虑单次换乘。下车站点推导过程为:取每张卡的四条刷卡记录,将第一条记录的上车站点作为第四条记录的下车站点,第四条记录的上车站点作为第一条记录的下车站点,第二条记录的上车站点作为第三条记录的下车站点,第三条记录的上车站点作为第二条记录的下车站点。表2为基于通勤出行链的下车站点推导结果示例。However, passengers who transfer to public transportation for commuting need to transfer in the morning and evening rush hours, resulting in 4 or more credit card swiping records in one day. Since it is relatively rare to make two or more transfers during commuting trips, only a single transfer is considered. transfer. The derivation process of the drop-off site is as follows: take the four swiping records of each card, take the pick-up site of the first record as the drop-off site of the fourth record, and the pick-up site of the fourth record as the drop-off site of the first record. The pick-up site of the second record is the drop-off site of the third record, and the pick-up site of the third record is the drop-off site of the second record. Table 2 is an example of the derivation results of drop-off sites based on the commuter travel chain.

表2基于通勤出行链下车推导结果示例Table 2 Example of derivation results based on commuter travel chain drop off

Figure BDA0002545781830000081
Figure BDA0002545781830000081

步骤2.4:基于通勤出行链下车站点推导结果统计Step 2.4: Derive the result statistics based on the drop-off station of the commuter travel chain

整理基于通勤出行链推导出下车站点的刷卡记录,统计线路上每个站点下车人数。Sort out the card swiping records derived from the drop-off station based on the commuter travel chain, and count the number of people getting off at each station on the line.

步骤3:基于普通出行链推导下车站点Step 3: Derive the drop-off site based on the common travel chain

普通出行链区别于通勤出行链在于它具有一定随机性,出行时间不一定在高峰时段,上、下车站点也不固定,因此基于普通出行链的下车站点推导需要在时间和空间上进行相应约束。The ordinary travel chain is different from the commuter travel chain in that it has a certain randomness. The travel time is not necessarily in peak hours, and the pick-up and drop-off sites are not fixed. Therefore, the derivation of the drop-off site based on the common travel chain needs to be corresponding in time and space. constraint.

步骤3.1:识别普通出行链Step 3.1: Identify common travel chains

筛除已经推导出下车站点的刷卡记录后识别普通出行链,首先要求乘客一天之内产生2条以上刷卡记录。故记录一天之内产生2条以上刷卡记录的卡号,筛选这些卡号的刷卡数据,并按照IC卡编号和刷卡时间进行排序,作为可能的普通出行链集合。After screening out the card swiping records that have been deduced from the alighting station, identify the common travel chain, and first require passengers to generate more than 2 card swiping records within a day. Therefore, record the card numbers with more than 2 card swiping records in one day, filter the card swiping data of these card numbers, and sort them according to the IC card number and card swiping time, as a set of possible common travel chains.

步骤3.2:推导下车站点Step 3.2: Derive the drop-off site

若同一张卡相邻两条刷卡记录被认为是连续的公交出行,则可以通过后次刷卡上车站点推导前次刷卡下车站点,是否为连续的公交出行的判断依据主要是乘客后一次刷卡上车站点周围是否有前一次乘坐的公交线路经过的站点,若有则认为前后两次公交出行是连续的。If two adjacent card swiping records of the same card are considered to be continuous bus trips, the previous card swiping and alighting stations can be deduced from the next card swiping station. Whether there is a stop around the boarding station that the previous bus route passed by, if so, the two bus trips before and after are considered to be consecutive.

基于普通出行链的下车站点推导过程为:根据经纬度计算每张卡相邻和首尾两条记录前次刷卡线路上与后次刷卡上车站点距离最近的站点。依次判断距离最近站点与后次刷卡站点是否满足距离阈值,乘车方向和时间顺序三个条件,即两个站点距离是否在500m以内,最近站点是否为前次出行上车站点的下游站点,乘客是否在后次上车之前到达该距离最近站点下车,若均满足则该距离最近站点就是前次刷卡下车站点。表3为基于普通出行链的下车站点推导结果示例。The derivation process of the drop-off station based on the ordinary travel chain is: according to the longitude and latitude, the adjacent and first and last two records of each card are the closest stations on the previous card swiping line and the next card swiping station. In turn, it is judged whether the distance between the nearest station and the next card swiping station meets the distance threshold, ride direction and time sequence, that is, whether the distance between the two stations is within 500m, and whether the nearest station is the downstream station of the previous trip and boarding station. Whether to get off at the nearest stop before the next boarding, if all are satisfied, then the nearest stop is the previous swipe and get off stop. Table 3 is an example of the derivation results of drop-off sites based on common travel chains.

表3基于普通出行链下车推导结果示例Table 3 Example of derivation results based on common travel chain getting off

Figure BDA0002545781830000091
Figure BDA0002545781830000091

步骤3.3:基于普通出行链下车站点推导结果统计Step 3.3: Derive the result statistics based on the drop-off station of the ordinary travel chain

整理基于普通出行链推导出下车站点的刷卡记录,统计线路上每个站点下车人数。Sort out the swipe records of the alighting stations based on the common travel chain, and count the number of people getting off at each station on the line.

步骤4:基于站点吸引权推导下车站点Step 4: Derive drop-off site based on site attraction power

根据各个站点吸引权计算在各站点下车的概率,将该线路各站点上车所有乘客按照概率分配到各站点下车就完成了基于站点吸引权的下车站点推导,其核心在于站点吸引权的定义和计算。这种方法不能推导每位乘客具体的下车站点,只能得到公交线路站间的客流分布情况,因此是一种集计模型,能够有效地挖掘无法通过出行链推导的下车站点。The probability of getting off at each station is calculated according to the attraction weight of each station, and all passengers who get on the bus at each station of the line are distributed to each station according to the probability to get off at each station. definition and calculation. This method cannot deduce the specific alighting stations of each passenger, but can only obtain the passenger flow distribution among bus line stations. Therefore, it is an aggregate model, which can effectively mine the alighting stations that cannot be deduced through the travel chain.

步骤4.1:定义站点吸引权Step 4.1: Define Site Attraction

本发明从站点客流发生强度和吸引强度,站点周围与地铁及公共自行车衔接情况和站点公交换乘能力四个方面进行考虑,定义站点吸引权,各个指标的含义如下:The present invention considers the occurrence intensity and attraction intensity of the site's passenger flow, the connection situation around the site with the subway and public bicycles, and the bus transfer capacity of the site, and defines the site's attraction power. The meaning of each index is as follows:

(1)站点客流发生强度(1) Intensity of site passenger flow

根据客流对称性,某一公交站点的客流发生量和客流吸引量大致上应该是均衡的,该站点客流的发生量越大,则在此站点下车概率就越大,即上行站点上车人数占该线路总上车人数比例与相应下行站点下车人数占该线路总下车人数比例接近。站点客流发生强度根据在该站点相应上行方向站点上车人数占总上车人数比例计算。According to the symmetry of passenger flow, the passenger flow occurrence and passenger flow attraction of a certain bus station should be roughly balanced. The proportion of the total number of people getting on the line is close to the proportion of the number of people getting off at the corresponding downlink station to the total number of people getting off the line. The intensity of passenger flow at a station is calculated based on the proportion of the number of people getting on the bus at the corresponding upstream direction of the station to the total number of people getting on the bus.

(2)站点客流吸引强度(2) Attraction intensity of site passenger flow

客流吸引强度一般根据站点周围土地利用性质确定,而基于出行链方法推导出的下车站点与从土地利用性质来看吸引客流强的站点在一定程度上重合,故本发明利用根据出行链假设推导得到的各站点下车人数统计结果,以各站点下车人数占总下车人数比例计算站点客流吸引强度。The attraction intensity of passenger flow is generally determined according to the nature of land use around the station, and the drop-off site derived based on the travel chain method overlaps to a certain extent with the site that attracts strong passenger flow from the perspective of land use properties. Based on the statistical results of the number of people getting off at each station, the attraction intensity of passenger flow at the site is calculated by the ratio of the number of people getting off at each station to the total number of people getting off.

(3)站点与地铁及公共自行车衔接情况(3) The connection between the station and the subway and public bicycles

换乘能力强的站点对于客流的吸引也越强,随着地铁和公共自行车等其他公共交通方式的兴起,公交站点的换乘能力也体现在与地铁、公共自行车的衔接上,公交站点周围的地铁站点和公共自行车站点越多,认为此站点换乘能力越强。With the rise of other public transportation methods such as subways and public bicycles, the transfer ability of bus stations is also reflected in the connection with subways and public bicycles. The more subway stations and public bicycle stations, the stronger the transfer capacity of this station is considered.

(4)站点公交换乘能力(4) Station bus transfer capacity

站点的公交换乘能力体现在站点可换乘的公交线路上,经过某站点的公交线路数量越多,认为此站点换乘能力越强。The bus transfer capacity of a station is reflected in the bus lines that can be transferred at the station. The more bus lines passing through a certain station, the stronger the transfer capacity of this station is considered.

综合考虑以上四个因素影响,结合基于出行链已经推导得到的下车站点结果,计算站点吸引权。Taking the influence of the above four factors into consideration, combined with the results of the alighting stations that have been deduced based on the travel chain, the attraction weight of the station is calculated.

站点吸引权计算公式为:The formula for calculating site attractiveness is:

Wi=k1Si+k2Ei+k3Mi+k4Ti (2)Wi = k1 Si +k2 Ei +k3 Mi +k4 Ti( 2)

Figure BDA0002545781830000101
Figure BDA0002545781830000101

Figure BDA0002545781830000102
Figure BDA0002545781830000102

Figure BDA0002545781830000103
Figure BDA0002545781830000103

Figure BDA0002545781830000104
Figure BDA0002545781830000104

式中,In the formula,

Wi——站点i的公交吸引权;Wi ——Bus attraction right of stationi ;

Si——站点i客流发生强度;Si ——the occurrence intensity of passenger flow at site i;

Ei——站点i客流吸引强度;Ei ——Site i’s passenger flow attraction intensity;

Mi——站点i的地铁、公共自行车换乘能力;Mi ——the subway and public bicycle transfer capacity of station i;

Ti——站点i的公交换乘能力;Ti ——the bus transfer capacity of station i;

k1,k2,k3,k4——各个影响因素所占权重,考虑后分别取0.4,0.3,0.15,0.15;k1 , k2 , k3 , k4 ——the weights occupied by each influencing factor, take 0.4, 0.3, 0.15, 0.15 respectively after consideration;

Ni——在站点i相应上行方向站点上车人数;Ni ——the number of people boarding at the corresponding upstream station of station i;

n——线路站点总数;n——the total number of line stations;

Oi——已知在站点i下车的人数;Oi - the number of people who are known to get off at station i;

mi——站点i周围100m范围内的地铁站点和公共自行车租赁点个数;mi ——the number of subway stations and public bicycle rental points within 100m around station i;

ti——经过站点i的公交线路数。ti — the number of bus lines passing through station i.

表4为站点吸引权计算结果示例,表中S,E,M,T分别代表客流发生强度、客流吸引强度、地铁、公共自行车换乘能力和公交换乘能力,W为计算得到的站点吸引权。Table 4 is an example of the calculation result of the station attraction right. In the table, S, E, M, and T represent the passenger flow occurrence intensity, passenger flow attraction intensity, subway, public bicycle transfer capacity, and bus transfer capacity, respectively, and W is the calculated station attraction right. .

表4站点吸引权计算结果示例Table 4 Example of calculation results of site attraction rights

Figure BDA0002545781830000111
Figure BDA0002545781830000111

步骤4.2:计算下车概率Step 4.2: Calculate the probability of getting off

根据各个站点吸引权,计算在各站点下车概率。Calculate the probability of getting off at each station according to the attraction rights of each station.

在站点i上车在站点j下车的概率Pij计算公式为:The formula for calculating the probability Pij of getting on at station i and getting off at station j is:

Figure BDA0002545781830000112
Figure BDA0002545781830000112

式中,In the formula,

Pij——乘客在站点i上车在站点j下车的概率;Pij ——probability of passengers getting on at station i and getting off at station j;

Wj——站点j的公交吸引权。Wj ——Bus attraction power of station j.

表5为下车概率矩阵示例。行列代表线路站点顺序编号,第i行第j列数字代表在第i站上车在第j站下车的概率PijTable 5 is an example of the probability matrix of getting off. The row and column represent the sequence number of the line station, and the number in the i-th row and the j-th column represents the probability Pij of getting on at the i-th station and getting off at the j-th station.

表5下车概率矩阵示例Table 5. Example of getting off probability matrix

Figure BDA0002545781830000113
Figure BDA0002545781830000113

步骤4.3:计算各站下车人数Step 4.3: Calculate the number of people getting off at each station

根据在各站点下车概率,计算各站下车人数。According to the probability of getting off at each station, the number of people getting off at each station is calculated.

在站点i下车人数计算公式为:The formula for calculating the number of people getting off at station i is:

Figure BDA0002545781830000121
Figure BDA0002545781830000121

式中,In the formula,

Xi——站点i总下车人数;Xi - the total number of people getting off at station i;

Sj——站点j上车人数;Sj —— the number of people getting on the bus at station j;

Oj——已知在站点j下车的人数;Oj - the number of people who are known to get off at station j;

Pji——乘客在站点i上车在站点j下车的概率;Pji ——the probability that passengers get on at station i and get off at station j;

Oi——已知在站点i下车的人数。Oi - the number of people known to get off at station i.

表6为各站点上下车人数统计表示例。Table 6 is an example of the statistics table of the number of people getting on and off at each station.

表6各站点上下车人数统计表示例Table 6 Example of the statistics table of the number of people getting on and off at each station

Figure BDA0002545781830000122
Figure BDA0002545781830000122

将上述结果与宁波市提供的参考结果进行对比验证本发明提出的下车站点推导方法的有效性,由于两者对IC刷卡数据的挖掘比例不同,故采取的验证方法是计算各站点下车人数占总下车人数的比例,然后对宁波市参考结果和本发明结果进行相关性分析。表7是相关性分析结果。The above results are compared with the reference results provided by Ningbo City to verify the validity of the method for deriving the drop-off site proposed by the present invention. Since the two have different mining ratios for IC card swiping data, the verification method adopted is to calculate the number of people who get off at each site. The proportion of the total number of people getting off, and then the correlation analysis is carried out on the reference results of Ningbo City and the results of the present invention. Table 7 is the correlation analysis results.

表7各站点下车人数比例相关性分析Table 7 Correlation analysis of the proportion of people getting off at each station

Figure BDA0002545781830000123
Figure BDA0002545781830000123

注:**.在置信度(双测)为0.01时,相关性是显著的。Note: **. The correlation is significant when the confidence level (double test) is 0.01.

由表6可知,p=0.002<0.01,故宁波市参考结果和本发明推导结果的各站点下车人数比例显著相关,且皮尔逊相关系数为0.619,说明两者呈正相关且关系紧密。It can be seen from Table 6 that p=0.002<0.01, so the reference results in Ningbo are significantly correlated with the proportion of people getting off at each station in the derivation results of the present invention, and the Pearson correlation coefficient is 0.619, indicating that the two are positively correlated and closely related.

以上所述,仅是本发明的较佳实施例而已,并非是对本发明作任何其他形式的限制,而依据本发明的技术实质所作的任何修改或等同变化,仍属于本发明所要求保护的范围。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention in any other form, and any modifications or equivalent changes made according to the technical essence of the present invention still fall within the scope of protection of the present invention. .

Claims (5)

Translated fromChinese
1.一种基于多源数据的公交IC卡乘客下车站点推导方法,其特征在于,包括如下步骤:1. a bus IC card passenger getting off site derivation method based on multi-source data, is characterized in that, comprises the steps:(1)基础数据准备;(1) Basic data preparation;所述步骤(1)具体步骤包括:The specific steps of the step (1) include:(1.1)筛选符合时间和速度条件记录:筛选在公交车运营时间范围内的GPS数据和IC卡数据,筛选速度小于阈值的GPS记录;(1.1) Screening records that meet the time and speed conditions: screen GPS data and IC card data within the bus operating time range, and screen GPS records whose speed is less than the threshold;(1.2)删除关键字段缺失记录:删除IC刷卡数据中缺少车辆ID和刷卡时间字段的记录和GPS数据中缺少经纬度和时间戳字段的记录;(1.2) Delete records with missing key fields: delete records that lack vehicle ID and card swiping time fields in IC card swiping data and records that lack latitude, longitude and timestamp fields in GPS data;(1.3)删除重复记录:将GPS数据和IC刷卡数据分别根据车辆ID和IC卡号分类,按照时间进行排序,若在极短时间间隔内产生多条重复记录,仅保留第一条记录;(1.3) Delete duplicate records: Classify GPS data and IC card data according to vehicle ID and IC card number, and sort by time. If multiple duplicate records are generated within a very short time interval, only the first record is retained;(1.4)匹配公交站点:根据经纬度计算公交车GPS记录和对应线路各公交站点距离,若最近站点距离小于50m,则该站点为匹配站点,否则删除这条GPS记录;(1.4) Matching bus stops: Calculate the distance between the GPS record of the bus and each bus stop of the corresponding line according to the latitude and longitude. If the distance between the nearest stop is less than 50m, the site is a matching site, otherwise the GPS record will be deleted;(1.5)计算公交到站时间表:每辆车每个公交站点仅保留时间最早的一条GPS记录,然后调整站点编号顺序,再次删除同一站点的重复记录,得到公交到站时间表;(1.5) Calculation of bus arrival timetable: Only the oldest GPS record is kept at each bus stop for each vehicle, and then the sequence of the site number is adjusted, and the duplicate records of the same site are deleted again to obtain the bus arrival timetable;(1.6)上车站点结果统计:统计线路上每个站点上车人数;(1.6) Statistics of boarding station results: count the number of boarding people at each station on the line;(2)基于通勤出行链推导下车站点;(2) Deriving the drop-off site based on the commuter travel chain;所述步骤(2)具体步骤包括:The specific steps of the step (2) include:(2.1)识别通勤乘客:取一天的公交IC刷卡数据,记录分别在早、晚高峰各产生了一条同一条公交线路刷卡记录的IC卡号或分别在早、晚高峰各产生了两条刷卡记录,且第一条和第四条记录乘坐线路相同,第二条和第三条记录乘坐线路相同的IC卡号,筛选这些卡号的刷卡数据,并按照IC卡编号和刷卡时间进行排序,作为可能的通勤出行集合;(2.1) Identify commuter passengers: Take the bus IC card swiping data for one day, record the IC card number of the same bus line card swiping record in the morning and evening peaks, or generate two card swiping records in the morning and evening peaks, respectively. And the first and fourth records have the same ride route, and the second and third records have the same IC card number. Filter the card swipe data of these card numbers and sort them according to the IC card number and card swiping time as possible commutes. travel collection;(2.2)直达通勤下车站点推导:对于一天产生两条刷卡记录的卡号,将第一条记录的上车站点作为第二条记录的下车站点,第二条记录的上车站点作为第一条记录的下车站点;(2.2) Derivation of direct commuter drop-off station: For a card number that generates two card swiping records a day, the pick-up site of the first record is taken as the drop-off site of the second record, and the pick-up site of the second record is taken as the first record. a recorded drop-off site;(2.3)换乘通勤下车站点推导:对于一天产生四条刷卡记录的卡号,将第一条记录的上车站点作为第四条记录的下车站点,第四条记录的上车站点作为第一条记录的下车站点,第二条记录的上车站点作为第三条记录的下车站点,第三条记录的上车站点作为第二条记录的下车站点;(2.3) Derivation of the drop-off site for transfer and commuting: For a card number that generates four card swiping records in one day, take the pick-up site of the first record as the drop-off site of the fourth record, and the pick-up site of the fourth record as the first The drop-off site of the first record, the pick-up site of the second record is the drop-off site of the third record, and the pick-up site of the third record is the drop-off site of the second record;(2.4)基于通勤出行链下车站点推导结果统计:统计基于通勤出行链推导得到的线路上每个站点下车人数;(2.4) Statistics of the derivation results based on the drop-off station of the commuter travel chain: Statistics of the number of people who get off at each station on the line derived based on the commuter travel chain;(3)基于普通出行链推导下车站点;(3) Deriving the drop-off station based on the ordinary travel chain;所述步骤(3)具体步骤包括:The specific steps of the step (3) include:(3.1)识别普通出行链:筛除已经推导出下车站点的刷卡记录后识别普通出行链,记录一天之内产生2条以上刷卡记录的卡号,筛选这些卡号的刷卡数据,并按照IC卡编号和刷卡时间进行排序,作为可能的普通出行链集合。(3.1) Identify common travel chains: identify common travel chains after filtering out the card swiping records that have been deduced from the alighting station, record the card numbers that generate more than 2 card swiping records within one day, filter the card swiping data of these card numbers, and follow the IC card numbers. Sort by card swiping time as a set of possible common travel chains.(3.2)推导下车站点:根据经纬度计算每张卡相邻和首尾两条记录前次刷卡线路上与后次刷卡上车站点距离最近的站点。判断距离最近站点与后次刷卡站点是否满足距离阈值,乘车方向和时间顺序三个条件,若均满足则该距离最近站点就是前次刷卡下车站点。(3.2) Derivation of the drop-off station: Calculate the adjacent and first and last two stations of each card according to the latitude and longitude of the nearest station on the previous card swiping line and the next card swiping boarding station. It is judged whether the distance between the nearest station and the next card swiping station meets the three conditions of distance threshold, ride direction and time sequence.(3.3)基于普通出行链下车站点推导结果统计:统计基于普通出行链推导得到的线路上每个站点下车人数;(3.3) Statistics of the derivation results based on the drop-off station of the ordinary travel chain: Statistics of the number of people who get off at each station on the line derived based on the ordinary travel chain;(4)基于站点吸引权推导下车站点;(4) Deriving the drop-off site based on the site attraction right;所述步骤(4)具体步骤包括:The specific steps of the step (4) include:(4.1)定义站点吸引权:筛除已经推导出下车站点的刷卡记录,结合上车站点和基于出行链的下车站点推导结果,计算各站点客流发生强度和吸引强度,站点与地铁及公共自行车换乘能力和站点公交换乘能力四个指标,然后计算各站点吸引权;(4.1) Defining the attraction power of the station: filter out the card swiping records that have been deduced to get off the station, combine the derivation results of the boarding station and the drop-off station based on the travel chain, calculate the passenger flow occurrence intensity and attraction strength of each station, the station and the subway and public transportation. There are four indicators of bicycle transfer capacity and station bus transfer capacity, and then the attraction power of each station is calculated;(4.2)计算下车概率:根据各站点的站点吸引权,计算在各站点下车概率;(4.2) Calculate the probability of getting off the bus: Calculate the probability of getting off at each station according to the station attraction weight of each station;(4.3)计算各站下车人数:根据在各站点下车概率,计算各站下车人数。(4.3) Calculate the number of people getting off at each station: Calculate the number of people getting off at each station according to the probability of getting off at each station.2.根据权利1所述的一种基于多源数据的公交IC卡乘客下车站点推导方法,其特征在于,所述步骤1.4中根据经纬度计算两点距离公式为:2. a kind of bus IC card passenger get-off site derivation method based on multi-source data according to claim 1, is characterized in that, in described step 1.4, according to longitude and latitude, the formula for calculating the distance between two points is:
Figure FDA0002545781820000021
Figure FDA0002545781820000021
式中,In the formula,S——A,B两点之间距离,米;S - distance between points A and B, meters;latA——A点的纬度,需要转换为弧度制,转换公式为latA=latA×π/180°;latA——The latitude of point A, which needs to be converted into radians, and the conversion formula is latA=latA×π/180°;latB——B点的纬度;latB - the latitude of point B;a——A,B两点纬度差;a——The latitude difference between A and B;b——A,B两点经度差;b——The longitude difference between A and B;R——地球半径,一般取6378137米。R - the radius of the earth, generally 6378137 meters.3.根据权利1所述的一种基于多源数据的公交IC卡乘客下车站点推导方法,其特征在于,所述步骤4.1中站点吸引权计算公式为:3. a kind of bus IC card passenger get-off site derivation method based on multi-source data according to claim 1, is characterized in that, in described step 4.1, the calculation formula of site attraction right is:Wi=k1Si+k2Ei+k3Mi+k4Ti (2)Wi = k1 Si +k2 Ei +k3 Mi +k4 Ti( 2)
Figure FDA0002545781820000022
Figure FDA0002545781820000022
Figure FDA0002545781820000023
Figure FDA0002545781820000023
Figure FDA0002545781820000031
Figure FDA0002545781820000031
Figure FDA0002545781820000032
Figure FDA0002545781820000032
式中,In the formula,Wi——站点i的公交吸引权;Wi ——Bus attraction right of stationi ;Si——站点i客流发生强度;Si ——the occurrence intensity of passenger flow at site i;Ei——站点i客流吸引强度;Ei ——Site i’s passenger flow attraction intensity;Mi——站点i的地铁、公共自行车换乘能力;Mi ——the subway and public bicycle transfer capacity of station i;Ti——站点i的公交换乘能力;Ti ——the bus transfer capacity of station i;k1,k2,k3,k4——各个影响因素所占权重,考虑后分别取0.4,0.3,0.15,0.15;k1 , k2 , k3 , k4 ——the weights occupied by each influencing factor, take 0.4, 0.3, 0.15, 0.15 respectively after consideration;Ni——在站点i相应上行方向站点上车人数;Ni ——the number of people boarding at the corresponding upstream station of station i;n——线路站点总数;n——the total number of line stations;Oi——已知在站点i下车的人数;Oi - the number of people who are known to get off at station i;mi——站点i周围100m范围内的地铁站点和公共自行车租赁点个数;mi ——the number of subway stations and public bicycle rental points within 100m around station i;ti——经过站点i的公交线路数。ti — the number of bus lines passing through station i.
4.根据权利1所述的一种基于多源数据的公交IC卡乘客下车站点推导方法,其特征在于,所述步骤4.2中计算在各站点下车概率。4 . The method for deriving a bus IC card passenger drop-off site based on multi-source data according to claim 1 , wherein the step 4.2 calculates the probability of getting off at each site. 5 .在站点i上车在站点j下车的概率Pij计算公式为:The formula for calculating the probability Pij of getting on at station i and getting off at station j is:
Figure FDA0002545781820000033
Figure FDA0002545781820000033
式中,In the formula,Pij——乘客在站点i上车在站点j下车的概率;Pij ——probability of passengers getting on at station i and getting off at station j;Wj——站点j的公交吸引权。Wj ——Bus attraction power of station j.
5.根据权利1所述的一种基于多源数据的公交IC卡乘客下车站点推导方法,其特征在于,所述步骤4.3中计算各站下车人数。5 . The method for deriving a bus IC card passenger alighting station based on multi-source data according to claim 1 , wherein the number of people getting off at each station is calculated in the step 4.3. 6 .在站点i下车人数计算公式为:The formula for calculating the number of people getting off at station i is:
Figure FDA0002545781820000034
Figure FDA0002545781820000034
式中,In the formula,Xi——站点i总下车人数;Xi - the total number of people getting off at station i;Sj——站点j上车人数;Sj —— the number of people getting on the bus at station j;Oj——已知在站点j下车的人数;Oj - the number of people who are known to get off at station j;Pji——乘客在站点i上车在站点j下车的概率;Pji ——the probability that passengers get on at station i and get off at station j;Oi——已知在站点i下车的人数。Oi - the number of people known to get off at station i.
CN202010559560.6A2020-06-182020-06-18Multisource data-based bus IC card passenger getting-off station derivation methodActiveCN111932867B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010559560.6ACN111932867B (en)2020-06-182020-06-18Multisource data-based bus IC card passenger getting-off station derivation method

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010559560.6ACN111932867B (en)2020-06-182020-06-18Multisource data-based bus IC card passenger getting-off station derivation method

Publications (2)

Publication NumberPublication Date
CN111932867Atrue CN111932867A (en)2020-11-13
CN111932867B CN111932867B (en)2022-04-29

Family

ID=73316553

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010559560.6AActiveCN111932867B (en)2020-06-182020-06-18Multisource data-based bus IC card passenger getting-off station derivation method

Country Status (1)

CountryLink
CN (1)CN111932867B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112733891A (en)*2020-12-292021-04-30华侨大学Method for identifying getting-off station of bus IC card passenger during trip chain breakage
CN113159416A (en)*2021-04-192021-07-23深圳大学Calculation method for bus single card swiping get-off station and intelligent terminal
CN113705903A (en)*2021-08-312021-11-26重庆市凤筑科技有限公司OD derivation method based on urban public transport comprehensive model
CN114219191A (en)*2021-06-302022-03-22深圳市巴滴科技有限公司Passenger vehicle scheduling method and device based on traffic passenger flow information
CN114331234A (en)*2022-03-162022-04-12北京交通大学 Rail transit passenger flow prediction method and system based on passenger travel information
CN114358808A (en)*2021-11-152022-04-15南京理工大学Public transport OD estimation and distribution method based on multi-source data fusion
CN114444789A (en)*2022-01-192022-05-06中山大学 An autonomous construction method of bus network supply and demand matrix based on multi-source data
CN114841428A (en)*2022-04-262022-08-02上海闻政管理咨询有限公司Bus route planning method and system
CN114882693A (en)*2022-03-232022-08-09昆明理工大学Bus passenger getting-off station prediction method based on card swiping data deep mining
CN114912657A (en)*2022-04-122022-08-16东南大学 An OD derivation method of bus passenger flow based on multiple toll ticket systems
CN115331470A (en)*2022-10-112022-11-11山东恒宇电子有限公司Single-ticket bus passenger getting-off station calculation method based on IC and travel chain
CN115527361A (en)*2021-06-242022-12-27北京市交通信息中心 Method and device for identifying bus passenger pick-up and drop-off stops
CN115691128A (en)*2022-10-272023-02-03大连海事大学Bus stop passenger flow calculation method based on multi-source bus data combined mining
CN115963519A (en)*2021-10-082023-04-14南京市城市与交通规划设计研究院股份有限公司 Bottleneck platform identification method, device, electronic equipment and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101414409A (en)*2008-11-252009-04-22东南大学Method for derivation of rapid public transport station OD
CN105809292A (en)*2016-03-212016-07-27广州地理研究所Passenger getting-off station reckoning method of bus IC (Integrated Circuit) card
CN106530190A (en)*2016-10-282017-03-22西安建筑科技大学Method for judging getting-off stations of public transportation IC card passengers based on historical trip patterns
CN109584555A (en)*2018-12-132019-04-05昆山市公共交通集团有限公司Bus passenger get-off stop estimation method based on AFC data
CN110084442A (en)*2019-05-162019-08-02重庆大学A kind of method of joint public transport and the progress passenger flow OD calculating of rail traffic brushing card data
CN110197335A (en)*2019-06-042019-09-03湖南智慧畅行交通科技有限公司A kind of get-off stop number calculation method based on probability OD distributed model
CN110222892A (en)*2019-06-062019-09-10武汉元光科技有限公司The get-off stop prediction technique and device of passenger

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101414409A (en)*2008-11-252009-04-22东南大学Method for derivation of rapid public transport station OD
CN105809292A (en)*2016-03-212016-07-27广州地理研究所Passenger getting-off station reckoning method of bus IC (Integrated Circuit) card
CN106530190A (en)*2016-10-282017-03-22西安建筑科技大学Method for judging getting-off stations of public transportation IC card passengers based on historical trip patterns
CN109584555A (en)*2018-12-132019-04-05昆山市公共交通集团有限公司Bus passenger get-off stop estimation method based on AFC data
CN110084442A (en)*2019-05-162019-08-02重庆大学A kind of method of joint public transport and the progress passenger flow OD calculating of rail traffic brushing card data
CN110197335A (en)*2019-06-042019-09-03湖南智慧畅行交通科技有限公司A kind of get-off stop number calculation method based on probability OD distributed model
CN110222892A (en)*2019-06-062019-09-10武汉元光科技有限公司The get-off stop prediction technique and device of passenger

Cited By (21)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112733891A (en)*2020-12-292021-04-30华侨大学Method for identifying getting-off station of bus IC card passenger during trip chain breakage
CN112733891B (en)*2020-12-292023-08-01华侨大学 A method for identifying the alighting station of bus IC card passengers when the travel chain is broken
CN113159416A (en)*2021-04-192021-07-23深圳大学Calculation method for bus single card swiping get-off station and intelligent terminal
CN113159416B (en)*2021-04-192022-04-15深圳大学Calculation method for bus single card swiping get-off station and intelligent terminal
CN115527361A (en)*2021-06-242022-12-27北京市交通信息中心 Method and device for identifying bus passenger pick-up and drop-off stops
CN114219191A (en)*2021-06-302022-03-22深圳市巴滴科技有限公司Passenger vehicle scheduling method and device based on traffic passenger flow information
CN113705903A (en)*2021-08-312021-11-26重庆市凤筑科技有限公司OD derivation method based on urban public transport comprehensive model
CN115963519A (en)*2021-10-082023-04-14南京市城市与交通规划设计研究院股份有限公司 Bottleneck platform identification method, device, electronic equipment and readable storage medium
CN114358808A (en)*2021-11-152022-04-15南京理工大学Public transport OD estimation and distribution method based on multi-source data fusion
CN114444789B (en)*2022-01-192025-07-25中山大学Bus network supply and demand matrix autonomous construction method based on multi-source data
CN114444789A (en)*2022-01-192022-05-06中山大学 An autonomous construction method of bus network supply and demand matrix based on multi-source data
CN114331234B (en)*2022-03-162022-07-12北京交通大学Rail transit passenger flow prediction method and system based on passenger travel information
CN114331234A (en)*2022-03-162022-04-12北京交通大学 Rail transit passenger flow prediction method and system based on passenger travel information
CN114882693A (en)*2022-03-232022-08-09昆明理工大学Bus passenger getting-off station prediction method based on card swiping data deep mining
CN114882693B (en)*2022-03-232022-11-18昆明理工大学 A prediction method for bus passenger alighting stops based on deep mining of swiping card data
CN114912657A (en)*2022-04-122022-08-16东南大学 An OD derivation method of bus passenger flow based on multiple toll ticket systems
CN114912657B (en)*2022-04-122024-05-28东南大学Public transport passenger flow OD deducing method based on multiple charge ticket systems
CN114841428A (en)*2022-04-262022-08-02上海闻政管理咨询有限公司Bus route planning method and system
CN115331470B (en)*2022-10-112023-01-17山东恒宇电子有限公司Single-ticket bus passenger getting-off station calculation method based on IC and travel chain
CN115331470A (en)*2022-10-112022-11-11山东恒宇电子有限公司Single-ticket bus passenger getting-off station calculation method based on IC and travel chain
CN115691128A (en)*2022-10-272023-02-03大连海事大学Bus stop passenger flow calculation method based on multi-source bus data combined mining

Also Published As

Publication numberPublication date
CN111932867B (en)2022-04-29

Similar Documents

PublicationPublication DateTitle
CN111932867B (en)Multisource data-based bus IC card passenger getting-off station derivation method
CN102521965B (en)Effect evaluation method of traffic demand management measures based on identification data of license plates
CN104778834B (en)Urban road traffic jam judging method based on vehicle GPS data
CN205721998U (en)Turn pike net traffic information collection based on path identifying system and inducible system
CN102324128A (en) Method and device for predicting OD passenger flow between bus stations based on IC card records
CN110751366A (en) A calculation method for the matching degree of trunk line railway and urban rail transit capacity
CN105809292A (en)Passenger getting-off station reckoning method of bus IC (Integrated Circuit) card
CN109903553B (en) Multi-source data mining method for identification and inspection of bus alighting stations
CN103745089A (en)Multi-dimensional public transport operation index evaluation method
CN110298516B (en)Method and device for splitting overlong bus line based on passenger flow OD data, mobile terminal equipment and server
CN110667428A (en)Electric vehicle charging station recommendation method based on real-time positioning data
CN106897955A (en)A kind of Public Transport Transfer recognition methods based on public transport OD data
CN108922178A (en)The real-time load factor calculation method of public transit vehicle based on public transport multi-source data
CN107590239B (en)Method for measuring connection radius of public bicycle at subway station based on IC card data
CN114139251A (en) An overall layout method of land ports in border areas
CN109993964A (en)Intelligent traffic management systems based on Hadoop technology
CN102324111A (en) Judgment method of vehicle running direction based on bus IC card data
CN107578619B (en) A method for determining the service range of public bicycles in subway stations based on IC card data
Wan et al.Taxi origin-destination areas of interest discovering based on functional region division
CN105405292A (en)Method for counting time when passenger gets on bus by use of bus double swiping data
Yu et al.Determining a Key Service Area of Feeder Buses for Rail Transit Station Based on Potential Railway Demand
CN103700264B (en)Based on the express highway section travel speed computing method of ETC charge data
CN113723761B (en)Multi-dimensional urban public transportation operation service reliability evaluation method based on operation data
CN112733891B (en) A method for identifying the alighting station of bus IC card passengers when the travel chain is broken
CN111339159B (en) An Analysis and Mining Method for One-ticket Bus Data

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp