摘 要
智能交通在近年得到了学术界和产业界的广泛重视。尤其是随着道路网的不断完善,交通车流越来越庞大,交通流预测显得越来越重要,分析并预测交通状况和交通热点分布情况是交通管控的基础,对城市交通管控有着十分重要的意义。随着车辆轨迹大数据技术、人工智能和机器学习技术的发展, 基于机器学习和大数据对车辆密度进行预测已成为重要的技术趋势。
本文基于车辆轨迹大数据,利用机器学习技术对城市交通热点进行预测, 主要的研究内容和创新点罗列如下:
首先,建立车流密度提取模型,利用核密度估计算法从车辆轨迹时空数据中提取车辆密度特征,并实现热点预测的可视化。本文从交通属性中车辆密度的角度去分析,相比传统的车流量和车速属性,让交通预测具有更加全局的特征信息,为交通管控增添一个新的维度与视角。
其次,提出预测滑动窗口模型,构建预测所需要的训练数据集,并使用标准的归一化方法进行处理,利用支持向量回归算法进行出租车车辆密度预测和热点预测,最后借助公认的评价指标对模型性能进行评估。为后续神经网络预测工作提供基础性参考。
再次,利用经典的神经网络——多层感知器模型对比不同层数和不同神经元个数的网络结构的性能,并使用循环神经网络中的长短期记忆模型进行预测,完成北京市出租车热点预测并达到预期效果。本文为机器学习应用于交通领域的全局和局部预测提供了新的思路,为该方向的研究提供基础性指标参考。
最后,总结短时预测模式下本文所述模型在不同时间尺度下的预测性能, 并提出长时预测的概念,为后续研究提供新的交通预测思路,将交通的短时预测方向扩充到长时预测的场景下。
关键词: 机器学习,核密度估计,交通热点预测,支持向量回归,多层感知器
- I -
Urban hot spot prediction model based on spatiotemporal data of vehicle trajectory
Abstract
In recent years, intelligent transportation has received extensive attention from academia and industry. Especially with the continuous improvement of the road network, the traffic flow is becoming larger and larger, and the traffic flow prediction is becoming more and more important. The analysis and prediction of traffic conditions and the distribution of traffic hotspots are the basis of traffic control, which is of great significance for urban traffic control. With the development of vehicle trajectory big data technology, artificial intelligence and machine learning technology, vehicle density prediction based on machine learning and big data has become an important technical trend.
In this thesis, machine learning technology is used to predict urban traffic hot spots based on vehicle track big data. The main research contents and innovation points are listed as follows.
Firstly, the vehicle flow density extraction model is established, and the kernel density estimation algorithm is used to extract the vehicle density features from the spatiotemporal data of vehicle tracks, and the visualization of hot spot prediction is realized. In this thesis, from the perspective of vehicle density in traffic attributes, compared with the traditional vehicle flow and speed attributes, the traffic prediction has more global characteristic information, adding a new dimension and perspective to traffic control.
Secondly, the prediction sliding window model was proposed to construct the training data set required for the prediction, and the standard normalization method was used for processing. The support vector machine regression algorithm was used for the taxi vehicle density prediction and hot spot prediction. Finally, the performance of the model was evaluated by the recognized evaluation indexes. It provides the basic flow for the subsequent neural network prediction.
Thirdly, the classical neural network called multi-layer perceptron model was used to compare the performance of network structures with different layers and neurons, and the long and short term memory model in the cyclic neural network was used to predict the hot spots of taxis in Beijing, and the expected results were achieved. This thesis provides a new idea for the application of machine learning to global and local prediction in the field of transportation, and provides a basic index
- III -
reference for the research in this direction.
Finally, the prediction performance of the model described in this thesis under the short-time prediction mode is summarized under different time scales, and the concept of long-time prediction is proposed to provide a new forecasting idea for the follow-up research, and the direction of short-time traffic prediction is extended to the scenario of long-time prediction.
Key Words: Machine Learning, Kernel Density Estimation, Traffic Hotspot Prediction, Support Vector Regression, Multi- layer Perceptron
目 录
摘 要 I
Abstract III
插图清单 IX
附表清单 XI
1引 言 1
1.1研究背景及意义 1
1.2研究内容与方法 2
1.3本文组织内容 2
2文献综述 4
2.1交通预测概述 4
2.2核密度估计概述 5
2.3交通预测中的机器学习方法 5
2.3.1传统交通预测方法 5
2.3.2基于机器学习的预测方法 6
3基于核密度估计算法的城市车流密度提取模型 8
3.1车辆轨迹时空数据概述 8
3.1.1GPS 数据描述 8
3.1.2Open Street Map 10
3.1.3研究区域概述 10
3.2车辆轨迹时空数据预处理 11
3.3基于时空轨迹数据的核密度估计算法 12
3.3.1核密度估计算法 12
3.3.2核密度估计最适带宽计算 13
3.4北京市出租车热点信息提取 14
3.4.1实验环境 14
3.4.2时空数据密度挖掘 15
3.4.3模型参数设置 17
3.4.4实验结果与分析 17
3.5本章小结 22
4基于支持向量回归的热点预测模型 23
4.1基于滑动窗口的热点信息数据处理 23
4.1.1滑动窗口模型 23
- V -
4.1.2训练数据集构造 24
4.1.3数据的归一化处理 26
4.2支持向量回归预测 27
4.2.1支持向量回归 27
4.2.2支持向量回归参数 30
4.3支持向量回归热点预测 31
4.3.1实验环境 31
4.3.2模型参数设置 31
4.3.3预测评价标准 32
4.3.4实验结果与分析 33
4.4本章小结 39
5基于神经网络的热点预测模型 40
5.1神经网络概述 40
5.1.1神经网络构成 40
5.1.2损失函数与正则化 42
5.1.3前向传播与反向传播 43
5.1.4激活函数 44
5.2优化方法 46
5.3循环神经网络 47
5.4MLP 热点预测 50
5.4.1实验环境 50
5.4.2模型参数设置及评价标准 50
5.4.3实验结果与分析 51
5.5LSTM 热点预测 56
5.5.1实验环境 56
5.5.2参数设置 56
5.5.3实验结果与分析 57
5.6本章小结 60
6不同时间尺度下模型性能分析 61
6.1不同时间尺度预测模型 61
6.1.1短时预测模型 61
6.1.2长时预测模型 61
6.2不同时间尺度预测模型性能对比 62
6.2.1实验参数设置 62
6.2.2模型对比与分析 62
6.3本章小结 63
7结论与展望 64
参考文献 67
在学取得成果 73
致 谢 75