Numerico Road Segmentation

Segmentation is the act or process of splitting into segments. In image processing the term may refer to splitting images into semantically coherent units, or in business dividing a consumer market into subsets of consumers based on some characteristics. In the context of traffic management and the prediction of incidents our objective is to partition the road network into some finite, and consistent road sections meeting certain criteria such that we can accurately and reliably predict incident risk.

At Numerico working with large geospatial datasets we are interested in sub-dividing large continuous networks such as a road transportation network into meaningful groupings of road sections. This discrete representation of a continuous system is a prerequisite to further analysis and modelling, where the requirements of our segmentation are dependent on the analysis or modelling we will perform. The discretisation of the system to segments allows us to derive the same features per segment, and so to fuse underlying data streams into a uniform input for the predictive algorithm.

Typically road networks will be evenly divided into sections at regular intervals by some marker to provide reference points, for example at every kilometer of road. These references are extremely useful for emergency services and maintenance engineers to direct them to specific points along the road where their presence is required. A naive segmentation approach could use these arbitrary intervals to divide the road network, however this does not account for the complexity of road sections for example at junctions where flows are related between different roads and on/off ramps. Our segmentation sought to overcome these drawbacks by creating logic to create road segments using information of the traffic flow direction, the density of induction loop sites, and the length of segments.

To enable meaningful incident risk prediction, it is necessary to have a segmentation that includes the entire road network. As a foundational segmentation we decided to use a segmentation provided by NWB in a shapefile that segmented the road based on road properties, in addition to this we then apply a second layer of logic to combine these segments.

The process of dividing a complex road network (see above) into meaningful groupings is non-trivial and required combining several data sources to achieve the desired outcome.

The required data came in several forms:

* An XML file with live induction loop locations – these data are updated every 4-5 days as measurement sites go on and offline.

* An ERSI shapefile of the road network – this file contains individual line segments with road features and is updated yearly.

* An ERSI shapefile with directionality of roads encoded – this file gives information on travel time between parts of the network and importantly has directionality encoded.

The aim of our segmentation method was to join contiguous road sections with consistent traffic direction such that each segment contains a minimum of two measurement site locations, whilst minimising the overall segment length. In order to achieve this information from the above sources needed merging and rules defined for joining road sections.

Merging of geospatial data may be performed as a spatial join or an attribute level join with a primary key. For the data we were using we needed to perform spatial joins. The approach to our segmentation is described in the below steps:

  1. The XML file with induction loops was parsed and converted to the EPSG:28992 coordinate reference system. Measurement sites were assigned to road sections by creating a spatial buffer of 15 m around each measurement site, and finding roads that intersect with the location. The measurement site was allocated to the closest road section. The road network was then filtered to RWS managed roads and multisite locations analysed.
  2. A key challenge for us was in joining road sections with the same direction of travel, in order to do this we had to join two non-aligned shapefiles. To do this we took a spatial intersection and where the overlap of the two areas was greater than 10 % we inherited directionality into our road network. This gave us directions for most of our road sections, and for the remainder we searched neighbouring segments and iteratively assigned based on the cartesian products. This was repeated until no more segment directions could be assigned.
  3. The crucial step in this segmentation pipeline was to join contiguous segments to ensure all segments have at least two measurement sites. To do this we created an adjacency matrix using the directional information that gives us information on the upstream neighbours for each segment. Segments without neighbours were removed at this stage. Merging was then performed by joining segments first by choosing adjacent upstream segments with the minimum number of sites, and if two potential merging segments have the same number of sites to join the shorter of the two. This process was repeated iteratively on the whole network until no more merging could occur.
  4. Having at minimum two sites in a segment was a requirement since this is useful in looking at statistical aggregates of those two sites, and making difference calculations. A final step in our segmentation was to determine the order of the measurement sites with respect to the direction of travel. For this we measured from the start point of each segment the distance to each site and then established a site ordering by distance.

The result of this segmentation approach was checked by doing a statistical analysis, looking at the number of loops, the segment length, and the number of incidents in each segment. We validated that the distribution of incidents over all the segments is sufficient to use the segmentation for incident risk prediction.

This segmentation pipeline gives a good foundation for further analysis and the development of many more projects built upon a reasoned and consistent network segmentation.


– [limitations of shapefiles](

– [coordinate reference systems](

Predicting Road Accidents

Road accidents carry heavy human and financial costs in every country. Our trials using road traffic data from the Netherlands’ indicate that a proportion of these accidents can be mitigated using predictive modeling to determine risk and suggest interventions before accidents occur.

In the Netherlands, each year there are approximately 20,000 incidents on the main highways, and many more on other minor roads. With the advent of autonomous vehicle technology, one may hope that incidents will be drastically reduced or perhaps completely eliminated. Since continuously improving robotic systems will eclipse human driving ability, virtually eliminating common errors that cause accidents. However, for an unknown period of time human-operated and self-driving cars will share the same roads.

With either human or robotic drivers at the wheel the chances of an accident depend largely on the state of the vehicle’s surroundings, the traffic and environmental conditions. To prevent further accidents, we want to understand when and how these external conditions can be manipulated by traffic-management interventions. The first step towards answering this is to study the predictability of accidents and derive an understanding of the factors that increase the probability there will be a collision. Some of these factors are intuitive, for example the density of cars in a road section, or having limited visibility. However, these factors do not always lead to an accident, and we would like to know if with a detailed understanding of traffic and environmental conditions we can better predict an incident, than a prediction based on averages and intuitive factors. Modern predictive algorithms, in particular deep neural networks, may be able to single out patterns that increase the probability of an incident more than average.

In collaboration with the Dutch Transportation Ministry Rijkswaterstaat. We we have the opportunity to test our predictive models on the national highway system, which is continuously monitored by probably the most dense system of sensors in the world. Magnetic induction loops measure flow (# of vehicles/min), average speed, vehicle size and vehicle type per lane every 0.5-1km on all national highways. These massive data generated by this system is complemented by a detailed database of
incidents, with exact location, duration and other features, compiled from multiple sources. We have begun the incident prediction study selecting segments of 10km to 20km from several highways, in areas with high rates of incidents. Thankfully, collisions and serious incidents are relatively rare in the Netherlands relative to the number of cars and density of the road network, the largest challenge is to tune the system to predict rare events.

We use feedforward and convolutional neural networks (with convolutions in space along the highway and also in time), and every dataset-balancing trick in the book to estimate the probability that an incident will occur within the next 30-60 minutes, given current traffic and weather conditions. The network input consists of space-time “images” that are a snapshot of current (e.g. last 15-30 minute) traffic dynamics and weather conditions, labeled by impending incidents, and the networks are
trained to perform binary image classification.

he results above show that predictive ability of our model declines as the length of the segment is reduced, and the examples of incidents that the network can learn from drops. In effect the network can identify conditions that make an incident likely, but as the segment becomes shorter it is harder to identify individual incidents. We can plot the network output against actual incidents to see this:

o avoid reaching hard limits of our predictive system when reducing segment length we tested a different setup, in which we turn from binary to multi-class classification and take long highway segments at the network input. The classes correspond to shorter segments where the incident is predicted to happen. In this way the balance issue is reduced significantly, and the network is given the task to localize the incident on one of the short segments, based on the differential of risk conditions, as the task of predicting an incident on the total long stretch becomes easier. We’ll report on the results of these tests in part II.

Data Imputation Algorithm

Mending the fabric of data with closed-loop neural networks

It is a common mantra in data science that the majority of time is spent on cleaning/munging/wrangling data rather than on modelling. Producing clean and structured datasets is essential to train useful machine learning models. Inevitably real-world datasets tend to be noisy and incomplete, especially when the data come from physical-digital interfaces subject to noise, such as industrial sensors and IoT networks. Imputation is part of the data cleaning pipeline creating “machine-friendly” datasets from raw data.
The objective of imputation is to fill missing data values in features of the dataset where a measurement was expected but was not obtained, or the delivered value is a suspected error or outlier.

When the data comes in the form of a numerical time-series, or is organized in some fixed structure with proximity and order (e.g. spatial sensor nets), methods of interpolation can be useful to fill in the gaps and correct for errors. However, for complex nonlinear systems the results of interpolation can be misleading. An example of such a system is highway traffic, where the interplay of vehicle flow/speed and road features creates complex dynamics with abrupt variations and phase transitions, that cannot be easily interpolated.

To perform imputation on these datasets, deep learning algorithms that learn and generate nonlinear patterns can be very useful. One such method that we have developed at Numerico is the closed-loop predictive network, where a deep neural network normally used for prediction is adapted for imputation. The adaptation consists in creating a closed loop in the network graph by feeding back predicted values into the network input at places where the original values are missing. By training with back propagation the predictive accuracy of the model is improved, and the estimated missing values for imputation are more accurate, as can be shown by an analysis of the modified closed-loop objective function (we call this “second order learning”). The process is shown schematically below.

The algorithm is trained for prediction accuracy, and imputation accuracy is tested by artificially masking some of the existing values. Since the actual missing measurements will never be known, the goal is to fill in the missing cells in the database with values that preserve as much as possible the correlations and patterns of the existing data. In that way the imputed dataset can be used in further applications directly, without introducing too much spurious information that reduces the effectiveness of the data.

Coming back to the example of vehicle traffic, the highway system of the Netherlands is a prime example of a dense and extensive sensor network. With sensors – magnetic loops or just “loops” – that measure vehicle flow and speed per lane about every 500 meters, it has been producing some of the most comprehensive datasets of highway traffic for many years. To impute missing measurements, we use a closed-loop predictive regression network, that fills in the missing data minute-by-minute. A small slice of the resulting completed dataset is shown below, with imputed values in orange. The average error in estimating missing values is ~5%.

It is possible to apply this approach to any ordered dataset, but depending on the specific needs, a hybrid model is usually adopted for the end result. For the highway data, imputing vehicle speed does not make sense when vehicle flow is zero, and we can impose that at the output of our imputation.

The closed-loop neural networks that form the basis of the method are part of the wider family of generative models that includes Generative Adversarial Networks, where the objective function input contains another network output (here the network is structurally the same, but the parameters are sampled from a previous time step). These models are becoming central in problems of constrained optimization to mimic real-world datasets, of which imputation is an example often needed in practice.

Tuscany Tourism Authority

Making a geographic recommendation engine

Recommendation engines are among the most widely used and recognizable applications of machine learning. The familiar “you might also enjoy” or “customers who viewed this item also viewed” when online shopping is the result of an algorithm trying to maximize the relevance (and click-through-rate) of the proposed links. The standard way to do this is to use similarity metrics that relate items and users. An approach known as collaborative filtering has been used extensively to make these suggestions, basing the similarity scores on past purchasing or user ratings. It represents the ratings of users for items as vectors, which then can be compared with vectors of other user/items to produce similarity scores with linear algebra methods. The method, although very effective (it has most conspicuously underpinned the success of e-commerce sites, but is employed in many scenarios), has its limitations. In particular, what is known as the cold start problem whereby a platform does not have sufficient items or users to make reliable recommendations. In this case, other criteria should be used to characterize the similarity of these items. Also, the prominence of some items in ratings and number of reviews can lead to filter bubbles if left unchecked, limiting the range of exploration by the users.

Recently, we were asked to design a recommendation engine for the website of a leading Tourism authority in Europe. In addition to all the famous attractions, it includes links for many important but sometimes relatively unexplored small towns and sites in the region. To promote the exploration of the website by the visitors beyond the best well-known towns and sites, and help them discover the region, we borrowed a similarity metric from the physical world: the geodesic distance between two

In addition to the standard collaborative filtering suggestions, we incorporated a mechanism that searches among nearby towns and sites of interest only, to discover recommendations that share themes and features assigned by expert travelers in the region. For example, when a user is viewing a link corresponding to a town known for its historical sites, the engine will search for other sites and towns rated high for history by human experts within a certain radius. It will include some
of them in the recommendations list, alongside the suggestions obtained with non-geographic methods. In this way, a user following these recommendations can essentially browse along a geographic trail connected by common themes, like traveling in the real world while following a variety of interests. Clicking on the geographic recommendations will take users deeper into the region, providing ideas, and aiming to inform their real-world experiences when they visit the region.

Read more