There are very few types of incidents that are as devastating and as frequent as fires. With every year there being on average one hundred large fires in The Netherlands totalling to more than 4 million euros in damages, not to mention the loss of irreplaceable and sentimental property, there is certainly a market for predictive and risk analysis models for fires. However due to the large variety of causes for domestic and industrial fires it is largely the opinion of domain specialist that qualitative models are not capable of providing any helpful information. We at Incentro wish to challenge that opinion. We have therefore created the Brandweer risico dashboard, you can read more about this fire rick profiles here. This product not only interprets fire data but also other types of incidents for which fire departments typically provide resources for, such as car crashes and dangerous substance spills. However in this blog we will be discussing the predictive and risk analysis models on which the fire predictive and risk analysis modules for the product are build. I will be using the city of Amsterdam as an example of how a state fire department could benefit from qualitative models.
The first part of building any model is to familiarize yourself with the data. We were privileged to receive entries for all emergency incidents that occurred in Amsterdam for the past six years. This data included.
- Date and time of incidents
- Primary incident category (e.g fire)
- Secondary incident category (e.g. type of fire)
Object table (e.g. building, subway stations):
- Object location on map
- Object address
- Object type
- The area the object covers
These two tables are linked with a n-to-n join table that allows an incident to involve multiple objects and for an object to be involved in multiple incidents. We also procured the CBS data set for Amsterdam which contains census data per district in Amsterdam, and KNMI - Daggegevens van het weer in Nederland data which contains weather data for Amsterdam.
From this data we can gather simple coalitions such as; it seems that on Fridays and Saturdays there are more fires incidents (in Amsterdam in general) then compared with the rest of the week. Figure 1 shows our results.
Figure 1: Frquency of fires over the days of the week
This may have a number of drivers, for example that it may be the case there is a correlate with the number of people that are going-out during the night on those days. On Fridays and Saturdays more people are going-out and therefore more chance for people to set fires as a form of entertainment or by accident(e.g. fires started by a cigarette). This is simply our hypotheses one could make based on this data and has in no way been proven.
We can also look atthe density of fires over the cource a the day.
Figure 2: Density of fire incidents over time of day
It is data like this that makes you appreciate the job firemen do. Here you can see that most fire incident occur between four and six in the afternoon and the number of incidents stay fairly high until nine at night.
As interesting that this analisys is, it is not particularly useful for firemen. Simply saying that Fridays and Saturday after noon until nine at night will be your busiest times doesn't give a fire department information that will help them save lives or prevent property damage. One way to help is to advise property owners of the statistical likelihood of their property being effected by a fire. To do this we needed to expand our analysis to take into account the rest of the data about the objects and the environment the object are in.
Taking all the data, we were able to generate two distinct datasets. The first was a data set where every tuple was an instance of fire at an object where the CBS made up the elements in the tuple. So in other words every tuple was the CBS data associated with an object which was involved in a fire incident. If the object was evolved in n fires, the tuple occurred n times in the dataset. Objects which have never been involved in a fire incident were also added to the set as a negative control. The tuples are then giving a classification that reflects whether or not they are associated with a fire incident. The data set was then divided into a training and test set using a 80/20 ratio. Using this training dataset the party model of R was used to derive a decision tree which places the various object in Amsterdam into over 60 categories based on the attributes(CBS data) that had the strongest correlation with the frequency of fires. Each category was then given a probability of a fire incident within 6 years. It was then tested against the testing dataset to obtain it's accuracy. Unfortunately the category with the highest chance of a fire incident was only 60% with most categories falling below 20%. This indicates that the chances of a fire incident has very little to do with the CBS data associated with that object. But the model did perform better than the model currently being used by the fire department which only uses the object's function, size and the size of the district which the object is in. That model only assigned objects to 8 categories with large steps in the incident chance percentage between them.
The second was a dataset where every tuple was a day in the past 6 years with the actual weather of that day and the number of fire that occurred on that day. The dataset was divided in the same manner as the previous analysis but this time a linear regression analysis was used to derive a formula to predict the number of fire that may occur based on the weather of a particular day. The analysis produces some promising results. We were able to show that we could predict the number of fires that may occur during a day given the weather of that day with fairly useful accuracy. These predictions could be used for resource management and other types of preparations since the model can be used to predict the likelihood of different types of fires based on tomorrows weather prediction. For example there is a stronger chance of household fire tomorrow because the temperature is relatively cold and there will be a strong wind that could fuel the fire and help it spread to other objects/buildings.
The model is now being used as the predictive analytic engine for the "Brandweer risico analyse dashboard".