Tag: python

Exploring Hounslow’s Air Quality Data

Why Air Quality matters?

It is a known fact that poor air quality is unhealthy to all of us, especially for vulnerable groups such as people with medical conditions such as heart issues or asthma, as well as children or the elderly with breathing difficulties. Air quality is not the same everywhere. In other words: pollution can build up in pockets and we call them “hot spots” and potential reasons for these occurring are that they are close to a busy road or near a commercial or industrial zone. Prevailing weather conditions are another contributory factor that impacts air quality measures. So, it is important to us all to monitor air quality regularly, identify troublesome “hot spots”, and ensure that we are using this information to help guide actions and policies focused on ensuring cleaner air for us all.

What do we know about Air Quality in Hounslow?

London Borough of Hounslow partners with Ricardo Energy & Environment who maintain 6 Air Quality monitoring sites across the borough. As well as these sites, there are also third-party monitoring stations like Breathe London. Live stations provide hourly data which hold key measurements of specific pollutants within the air. The current list of live monitoring stations is as below:

  • Brentford
  • Chiswick
  • Feltham
  • Gunnersbury
  • Hatton Cross
  • Heston

Quick understanding of Air Quality measures (Pollutants)

Do you know that air is mostly gas? Air is actually comprised of a mixture of different gases like Nitrogen (approx. 78%), Oxygen (21%) and the remaining approx. 1% hold lots of other gases in the earth’s atmosphere (NASA). The UK Government has provided a national legislation and standards on air quality that identifies key pollutants in the air, like Nitrogen Dioxide (NO2), Particulate Matter up to 10 micrometres in size (PM10), Small Particulate Matter under 2.5 micrometre in size (PM2.5), Nitric Oxide (NO), Sulphur Dioxide (SO2) and Ozone (O3).

How can data science support a ‘data-enabled decision making’ process?

The role of data science brings in a deep lens to interpret data with a new dimensions and opportunities. With the use of key data science technologies like Python and R, you can filter out answers in seconds. At the London Borough of Hounslow, the Data Science & Quality Team have been working on air quality data sets generated during the last 10 years, where we have learned and identified valuable insights such as, seasonal changes impacting the hot spots’ live feeds, last 10 years comparison between hot spots and its performance to gather data, correlating pollutants with each other, correlating data with 3rd party monitoring stations, engineering and deploying machine learning models for predictive insights and utilising cloud technologies for rapid outcomes for data-enabled decision making.

During our data science work, we have learned so many facts and picked up patterns based on air quality data insights, do you know that during winter season pollutants concentration within the air stays longer than summer because cold air is denser and moves slower than warm air. The image below explains last 10 years of seasonal recordings within Hounslow.

data visual for Air Quality and its pattern during seasonal changes.
Air Quality Pollutants / Visual covering yearly seasons

What can we do in future?

The Data Science & Quality Team regularly meets Environmental & Public Health colleagues and are working on future initiatives for the cleaner air in Hounslow. One of the future initiatives is to correlate past 10 years of air quality data against the public health’s respiratory datasets. This initiative will bring in new dimensions and thoughts to build on.

If you have an idea / suggestion to share or to correlate Hounslow’s Air Quality data against your datasets, then please do approach us.

How to get the most out of your anonymous surveys

By Anna Trichkine, Data Quality Lead

During a period of change, surveys are a go to tool. If my own experience is anything to go by, I would say you’ve probably filled out dozens of surveys over the last year and for most of these surveys you are yet to see the results.

The good news is that the survey results often have important stories contained within them, even if they’re anonymous.

So how can we retell the stories whilst still maintaining anonymity?

The data team at London Borough of Hounslow have recently been tasked with analysing the data for anonymous workplace surveys. These surveys aim to capture how participants feel about working in the borough, and how they feel about working from home. As the data that are collected are anonymous, it is important to try to find patterns or stories in the responses without revealing any personally identifiable experiences. The user profile must not reveal an individual but must provide an insight into a group of people with similar experiences.

How do we do this?

One way to do this is by using decision trees, a type of mathematical model that identifies patterns in your data set by asking true/false style questions.

Using this mathematical model we are able to identify patterns in the data set. The mathematical model takes into consideration all of the questions asked by the survey, and suggests which of these questions are more important. Even if a survey had dozens of questions, it may be that only two or three of the questions have significant differences for the way people respond.

Once these key questions are identified, the data team specify minimum sample sizes for each group to make sure anonymity is maintained.

This combined approach can build a compelling story whilst also making sure that the narrative is anonymous.

Results

Once the model identified the key questions, and the groupings, the team were able to build up a user profile from the grouped responses.

We were able to create 4 main personas, each with unique reflections and experiences of working in the Borough of Hounslow. Underneath these personas were the collective responses for a group of individuals who shared similar thoughts.

Relaying these stories as personas, rather than as graphs and line charts, allows the stories to come to life. We can empathise a lot more with a person, rather than with a line. In this way, the anonymous surveys become more exciting both for our team, and hopefully for those who would like to see the results of the survey.

We will be sharing the personas internally with staff, and would love to hear which persona resonates with you.