Back to

Datasets clarification

Hi there,
I’m exploring the datasets published for the iScape project. I’m interested in Dublin datasets. And I need some clarification about them…maybe they are described in a file that I didn’t find. All I found was the yaml file that wasn’t clear enough for me.
So here are my questions:
1.I want description for the abbreviation of the column names in the files 5262.csv and 5262_processed.csv ?
2.How many kits were used for the file [2019-03_EXT_UCD_URBAN_BACKGROUND_API_CITY_COUNCIL_REF]? Is the file has the final result for the reading of the data of the 2 kits 5262 and 5565 ?
3.Are alphasense sensors were attached to the kits 5262 and 5565 to have the pollutants reading of SO2, O3,CO and nitrogen dioxides ?
4.How did you translate the result from every minute to every 15 minutes…is it interpolation or averaging or which mathematical operation ?
5. Is the location of UCD with the longitude and latitude : 53.3067° N, 6.2210° W can be the same for the kits?
6.How the kits were placed in UCD is there any description or image of the kits placed in their monitoring place ?
Thanks for your time and help,


Thanks for using the datasets. We added some more information in the dataset description. Step by step:

  1. The column names are now included in the dataset description. In principle, the _processed.csv suffix only adds the following changes to the file: cleaning NaN and electrochemical sensors calculations (CO, NO2 and O3). The methdos for calculating these pollutants are described here.
  2. This is a reference (high end sensor), from Dublin’s City Council official data. You can see this in the .yaml descriptor file, in the field type: REFERENCE.
  3. The sensors are CO (1), NO2 (2), OX (3).
  4. Resampling with mean - reference here
  5. Yes, they were co-located for the period of the test.
  6. This is what I have:

I will put an example on how to interpret the yaml files in the dataset description for future downloads. Thanks for using it!

Just to confirm, the dataset description has been updated in the zenodo repository:

Please, do not hesitate to post more questions about it here :wink:

Thank you for your fast reply.
Will check each answer in details

Best Regards,

1 Like

I hope you are fine
Well I have more questions :nerd_face:
Still the processed file (5262_processed.csv )attributes are not clear to me …
1.What does DELTAS_OVL_0-5-50 and DELTAS_OVL_0-30-50 attributes stand for ?
2.Which is the final calculated value that I should take for each pollutant?
3.So the file 5262_processed has the results of 1 kit and 3 alphasense sensors (NO2,O3 and CO) and validated by the Dublin city Council datasets?

Thanks again for your cooperation,


  1. That is an identifier for the parameters in the algorithm for calculating NO2 and OX. The algorithm is the one described in this publication and the parameters are matched to best suit the different datasets.
  • DELTAS: usage of the algorithm mentioned above
  • OLV_0: not really relevant
  • 5-50 or 30-50: the range of minutes (from 5 to 50 or from 30 to 50) used for the application of the algorithm above.
  1. Either, they are very similar for both of them, but normally 30-50 gives better results
  2. Exactly, the same for 5565_processed (there were two stations co-located, however one of the was down for a while and put back)

Hope it helps

1 Like

Sorry I still need more clarification…
So if I want to model the NO2 value from the 5262_processed with the “NO2_ppb” column in the file “2019-03_EXT_UCD_URBAN_BACKGROUND_API_CITY_COUNCIL_REF”.Which attribute for NO2 to take as a final result from the 5262_processed.csv?

Thanks for your cooperation,

NO2_DELTAS_OVL_0-30-50 or _0-5-50. They should give similar okish results: fine at higher ppb values, but not so good below 15ppb.

You can also use directly the GB_2W and GB_2A, and include in your model TEMP and HUM. In the following days, there will be a deliverable from iSCAPE in this site that explains all that in detail: Sensor monitoring experiences and technological innovations (upcoming)


1 Like

For the reference file [2019-03_EXT_UCD_URBAN_BACKGROUND_API_CITY_COUNCIL_REF] coming from city council:
1.Where are these reference stations that measure these data?
2.What is the closest EPA station that this data was gathered from ?
3.How was the sensors data validated by this reference if they are not placed in UCD ?

For 5262_processed.csv data was it output of 1 kit output attached to it 3 sensors or 3 kits each with 3 sensors output to be sure that the output is well validated for in-situ sensors ?
Thanks a million,

Hi Hala,

The reference stations are in Dublin’s City Council itself. We have no information on what the sensors are, other than them being from official instruments from Dublin’s City Council.

5262 is the device ID of one Living Lab station:
Measurements available here: