Community Portal

Discuss and decide + write rules for importing of Eurostat datasets

Created on Thursday 11 February 2021, 09:48

Back to task list
  • ID
  • Project
  • Status
    In Progress
  • Priority
  • Type
    General Research Work
  • Assigned to
    Carolin Bellstedt
  • Subscribers
    Aristide Athanassiadis
    Carolin Bellstedt
    Paul Hoekman

You are not logged in

Log In Register

Please join us and let's build things, together!


We have our great Eurostat manager and have invested lots of time to go through most of the 9949 Eurostat datasets, 472 are still pending, to decide if they are relevant from a UM/CL perspective or not.

Now we are in the situation, where the CL cities who find those datasets too, will add the same one, on their own. It could be that they all add "Gross domestic product (GDP) at current market prices by NUTS 3 regions" (especially because there is no name check for a dataset, when adding one, which normally makes sense). Do we want to have 7 instances of the same dataset? Probably not, right? So we should decide rather sooner than later, if we do import the Eurostat datasets or instead see what the cities will actually do.

Of course, there is no need to import all 9,949 Eurostat records, but we could discuss if it makes sense to import if not the
800 accepted records, then perhaps the 155 accepted NUTS records:

  • 155 accepted NUTS,
  • 155 is made up of 40 accepted NUTS 3 (incl. NUTS3)
  • and 114 accepted NUTS 2 (plus 1 NUTS2)

We will again run into the issue of having MoC and CL tags, there is no way around it, but for now we could use CL tags and have the cities tag them. OR we add the CL tags in addition and the cities as spaces, which is possible in the backend, as I've just tested it.

In any case, first we should decide: to import or not to import? Aris, Paul, what do you think?

General instructions

Compiling lists, searching for information or compiling new information to further the goals of Metabolism of Cities

Discussion and updates

New task was created

My feeling: yes absolutely let's import them. Lots of advantages, no big drawback. I'd say let's do all 800 records. At least just having them in there is very useful. And we should look into some semi-automated scripts to label certain cities/reference spaces that are relevant. If you can put the rules in writing then I am happy to write a script for that.

Task was assigned to Carolin Bellstedt

Ok, cool!! Thanks! I will put the rules in writing and let you know when I have those done.

Status change: Open → In Progress

I've written up a document on Eurostat datasets, manager, grid and import to Data Hub, because we had a lot of info in different places. This document contains a section that lists points that we should address before the import, namely

  1. What should we do with the 472 pending ones? I suppose they could be imported later with the same script?
  2. What do we do with the info in the notes field?
  3. Check with Paul if assigning the correct ref spaces is part of the import? (point 9-12 in the “rules”)

The document also has some rules. Paul, could you please have a look and let me know if this is what you had in mind? Thanks!

Great Carolin, thanks. Quick feedback:

  1. Yes indeed
  2. Add it to the system under UPLOADER COMMENTS, which will then show on the page of the item in the library
  3. Hmmm I hadn't planned on that.............

Rules look good, had a quick look. Is there any deadline for this?

Thanks for your comments Paul.
1. Cool.
2. Nice.
3. Ok. Let's not do it then. Some of the ref spaces aren't in the system yet anyway and we haven't figured out yet how we go about naming of ref spaces in the case where there are several spatial units with the same name, but different codes (this whole City (C), Greater City (K), Functional Urban Area (L) business), see e.g. Dublin ,

Would be great to have by next week, morning of Friday (Feb 26) for Aris to make a video on it. Doable?

Copy that Carolin. But it starts to feel a bit too much like work work with all the upcoming deadlines. Can I instead try to have this done in two weeks time, can you work with that? Safest would be to say March 7 so I also have the weekend if need be.

Yeah, I understand that. I've been trying to keep it easy, but it is a push on several fronts right now, which makes it tricky. I don't want it to feel like work work for you.
Yes, we will work with that then. Thanks!

Hey Carolin, just to let you know I have this on my calendar for Sunday. If possible, would be great if you can check in every now and then in case there are any issues/details to discuss with the importation project. In any case more about this soon.

Hi Paul,
Cool, thanks. Ok, noted about the check in.

Hey Carolin, I've started preparing for the import. I have left a few comments in the Google Doc. Can you respond to those? Thanks!

Hey Paul,
great!! I'm just responding right now. Thanks.

Thanks for the feedback Carolin. The import was now completed. You can see the results here. Attached a screenshot. This is based on the layout we already use for other library lists. As you can see, 923 items were imported. They have all been tagged with the "Eurostat database" tag, which makes it easy to select them also from other places if we need to. I've created the SDG INDICATOR tag, and linked those relevant to that, as well as the CIRCULAR ECONOMY ones. The different countries are tagged where relevant, and the NUTS2/NUTS3 tags are also linked. I'm also adding a few pages/details now to improve management of these items. First thing I added was a "reverse" list of the publications that a certain reference space is tagged under. In other words, if you go to e.g. Zuid Holland you will see that it has all the relevant publications at the bottom. This may make it easier for cities to navigate like that.

Okay, a few additional things added:

  • If you go to a particular reference space within NUTS2/3, underneath the table with associated documents (see screenshot above) there is now a link that takes the user straight to the library. Here there is more info, which doesn't really fit on the previous page, including the tags, making searching through it much easier. Example here (go to the bottom and follow the link)
  • Within the library (be it our MOC LIBRARY or a sub-library shown on a sub-site), there is now a RESTRICT BY TYPE box. By default we normally only search in the academic group of documents, but if you want to see e.g. datasets you need to search more broadly. This makes that possible. See attached screenshot. The system also shows if you have a particular reference space set within the filter.
  • If you view the page of a document that has 500+ associated reference spaces (eg NUTS3) then normally the table would only show the first 500 (to prevent browser+server from chocking on some crazy docs that have tens of thousands of associated hits). However, in order to see all items (which in case of NUTS3 you may want to do), there is now a link at the top of that table to see them all.
  • In a library item where you have tons of associated spaces (e.g. ALL NUTS codes), it became pretty crazy to see the list of associated spaces. I have now limited this to 10, with an option to expand and see all.

Okay, I think all this should make it doable to manage these new datasets and go back and forth a bit, with decent enough searching options for your users and for the video. If anything is missing give me a shout, but asap please as my workweek starts tomorrow.


Forgot to say, in the metadata I have some of the old details recorded. This also includes the ID of the original record in our other table, so if we must, we can always run another script to update things etc.

Hey Paul,
Just to leave a short note here: It has not gone unnoticed that you did this (!!) and I want to thank you very much for it :) So far, I just didn't have time to review it properly or else I'd have let you known. It is on my list though. In the meantime: thanks!

👍 👍