You are browsing the archive for Data Expeditions.

1st Greek Data expedition

Anastasios Ventouris - April 30, 2014 in Data Expeditions

Η Δημοσιογραφία Δεδομένων και η γνωσιακή επεξεργασία είναι ένας πολύ υποσχόμενος κλάδος τόσο στην επιστήμη της δημοσιογραφίας όσο και στην επιστήμη του Web. Πρωτοπόρος στην δημιουργία και επεξεργασία Γνώσης είναι το διεθνές Ίδρυμα Ανοικτής Γνώσης Open Knowledge Foundation το οποίο με το Ευρωπαϊκό Κέντρο Δημοσιογραφίας εξέδωσαν το Εγχειρίδιο Δημοσιογραφίας Δεδομένων. Το Ίδρυμα Ανοικτής Γνώσης Ελλάδας, OKF Greece, το Εργαστήριο Εφαρμογών Πληροφορικής στα ΜΜΕ του Τμήματος δημοσιογραφίας & ΜΜΕ και το Πρόγραμμα Μεταπτυχιακών Σπουδών στην Επιστήμη του Διαδικτύου (ΕτΔ) του Τμήματος Μαθηματικών του Α.Π.Θ., δημοσίευσαν την Ελληνική έκδοση του Εγχειριδίου Δημοσιογραφίας Δεδομένων (http://datajournalism.okfn.gr/handbook/).

Στο παραπάνω πλαίσιο το Ίδρυμα Ανοικτής Γνώσης Ελλάδας, το Μεταπτυχιακό Πρόγραμμα Σπουδών Επιστήμης του Διαδικτύου και το Μεταπτυχιακό Πρόγραμμα του Τμήματος Δημοσιογραφίας & ΜΜΕ (κατεύθυνση Δημοσιογραφίας & Νέων Μέσων) θα συνεργαστούν με στόχο την δημιουργία πρακτικών ασκήσεων Δημοσιογραφίας Δεδομένων. Οι φοιτητές των δυο μεταπτυχιακών θα συνεργαστούν ώστε από κοινού να αναζητήσουν δεδομένα με στόχο την ανάλυση τους, την οπτικοποίηση τους και την δημιουργία ρεπορτάζ δεδομένων.

 

Στα πλαίσια των εργασιών αυτών θα πραγματοποιηθούν δυο συναντήσεις

  1. Πέμπτη 3/4/2014 Ανάλυση δεδομένων και στόχος εργασιών (16:00-17:30) στο Εργαστήριο Η/Υ του Τμήματος Μαθηματικών

  2. Πέμπτη 8/5/2014 Τελική παρουσίαση εργασίας ρεπορτάζ δεδομένων (16:00-18:00) στο συνεργατικό χώρο εργασίας coho.

Αναλυτικά οι δράσεις του workshop Data expedition

Οι φοιτητές του ΠΜΣ ΕτΔ θα αναζητήσουν ανοικτά δεδομένα. Στην πρώτη συνάντηση θα γίνει ο ορισμός των 4 ομάδων εργασίας, θα πραγματοποιηθεί μια πρώτη ανάλυση των δεδομένων που έχουν βρεθεί και θα τεθούν οι στόχοι της εργασίας από τις ομάδες που στόχο θα έχουν την δημιουργία ρεπορτάζ δεδομένων (ποια είναι τα ερωτήματα που θα απαντά η έρευνα, τι δεδομένα χρειάζονται επιπλέον, τι στατιστική ανάλυση χρειάζεται να γίνει και τι είδους οπτικοποιήσεις θα ήταν χρήσιμες για το ρεπορταζ).

Οι ομάδες θα πρέπει να είναι σε επικοινωνία για να δημιουργηθεί το ρεπορταζ δεδομένων.

Η κάθε ομάδα θα αποτελείται από 4 φοιτητές του ΠΜΣ ΕτΔ και 2 Μεταπτυχιακούς του Τμήματος Δημοσιογραφίας της κατεύθυνσης Δημοσιογραφίας και Νέων Μέσων.

Flattr this!

Tax Avoidance and Evasion Expedition: Looking back

Lisa Evans - June 27, 2013 in Data Expeditions

We recently ran a tax avoidance/evasion data expedition. It was run online over the course of the afternoon. Our aim was to provide a good grounding in tax topics for someone new to this complex subject, complete with links to data sources. To do this we put together a decision tree of tax avoidance and evasion topics.

The first decision people had to make was to pick their question: would they choose between illegal tax evasion and morally questionable tax avoidance. If they picked the tax avoidance option, we then fleshed it out with examples of known schemes and measures governments have taken to close that tax loophole. The tax avoidance route gives very specific cases of individuals wanted by the law for unpaid tax and, in contrast, country level measures of shadow markets. There was also the option for participants to choose a project of their own.

What we found

There were three groups in the afternoon, two of them chose to explore shadow markets and one picked their own topic of looking for patterns of company structures in the open corporates database that might indicate tax avoidance. The two shadow market explorations were quite different, one looked at the relationship between estimates of the size of the shadow market and the gross domestic product (GDP) for a wide range of countries, the other looked at USA, state by state, measures of shadow markets.

The data visualisation produced by a group exploring the relationship between shadow market size and and GDP throughout the world. There is an explanation of this data (in Spanish) here.

Three different data difficulties

Each group was asked describe difficulties they had with the data. The group exploring shadow markets and it’s relationship to GDP picked a measure of the shadow market from the The Institute for the Study of Labor and the OECD for the GDP data. The difficulty this group had was matching the two data sets; getting them both in the same format and then matching by country name.

The group who looked at shadow markets in the US were challenged by finding the most reliable measure of a shadow market that was as granular as possible. First the World Bank report on shadow markets was considered but to understand the methodology would take too long for this quick challenge. The group decided to focus on two measures Currency in circulation and Numbers of households without bank accounts from the FDIC. Spreadsheets in excel format were found for the latter that were over-formatted and contained troublesome image file headers. However the group tidied up the data in the course of the afternoon. The group also noted the FDIC data was only by state, not by census tract (a more geographically detailed measure), they concluded that in general, working at the state level doesn’t give a really clear portrait of circumstances. Another potentially useful source was BankOn’s analysis that extrapolates a percentage of unbanked tracts with a detailed methodology. The group also considered combining the Gini index into their work which is available on a metropolitan area level but it grouped up to a state level

The group who looked at patterns that might indicate tax avoidance or evasion in company structures found a couple of difficulties using Open Corporates: OpenCorporates API was rate limited (which was a problem for the afternoon but easily resolved by talking to the Open Corporates team who were very willing to help).

They found the OpenCalais API for identifying companies in the officers names was breaking frequently as they were using it. The group dug deep into the rules for setting up different types of organisation in the Isle of Man (where they considered prototyping this project) and the UK. This is a very dense area of research and the group made a good start throughout the afternoon.

Feedback

The groups all found they needed more time to research the specific areas of tax they chose to investigate so it is recommended that in future tax expeditions are run with at least one tax expert in each group, ideally a forensic accountant for spotting patterns in company structures and an economics expert from maybe the World Bank or Bank On for the US shadow market project and a software expert or statistician for the world wide shadow market study.

So we thank all the participants for a great afternoons work and we look forward to using this experience to design data expeditions that more closely fit your requirements.

Interested to hear when the next data expedition will launch? Join the School of Data Announce list.

Flattr this!

Data MOOC: Results, Findings and Recommendations

Vanessa Gennarelli - June 25, 2013 in Data Expeditions, Uncategorized


From mid-April to mid-May, we collaborated with our friends at the Open Knowledge Foundation to launch the “Data Explorer Mission” using the Mechanical MOOC platform. The Mechanical MOOC was built to form more intimate small learning groups around open educational resources. This was the first time we had used it for team-based projects with synchronous meetings. Here are our findings from the experiment.

Overview

  • The “Data Explorer Mission” was designed as an introduction to working with data.
  • Learning outcomes were: data cleaning, data analysis, facilitation, visualization and storytelling.
  • 151 “Data Agents” signed up.
  • Group formation: teams were put together based on time zone: we formed 13 cohorts of 10 learners each (more or less).
  • Communication: teams received 2 emails per week–1 with an assignment, and 1 with a script for their synchronous meeting.
  • Tools: teams were prompted to schedule their own weekly Google hangout.
  • 5 Badges were designed for learners to apply for feedback on their projects.
  • Support team (a.k.a “Mission Control”) comprised of one subject-matter expert (Neil Ashton) one data community manager (Lucy Chambers) and one learning designer (Vanessa Gennarelli)

Results

Our findings consist of 3 main datasets:

  • Logs of emails from Data Agents to each other (we’ll call this set “Intergroup Emails”)
  • Content of email conversations amongst participants
  • Qualitative post-course survey results

Intergroup Emails:

In our 13 groups, we tracked how many emails Agents sent to each other. The results were quite surprising:

INTERGROUP EMAILS

The full dataset for this chart can be found here: http://ow.ly/m1YCp

You’ll notice that most groups emailed each other around 30 times.  Two groups, Group 1 and Group 10 emailed each other more than 220 times over the trajectory of the course. What made these groups different?

Content of Email Conversations:

Since this was our first collaborative, project-based Mechanical MOOC project, we approached it as a pilot. As such, the 3 support folks behind Mission Control masqueraded in all of the groups as they evolved. To find out what set Groups 1 and 10 apart, we combed through the content of those conversations. This is what we found:

Team 1

Upon closer inspection, many of these emails discussed trying to find a time to meet. After the first 10 days, the conversation dropped off, so these results are inflated.

Team 10

In looking at the conversations from the most successful Team, several fascinating trends emerged that led to Team 10 to build social presence and cohere as a group.

  • Core team: 4 of the 10 original members were active, encouraging each other to keep up with the Mission.
  • Spontaneous prompts to check in: Members sent short messages to each other to keep the course alive, i.e. “Are you doing alright? Haven’t heard from you in ages” “Just making some noise.”
  • Familiarity: Agents referred to each other by name (as opposed to “Team 10”) and shared bits of contextual information about their lives, such as when they found time to do the assignments, where they were traveling, etc.
  • Building upon shared interest: Team 10 shared content related to the subject matter of the course that others might find interesting–such as other Data MOOCs, White House open data, etc.
  • Tried new tools together: Agents tried out new tools like Google Fusion Tables together, and shared their frustrations, setbacks and successes. 
  • Summaries of Hangouts: In a brilliant move, Agents sent a summary of the synchronous Hangout to the whole group, which kept the folks who couldn’t make it in the loop.

What’s notable about Team 10’s interactions is that all four of the core group members were about equally active–this is an example of true group facilitation. We’ll recommend using Team 10’s interactions as a model or a roadmap for future Mechanical MOOC projects.

Overall Team Activity

It’s also worth noting that 3 groups continued to email each other after the course officially ended. Even if they had not finished the project, they had built a community around data, and continued to share resources and review each other’s work.

This made us realize that perhaps we should experiment with time, or folks should be able to progress at their own pace. Another realization was that we should keep the small listservs up so that people can continue to tap their small learning community.

Survey Data:

After the Mission ended, we surveyed Agents about what they felt they learned in the Mission, which tools were most valuable, and about their level of satisfaction with the experience. In the results, we found that many respondents were looking for a more traditional, direct instruction MOOC experience. We need to make the peer learning approach clearer–that Agents were in charge of directing their own learning, that expertise would emerge from working together as a group, and not from an Instructor or a series of Teaching Assistants. This is important, because the Teams that embraced the peer learning approach fared far better in the Data Mission:

  • “Apart from learning the basics of working with Google Spreadsheets (including some cleaning, formatting and visualising) and some other tools, I got my first and very impressive experience of P2P-learning.”
  • “I would recommend the Data Explorer Mission, because it is a good starting platform, to my mind. What is also important, it’s one of the formats that fosters p2p networking for potential future cooperation, which is very important.”
  • “It’s a great learning opportunity, but you take out only as much as you give. The amount of learning depends largely on the work each individual is willing to do.”

As mentioned above, participants who had yet to be “onboarded” to peer learning expressed frustration at the lack of structure and direction in the experience:

  • “After reading more about p2p learning and its various methods, I can only say that the my experience would probably be less frustrating if I knew something about its specific in advance.”
  • “Make it clear to to ‘beginners’ that there is no right or wrong answers involved in this Mission, but any exploration of the data given is acceptable.”
  • “Before the team became interactive, it took quite a bit of effort to organise its cooperation. When people of different cultural backgrounds come together for the first time, they might feel shy and don’t know how to behave. For instance, the team had been keeping silent for more than a week and everybody, as it turned out before, felt frustrated, because there was no visible team or work at all. In fact, it was not because people weren’t doing anything. It was because they were trying to do, failed and didn’t share their negative experience. They thought they only could communicate when they had some positive results. Later we decided that in order to keep our teamwork we’ve got to stick together an share not only our achievements, but also concerns, problems or even just write a few words like ‘hi, I’m in’. That’s not all that obvious.” (Our italics).

Findings & Recommendations

  • Google Hangouts. These worked well as a tool. 12/13 groups held at least one hangout. But we should schedule these beforehand, so the path is clearer.
  • Onboarding to Peer Learning. Some scaffolding is needed here to prime learners about what to expect. The first exercise should be to examine peer learning and define it for yourself. We’ve updated our Create a Course content to reflect these findings.
  • Facilitation. We should use Team 10’s framework to support distributed facilitation. It is our hope that a stronger onboarding process to peer learning will progress in that direction.
  • A Sense of the Wider Learning Community. Lots of learners asked for a forum to go to with questions, how many people were in their group, and more of a meta sense of “what was going on.” We could solve this by visualizing group data to learners and contrasting it with the wider community in a weekly message or blog post. And in the future, we could leverage Open Knowledge Foundation’s Q&A engine for questions that the groups cannot answer themselves.
  • Timing. We broadcasted the content, instead of working with the context of each individual group, and some folks needed more time. Design a more flexible flow where learners *ask* for the next unit or module. That way they don’t feel like the course has left them behind and they have to drop out if they aren’t “keeping pace.”
  • Integrate Badges. We developed a series of Badges for this experience on our platform, but they weren’t used. We need to integrate these better and show learners the value of submitting a project for feedback.

Validity and Limitations

Data collection. We’ll admit candidly: we were learning along with the Data Agents. This was one of Peer 2 Peer University’s first attempts at using Mailgun to track engagement, and there are a few things we could do better. In the future, we will use the “Campaigns” feature to drill down into per group and per user opens / click throughs / replies to the group.  We also struggled to get an export of the engagement data on a more regular basis, which would have helped us support groups that were flagging.

Sample size. With a pilot of 150 folks, Teams of Data Agents were spread thin across the world. Some groups, like those in Fiji or Australia, got placed with the nearest-by folks–sometimes 3-4 hours away. With a larger group, Teams will have more local folks in their Mission.

Avenues for Future Projects

From our pilot experience and lessons learned, we’ll be running another iteration of the Data Explorer Mission in August that will include a clear onboarding process for peer learning, stronger support for facilitation, and integrating the “Ask School of Data” to support Agents who have questions their Team cannot answer. Stay tuned for more details.

Flattr this!

Data Explorer Mission from the Inside: an Agent’s Story

Vanessa Gennarelli - June 18, 2013 in Data Expeditions

This post comes to you from Anna Sakoyan, who participated as a “Data Agent” the Data Explorer Mission, a partnership between Peer 2 Peer University and the Open Knowledge Foundation. The course ran from mid-April to mid-May, and primed Agents to analyze, clean, visualize data, tell a story with it, and facilitate their group. Here is her story. The original post can be found at her blog, Self Made University.

I can hardly believe it, but my assignment at School of Data seems to be completed. The last step was to produce some output, that is to tell the story. Now I think I should somehow summarize my experience.

Now, first off, what is Data Expedition at School of Data? It can be very flexible in terms of organisation. Here are the links to the general description and also to the Guide for Guides, which is revealing. In this post, I’ll be talking about this particular expedition. Also, a great account of it can be found on one of my team mates’ blog. So, this expedition was technically very similar to the principle of Python Mechanical MOOC. All the instructions were sent by a robot via our mailing list and then we had to collaborate with our team mates to find solutions.

8364602336_facaa10cdf_o

(Image CC-By-SA J Brew on Flickr)

First of all, we were given a dataset on CO2 emissions by country and CO2 emissions per capita. Our task was to look at the data and try to think about what can be done about it. As a background, we were also given the Guardian article based on this very dataset so that we could have a look at a possible approach. Well, I can’t say I was able to do the task right away. Without any experience of working

with data or any tools to deal with it, I felt absolutely frustrated by the very look of a spreadsheet. And at that stage peers could hardly provide any considerable technical support, because we all were newbies.

2013-06-03 01_13_18-Untitled - Google Maps

Then we had tasks to clean and format the data in order to analyze certain angles. Here our cooperation began and became really helpful. Although nobody among us was an expert here, we were all looking for the solutions and shared our experience, even when it was little more than ‘I DON’T UNDERSTAND ANYTHING!!11!!1!’.

Our chief weapons were:

  • the members’ supportive and encouraging attitude to each other
  • our mailing list
  • Google Docs to record our progress
  • Google Spreadsheets to work with our data and share the results
  • Google Hangout for our weekly meet-ups (really helpful, to my mind)
  • Google Fusion Tables for visualisation (alongside with Google Spreadsheets)

And that is it actually. I’m not mentioning more individual choices, because I’m not sure I even know about them all.

Now some credits.

Irina, you’ve been a source of wonderful links that really broadened my understanding of what’s going on. And above all, you’re extremely encouraging.

Jakes, you’ve contributed a huge amount of effort to get the things going and I think it paid off. You have also always been very supportive, generous and helpful even beyond the immediate team agenda.

Ketty, you were the first among us who was brave enough to face the spreadsheet as it is and proved that it is actually possible to work with. I was really inspired by this and tried to follow suit. Same was in the case of Google Fusion Tables.

Randah, I wish you had had more time at your disposal to participate in the teamwork. And judging by your brief inputs, you would make a great team mate. You were also the person who coined the term dataphobia and in this way located the problem I resolved to overcome. I hope to get in touch with you again when you have more spare time.

Zoltan, you were also an upsettingly rare contributor, due to your heavy and unpredictable workload. But nevertheless, you managed to provide an example of a very cool approach to overcoming big problems just by mechanically splitting them into smaller and less scary pieces.

Vanessa Gennarelli and Lucy Chambers, thanks for organising this wonderful MOOC!

So, as a result, I

  • seem to have overcome my general dataphobia
  • learnt a number of basic techniques
  • got an idea of what p2p learning is (it’s a cool thing, really)
  • got to know great people and hope to keep collaborating with them in the future

Well, this is kind of more than I expected.

Next, I’m going to learn more about data processing, Python, P2P-learning and other awesome things.

Flattr this!

Data Expedition story: Why garment retailers need to do more in Bangladesh

Anders Pedersen - June 4, 2013 in Data Expeditions, Data for CSOs

On May 25-26 almost 50 participants from several teams set out on a data expedition to map the garment factories. This is a report from the team comprised of Roy Keyes, Naomi Colvin, Sybern, Bhanupriya Rao and Daniela Mattern. The team used a crowdsourced database on garment factories to expose questionable standards and highlight the need for open supplier lists from all retailers. The article concludes that major retailers like Wal-Mart maintains high levels of opacity around their supply chain and audit standards, which are detrimental to improving working standards in the garment industry.

Not the first time!
When the Rana Plaza collapsed killing 1127 people and injuring over 2500 people of its 5000 workforce, it shocked the world and shone an instant light on the working conditions of the garment factories in Bangladesh. While it may have been the worst disaster of our times, it is my no means the first in Bangladesh, where fire due to faulty electrics and short-circuits or building collapses due to structural and maintenance issues are commonplace. Just 8 days later, another fire broke out in one of the Tung Hai group factory killing 8 people. The fire in Tazreen garment factory in November 2012, which killed 100 people should have acted as a wake up call to take health and safety issues seriously. But all it did was lull the government, retailers and the Bangladesh Garment Manufacturers and Exporters Association (BGMEA) into deeper slumber after dubbing it as arson.

Holier-than-thou?
The Rana Plaza tragedy seemed like a rude awakening, one that shone a spotlight on the appalling conditions that Human Rights Watch and others have warned about for many years in sweat shops. There was an instant rush by Western retailers who source a major chunk of their ready-made garments from Bangladesh, to appear to be doing the right thing: to be holier-than-thou. Wal-Mart was quick to release a list of 250 factories that it blacklisted from its supplier list in what appears to be a PR exercise, without any transparency around their audit findings or the exact reasons for the blacklist except for a vague statement that the ‘violations could relate to safety issues, social issues, unauthorized subcontracting or other requirements established by our set of Standards for Suppliers. Suffice it to say that, H&M still sources from eleven and Van-Gruppen from two of the factories. In the absence of transparent data on their methods of audit and their findings, simply blacklisting of companies is not very helpful. Wal-Mart’s blacklist consists of large textile groups such as Akh Fashions, Hop Lun and Mohammadi Group that that own several factories and supply to several big western retailers. MJ Group – whose subsidiary, Columbia Garments, is on the Wal-Mart list – lists Replay, New Yorker, C&A, Espirit, GAP, Old Navy and Macys alongside H&M as customers on its website.

Sustainability and Ethical codes
The essential point being missed in the rush to appear holier-than-thou is the compliance with ethical standards initiatives that rely largely on a multi-stake holder model. Worldwide Responsible Accredited Production (WRAP) is one such accreditation initiative which has released a list of 194 factories in Bangladesh that meets its standards. That these certified factories constitute a mere 3% of all factories in Bangladesh gives us an insight into how far the industry has to go as far as certification is concerned. Interestingly, 22 of the Wal-Mart blacklisted factories feature on this list. While Wal-Mart was quick to disclose a blacklist in a bid to appear responsible, it would do well to disclose all its suppliers in the interests of transparency and responsible sourcing.

H&M has been much more transparent here, not just disclosing a list of its worldwide suppliers, but also spelling out its stringent audit policy. Only one H&M factory was both WRAP certified and on Wal-Mart blacklist. And the story is a bit more encouraging because 15% of H&M’s suppliers in Bangladesh are WRAP accredited. Brands like Puma (10%) and Varner-Gruppen (15%) show some good signs of sourcing from accredited suppliers as opposed to Timberland and Nike, none of whose suppliers are WRAP accredited. While by no means adequate, it does show that some retailers are better at sourcing ethically than the others.

Table: Which retailers use WRAP Certified factories?

Retailer

Factories

in Bangladesh

WRAP Certified

Retailer % WRAP Certified

H&M

164

24

15

Levi’s

13

1

8

Nike

6

0

0

Puma

10

1

10

Timerland

5

0

0

Varner-Gruppen

46

7

15

Source: Crowdsourced garment factory list

The blacklist from Wal-Mart is pretty rich considering that along with Gap it has refused to sign the Accord on Fire and Building Safety in Bangladesh, instead preferring to rely on their own codes and audits. H&M was the first retailer, followed by 31 others, to sign the agreement which includes provisions for independent safety inspections, mandatory repairs and renovations and a commitment to pay for them and a role for workers and their unions to make garment factories safe in Bangladesh safe. The accord is a watershed moment for the reason that it is a multilateral initiative driven by retailers, global unions IndustriALL and UNI, in alliance with Clean Clothes Campaign and Worker Rights Consortium.

It certainly could be the last!
In the aftermath of the Wal-Mart blacklist, other retailers like H&M have rushed in to rethink their sourcing policy and look at new supply chains in Africa and Latin America. While any rethink is welcome, it needs to be in the area of more responsible auditing, greater transparency in supply chains, not just of primary suppliers, but secondary ones where there is astounding opacity. What would be a great step forward for western retailers like H&M is to make public their factory wise audit findings for greater accountability. Simply moving supply chains and tolerating the same conditions will not see the end of tragedies such as the Rana Plaza. There needs to be timely and better audit data as well as supplier data down to the last in the supply chain as well as greater commitment to multi-stakeholder processes such as the Fire safety accord. This could be the beginning of a long-term political engagement on workers safety and better wage and working conditions. This also means that Rana Plaza could be the last in the list of terrible tragedies.

Flattr this!

Data Expedition: Tax Avoidance and Evasion – 6th June

Lisa Evans - May 24, 2013 in Data Expeditions

Tax expedition

Want to dig deep into tax avoidance and evasion? We have gathered a wide range of data on this sensitive topic and for one afternoon we we’ll guide you through some of the key decisions to think about when writing a story on the topics. With tax evasion and tax avoidance currently such a hot topic in the media, it’s crucial that people can understand the difference between the two terms as well as the mechanisms by which they happen.

When: Thursday June 6th – 12:00 BST to 17:00 BST – link to your timezone

We’ll be looking for projects such as:

  • Exploring the tax avoidance schemes used by Apple, Google, Amazon, or Starbucks

  • Looking at data gathered by tax collection authorities and patterns of avoidance that emerge from that dataset

  • Creating a “most wanted” list tax evaders for future research

  • Your project here!

Sign up here for the Data Expedition!

Please note that limited space is available. For more information about the Data Expedition format, we encourage you to read this article.

How can I participate?

To get involved either:

  • Lead a team! (Up to 6 hours) Are you able to help to coordinate a team on the day? This involves, helping your team to understand the options and research that has been conducted and starting a discussion about the choice of story and how to construct a plan for making the story happen. The School of Data team will hold a specific hangout for team leads on Monday 3rd June at 12:00 BST to prepare for Thursday’s activities. Please email schoolofdata [at] okfn.org if you are interested in getting involved.

  • Offer an expert introduction! (Up to one hour) We’re looking for experts who understand the loopholes or tactics used by companies in different countries to offer quick introductions from 5-30 mins long to get the expedition started.

  • Join us as a participant on the day! (3-6 hours) You will need to be prepared to brainstorm ideas with others in your group and ultimately explain your choice of story. There will be two roles you can take on the day – either getting stuck into the data (analyst) or writing (storyteller).

Aims of the expedition

We will aim to give people:

  • A clear understanding of the difference between tax evasion and tax avoidance
  • An key understanding of a few schemes via which people engage in them
  • Perhaps also a few story ideas!

How to get involved

Please make sure you are registered here and that you select “Tax Avoidance/Evasion” in the “I’m Interested in…” section. Please note: you will need to be available for at least 3 hours during the expedition period and spaces will be limited, so preference will be given to those who can definitely commit to the expedition. Spaces will be confirmed shortly before the expedition.

Stay up to date with the latest data expeditions

Want to be informed any time there is a new data expedition? Join the School of Data announcement list to get notifications of the expeditions as soon as they are announced!

Flattr this!

Data Expedition: Mapping the garment factories

Anders Pedersen - May 18, 2013 in Data Expeditions

Women sewing at long tables next to tall windows in a garment factory.

The horrific factory collapse at Rana Plaza in Dhaka has brought the business practices of global garment brands, as well their thousands of suppliers, into the spotlight.

At School of Data we noted that corrupt and missing data were part of the story. Data on building permits in Bangladesh is largely unavailable due to lack of state inspections. However, after years of pressure on global apparel brands from labor activists, the publishing of garment factory supplier lists is becoming increasingly standardized. We’re asking you to join us in mapping the data on garment factories.

Data Expedition: Mapping the garment factories 

When: Saturday May 25 – 12:00 BST to May 26 18:00 BST – link to your timezone

We’ll be looking for projects such as:

  • Mapping garment factories locally and globally

  • Exploring the global supply chain of garment export and imports

  • Mapping the ownership of local factories and global brands with open company data

  • Finding stories and patterns in the connections between global brands and local garment factories

Sign up here for the Data Expedition!

Please note that limited space is available. For more information about the Data Expedition format, we encourage you to read this article.

Before the Data Expedition – Help us build an open garment factory supply list

Before heading out on this important expedition, we’ll need to gather as much data as possible on garment factories. Labor activists and campaigners typically articulate the data in terms of “supplier lists.” Some brands, such as Nike, provide a list of all factories in their supplier network via Excel and JSON downloads; while others, such as Levi-Strauss, only offer lists in PDF format. In order to prepare a solid dataset for the Data Expedition, we’re asking you to help locate, clean, and merge the supplier lists from across garment brands into one comprehensive Open Garment Factory List.

Begin today by adding to the Open Garment Factory List and join us for a GoogleHangout on Thursday, 23 May at 19:00 CET, where we’ll be engaging in joint data collection.

Flattr this!

At the Cockpit: How the Data Explorer Mission Works

Vanessa Gennarelli - April 30, 2013 in Data Blog, Data Expeditions

We provide multiple pathways to learning here at P2PU–if visual is your thing, here’s the walkthrough of Data Explorer Missions on our Community Call (start around minute 19:00):

Last year Peer 2 Peer University and the Open Knowledge Foundation launched an initiative to to meet the global demand for data-wrangling skills–enter the School of Data. Over the course of the past few months, Lucy Chambers, Neil Ashton and I designed a pilot “Data Explorer Mission” that we just launched on April 15th. We’re in the third week of that project now, and here’s a window into how it works.

Data Explorer Mission

Fast Facts

  • Four-week long course, running from April 15 to mid-May

  • 130 signups for our initial pilot

  • Our Mechanical MOOC email grouping mechanism formed 13 groups by time zone

  • The course features 5 Badges on our new platform (http://badges.p2pu.org) and it’s our first time implementing Badges for a Mechanical MOOC project

Learning Design

  • Mechanical MOOC put together 13 groups of 10 learners (or team of “Data Agents”) based on time zone.

  • Each week Data Agents receive 2 emails from “Mission Control”–one email with a project and resources on Tuesday, and one email with directions for their Google Hangout on Friday.

  • The learning project asks Agents to examine a CO2 dataset, ask a question, and then clean, refine, visualize and tell a story about their exploration.

  • We designed Badges that directly correspond to those learning goals.

  • During the weekly hangout, Agents share their projects, help each other, and reflect on their projects. Data agents take notes on etherpad.
  • Facilitation duties change from week-to-week, with folks opting-in to facilitate.

Who is “Mission Control”?

  • Mission Control is our persona for the School of Data Mechanical MOOC–think a mix of 007/Bond’s “M” and “Charlie” from “Charlie’s Angels.”
  • We’ve been giving a lot of thought to the affective dimension of learning, or how positive feelings in learning situations increase a sense of curiosity or play. Mission Control comes out of recent research on affective learning and engagement through Universal Design for Learning.
  • Behind the curtain it’s me, Vanessa, Lucy Chambers with Open Knowledge Foundation and our rockstar data wranger Neil Ashton.

Preliminary Results

  • We’ve been using Mailgun to track opens, clicks and replies to the emails we send from missioncontrol@data.p2pu.org

Email Engagement for Past 7 Days

  • We’ve sent 4 emails so far, so we’re about halfway into the course. 
  • 131 participants have sent approximately 50 emails to their small groups per day since the start of the course, or 675 emails total.
  • Almost every group has had at least one synchronous Google Hangout.

Lessons Learned (Already!)

  • Find a clearer way to represent that Data Agents are already in a small group by the time they are contacted. Learners seem unclear about how their small group functions. We need to a.) visualize to the teams who is in their group and b.) give them a sense of “people in the room.”
  • We should consider moving Data Agents whose teams don’t take off–maybe these folks form their own team?
  • We haven’t mastered Mailgun analytics yet, so Dirk and Vanessa need to thrash around with it a bit longer before we are truly confident in the reliability of the data.

Next Steps

  • We’re designing a post-course survey for our pilot teams of Data Agents.
  • In another 2 weeks we’ll present summative data, including: number of messages per group, number of click throughs, number of Badges applied for, and number of reviews per application.
  • We’re experimenting with the timeline for the course–our next iteration will be only two weeks long–watch out!

Flattr this!

3 Days Left to Sign Up for Data Explorer Missions!

Vanessa Gennarelli - April 12, 2013 in Data Expeditions

As a Data Agent, your first Mission, should you choose to accept it, would begin Monday, April 15. That gives you only 3 more days to sign up for this innovative partnership between the Open Knowledge Foundation and Peer to Peer University. Read on for more details.

8364602336_facaa10cdf_oImage CC-By-SA J Brew on Flickr

At the School of Data, we teach in two ways.

1) By producing materials to help people tackle working with data and
2) By running Data Expeditions – where learners tackle a problem, answer a question or work on a project together, learning from one another as they get hands on with real data.

It’s come to our attention, that sometimes, it’s handy to combine the two – handing people materials to tackle the challenges they are likely to encounter along the way. The Data Explorer Mission is like a data expedition with one crucial difference: your guide is a robot…

Read on to learn more…

Your Mission: Tell Stories with Carbon Data

Learn how to tinker with, refine and tell a story with data in this 4-week course. Each week you’ll be commissioned to work with others on a project that will hone your data-wrangling skills. Lessons will be pulled from Open Knowledge Foundation and Tactical Tech with help from Peer 2 Peer University. At the end of the course, you will have finessed, wrangled, cleaned and visualized a data set and shared it with the world.

What to Expect

The course will run April 15 to May 3, and each week your team will receive weekly “Missions” from Mission Control over email. You’ll work together on those projects, including a 30-minute Google Hangout each week. Each “Mission” will lead up to your final project. For each skill you master in the course, you can earn a Badge to show your mastery and to get feedback to further your talents.

The Topic

Carbon Emissions. Don’t worry if you don’t know anything about them at the moment, you don’t need to be a topic expert and the data skills you will learn will be very transferrable to other areas!

The Level

No prior experience is required, we’ll cover spreadsheets and working with data. If you’re more advanced, you are also welcome to join us to hone your skills, and the only limit on what you can learn is your imagination – so if you’re prepared to push yourselves on the project front the data-skills-bucket is your oyster!

About Mission Control

Normally – Data Expeditions are guided by a human sherpa, in this course, we’re weaving School of Data course material with a robot sherpa to help guide participants through the phases of the expedition. You’ll need to listen out for Mission Control’s instructions to guide you through the phases, keep timing and look out for handy tips, but organising your team is up to your group…

Sign up by completing the form below!

Flattr this!

A Data Expedition in Cooperation with Save the Children

Lucy Chambers - April 11, 2013 in Data Expeditions

This post is written by Ralf Becker, from the University of Manchester about an impending data expedition – the first to be independently run!

The topic? Child poverty levels & Parental employment in the UK

The government recently consulted on child poverty measurement. Extensive data exists which allows us to identify levels of child poverty nationally, regionally and locally. Through the Household Below Average Income dataset (derived from the Family Resources Survey) we are able to look at levels of child poverty at a national and regional level and also the make-up of those in poverty. Data such as the Index of Multiple Deprivation allows us to look at the prevalence of a range of indicators related to poverty at a localised level.

There is a strong correlation between parental employment and child poverty. Data exists on levels of unemployment locally and the number of children in workless households (held by ONS/DWP). To what extent is there a full understanding of levels of parental worklessness and low pay and trends over time.

Can our data explorers find new, inventive, illuminating or even crazy angles to these issues?

When & Where

A kick-off meeting will take place on Friday 12 April 2013, 5-9pm, at MADLAB Manchester. Graham Whitham, Policy Advisor at Save the Children will give an introduction to the issues at hand. This meeting is mainly meant to generate some initial ideas. The material presented and a summary of the ideas will be made available (thedatasquad.wordpress.com) such that a subsequent online data expedition can use, build on and extend this material.

The online version

Congratulations to Ralf for making this happen. We’re hoping to launch an online version of this session, based on the learnings from the offline one. If you’d like to stay informed about when the next expedition is coming up, join us on the School of Data announce list, where notifications about the upcoming expedition will be posted.

Flattr this!

 Receive announcements  Get notifications of news from the School in your inbox
Join the discussion Discussion list - have your say: