52. Experiments with d3.js
Open Data Day 2013 happened on February 23rd. In Toronto. London. Buenos Aires. And oh about 67 other cities around the world. I first heard about the initiative from Mary Beth (@bethmaru), one of the founders of Open Data Day when she came to visit Toronto in February. So I was very happy when I was asked to help out that day thanks to Arndis (@arndis) from Urban Digital (@UrbanDigitalTO). In Toronto it was a day full of workshops, talks, and hacking.
I had a lot of fun! Trying to make sense of my notes however was not. So I’ve also provided performance-enhancing links. Parse, process, and analyse at will. Here it all is.
# OPEN DATA DAY TORONTO 2013
# SHERAZ KHAN
Open Data Day 2013 presentation
Online interactive maps
- Crowdsourced democracity
- The purpose: service, analysis , collection, engagement, informing, art
The DemocraCity Project
One Million Tweet Map
Save the Rain
Toronto Sound Map
Rebuild Your Community
# BETH WILSON CITY BUDGET MAPPING ENTHUSIAST
- Social planning Toronto Chloropleth map tdsb
- Low income has cut services
- Least marginalized schools were affected (using the TDSB index, looking at earning opportunities with ranking schools)
- Services are already disproportionately located in city
- City of Toronto data consortium - purchasing that data
- Can look at CT vs wards vs political ridings boundaries
- KPMG did a study that reproduced the map
- Income mapping - public information sensitivity
### The problem:
- Data is never natural - there is always an angle
- A stumbling block is for the government at a political level to embrace open data without information presented within a certain context
_the government analyses the data for you_
- A folsom set of data would make it easy to drill down to sources
- Spending council - voting transparency
- Details of motion - clause context matters for constituents
- Approach neutrality by providing as much info as possible
# TABS Toronto
# LIGHTNING TALKS
Jason WHITE - TRANSIT ACCESSIBILITY
Ellie MARSHALL - MYCITYHALL.CA OPEN NORTH
Chris PENROSE - YOUTH ASSET MAPPING
Devon MEUNIER - AJAH TOOLS FOR FUNDRAISERS
Morgan PEERS - BIG PICTURE
Diederik VAN LIERE - a love story of big and small data
Julie BOGDANOWICZ - facemap toronto
Helen KULA - mars data catalyst
Sheyda SANEINEJAD - TRACKING TORONTO TICKER
Anil PATEL- THE SHARING IMPERATIVE
Raed SHARIF - GOVT DATA IN SOUTH AFRICA
Kevin BRANIGAN - MY TTC++
Sean CRAWFORD and Tammer KAMELZ - QUANDL
Dawn BUIE - Neighbourhood Planning
Wendy SMITH - PARKLOT
# JASON WHITE - TRANSIT WALKABILITY ANALYSIS
Transit Reliability Access to Transit A Cross City Comparison
# Ellie Marshall
# Chris PENROSE - YOUTH ASSET MAPPING
# Devon MEUNIER - AJAH TOOLS FOR FUNDRAISERS
# Morgan PEERS - BIG PICTURE
# Diederik VAN LIERE - a love story of big and small data
# JULIE BOGDANOWICZ - FACEMAP TORONTO
- Presenting data in clearest way possible - moving from complexity to overview
- Look at Otto Neurath
_the symbols create an interesting visual language_
- Maps from 1920s, abstraction of reality
- Beyond statistics in mapping in general it is an arbitrary exercise
### David Hulchanski: neighbourhood change
- UFT social worker 3 cities in toronto
- map of blue red white - shows the middle class is disappearing
represents immediate data
- with stats and mapping there’s a disconnect to the way laypeople look and navigate information
- to some there is an intimidation factor in looking and reading maps and statistics
- 3 cities map using photos of people not reading well
- white lane (middle) dominant
- try to represent 3 cities with photos of people
- photo on subway white faces middle city 1
- not white faces city 3 at edges
- Hulchanski map predict 2025 appears more uniform as opposed to the diversity in the 1970s - mixed map
- A low tech design approach to mapping data
- Representing 3 cities stats more immediately with photography
- Going to subway stations and taking photographs of people
- If the project matures it will go into more accurate detail
- Map represented by percentages of population. There is an ethical line straddling, issues of racial profiling
_This is a cool idea, hope they can use real faces_
### Transportation Projects:
- Trying to consider the interoperation of data
- The innovation lab: when you look at things on the periphery at a glance. Instead it brings things to your attention, not only at your choice. hear streetcars shaking use speakers and see
- Open paths
# HELEN KULA - MARS
- The unlock data initiative. Health data is challenging to work with and to access because there are a number of privacy issues. The goal is to generate insight and inform decision making around innovation.
- Tracking the money flow and innovation especially around startups. Accelerators and incubators. Currently we have little knowledge of the startup landscape in province, the data landscape around this is patchy at best.
- They are doing data development work, negotiating partnership with 17 regional innovation centres
- Trying to pool data linkages and integrate with others sources of data crunch base, angel list, commercial sources, to generate a more robust view of startup activaty in province
- Special interest in energy and health
- Open data outcomes with energy process consumption
- Innovation data - the million dollar question is what does this look like? How do you capture it?
- It can mean different things. Here they focus on startups and ecosystems that sits around those
- organizations support like mars, funders focus oct programs venture capitalists, angel investors, policymakers
# SHEYDA SANEINEJAD - TRACKING TORONTO TICKER
The Innovation Lab
- The ticker for stocks, sports
- Here it would track something more important things using open data that are happening in Toronto, trends of how Toronto is doing as a city economically, socially, environmentally
- What measure or policy in place to determine what is good or bad who decides
- Numbers are not in the periphery, the city will think about why some things increase or decrease
- A physical ticker display gateway to toronto - union station airport
- People will stand in line find to out how doing as a city (building stats, sewage, pollution, etc)
- An online ticker can be customizable audience/ website indicators turn off and on
_Should you let people turn things off or customize what they want to see? Information silos? Or maybe select a portion of the ticker to customize_
- Currently the data feed is not automatic, open data updated by going and put data in the background sheet, a prototype idea is to develop the process automatically
- The innovation lab is a volunteer group no affiliation to the City of Toronto
# ANIL PATEL - timeraiser
Planning Presentations Repository IT Linkedin Box
Planning & IT Timeraiser
- Principles: agile open web movement, takes non-profit work to a whole new level of transparency and efficiency
- Good governance leads to better decisions
- Responds quickly to change
- Shows how non profits interact work like open web
- Tracks all costs from batteries to beer
- The burden of paperwork and what does it mean
- Cost revenue ratio
- Web assets, capacity ability testing tools transactions
- Document workflow
- Tools to use
- See a lot of good governance tools speed dating prep
- Timeraiser’s mission is to creatively connect people with causes they care about. In reality volunteers and people involved in donations expend many hours and resources and are burnt out
- A new non-capitalist society
_Prospects through events_
# RAED SHARIF - Govt data in S. Africa
- Government data can save lives
- But there is no accountability or transparency for planning projects
- It is very difficult to find data in S. Africa, even harder to access it. Likely you would find a brief summary of the db or a db wouldn’t even exist
- of the requests for data, 60% not answered and from those requests that were answered, many (?%) were rejected
- another issue is data in pdf files
- how to manic put in map see enter manual
- After going through all that, trust/reliability of data
- Context of global south issues tribalism
- NGO CiviC Centre Global South (Egypt, Lebanon, Kenya)
- Crowdsourcing applications like ushahidi can’t work because it is not easy to get to scaleable level of infrastructure. Issues like road banning are really becoming more of a trend
# Kevin Branigan & Kieran - My TTC ++
- Was using open trip planner, decided to make their own
- Have a ridiculous amount of data - 10 000 bus stops 3 million times from bus schedules
- Had to interpolate stop times between stops
- Service summaries vehicles allocated each window time frame headway minutes etc
- Debug data see vehicles where move quick interpolate airport or subway
- Simulation compression artefacts
- Data: Went all subway count all stairs - 10 000 stairs, entrances, elevators
- Accolades from Brad Ross (on Twitter he said ‘cool’)
-Organizational agility could be dangerous. Possible repurcussions: deception by trip planner
# Sean Crawford - QUANDL
Open Data Day 2013 Quandl
- Questions that require data
- A search engine for numerical data
- 2.5 million datasets, doubling total dataset every 3 months, numerical data on set easy to find and use
- When you find the raw data that you are looking for, there is a lot of work left to do. 30 min per dataset in finding, processing, cleaning
- How does it work?
- You can download data in a multitude formats, API visualizations, & more features
- Quandl goes to original data source, fetches raw data, translates it into usable format instantly - an instant productivity gain
- System for copyright, license, all data they have is open and accessible, no need to check yourself
-Will premium data be licensable later?
- Fields or domains Quandl focus - catch all, strongest datasets in finance, demographics, global datasets, economics
- Take dataset pull into R to analyze
- Health data - all and more data hopefully coming
_I hope they chronicle this experience_
- Do you take sets manually or pull in? Quandl uses Q bots - data exists in pdfs, broken web pages, etc Q bots crawlers parses and pulls in automatically
- Dataset will track updates, revisions knows when to constantly grab new data to pull in
- Quandl archives previous data so even if dataset ceases to exist it will exist in quandl
- The data library or finds immediately interprets data - potentially a data library, wikipedia for data
- Search goes to data library or original source
- Will go to their library and exact same time the source makes updated data
# Wendy Smith - Parklot Project
- The Parklot Project looks at a number of historical features of Toronto: disappeared/disappearing creeks, land grants
- Usage of open data? Much of it is primary research
- The map tells a land transaction story
- Tools: google fusion tables, created a db in excel and built online on fusion tables
- Then give the table coordinates address plot markers on map
# WORKSHOPS, HACKATHON REVIEW
# Karen Smith
# Andrew Lovett-Barron - HACKATHON IDEAS
1. Neighbourhood cheatsheet
- a guide that shows local neighbourhood interests to newcomers
- aggregate data from sources
- filters data down to use
- communication problem: how to make data relevant to a lot of people
- location quick and easy cheat sheet
- Information that is contextual to you in seconds
2. Local Layer Library
- A community bake oven, find out when events happen and access to those events. There is no one place to go and find out
- Local organization participation
- Problem there are sources of data but existing in diff formats cannot centralize
- Start physical/expand to cultural events
- For instance, this app can extend to public art like finding murals in the city
Other great ideas:
Great Lakes Map
Ever wonder how to tell a story backwards?
Then look into trend of digital data-driven journalism:
Data driven journalism is a workflow that consists of the following elements: digging deep into data by scraping, cleansing and structuring it, filtering by mining for specific information, visualizing it and making a story.
(Mirko Lorenz, information architect and multimedia journalist)
How is this related to maps and GIS? Mapping involves the same process of visualizing other forms of data, but mapping traditionally involved using geospatial data.
The process of creating a map not limited to a few people with the tools. Mapping lends itself to the best practices of journalism by communicating information in a way that is efficient and easy to read. In many cases it’s more engaging if it is being constantly updated in real-time, like the election maps used in the last two US presidential elections.
Stay tuned for news about a great event on data visualization, coming to an iSchool near you.
Piers Dillon-Scott: 2012 – The web’s most interactive, real time, Presidential Election ever
Sheraz Khan: US Election - Mapping Frenzy
Jessica Melanson: Digital mapping: Is it journalism?
Knight Digital Media Center: A resource for data driven journalism, digital mapping and mashups
CBC’s Spark website features an extended interview with Andrew Turner about the rise in digital map making and digital map making culture. They ask, what is the social endgame for companies (like Apple, Google, Open Street Maps) to provide mapping?
Is it all about advertising? Is it getting you to buy something? Turner looks beyond that, that creating maps tells us a lot more about the things around us, and it enriches user experiences. Maps can become more personal, as users are able to edit and create their own. Bottom up map-making, local input. You also get to see spin-offs of maps for other uses.
In Maclean’s chart of the week, more people will use mobile devices like tablets and smartphones than desktop and laptop computers (the mobile global user base is to surpass 1.5 billion). Location-based information will have a greater importance in peoples’ lives.
The interview touches upon everything, also going into political issues of editing maps without ownership- for instance in Cyprus there are disputes over putting Greek and Turkish names on roads. With editing errors by authorities, government platforms controlling maps approval but also having open data.
CBC’s Spark: Future of Digital Mapmaking
Andrew Turner: Mapping Social Infrastructure
Krista McCracken: Public Archivist, Digital Map Making Round-Up
An American songbook, including standards, folk, blues, pop, singer-songwriters, gospel, and country, exists as a kind of continuous presence underlying the fluctuations of popularity and trend. Its contents are perennially rejected and then reclaimed, added to and subtracted from. The possibility of that presence threading into a group of songs seemed like a way to take notice of the musical structures that have formed modern songwriting, and to reconnect with the ones that have been abandoned.
In the first six months of this year, governments around the world made 20,938 requests to Google to provide information on 34,614 accounts, the company said. In the same period in 2011, governments made 15,744 requests on 25,342 accounts, according to company data.
During the first half of 2012, the government of the United States made the majority of the demands, followed by India, Brazil and France.
In the United States, in the first half of 2012, for instance, the government made 7,969 requests to Google to hand over information on 16,281 accounts. Google said it fully or partially complied with 90 percent of those requests. In the same time period in the United States in 2011, Google received 5,950 data requests regarding 11,057 accounts.
-The process is iterative. Form and feedback co-evolve and re-evolve.
-Do linear models ever work?
-Creating metadata for digital stuff meaning
-Group books together—> meaning
-Digital collections & information overload —> metadata on the web
-Google sucks at this
-Strategic error of the academic library is the organization as an institutional repositories. Focus on the discipline
-One arc, no longer based on locale—>forward facing front disciplinary
-Open data access and normalization across universities —> empowerment to the user
-Service model, away from industrial model
-Seminal works make social constructs
-The framework: from structure to interaction, modality
keywords prototyping, agile personas
keywords big data, metadata, curated content communities
keywords contest of faculties, second order discipline, organizational design
Why Does Open Matter?
theory, practice, praxis, signification, domination, legitmation
I wondered today what the revenue model was for MOOCs, like Coursera. So far Coursera has $22 million from venture funding, and the article on Gigaom says in that article that they hope to gain most of their revenue from (1) certification and (2) student-employer matches.
How much thought went into providing certification for every successful student? Is there some sort of policy in place that is recognized by every participating university? How difficult would it be to match employers with students online? How do you go about assessing a student’s performance this way? What are the issues for a student’s privacy, what if they didn’t want their performance to be monitored and revealed to potential employers? Does this model work to actually find the best candidates? There won’t be enough time to answer all of these questions before another idea comes along and reforms the way we learn.
Look at social analytics, resume parsing, and digital grading. The most recent post I could find was called “The algorithm didn’t like my essay”. The human markers for now still outnumber the machine markers, but it’s something we are moving towards, not away from even with its current fundamental problems. A machine with the same algorithm will always give the same response, is this a good thing?
The assumption is that because users are familiar with the system that they will continue to use it even if they need to pay, and will start to monetize their courses later on. But will these lessons actually be worth it? How much will it actually matter to employers? If you are doing it for yourself, there are so many other learning options out there that are free.
UC Online Strives to Compete in an Era of Online Courses
Does the freemium model really work? The definitive answer is yes and no.
Infographic: Is the Higher Education Bubble About to Burst?