The week before last, I attended the 24th International World Wide Conference ( WWW2015 ) in Florence, Italy. Google was a gold sponsor, and Google's Distinguished Scientist Andrei Broder delivered one of the main keynotes. The core proceedings and the companion proceedings are available online. This is my trip report with personal highlights and key take-aways.
Workshops, Day 1
I started the conference on Monday with the Workshop on Web APIs and RESTful Design ( WS-REST ) that I have co-organized together with Ruben Verborgh (University of Gent) and Carlos Pedrinaci (The Open University). We had three main themes in the workshop: testing, hypermedia and semantics, and REST in practice. The day started with a keynote delivered by Erik Wilde ( ex-Siemens ); one of his main points—that also got identified as a general workshop theme—was that the REST world, despite all self-descriptiveness, still needs service descriptions and better testability. Erik shared his keynote slides on his personal website. The WS-REST proceedings can be found online. Personally, I liked Ronnie Mitra 's (CA Technologies) slides and paper on his upcoming API design tool Rápido a lot.
One of the workshop attendants, Michael Petychakis , also wrote a workshop report .
On the same day, I also had an accepted paper in the Workshop Ad Targeting at Scale ( TargetAd ), co-organized by Googler D. Sculley . The title of my paper is AdAlyze Redux: Post-Click and Post-Conversion Text Feature Attribution for Sponsored Search Ads . In the paper, I describe a tool in use in my organization at Google to show large-scale advertisers what textual features work in their ads. The workshop triggered broad industry interest with presenters and speakers coming from Twitter, Yahoo!, Etsy, Adobe, eBay, Facebook, and Google (D. Sculley). The TargetAd proceedings are available online.
Workshops, Day 2
I spent the first half of Tuesday morning in the Workshop Linked Data on the Web ( LDOW ), and the second half in the Workshop on Web and Data Science for News Publishing ( NewsWWW ). From LDOW , I want to highlight DBpedia Atlas , an alternative visualization of DBpedia ( demo ). NewsWWW had an interesting paper on gender bias in news images . In the afternoon, I attended Facebook's Antoine Bordes ' and Google's Evgeniy Gabrilovich 's tutorial on Constructing and Mining Web-scale Knowledge Graphs (slides from KDD 2014 , but similar enough to the ones at WWW).
Main Conference, Day 1
The main conference began with a keynote by Jeanette Hofmann (Berlin University of the Arts), who raised a number of critical points that she named "dilemmas of digitalization". She especially mentioned the Right to be Forgotten and how (personal) data has become the currency we pay our free apps with.
A number of papers from Wednesday morning that I want to highlight are The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk on crowdsourcing with Amazon's Mechanical Turk, a Facebook study on The Lifecycles of Apps in a Social Ecosystem where they study, among other things, app sustainability, and finally a Google paper on account recovery secret questions titled Secrets, Lies, and Account Recovery: Lessons From the Use of Personal Knowledge Questions at Google .
In the afternoon, I listened to Philipp Singer 's presentation of their paper HypTrails: A Bayesian Approach for Comparing Hypotheses about Human Trails on the Web (best paper award) , wherein they present "a general approach called HypTrails for comparing a set of hypotheses about human trails on the Web, where hypotheses represent beliefs about transitions between states" . Further of interest to me was a Google paper titled Getting More for Less: Optimized Crowdsourcing with Dynamic Tasks and Goals where the authors "optimize the crowdsourcing process by jointly maximizing the user longevity in the system and the true value that the system derives from user participation" . The Yahoo! paper Evolution of Conversations in the Age of Email Overload looked at 16 billion emails between 2 million users and studied the reply times and reply lengths as indicators of how people deal with email overload. The task of benchmarking entity annotation systems reproducibly was addressed in the paper GERBIL - General Entity Annotator Benchmarking Framework .
I follow privacy implications of Web tracking critically (probably due to my day job ), so the paper Cookies That Give You Away: The Surveillance Implications of Web Tracking was of great interest to me. I generally liked the track Security and Privacy 3 – Browsers a lot. Related to my PhD research on breaking news events and their perception in online social networks , I enjoyed the paper Crowdsourcing the Annotation of Rumourous Conversations in Social Media very much.
Main Conference, Day 2
I started Thursday after the keynote with an interesting Yahoo! paper on explorative entity search titled From "Selena Gomez" to "Marlon Brando": Understanding Explorative Entity Search that identified query patterns that lead to explorative searching. A somewhat emotional paper that certainly raises privacy warning flags was Diagnoses, Decisions, and Outcomes: Web Search as Decision Support for Cancer , which examined search behavior of patients detected with cancer. The paper Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia looked at identifying missing hyperlinks in Wikipedia.
During the lunch break, I called in an informal meeting of the W3C Media Fragments WG and interested friends in order to discuss extensions to Media Fragments URI by allowing for more than rectangular spatial fragment shapes and dynamic moving spatial fragments. The notes are on the mailing list.
In the afternoon, I attended the Industry Knowledge Graphs PechaKucha 20×20 and Panel where Googler Chris Welty presented the Google Knowledge Graph , Yuqing Gao gave an overview of Microsoft's (Bing's) Satori , Paul Groth talked about Elsevier's scholarly publications graph, and Lora Aroyo presented Tagasauris' mediaGraph. This also touched on my 20% project together with Googlers Denny Vrandečić and Sebastian Schaffert around migrating Freebase to Wikidata via a crowdsourcing approach titled primary sources tool.
From the posters and demos session in the evening, I want to highlight whoVIS: Visualizing Editor Interactions and Dynamics in Collaborative Writing Over Time , which deals with visualizing editor interactions in Wikipedia ( demo ).
Main Conference, Day 3
Friday began with Andrei Broder's excellent keynote How good was the crystal ball? A personal perspective and retrospective on favorite Web research topics where he first looked back at search engines and what worked and what did not work (subscribing to pages for obtaining change notifications). I especially liked the outlook he gave for semantic smart agents and how Google Now is just the beginning.
Again driven by my PhD topic, I followed the paper presentation of Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts who showed how scepticism-driven follow-up queries on questionable news-spreading posts may reveal rumors early on. A fun paper was User Review Sites as a Resource for Large-Scale Sociolinguistic Studies that, among other things, detected that users older than 34 mostly use smileys with nose " :-) " , those younger than 34 without nose " :) ".
- People start to get tired of PDF proceedings when past WWW conferences explicitly required HTML submissions, as highlighted by Andrei Broder in his keynote. RASH : Research Articles in Simplified HTML is an attempt to bring back the Web to WWW.
- Larry Page and Sergey Brin were awarded the Test of Time award for their paper The Anatomy of a Large-Scale Hypertextual Web Search Engine .
- Splitting the poster and demos session into two sub-sessions is a great idea that certainly reduced my personal perceptual overload.
- Commonly frowned upon and joked about, the lack of any formal speech or program topic at all during the (not so gala) dinner felt somewhat inadequate.
- Paul Groth wrote a WWW trip report , too, as did Amy Guy with her WWW observations , eXascale with their blog post, and Daniel Garijo with his "first time at WWW" post.