Related

ATOM 1.0ATOM 1.0

Thomas Steiner is @tomayac on Twitter@tomayac on Twitter

Tweet archive of tweets by @tomayacTweet Archive

Thomas Steiner's Google ProfileGoogle Profile

A Look Inside the Think Tank...

Mediterranea.JS—Trip Report

Created on Wednesday, June 24, 2015 at 15:11:25 and categorized as Technical by Thomas Steiner

Mediterranea.JS—Trip Report

This week, I attended and spoke at Mediterranea.JS, a JavaScript developer conference in sunny Barcelona, Spain. The whole conference-related social network discussions and photos were captured on Eventifier. Below are some interesting links for your reading pleasure:

Show/Hide Comment Form | Show/Hide Comments | Permalink

Tweet

World Wide Web Conference (WWW2015)—Trip Report

Created on Monday, June 01, 2015 at 14:42:22 and categorized as Work by Thomas Steiner

World Wide Web Conference (WWW2015)—Trip Report

The week before last, I attended the 24th International World Wide Conference ( WWW2015 ) in Florence, Italy. Google was a gold sponsor, and Google's Distinguished Scientist Andrei Broder  delivered one of the main keynotes. The core proceedings  and the companion proceedings  are available online. This is my trip report with personal highlights and key take-aways.

Workshops, Day 1

I started the conference on Monday with the Workshop on Web APIs and RESTful Design  ( WS-REST ) that I have co-organized together with Ruben Verborgh  (University of Gent) and Carlos Pedrinaci  (The Open University). We had three main themes in the workshop: testing, hypermedia and semantics, and REST in practice. The day started with a keynote delivered by Erik Wilde  ( ex-Siemens ); one of his main points—that also got identified as a general workshop theme—was that the REST world, despite all self-descriptiveness, still needs service descriptions and better testability. Erik shared his keynote slides  on his personal website. The WS-REST proceedings  can be found online. Personally, I liked Ronnie Mitra 's (CA Technologies) slides  and paper on his upcoming API design tool Rápido  a lot.


One of the workshop attendants, Michael Petychakis , also wrote a workshop report .

On the same day, I also had an accepted paper in the Workshop Ad Targeting at Scale  ( TargetAd ), co-organized by Googler D. Sculley . The title of my paper is AdAlyze Redux: Post-Click and Post-Conversion Text Feature Attribution for Sponsored Search Ads . In the paper, I describe a tool in use in my organization at Google to show large-scale advertisers what textual features work in their ads. The workshop triggered broad industry interest with presenters and speakers coming from Twitter, Yahoo!, Etsy, Adobe, eBay, Facebook, and Google (D. Sculley). The TargetAd proceedings  are available online.

Workshops, Day 2

I spent the first half of Tuesday morning in the Workshop Linked Data on the Web  ( LDOW ), and the second half in the Workshop on Web and Data Science for News Publishing  ( NewsWWW ). From LDOW , I want to highlight DBpedia Atlas , an alternative visualization of DBpedia ( demo ). NewsWWW  had an interesting paper on gender bias in news images . In the afternoon, I attended Facebook's Antoine Bordes ' and Google's Evgeniy Gabrilovich 's tutorial on Constructing and Mining Web-scale Knowledge Graphs  (slides from KDD 2014 , but similar enough to the ones at WWW).

Main Conference, Day 1

The main conference began with a keynote  by Jeanette Hofmann  (Berlin University of the Arts), who raised a number of critical points that she named "dilemmas of digitalization". She especially mentioned the Right to be Forgotten  and how (personal) data has become the currency we pay our free apps with.

A number of papers from Wednesday morning that I want to highlight are The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk  on crowdsourcing with Amazon's Mechanical Turk, a Facebook study on The Lifecycles of Apps in a Social Ecosystem  where they study, among other things, app sustainability, and finally a Google paper on account recovery secret questions titled Secrets, Lies, and Account Recovery: Lessons From the Use of Personal Knowledge Questions at Google .

In the afternoon, I listened to Philipp Singer 's presentation  of their paper HypTrails: A Bayesian Approach for Comparing Hypotheses about Human Trails on the Web  (best paper award) , wherein they present "a general approach called HypTrails for comparing a set of hypotheses about human trails on the Web, where hypotheses represent beliefs about transitions between states" . Further of interest to me was a Google paper titled Getting More for Less: Optimized Crowdsourcing with Dynamic Tasks and Goals   where the authors "optimize the crowdsourcing process by jointly maximizing the user longevity in the system and the true value that the system derives from user participation" . The Yahoo! paper Evolution of Conversations in the Age of Email Overload  looked at 16 billion emails between 2 million users and studied the reply times and reply lengths as indicators of how people deal with email overload. The task of benchmarking entity annotation systems reproducibly was addressed in the paper GERBIL - General Entity Annotator Benchmarking Framework .

I follow privacy implications of Web tracking critically (probably due to my day job ), so the paper Cookies That Give You Away: The Surveillance Implications of Web Tracking  was of great interest to me. I generally liked the track Security and Privacy 3 – Browsers  a lot. Related to my PhD research on breaking news events and their perception in online social networks , I enjoyed the paper Crowdsourcing the Annotation of Rumourous Conversations in Social Media  very much.

Main Conference, Day 2

I started Thursday after the keynote with an interesting Yahoo! paper on explorative entity search titled From "Selena Gomez" to "Marlon Brando": Understanding Explorative Entity Search  that identified query patterns that lead to explorative searching. A somewhat emotional paper that certainly raises privacy warning flags was Diagnoses, Decisions, and Outcomes: Web Search as Decision Support for Cancer ,  which examined search behavior of patients detected with cancer. The paper Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia  looked at identifying missing hyperlinks in Wikipedia.

During the lunch break, I called in an informal meeting  of the W3C Media Fragments WG  and interested friends in order to discuss extensions to Media Fragments URI  by allowing for more than rectangular spatial fragment shapes and dynamic moving spatial fragments. The notes  are on the mailing list.

In the afternoon, I attended the Industry Knowledge Graphs PechaKucha  20×20 and Panel  where Googler Chris Welty  presented the Google Knowledge Graph , Yuqing Gao  gave an overview of Microsoft's (Bing's) Satori , Paul Groth  talked about Elsevier's scholarly publications graph, and Lora Aroyo  presented Tagasauris' mediaGraph. This also touched on my 20% project together with Googlers Denny Vrandečić  and Sebastian Schaffert  around migrating Freebase to Wikidata  via a crowdsourcing approach titled primary sources  tool.

From the posters and demos session in the evening, I want to highlight whoVIS: Visualizing Editor Interactions and Dynamics in Collaborative Writing Over Time , which deals with visualizing editor interactions in Wikipedia ( demo ).

Main Conference, Day 3

Friday began with Andrei Broder's excellent keynote How good was the crystal ball? A personal perspective and retrospective on favorite Web research topics  where he first looked back at search engines and what worked and what did not work (subscribing to pages for obtaining change notifications). I especially liked the outlook  he gave for semantic smart agents  and how Google Now  is just the beginning.


Again driven by my PhD topic, I followed the paper presentation of Enquiring Minds: Early Detection of Rumors in Social Media from Enquiry Posts  who showed how scepticism-driven follow-up queries on questionable news-spreading posts may reveal rumors early on. A fun paper was User Review Sites as a Resource for Large-Scale Sociolinguistic Studies  that, among other things, detected that users older than 34 mostly use smileys with nose " :-) " , those younger than 34 without nose " :) ".


General observations


Show/Hide Comment Form | Show/Hide Comments | Permalink

Tweet

International Semantic Web Conference 2014: Trip Report

Created on Monday, October 27, 2014 at 13:01:35 and categorized as Work by Thomas Steiner

Last week, I attended the 13th International Semantic Web Conference in Riva del Garda, Italy. Google was a gold sponsor, and Vice President Prabhakar Raghavan delivered one of the keynotes. This is my trip report with personal highlights and key take-aways.

I started the conference on Sunday with the Developers Workshop, where I had two papers. The workshop was the first of its kind and was put together by my good friend Ruben Verborgh. It pulled more than 70 people in the room and the workshop was prominently featured during the main conference's opening ceremony.

Of personal interest for me were the following works. Knowledge Graphs were a core theme during the conference, and one example based on OpenRefine was shown by Parmesan et al. in form of Dandelion. Liepi?š et al. showed an ontology visualizer called OWLGrEd. Ebner et al. showed a system called LDcache that deals with caching flaky Linked Data sources. With XSPARQL, Dell’Aglio et al. presented a language and implementation combining XML, SPARQL, and SQL to query heterogeneous data sources. Matteis et al. showed how App Engine or Google Code among others can be used as "free" and queryable triple pattern data stores. Ceccarelli showed an entity linking framework called Dexter. My first contribution is titled Comprehensive Wikipedia Monitoring for Global and Realtime Natural Disaster Detection and focuses on natural disaster detection and monitoring with Wikipedia and online social networks. My second contribution is a paper called Self-Contained Semantic Hypervideos Using Web Components and introduces Web Components for the creation of hypervideos. On Monday morning, I attended the Consuming Linked Data workshop. The most interesting paper for me was by Rula et al., which dealt with the recency of facts in DBpedia. In the afternoon, I switched to the NLP and DBpedia workshop where the highlight was an amazing 300 slides in 30 minutes keynote by Roberto Navigli on BabelNet, Babelfy, Games with a Purpose, and the Wikipedia Bitaxonomy. Further of interest was a paper by Weisenburger et al. on mining historical data for DBpedia via Wikipedia infoboxes.

Tuesday started with Prabhakar's well-received keynote, in which he provided an overview of search engine development in the last years. His book has a nice summary. I then went to the NLP & IEs track, where the best-paper-award-winning paper on the AGDISTIS framework on entity disambiguation by Usbeck et al. was presented. In the afternoon, I attended the Data Integration and Link Discovery track. I liked a paper by Erxleben et al. that described the integration of Wikidata in the Linked Data Web. From the demos in the evening, I want to specially highlight the best-demo-award-winning paper by my friends Verborgh et al. on Linked Data Fragments on a Raspberry Pie. In general, Linked Data Fragments were one of the themes at this conference with several works citing them and also the release of the official DBpedia Linked Data Fragments interface.

On Wednesday, I attended the User Interaction and Personalization track, where I want to highlight a paper by Uchida et al. who presented a Chrome extension on browser personalization. I further liked a paper by Khamkham et al. on the CrowdTruth framework for harnessing disagreement in gathering annotated data. In the afternoon, my personal highlight was Verborgh et al.'s full paper on Linked Data Fragments.

I skipped Thursday morning and was back in the afternoon for the Linked Data track. Notable papers include Beek et al.'s LOD Laundromat that provides a solution for streamlining access to Linked Data sources by cleansing and format conversion and Patel-Schneider's analysis of Schema.org and some (author's view) recommendations on how to improve it. Meusel et al. gave an overview of the current state of WebDataCommons project that examines Microdata, RDFa, and Microformats distribution in the CommonCrawl corpus.

The conference was archived by Eventifier and Seen. All papers are available as open access preprints on GitHub.

Show/Hide Comment Form | Show/Hide Comments | Permalink

Tweet

PhD thesis successfully defended

Created on Wednesday, May 21, 2014 at 10:51:11 and categorized as Private by Thomas Steiner

PhD thesis successfully defended

I have finally defended my PhD thesis. A raw, unedited recording of the defense is available on YouTube



You can check out my slide deck that I used on http://tomayac.com/phd and the PDF of the thesis itself is available at http://tomayac.com/phd/thesis.pdf. The source code of the thesis is available in the GitHub repository https://github.com/tomayac/phd. I guess this makes me officially Dr. Thomas Steiner from now on.

Show/Hide Comment Form | Show/Hide Comments | Permalink

Tweet

WS-REST 2014 and Weaving the Web(VTT) of Data

Created on Tuesday, April 08, 2014 at 19:27:24 and categorized as Technical by Thomas Steiner

Weaving the Web(VTT) of Data

This week, I'm attending the World Wide Web conference (WWW2014) in Seoul, Korea. Yesterday, I co-ran the 5th International Workshop on Web APIs and RESTful Design (WS-REST2014). It was a great workshop, for the papers, the keynote by my fellow Google colleague Sam Goto, and above all, for the people:



Now today, I presented work of ours in the workshop Linked Data on the Web (LDOW2014). The title of our paper is Weaving the Web(VTT) of Data, you can see the slides that I used for my talk below.

Show/Hide Comment Form | Show/Hide Comments | Permalink

Tweet