idr-511
Title: ScraperWiki Tutorial
Authors: Levine, Thomas
Year: 2012
Language: eng
Abstract: The objective of the workshop, or better hackathon, was to get the data into a structured format, and join it with data from another sources – together with an overview and showing by example what is possible with scraping. Thomas identified targets for web scraping and navigating the complexity of different types of web pages and introduced that in a few half-hour-long and hour-long modules that catered to different audiences.
Keyword(s): BigClean; sbírání dat; strukturované data; teorie dat; čištění dat
English keyword(s): BigClean; data cleaning; data theory; scraping; structured data
Conference/Event: Big Clean 2012, Prague (CZ), 2012-11-03
Rights: Dílo je chráněno podle autorského zákona č. 121/2000 Sb. Licence Creative Commons Uveďte autora-Neužívejte dílo komerčně-Zachovejte licenci 3.0 Česko This work is protected under the Copyright Act No. 121/2000 Coll. License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Czech Republic


Record appears in these collections:
Projects and activities > Conferences
Conference Materials > Papers

 

 Record created 2012-11-14, last modified 2020-07-11


Prezentace:
Download fulltext
PDF
[Download]