KEMBAR78
Webrecorder: Web Archiving for All! | PPTX
Webrecorder:
Web archiving for all!
ARLIS/NA
February 26 & 27, 2018
Anna Perricci
Webrecorder / Rhizome
Web archiving fundamentals
• Web archiving: the process of selecting, capturing, saving and making
accessible select content available online (e.g. websites)
• Web archiving is a new and growing field and we need people with new ideas
and evolving skill sets
• Web archiving has a distinct lack of ‘silver bullets’ or comprehensive one-size-
fits-all solutions
About Webrecorder
Create high-fidelity, interactive captures of any web pages you browse
http://webrecorder.io Webrecorder Player App
A project by
with generous support from
Webrecorder Project
● Robust tools
● Free to use
● Fully open source
● Using open standards
● Growing user community
● Quickly evolving
Webrecorder Team
Dragan Espenschied
Rhizome's Digital Conservator
Ilya Kreymer
Lead developer & Creator
Mark Beasley
Senior Front-End Developer
Pat Shiu
Design Lead
Anna Perricci
Associate Director of
Strategic Partnerships
High fidelity web collecting (archiving)
• Capture any web page loaded in the browser
• Archive interactive content (only available after user input)
• Same system for recording and playback (web browser)
Collecting at human scale
• Webrecorder: web archiving for all!
• Collecting is done by a person via a web browser one page at a time
• Can import and augment collections created by crawlers
The payoff for careful capture is an
accurate representation of the original
Record=capture / replay=browse
• Webrecorder.io is used to make interactive captures of web pages as users
see them while archiving, but is not a screen recording software that can play
recordings back like a video
• Replay means you can access the content captured in the web archive and
browse it interactively like the live web (or a bit like a slideshow with arrow
button)
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Symmetrical archiving
Browsing a bound archive
• Each collection is a separate unit so at this time you can only navigate
content within one collection at a time
• This gives tight curatorial control though the boundaries of the collection can
sometimes be hit quickly
Patching with Open Web Archives & Live Web
• What is Patching? – Filling in missing resources in an archive using other
sources
• Other sources = other web archives and/or the live web
Importing Content from Open Web Archive
• Extraction is the importation of content from other open web archives
• Archives included in public-web-archives repository can be extracted from
Preconfigured browsers
• Using a preconfigured browsers to capture and replay web content that may
not be supported in current or future web browsers
• e.g. Java applets or Flash
• Access with a preconfigured browser ensures greater faithfulness to the
original look and feel of web pages
• Browsers use HTTP proxy mode = even better fidelity
Preconfigured browsers
Recording and replaying Flash content
What about social media?
• Webrecorder can capture content from social media sites, and works
especially well with Instagram and Twitter
• Some websites deliver content individualized for each user
• Webrecorder can record the content you see when you are logged in to a
social media profile
Account login is optional
• One does not need to login to use Webrecorder to capture web content
(though we do recommend it!)
• Users can download the captures right away (as a WARC file) & save
them locally
• For continued access to archived content online & to be able to add to a
collection, one must create and log in to a free account
Access & sharing options
• User created collections can be kept private or made public through
Webrecorder.io
• Public collections can be viewed by anyone
• Finer access controls are being considered
Webrecorer Sample
Collections
https://webrecorder.io/wrsc
Webrecorder Player
• Desktop application for OSX, Windows and Linux
• User friendly application to browse any web archive (saved in standard WARC
format)
• Can browse web archives offline, no internet connection required!
Using Webrecorder
Hosted Service
Sign-up at https://webrecorder.io/ for a free account
Run your own Webrecorder instance
Install from https://github.com/webrecorder/webrecorder-deploy
Use Webrecorder Player on your Desktop
Download from https://github.com/webrecorder/webrecorderplayer-electron
Toosheh project
https://www.netfreedompioneers.org/toosheh1
The (Obama) White House
Social Media Archive
http://archive.rhizome.org/narrative-
archives/thxobama.html
Net Art Anthology: Marisa Olson
https://anthology.rhizome.org/marisa-s-american-idol-audition-training-blog
Rhizome net art Microgrants
http://rhizome.org/editorial/2017/jul/18/open-call-rhizome-microgrants-2017/
Ethics & Archiving the Web
Hope to see you tomorrow!
A project by
with generous support from
Thank you

Webrecorder: Web Archiving for All!

  • 1.
    Webrecorder: Web archiving forall! ARLIS/NA February 26 & 27, 2018 Anna Perricci Webrecorder / Rhizome
  • 2.
    Web archiving fundamentals •Web archiving: the process of selecting, capturing, saving and making accessible select content available online (e.g. websites) • Web archiving is a new and growing field and we need people with new ideas and evolving skill sets • Web archiving has a distinct lack of ‘silver bullets’ or comprehensive one-size- fits-all solutions
  • 3.
    About Webrecorder Create high-fidelity,interactive captures of any web pages you browse http://webrecorder.io Webrecorder Player App
  • 4.
    A project by withgenerous support from Webrecorder Project ● Robust tools ● Free to use ● Fully open source ● Using open standards ● Growing user community ● Quickly evolving
  • 5.
    Webrecorder Team Dragan Espenschied Rhizome'sDigital Conservator Ilya Kreymer Lead developer & Creator Mark Beasley Senior Front-End Developer Pat Shiu Design Lead Anna Perricci Associate Director of Strategic Partnerships
  • 6.
    High fidelity webcollecting (archiving) • Capture any web page loaded in the browser • Archive interactive content (only available after user input) • Same system for recording and playback (web browser)
  • 7.
    Collecting at humanscale • Webrecorder: web archiving for all! • Collecting is done by a person via a web browser one page at a time • Can import and augment collections created by crawlers The payoff for careful capture is an accurate representation of the original
  • 8.
    Record=capture / replay=browse •Webrecorder.io is used to make interactive captures of web pages as users see them while archiving, but is not a screen recording software that can play recordings back like a video • Replay means you can access the content captured in the web archive and browse it interactively like the live web (or a bit like a slideshow with arrow button)
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
    Browsing a boundarchive • Each collection is a separate unit so at this time you can only navigate content within one collection at a time • This gives tight curatorial control though the boundaries of the collection can sometimes be hit quickly
  • 21.
    Patching with OpenWeb Archives & Live Web • What is Patching? – Filling in missing resources in an archive using other sources • Other sources = other web archives and/or the live web
  • 23.
    Importing Content fromOpen Web Archive • Extraction is the importation of content from other open web archives • Archives included in public-web-archives repository can be extracted from
  • 24.
    Preconfigured browsers • Usinga preconfigured browsers to capture and replay web content that may not be supported in current or future web browsers • e.g. Java applets or Flash • Access with a preconfigured browser ensures greater faithfulness to the original look and feel of web pages • Browsers use HTTP proxy mode = even better fidelity
  • 25.
  • 26.
  • 27.
    What about socialmedia? • Webrecorder can capture content from social media sites, and works especially well with Instagram and Twitter • Some websites deliver content individualized for each user • Webrecorder can record the content you see when you are logged in to a social media profile
  • 28.
    Account login isoptional • One does not need to login to use Webrecorder to capture web content (though we do recommend it!) • Users can download the captures right away (as a WARC file) & save them locally • For continued access to archived content online & to be able to add to a collection, one must create and log in to a free account
  • 29.
    Access & sharingoptions • User created collections can be kept private or made public through Webrecorder.io • Public collections can be viewed by anyone • Finer access controls are being considered
  • 30.
  • 31.
    Webrecorder Player • Desktopapplication for OSX, Windows and Linux • User friendly application to browse any web archive (saved in standard WARC format) • Can browse web archives offline, no internet connection required!
  • 32.
    Using Webrecorder Hosted Service Sign-upat https://webrecorder.io/ for a free account Run your own Webrecorder instance Install from https://github.com/webrecorder/webrecorder-deploy Use Webrecorder Player on your Desktop Download from https://github.com/webrecorder/webrecorderplayer-electron
  • 33.
  • 34.
    The (Obama) WhiteHouse Social Media Archive http://archive.rhizome.org/narrative- archives/thxobama.html
  • 35.
    Net Art Anthology:Marisa Olson https://anthology.rhizome.org/marisa-s-american-idol-audition-training-blog
  • 36.
    Rhizome net artMicrogrants http://rhizome.org/editorial/2017/jul/18/open-call-rhizome-microgrants-2017/
  • 37.
  • 38.
    Hope to seeyou tomorrow!
  • 39.
    A project by withgenerous support from Thank you

Editor's Notes

  • #20 Demo this in the browser. Use as an example: http://howtoappearofflineforever.online/
  • #22 Show patching in action.
  • #24 Show extracting in action.
  • #27 Flash-based interactive documentary – At Home – from NFB (National Filmboard of Canada). Demo example.
  • #28 Demo recording of Twitter.
  • #32 Demo with the player – later on in the Workshop.