Frequently Asked Questions: Web Archiving

  • What is the NYARC web resources program?

    The New York Art Resources Consortium (NYARC) consists of the research libraries of three leading art museums in New York City: The Brooklyn Museum, The Frick Collection, and The Museum of Modern Art. The NYARC web resources program archives, preserves, and provides online public access to curated collections of websites in areas that correspond to the scope and strengths of the print collections at each research library, as well as to NYARC project websites and the institutional websites of the three museums. The initial phase of the NYARC web resources program is made possible through funding provided by the Andrew W. Mellon Foundation.
  • Why is NYARC archiving art-rich websites?

    After a 2012 study funded by the Andrew W. Mellon Foundation, it became clear that there was a real danger that some web content important for the study of art history might be lost to the future researcher. This program seeks to address this problem by archiving, preserving, and providing online public access to those websites.
  • How does NYARC archive websites?

    It is the goal of the NYARC web archiving program to create archival copies of websites, documenting as much as possible of the appearance and functionality of each site at a particular point in time. NYARC makes copies of websites using web crawlers and preservation tools offered through the Internet Archive's Archive-It service, Hanzo Archives, and DuraCloud. NYARC uses other in-house tools for managing the selection and permissions process.
  • How are websites selected for inclusion in the NYARC web archive?

    The initial phase of the web archiving program concentrates on archiving the following online materials:

    1. 1. websites already cataloged in NYARC’s shared catalog, Arcade, or those that are candidates for cataloging and meet our collection development guidelines
    2. 2. web-based auction catalogs
    3. 3. contemporary artists’ websites
    4. 4. digital catalogues raisonnés
    5. 5. NYARC members’ own museum websites and other web-based digital assets

    We are presently seeking permission from the owners of websites that fall into one or more of these categories.

  • Can I nominate a website for inclusion in the archive?

    We are archiving websites that fit within the scope of our 10 existing Archive-It collections. If you would like to nominate a website for consideration of inclusion in one of these collections, please fill out our online nominations form. Please direct questions about nominations to: webarchive@frick.org
  • Who owns the content included in NYARC’s web archive collections?

    The New York Art Resources Consortium (NYARC) respects the intellectual property rights and the proprietary rights of others. Copyright ownership remains with the owner(s) identified on a website and governed by local, national, and international regulations. NYARC does not assume responsibility for the accuracy or lawfulness of the archived website or the contents within. These materials are collected to ensure long-term access for research purposes and private study. When using content from the web archive collections, we encourage users to first review the archived website’s terms of use.
  • How frequently is NYARC archiving each website?

    This will depend on the extent and significance of changes to each website over time. Sites that change frequently may be archived more often. The frequency of archiving may also vary for different types of websites and during the life-cycle of an individual website.
  • How long does it take before archival versions of my site appear?

    Some sites may appear within several weeks. Technically complex sites will generally take longer as there may be difficulties in capture, and NYARC must check each archived site for quality assurance prior to making it available.
  • How are the NYARC web archive collections accessible?

    Where feasible, the content of NYARC’s archived websites will be made publicly available, without restrictions. The public may access the collections online via the Internet Archive’s Wayback Machine and by searching for website records within Arcade, the collective online library catalog of the NYARC institutions.
  • How does one cite an archived website?

    Researchers should follow standard citation guidelines for websites. As many of the materials in the web archive are under copyright, citations must credit the authors or publishers of those works. On their website the Internet Archive provides the following example for citing archived URLs in MLA format (additional helpful FAQs from the Internet Archive can be found here): 

    We asked MLA to help us with how to cite an archived URL in correct format. They did say that there is no established format for resources like the Wayback Machine, but it's best to err on the side of more information. You should cite the webpage as you would normally, and then give the Wayback Machine information. They provided the following example:

    McDonald, R. C. "Basic Canary Care." _Robirda Online_. 12 Sept. 2004. 18 Dec. 2006 [http://www.robirda.com/cancare.html]. _Internet Archive_. [ http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare... ].

    They added that if the date that the information was updated is missing, one can use the closest date in the Wayback Machine. Then comes the date when the page is retrieved and the original URL. Neither URL should be underlined in the bibliography itself.

  • Why are some archived websites incomplete or displayed strangely?

    We use harvesting software to capture website content, and we try to preserve the website look and functionality of each site as it appeared at a particular point in time. Technical complications may limit the ability of our web crawlers to capture rich media, database content, or other interactive components, and as a result certain elements of an archived website may not be present. Another reason that some archived websites are not completely captured is that often only one page of the site was intended for inclusion in the archival collection. In addition, websites frequently link to other websites, so at times a user may click to follow a link which was not archived by the NYARC web resources program, thus resulting in a message stating that the resource is not in the archive.
  • What happens if a website changes its name or address?

    NYARC’s web archive will attempt to link the new website name or address with the original and continue to archive the website. We will not seek additional permission to archive a site if an organization changes the URL of a website which has already provided permission to be included in the archive.
  • Can I link to the NYARC web archive?

    Yes. Other organizations may link to NYARC’s archival collections using: http://nyarc.org/webarchive
  • What can website producers/developers do to ensure their websites are more easily archived?

    Creators of websites can take certain steps to ensure that their content will be more easily preserved for the future. Including a sitemap, providing standard links in HTML/XHTML formats, and avoiding proprietary formats, are all important considerations in producing a preservable website. For additional recommendations on creating preservable websites, please refer to the following useful guidelines compiled by our web archiving colleagues:

    Columbia University Libraries, Web Resources Collection Program: Guidelines for Preservable Websites

    Stanford University Libraries: Archivability

    UK Web Archive: Technical Information

    Smithsonian Institution Archives: Five Tips for Designing Preservable Websites
  • How do I get content removed?

    We will block public access to and remove content from the archive only in exceptional circumstances, such as, on:

    1. 1. official notification that the individual who granted permission did not have the authority to grant permission either to the website or to specific content on the website that belongs to another rights holder
    2. 2. notification that specific content has been removed from the website because of a legal challenge, e.g. libel

    If content stored with an external agent (such as the Internet Archive or DuraCloud) must be removed, we will use reasonable efforts to make sure this happens. Please email the NYARC web archive at webarchive@frick.org regarding requests for the removal of content.

  • Are other organizations doing similar work, and where can I learn more about their collections?

    Yes. There are a number of other organizations that are archiving websites, such as The Internet Archive, The Library of Congress, and Columbia University Libraries.
  • Do you have any workflow documentation that you can share?

    Yes. Documenting our policies and procedures is an on-going process. We are very interested in sharing our practices with the web archiving community, librarians and archivists, and the general public. Please visit our wiki to learn more about the internal workflow of the NYARC web archiving program.