The Internet Archive’s “Archive Team” announced recently that it has begun saving public content from Google’s failed social network.
Last week we heard about MySpace’s little mistake that ended in a huge tranche of MySpace content vanishing into thin air. MySpace “lost” 12 years of Internet history never to be seen again. That’s the thing about data loss; it tends to be final. In any case, another big set of data will be deleted very soon, as Google recently announced that it is shutting down Google+.
The Internet Archive, the team behind the Wayback Machine, really wants to save as much as it can before the content is removed forever, so it’s running a project to archive all public content from the platform. The Internet Archive, a digital library with the mission of “universal access to all knowledge,” also preserves texts, audio, video, software, and other formats from the Internet.
The organisation’s Archive Team does its best “to save the history before it’s lost forever,” so when it became aware that Google+ was shutting down, it began sharing information and planning its move, after collecting copious amount of data like size, activity, profiles, communities, and characteristics of the site and platform.
The code used to scrape information from Google+ can be found here, on Github.
There are a few limitations: For example, only public content that is presently available on Google+ is being included. Private posts and any previously deleted content will not be saved. However, previously saved content that’s since been deleted will be available. Also, full post comments may not be archived. Google+ allows up to 500 comments per post, but only presents a subset of these as static HTML. It’s also not clear that long discussion threads will be preserved. Historically they have not been.
Image and video content might not be preserved at full resolution, but this limitation applies mostly to high-definition image and video content. Photographers might want to be aware of this.
Finally, as the team explains, “content archival is subject to the rate at which the project can proceed and any limitations imposed outside its control,” commenting that “general success is likely.”