I really admire what the Internet Archive does, and I'd be very willing to donate to them, but for one caveat: IA's robots.txt policy retroactively applies the current robots.txt on a particular domain to the entire archive of previous captures for that domain. This means that if a website goes offline, and an unrelated third party later acquires the domain, and uses a new, restrictive robots.txt, then the older site is no longer accessible in the archive.
I know this may seem somewhat trivial, but I've run into this problem more than once, and I find that it undermines the value of the archive: if you're trying to preserve history, making that preservation contingent on the the state of things in the present defeats your purpose.
I'd much rather see them adopt a more sensible policy of obeying whatever robots.txt is contemporaneous to each particular site capture.
If you donate to them through JustGive, be aware that after donating to the IA last year, I received over the last few months several spammy e-mails asking me to give to other unrelated charities.
I always marvel in a service that could easily monetize their product or traffic but chooses instead to essentially beg for funds. Or even possibly charge for tiered access to the archives (something I would pay for).
Wholesale 'archiving' (copying) of websites seems like a pretty gray area to play in. Of course, it's ridiculously important that someone does it, too. However, I suspect that a big reason they get away with it is that they're a not-for-profit organization.
Perhaps I'm not bold enough, but I'd be hesitant to try to 'monetize' the simple re-publication of other people's content.
Archive.org often archives internet harassment. What a pain it is for the victims. The people at Archive.org know this happens but don't say anything about it. So, let's add harassment to the copyright theft.
It's an interesting service and it should be done by a responsible and legal commercial entity.
Feel free to down vote me, but, every dollar you spend on Archive.org could've gone to a worthy cause and helped someone. What a shame.
I know this may seem somewhat trivial, but I've run into this problem more than once, and I find that it undermines the value of the archive: if you're trying to preserve history, making that preservation contingent on the the state of things in the present defeats your purpose.
I'd much rather see them adopt a more sensible policy of obeying whatever robots.txt is contemporaneous to each particular site capture.