Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Internet Archive is trying to raise funds to buy 3 petabytes of storage (thenextweb.com)
111 points by napolux on Dec 17, 2012 | hide | past | favorite | 14 comments


I really admire what the Internet Archive does, and I'd be very willing to donate to them, but for one caveat: IA's robots.txt policy retroactively applies the current robots.txt on a particular domain to the entire archive of previous captures for that domain. This means that if a website goes offline, and an unrelated third party later acquires the domain, and uses a new, restrictive robots.txt, then the older site is no longer accessible in the archive.

I know this may seem somewhat trivial, but I've run into this problem more than once, and I find that it undermines the value of the archive: if you're trying to preserve history, making that preservation contingent on the the state of things in the present defeats your purpose.

I'd much rather see them adopt a more sensible policy of obeying whatever robots.txt is contemporaneous to each particular site capture.


Direct link to the donation page: https://archive.org/donate/


And in case you're interested of the hardware, here's some details: http://archive.org/web/petabox.php


They also take donations in bitcoin. (something that python.org and the EFF can learn from)


If you donate to them through JustGive, be aware that after donating to the IA last year, I received over the last few months several spammy e-mails asking me to give to other unrelated charities.


It might be worth mentioning that this is part of an offer to match donations 3-for-1 until December 31st.


Relevant discussion on the same topic from last week: http://news.ycombinator.com/item?id=4901148

A worthy and commendable initiative. Chipped in what I could.

I had meant to follow this up last week, but it slipped my mind. So, thanks for posting this again. :-]


I always marvel in a service that could easily monetize their product or traffic but chooses instead to essentially beg for funds. Or even possibly charge for tiered access to the archives (something I would pay for).


Wholesale 'archiving' (copying) of websites seems like a pretty gray area to play in. Of course, it's ridiculously important that someone does it, too. However, I suspect that a big reason they get away with it is that they're a not-for-profit organization.

Perhaps I'm not bold enough, but I'd be hesitant to try to 'monetize' the simple re-publication of other people's content.


Really? You "marvel" that Wikipedia doesn't charge for access? I hope I never become as "business cynical" as some people here.


"You "marvel" that Wikipedia doesn't charge for access?"

I'm not talking about Wikipedia nor did I mention them. My comment is directed toward internet archive a different situation.


>I always marvel in a service that could easily monetize their product or traffic but chooses instead to essentially beg for funds.

Describes Wikipedia word-for-word.


Things to donate to:

Starving children. Orphans. Awareness about preventable disease. The environment. Human rights. Animal rights.

Things NOT to donate to:

Obscure copyright theft.


Archive.org often archives internet harassment. What a pain it is for the victims. The people at Archive.org know this happens but don't say anything about it. So, let's add harassment to the copyright theft.

It's an interesting service and it should be done by a responsible and legal commercial entity.

Feel free to down vote me, but, every dollar you spend on Archive.org could've gone to a worthy cause and helped someone. What a shame.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: