Wanted: volunteers with bandwidth/storage to help save climate data

This is a short-term (about one month) project being thrown together
in a hurry...and it could use some help. I know that some of
you have lots of resources to throw at this, so if you have an
interest in preserving a lot of scientific research data, I've set
up a mailing list to coordinate IT efforts to help out. Signup via
climatedata-request@firemountain.net or, if you prefer Mailman's web
interface, http://www.firemountain.net/mailman/listinfo/climatedata
should work.

Thanks,
---rsk

If you’re interested, there’s also a Slack team: climatemirror.slack.com

You can find more info about that here:

- https://climate.daknob.net/
- http://climatemirror.org/
- http://www.ppehlab.org/datarefuge

Thank you for your help!

See also:

https://twitter.com/textfiles/status/808715999042117632

https://twitter.com/textfiles/status/808922272551550976
    Jason Scott‏@textfiles
    When your boss gives you the goahead to mirror 200tb of NOAA data,
you run with it

Don't let the fact that The Internet Archive is all over this deter
you, though. Coordinate here:

https://docs.google.com/spreadsheets/d/12-__RqTqQxuxHNOln3H5ciVztsDMJcZ2SVs1BrfqYCc/edit#gid=0

Royce

Surfing through the links - any hints on how big these datasets are? Everyone's got
a few TB to throw at things, but fewer of us have spare PB to throw around.

There's some random #s on the goog doc sheet for sizes (100's of TB for the
landsat archive seems credible), and there's one number that destroys
credibility of the sheet (100000000000 GB (100 ZB)) for the EPA archive.

The other page has many 'TBA' entries for size.

Not sure what level of player one needs to be to be able to serve a useful
segment of these archives. I realize some of the datasets are tiny (<GB)
but which ones are most important vs size (ie the win-per-byte ratio) isnt indicated.
(I know its early times.)

Also I hope they've SHA512'd the datasets for authenticity before all these
myriad copies being flungabout are 'accused' of being manipulated 'to promote
the climate change agenda' yadda.

Canada: time to step up! (Cant imagine the Natl Research Council would do so
on their mirror site, too much of a gloves-off slap in the face to Trump.)

/kc

We are currently working on a scheme to successfully authenticate and verify the integrity of the data. Datasets in https://climate.daknob.net/ are compressed to a .tar.bz2 and then hashed using SHA-256. The final file with all checksums is then signed using a set of PGP keys.

We are still working on a viable way to verify the authenticity of files before there are tons of copies lying around and there’s a working group in the Slack team I sent previously where your input is much needed!

Thanks,
Antonios

University Toronto's Robarts Library is hosting an all-day party tomorrow of
people to surf and help identify datasets, survey and get size and details,
authenticate copies, etc.

fb event: https://www.facebook.com/events/1828129627464671/

/kc

It would seem like the more copies the better, seemingly chunking this data
up and using .torrent files may be a way to both (a) ensure the integrity
of the data, and (b) enable an additional method to ensure that there are
enough copies being replicated (initial seeders would hopefully retain the
data for as long as possible)...

I seriously doubt that there's going to be a witchhunt even close to as well
funded as anti-torrent DMCA-wielding piracy hunters, and it's not even nearly
the same as keeping a copy of wikileaks.aes, or sattelite photos of
Streisand's campus, or photos of Elian Gonzales, a copy of deCSS, the cyberSitter
killfile, etc ("we've been here before.").

The issue will be 1000s of half copies that are from differing dates sometimes
with no timestamps or other metadata, no SHA256 sums, etc etc. It's going to be
a records management nightmare. Remember, all these agencies wont be shut down
on Jan 20th making that the universal time-stamp date. Some of them may even
be encouraged to continue producing data, possibly even cherry picked or otherwise
tainted. Others will carry on quietly, without the administration noticing.

Im glad some serious orgs are getting into it - U of T, archive.org, wikipedia, etc.
We'll have at least a few repo's that cross-agree on progeny, date, sha256, etc.

Only once jackboots are knocking on doors "where's the icecore sample data, Lebowski!"
will we really have to consider the quality levels of the other repos. Not that
they shouldnt be kept either, of course.

Remember, this is only one piece of the puzzle. The scientists can do as much data-
collecting as they want -- if the political side of the process wants to make 'mentioning
climate change illegal' in state bills or other policies or department missions,
it's far more effective than rm'ing a buncha datasets.

http://abcnews.go.com/US/north-carolina-bans-latest-science-rising-sea-level/story?id=16913782

Nonetheless - mirror everything everywhere always...

/kc

North Carolina is not banning science. It is banning absolutely preposterous and manipulated junk science.

A 39-inch rise in the ocean levels over the next century is based on fear-mongering and junk science designed to scare politicians into increasing grant $$ from the federal government. It is not based on science.

In fact, the sea levels continue to rise at the SAME TINY 2-4mm per year that they've been rising at for decades, with ZERO sign of an increase.

If global warming was real and cumulative - this shouldn't even be possible, based all that we've been told over the past 20 years.

Every article that states that oceans rising at alarmingly faster rates - due to global warming - either lie about or manipulate the the data... or they grab one relatively small short term spike and extrapolates from that.

Meanwhile, dozens of sea-level rising predictions from so-called credible scientists have not only failed, but failed by order of magnitudes, and again, relied upon junk science. True science makes "risky predictions" and is willing to throw out the theory when that theories "risky predictions" don't come true.

But I truly due hope that this collection process is successful because I hope that ALL of this (mostly) manipulated data gets recorded for posterity so that (honest) scientists a century from now can do extensive studies on how/why science became so political and manipulated as they look back on the first few decades of the 21st century's slide into a strong long-term cooling trend, due to long term cyclical sun cycles.

This is not a victim-less crime. This manipulation of the data by global warmongers harms people because is miscalculates resources and damages the economy. Does that mean we should spew toxic waste into rivers or streams or spew smog into the air? Of course not. But global warming and CO2 being a cause of it... and "oceans rising" has MUCH junk science behind it.

Still, I hope this data is preserved. The truth will win out in the long term. (as is already starting to happen)

This started as a technical appeal, but:

https://www.nanog.org/list

1. Discussion will focus on Internet operational and technical issues as described in the charter of NANOG.
...
6. Postings of political, philosophical, and legal nature are prohibited.
...

EXACTLY - but I had to finally respond because it was getting obnoxious... all the "we all think this way and we KNOW that the other side is wrong"--implications/statements embedded in various previous posts.

39 inches?
I'm going to start laying fiber up and down I-5. I'll have the
cheapest trans-ocean cable between Canada and Mexico...

-A

http://climate.nasa.gov/evidence/

What sort of effects do you reckon a 35% increase in atmospheric CO2 over historical levels over the space of 65 years might lead to?

- Mark

How much data are we talking about here? A few floppy disks ? a couple
of megabytes ? gigabytes ? terabytes ? petabytes ?

Have you considered giving "courtesy copies" to other environmental
organistaions such as Environment Canada, Australian Bureau of
Meteorology etc ?

I guess at long last it is time for Larry to stop thinking there was a common interest here.

NANOG has gone completely into the weeds (my email client treats it as political spam).

Sad--once upon a time it was a home for science in an insane academic world.

Hard to see how the OP has anything to do with either of the above.

Actually, it's not that hard ... *if* we can control ourselves from
making them partisan, and focus instead on the operational aspects.
(Admittedly, that's pretty hard!)

The OP's query was a logical combination of two concepts:

- First, from the charter (emphasis mine): "NANOG provides a forum
where people from the network research community, the network operator
community and the network vendor community can come together *to
identify and solve the problems that arise in operating and growing
the Internet*."

- Second, from John Gilmore: "The Net interprets censorship as damage
and routes around it."

The OP appears to be managing risk associated with a (perhaps low)
chance of future censorship. Was the OP asking a straight question
about BGP or SFPs or CDNs? Of course not. But should doctors only talk
about surgical technique -- and not about, say, the need for a living
will? Of course not.

IMO, *operational, politics-free* discussion of items like these would
also be on topic for NANOG:

- Some *operational* workarounds for country-wide blocking of
Facebook, Whatsapp, and Twitter [1], or Signal [2]

- The *operational* challenges of replicating the Internet Archive to Canada [3]

Each operator has to make such risk calculations for themselves. Some
may see the "NA" in NANOG as insurance that such censorship could
never happen here. Others -- especially those who came from other
countries -- may feel differently.

Put another way:

Everyone has a line at which "I don't care what's in the pipes, I just
work here" changes into something more actionable. Being
*operationally* ready for that day seems like a good idea to me.

Royce

1. http://www.telegraph.co.uk/technology/2016/12/20/turkey-blocks-access-facebook-twitter-whatsapp-following-ambassadors/
2. http://www.nytimes.com/aponline/2016/12/20/world/middleeast/ap-ml-egypt-app-blocked.html
3. https://blog.archive.org/2016/11/29/help-us-keep-the-archive-free-accessible-and-private/

This started as a technical appeal, but:

https://www.nanog.org/list

1. Discussion will focus on Internet operational and technical issues as
described in the charter of NANOG.

Hard to see how the OP has anything to do with either of the above.

Actually, it's not that hard ... *if* we can control ourselves from
making them partisan, and focus instead on the operational aspects.
(Admittedly, that's pretty hard!)

The OP's query was a logical combination of two concepts:

- First, from the charter (emphasis mine): "NANOG provides a forum
where people from the network research community, the network operator
community and the network vendor community can come together *to
identify and solve the problems that arise in operating and growing
the Internet*."

- Second, from John Gilmore: "The Net interprets censorship as damage
and routes around it."

[snip]

Everyone has a line at which "I don't care what's in the pipes, I just
work here" changes into something more actionable.

Stretched far beyond any credibility. Your argument boils down to, "If it's a political thing that *I* like, it's on topic."

[..]
  >>Everyone has a line at which "I don't care what's in the pipes, I just
  >>work here" changes into something more actionable.
  >
  >Stretched far beyond any credibility. Your argument boils down to, "If it's
  >a political thing that *I* like, it's on topic."

"If it's a politically-generated thing I'll have to deal with at an
operational level, it's on topic."

That work?

/kc

[..]
  >>Everyone has a line at which "I don't care what's in the pipes, I just
  >>work here" changes into something more actionable.
  >
  >Stretched far beyond any credibility. Your argument boils down to, "If it's
  >a political thing that *I* like, it's on topic."

I can see why you've concluded that. My final phrasing was indeed
ambiguous. I would have hoped that the rest of my carefully
non-partisan post would have offset that ambiguity.

"If it's a politically-generated thing I'll have to deal with at an
operational level, it's on topic."

That work?

That is indeed what I was trying to say - thanks, Ken.

Royce