SHA1 collisions proven possisble

Coworker passed this on to me.

Looks like SHA1 hash collisions are now achievable in a reasonable time
period
https://shattered.io/

-Grant

Coworker passed this on to me.

Looks like SHA1 hash collisions are now achievable in a reasonable time
period
https://shattered.io/

-Grant

Good thing we "secure" our routing protocols with MD5

:slight_smile:

Coworker passed this on to me.

Looks like SHA1 hash collisions are now achievable in a reasonable time
period
https://shattered.io/

-Grant

Good thing we "secure" our routing protocols with MD5

MD5 on BGP considered Harmful.

:slight_smile:

:slight_smile:

More seriously: The attack (or at least as much as we can glean from the blog post) cannot find a collision (file with same hash) from an arbitrary file. The attack creates two files which have the same hash, which is scary, but not as bad as it could be.

For instance, someone cannot take Verisign’s root cert and create a cert which collides on SHA-1. Or at least we do not think they can. We’ll know in 90 days when Google releases the code.

"It is now practically possible to craft two colliding PDF files and obtain a
SHA-1 digital signature on the first PDF file which can also be abused as a
valid signature on the second PDF file."

So they're able to craft two objects that collide to the same unpredictable
hash, but *not* produce an object that collides to a pre-specified hash.

Exactly. This is just more sky-is-falling nonsense. Of course collisions exist. They occur in every hash function. It's only marginally noteworthy when someone finds a collision. It's neat the Google has found a way to generate a pair of files with the same hash -- at colossal computational cost! However this in no way invalidates SHA-1 or documents signed by SHA-1. You still cannot take an existing document, modify it in a meaningful way, and keep the same hash.

[Nor can you generate a blob to match an arbitrary hash (which would be death of all bittorrent)]

It's actually pretty serious in Git and the banking markets where there is high usage of sha1. Considering the wide adoption of Git, this is a pretty serious issue that will only become worse ten-fold over the years. Visible abuse will not be near as widely seen as the initial shattering but escalate over much longer periods.

Take it serious ? Why wouldn't you !?

We negotiate a contract with terms favorable to you. You sign it (or more
correctly, sign the SHA-1 hash of the document).

I then take your signed copy, take out the contract, splice in a different
version with terms favorable to me. Since the hash didn't change, your
signature on the second document remains valid.

I present it in court, and the judge says "you signed it, you're stuck with
the terms you signed".

I think that would count as "invalidates documents signed by SHA-1", don't you?

We just need to keep the likely timeline in mind.

As I saw someone say on Twitter today ... "don't panic, just deprecate".

Valeria Aurora's hash-lifecycle table is very informative (emphasis mine):

http://valerieaurora.org/hash.html

Reactions to stages in the life cycle of cryptographic hash functions
StageExpert reactionProgrammer reactionNon-expert ("slashdotter") reaction
Initial proposal Skepticism, don't recommend use in practice Wait to hear
from the experts before adding to your crypto library SHA-what?
Peer review Moderate effort to find holes and garner an easy publication Used
by a particularly adventurous developers for specific purposes Name-drop
the hash at cocktail parties to impress other geeks
General acceptance Top-level researchers begin serious work on finding a
weakness (and international fame) Even Microsoft is using the hash function
now Flame anyone who suggests the function may be broken in our lifetime
Minor weakness discovered Massive downloads of turgid pre-prints from
arXiv, calls for new hash functions Start reviewing other hash functions
for replacement Long semi-mathematical posts comparing the complexity of
the attack to the number of protons in the universe
Serious weakness discovered Tension-filled CRYPTO rump sessions! A full
break is considered inevitable Migrate to new hash functions immediately,
where necessary Point out that no actual collisions have been found
First collision found *Uncork the champagne! Interest in the details of the
construction, but no surprise* *Gather around a co-worker's computer,
comparing the colliding inputs and running the hash function on them* *Explain
why a simple collision attack is still useless, it's really the second
pre-image attack that counts*
Meaningful collisions generated on home computer How adorable! I'm busy
trying to break this new hash function, though Send each other colliding
X.509 certificates as pranks Claim that you always knew it would be broken
Collisions generated by hand Memorize as fun party trick for next faculty
mixer Boggle Try to remember how to do long division by hand
Assumed to be weak but no one bothers to break No one is getting a
publication out of breaking this What's this crypto library function
for? Update
Pokemon Wikipedia pages

Royce

Depends on the format of the document. As was just pointed out, and I almost posted earlier today, that there are collisions in SHA-1, or any hash that takes an arbitrary length input and outputs a fixed length string, should be no surprise to anyone. Infinite inputs yielding a fixed number of possible outputs. There have to be collisions. Lots of them. The question then becomes how hard is it find or craft two inputs that give the same hash or one input that gives the same hash as another? Doing this with PDFs that look similar, which can contain arbitrary bitmaps or other data is kind of a cheat / parlor trick.

Doing it with an ASCII document, source code, or even something like a Word document (containing only text and formatting), and having it not be obvious upon inspection of the documents that the "imposter" document contains some "specific hash influencing 'gibberish'" would be far more disturbing.

Keep in mind that there's *lots* of stuff that people might want to sign
that aren't flat ASCII. For instance, the video that just came out of
that police officer's bodycam. If the "gibberish" is scattered across the
pixels, you'll never know.

And let's face it - if you need to do an inspection because you don't
trust the hash to have done its job - *the hash has failed to do its job*.

Doesn’t work that way.

According to the blog post, you can create two documents which have the same hash, but you do not know what that hash is until the algorithm finishes. You cannot create a document which matches a pre-existing hash, i.e. the one in the signed doc. Hence my comment that you can’t take Verisign’s root key and create a new key which matches the hash.

You missed the point. I generate *TWO* documents, with different terms but the
same hash. I don't care if it matches anything else's hash, as long as these two
documents have the same hash. I get you to sign the hash on the *ONE* document I present to you
that is favorable to you. I then take your signature and transfer it to the
*OTHER* document.

No, I can't create a collision to a document you produced, or do anything to a
document you already signed. But if I'm allowed to take it and make "minor
formatting changes", or if I can just make sure I have the last turn in the
back-and-forth negotiating... because the problem is if I can get you to sign a
plaintext of my choosing....

When you can do that in the timespan of weeks or days, get back to me. Today, it takes years to calculate a collision, and you have to start with a document specifically engineered to be modified. (such documents are easily spotted upon inspection: why does this word doc contain two documents?) You can't take any random document, modify it to say what you want, and keep the same hash. People still haven't been able to do that with MD5, and that's been "broken" for a long time.

This isn't a checksum or CRC. The changing of bits in the input has an unpredictable effect on the output -- you have to do the entire hash calculation (or most of it), there is no instantaneous shortcut. They had to do 9billion billion hashes to stumble on a solution, after all.

For example, one cannot recover an SSL certificate given only the hash (MD5 or SHA-1.) One cannot change the expiration date of an existing certificate while still maintaining the same hash.

The fact that modern technology can perform 9BB hashes in a realistic time frame is worth noting. (that capability is usually wasted on bitcoin mining.)

I did miss the point. Thanks for setting me straight.

A couple things will make this slightly less useful for the attacker:
  1) How many people are not going to keep a copy? Once both docs are be
     found to have the same hash, well, game over.

  2) The headers will be very strange indeed. The way this works is
     Google twiddled with the headers to make them look the same. That
     is probably pretty obvious if you look for it.

Oh, and third: Everyone should stop using SHA-1 anyway. :slight_smile:

When you can do that in the timespan of weeks or days, get back to me.
Today, it takes years to calculate a collision, and you have to start with
a document specifically engineered to be modified. (such documents are
easily spotted upon inspection: why does this word doc contain two
documents?)

That question never arises, because this word doc contains only one document.

The *OTHER* word doc also contains only one document.

You can't take any random document, modify it to say what you
want, and keep the same hash. People still haven't been able to do that
with MD5, and that's been "broken" for a long time.

That doesn't change the fact that if I can get you to sign a document I
present to you, I can still have lots of fun at your expense.

Stop thinking in the context of bits of fake news on your phone. Start thinking in the context of trans-national agreements that will soon be signed by such keys.

--lyndon

Especially if that "document" is a component of a ciphersuite exchange.

--Dave

One place that use sha1 seems to be some banking gateways. They sign the
parameters of some request to authentificate the request has a valid one
doing something like "sha1( MerchantID . secureCode . TerminalID . amount .
exponent . moneyCode )". I have no idea how evil people would exploit
collisions here, but I guest banking will move to the next hash algorithm
(sha256?) and deprecate this one. This may affect more "Mom and Pa Online
Shop" than bigger services.

* valdis kletnieks:

We negotiate a contract with terms favorable to you. You sign it (or more
correctly, sign the SHA-1 hash of the document).

I then take your signed copy, take out the contract, splice in a different
version with terms favorable to me. Since the hash didn't change, your
signature on the second document remains valid.

I present it in court, and the judge says "you signed it, you're stuck with
the terms you signed".

I think that would count as "invalidates documents signed by SHA-1",
don't you?

The more immediate problem isn't that you get framed, but that someone
is insinuating that you might be framing *them*, i.e. invalidation of
existing signatures etc.

Regarding your original scenario: You have both copies, and it is
possible to analyze them and notice that they were carefully crafted
to exhibit the SHA-1 collision. So it should be clear that the party
who generated the document is up to to no good, and the question now
is which party is responsible for the doctored document. This
scenario isn't much different from abusing complex file formats to
render the document differently in different contexts. There is more
reliable evidence here than there is with your average disputed
pen-and-paper signature.

Automated processing of SHA-1-hashed data might be a problem, though.
For example, a web page generator might skip proper HTML encoding if
the hash of a document fragment has a known SHA-1 (assuming that this
exact fragment has been checked earlier).

Certification signatures (such as those found in X.509 and DNSSEC) are
particularly at risk. For X.509, CAs can randomize the serial number
and avoid the shared prefix, which stops these attacks AFAIK. For
DNSSEC, you probably should verify that the DS records are meaningful
before signing the DS RRset. If I recall correctly, there is no good
way to inject randomness early into the signed data, maybe except
using invalid DS records which get sorted first.

❦ 23 février 2017 19:28 -0500, Jon Lewis <jlewis@lewis.org> :

cost! However this in no way invalidates SHA-1 or documents signed by
SHA-1.

We negotiate a contract with terms favorable to you. You sign it (or more
correctly, sign the SHA-1 hash of the document).

I then take your signed copy, take out the contract, splice in a different
version with terms favorable to me. Since the hash didn't change, your
signature on the second document remains valid.

I present it in court, and the judge says "you signed it, you're stuck with
the terms you signed".

I think that would count as "invalidates documents signed by SHA-1", don't you?

Depends on the format of the document. As was just pointed out, and I
almost posted earlier today, that there are collisions in SHA-1, or
any hash that takes an arbitrary length input and outputs a fixed
length string, should be no surprise to anyone. Infinite inputs
yielding a fixed number of possible outputs. There have to be
collisions. Lots of them. The question then becomes how hard is it
find or craft two inputs that give the same hash or one input that
gives the same hash as another? Doing this with PDFs that look
similar, which can contain arbitrary bitmaps or other data is kind of
a cheat / parlor trick.

Doing it with an ASCII document, source code, or even something like a
Word document (containing only text and formatting), and having it not
be obvious upon inspection of the documents that the "imposter"
document contains some "specific hash influencing 'gibberish'" would
be far more disturbing.

The collision is contained in about 128 bytes. It is easy to hide this
collision in almost any document. You need a common prefix between the
two documents, the collision, then anything you want (you still need a
lot of processing power to get the collision matching your document). It
is a weakness specific to SHA-1. Another same-length hash (like
RIPEMD-160) is not affected.