A note on the use of the 160 bit SHA
and RIPEMD hashes.
Recently the 160 bit SHA hashes have
come under greater examination for possible weaknesses. As a result, it
was found that it was possible to find another message/file/document
with the same hash in fewer operations than previously thought.
Immediately all of the prominent security experts called for the shift
from the 160 bit hashes to the 224 bit, 256 bit, 384 bit or 512 bit
hashes.
I personally think the call is premature for the following reasons.
Any message/file/document has an infinite number of other
messages/files/documents with the same hash value - agreed. But it is
not quite that simple. The simple step of including the size of the
original message/file/document explicitly in the text and hash of the
message/file/document precludes a forgery that adds null characters or
other "non-readable" characters on the end of the document or any other
"trick" that changes the size of the original to find another with the
same hash value. The size in bits of the original message/file/document
is included in the hash function, but this is done "implicitly" by the
software and is not controlled by the writer of the original
message/file/document.
By including the size in bytes in the message text, say as the
beginning and end of the the message/file/document, any "fiddling that
changes the message/file/document size can immediately checked by the
user. This precludes many attempts to "forge" a message/file/document.
Once the size in bytes of the message/file/document is limited to a
known size, then any attempt to change the size can be obvious and
easily detected. This simple step now limits any possible "forgeries"
from an infinite number to an easily calculated and known number, i.e.,
all possible combinations of 8 bit bytes equal in number to the number
of bytes in the original message/file/document. This number may be huge
depending on the size of the message/file/document, but it is still
finite.
The number of possibilities can be limited even more by some simple
considerations.
- Any attempt to produce a random combination of bytes equal in
number to those of the original message/file/document may produce a
message/file/document with the same size and hash value, but the
probability of a random combination of bytes producing anything
intelligible is very small, especially when the random combination must
have the same hash value.
- If the message/file/document is produced in a simple text editor
for human reading and intelligibility, then this severely limits the
possible characters available for use. The full 256 possible byte
values is limited to less than 128 by the simple reason that any value
greater will be un-intelligible (this assumes English for the language
of the original document, for other languages, this may not and
probably will not be true). Further, the combination of bytes have to
be intelligible to the reader. The "forged" message not only has to be
intelligible, but it must pertain to the topic of the original message,
i.e., the context of the "forgery" must be considered.
As an example the sentence: "The number of possibilities can be limited
even more by some simple considerations."
May be changed to a random sequence of bytes:
"qfthyuinjkliopdfbgtunjgdgynhdfgy ddse dghrefgh ebffbebveeveeveeb
dcevevevedvdddfvebvfd"
The two messages have the same number of bytes and (postulating that)
they have the same hash value, the second message would very quickly be
recognized as total nonsense.
- The same holds true for images. Try changing random bytes in a
"JPEG" file and viewing. Changing even one byte can render the image
un-renderable past the point of the first changed byte. Thus, an
alteration of images, especially compressed images, can be quickly
detected.
- The same is true of compressed text files, although depending on
the method of compression, they can be more resistant to detecting
single byte changes. But the probability of a single byte change
producing an equal hash value is probably vanishly small, if not zero.
Also, since the single byte is hard to detect, that means that the text
of the un-compressed message/file/document is unchanged when read.
Although with compressed files all 256 bytes values are permissible,
thus expanding the possible forgeries possible to a still finit number,
the limit on intelligibility and context again limits the
possibilities.
- If the message/file/document is produced in a sophisticated word
processor, then the possiblity of combining meaningful changes without
rendering the message/file/document totally un-renerable by the word
processing software becomes even more remote. Randomly changing text
bytes will soon produce a "forgery" that, when viewed in the context of
the original message/file/document, is total nonsense. Attempting to
randomly change the formating commands in the document will produce
even weirder results that will be immediately obvious when viewed in
the context of the original message/file/document.
Thus, the simple expedient of explicitly including the
message/file/document size in bytes in the text of the
message/file/document renders the infinite number of possible
forgeries, to an easily calculated number of finite possibilities. When
any possible forgery is viewed in the context of the original document,
the possibility of a forgery passing inspection is almost vanishingly
small, if not equal to zero, 0.
The "forgeries" that I have seen to date, rely on using document
formatting languages, e.g., postscript, to include in a document
alternative contents. The creator of the document may selectively
change which portions of the document are displayed for viewing and
arrange that select portions are never displayed. By incorporating
select characters in the protions which are never displayed, the
creator of the document is able to create a document which displays
different contexts to different viewers. The above method of
incorporating the size in the hash would not work for this type of
"forgery". If a "broken" hash is used for the document, the only real
defense is the only real defense available to any user of secure
software: "know whom you are dealing with".
QTCrypt has taken this simple
step from it's inception. Any document encrypted with
QTCrypt, whether signed or not,
includes the size of the original document in the encryption as an
explicit step. This size is compared to the size of the decrypted
document and any discrepancy in the two sizes is flagged by
QTCrypt that the decrypted document
is useless.
QTCrypt has taken an additional
step in version 6.0 of the software: multiple hashes are used for all
documents. By default
QTCrypt
uses six Secure Hashes when encrypting and/or signing any document/file.
QTCrypt cannot make any judgments about the context of the
decrypted document, that is left to the discretion of the user.
That leaves the same avenue for producing "forgeries" that has been
used for millenia, simple human deception.
Again,
QTCrypt cannot protect
against this. The user must be responsible for protecting against human
deception.
QTCrypt is open source so that
the user can inspect the source for any concealed "tricks". If the user
cannot perform this task, then the user can have a trusted agent
perform this task to insure against possible "tricks".
Security comes down to having "trusted" associates and business
partners. This has always been true and will be true no matter what
software devices the security experts invent.
Any software Secure Hash invented by the security experts will
eventually by
compromised. This is true whether the hash length is 160 bits, 224
bits, 256 bits, 384 bits, 512 bits or any longer length, or some
security devise other than a hash value, invented in the future. If the
security devise can be computed in a reasonable time, then
technological advancement dictates that, on a purely theoretical basis
, the security devise can be compromised when the computing power
advances sufficiently.
The time span between invention and compromise is shortening every year.
Note that I said on "a purely theoretical basis" above. The security
device may be compromised, but that doesn't mean that human
intelligence to detect the forgery based on context has been
compromised.
Only the user can do that.
In spite of the above arguments, I have taken the simple and expedient
step that the 160 bit hashes will not be choosen when the user leaves
that choice to
QTCrypt. The
user can force their use and
QTCrypt
can still compute their values, but they will not be used automatically
by
QTCrypt.
In addition, if the user attempts to force the use of the 160 bit
hashes in the "qtkeys" program, the user will be warned that their use
is deprecated.
Also, the user can no longer force their use in a configuration file.
In addition, as mentioned above,
QTCrypt
now uses multiple hashes when encrypting and/or signing any
file/document. BY default,
QTCrypt
uses six secure hashes for encrypting and/or signing. It is strongly
suggested that the user not use less than the default. As more secure
hashes are invented,
QTCrypt
will incorporate those hashes as feasible and appropriate. Since
QTCrypt is Open Source software,
only open source secure hashe will be incorporated.
© Terry D. Boldt 2005
All Right Reserved
Last Updated: Nov. 08, 2005