More 60-Second Science
Imagine that you want to send a secret message to your colleague at the CIA. You can encrypt it to prevent counterspies from reading it. But they'll still know you were sending some sort of message. There's a better option—steganography—which means "hidden writing." You tuck your secret away in an unexpected place, like an MP3 or a photo file, to conceal the fact you even shared any information at all.
But you can't hide much in those file types without it being suspiciously large. Executable files, aka applications, on the other hand, come in all sizes—making them an ideal place to embed lots of data. So two computer scientists created an algorithm that packages your encrypted data into an executable file of your choice. Then it spits out a new program that works just like the original—except that your secret cargo is inside. The research is in the International Journal of Internet Technology and Secured Transactions. [Rajesh Kumar Tiwari and G. Sahoo, "A Novel Steganographic Methodology for High-Capacity Data Hiding in Executable Files"]
Ideally, this handy trick would keep your data safe from prying eyes. Until somebody notices that you're sharing an inordinate number of executable files.
—Christopher Intagliata
[The above text is an exact transcript of this podcast.]



Listen to this Podcast
See what we're tweeting about


12 Comments
Add CommentI was looking for the comments here, but I guess they are hidden somewhere ....
Reply | Report Abuse | Link to thisTechnically, unless these "executable" files contain program instructions (I don't think they do), they are actually simply data files encoded to the specifications required by and associated by the operating system with a specific processing program (the windows media manager, for example). In this way processing the data file with the default decoding program can be accomplished simply by selecting the data file. These data files are not actually executable programs.
Reply | Report Abuse | Link to thisSuccessfully hiding information within these data files only requires that there be some method of specifying to the decoding program that some data not be processed. Processing programs often include a capability for including documentation or comments, for example.
This method of hiding data could be used for MP3, JPEG, MS Word or many other file formats whose formatting specifications are publicly available. Only the typical size of MP3 files offers any advantage for this "Trojan Horse" delivery method.
Jtdwyer, you didn't even read the article, did you?
Reply | Report Abuse | Link to thisI read it thoroughly before commenting.
Reply | Report Abuse | Link to thisDo you have some specific issue with or complain against my comment that you can actually identify, or do you just randomly accuse commentators of not having read the article without cause or reason?
From the post:
Reply | Report Abuse | Link to this"an algorithm that packages your encrypted data into an executable file of your choice. Then it spits out a new program that works just like the original".
Therefore, it does contain program instructions.
I see where the confusion is. That is what the article said but, based on my 35+ years programming computers, that's not correct.
Reply | Report Abuse | Link to thisThe article mentions MP3 files or a photo file. I don't have any MP3 files handy but, (using Windows - I don't specifically know about MAC or Unix), if I right click on a .JPG file the menu offers to OPEN or OPEN WITH... OPEN WITH lists a bunch of APPLICATION PROGRAM FILES - containing EXECUTABLE PROGRAM INSTRUCTIONS that can be used to process the .JPG image data file. The .JPG image file does not contain any program instructions.
These standard interchange data files are intended to allow the exchange of data between computer systems regardless of their operating system or processor platform.
Executable programs are typically compiled to run on a specific OS and processor - that's why when you purchase an application software product, like MS Word for example, it will typically ask you to specify whether you want to buy the Windows or MAC or whatever version. Interpreted software is somewhat different, since uncompiled program source code is decoded and dynamically converted into executable program instructions.
MP3, MPG, JPG, etc. format files are encoded data files conforming to a public standard defined so that they can be processed by many application programs. I hope this makes sense in today's 'technical' language.
I have no doubt that 'secret' data, encrypted or not, could be hidden in MP3 or JPG or MPG files by specifying to the decoding programs in some way that they not be processed as decoding commands. Many if not all of these encoded data file formats support binary data. In an MPG file, for example each data record begins with some decoding specifications, most often followed by binary data that is essentially a bit map image if the current frame. Conversely, that bit map could be the binary representation of the text in this comment or some other binary encoding or encrypted representation of this comment.
My original objection was to the improper and misleading references to MP3 & other encoded data files as executable application programs.
Wouldn't it be possible to just write a programe that can detect all unread code in a file? Basically highlighting the secrete info in the file.
Reply | Report Abuse | Link to thisIf the decoding program was instructed to ignore or 'jump' around included data, as I initially suggested, that ignored data could be identified by a program.
Reply | Report Abuse | Link to thisIf, however, the 'secret' data was encrypted or otherwise encoded in any way as binary data and then encoded as normal audio or image data, I think its presence might only be detectable when decoded normally, since the final product would essentially be a noise signal.
Of course, it might be possible for a binary encoding program to to translate textual information into actual intelligible music and back again. That would be quite a program, though!
IMO, programmatic detection of this encoded binary data might be possible, but very difficult and not likely very reliably. Actual decoding and/or decrypting of the secret data might be practically impossible.
jtdwyer: based on my 5 minutes of actually reading the article, you really are missing the point!
Reply | Report Abuse | Link to thisInformation hiding in data files is already well established: "You tuck your secret away in an unexpected place, like an MP3 or a photo file,"
But this is specifically about hiding information in *executable* files, not data files: "you can't hide much in those file types without it being suspiciously large. Executable files, aka applications, on the other hand..."
The article makes no confusion between the two. Read it again - it's quite short.
Thanks for explaining - you are correct. I simply couldn't accept that this article was making so much about something so trivial and suspect.
Reply | Report Abuse | Link to thisNow what I don't understand (it's the source of my confusion) is that if someone were concerned about the size of their MP3 or MPG files "being suspiciously large," wouldn't the exchange of large program files be even more suspicious? I understand that many people (millions?) often exchange JPEG, MPG an especially MP3 files, but how many people exchange large application program files?
IMO, exchanging dozens of large MP3 files would be less suspicious than exchanging one large program file. In fact, if someone sent a copy of MS Word that was significantly larger than the version distributed by MS, for example, it should be immediately suspect. I think that many commercially available computer security programs would flag oversize standard programs as a security threat.
Moreover, including data in programs is a standard feature of program assemblers, compilers and linkers. Producing source code containing any data to be compiled and included in a program is a very trivial programming task, not requiring that:
"computer scientists created an algorithm that packages your encrypted data into an executable file of your choice. Then it spits out a new program that works just like the original—except that your secret cargo is inside."
There is simply much less here than I could believe!
I have to agree with jtdwyer. It is a brilliant idea. However, it was already thought of more than 30 years ago by the first virus-writers. All you have to do is compare the modified executable file with one available from Microsoft. Or wherever. You might not be able to find out what the encrypted data says, but any decent anti-virus program should be able to tell you if you are looking at a suspect computer with data encrypted in this fashion. In fact, it is almost as obvious as putting a file called MySpySecrets.doc.gz on the desktop.
Reply | Report Abuse | Link to thisThe PE format was designed to hold non executable data and allow arbitrary "padding" to support alignment, both of which have been used to store data in programs that have not claimed to be practising stegnography.
Reply | Report Abuse | Link to thisPE executables generally hold significant amounts of non-executable data, by design such as icons, bitmaps, string tables, XML, etc.
Data bytes used for padding have also been examined in the commercial sector, esp. due to issues with older linkers that tended to store whatever happened to be stored in memory in these areas (leading to security concerns).
In fact the authors reference to specific unused areas seems to demonstrate a misunderstanding of the flexibility of the underling format, and consideration only of a rather common implementation (though the same "unused" sections described in the paper have also been used in industry by some PE packers I believe).
A well documented challenge to the author's view of the PE format appears can be found here: http://www.phreedom.org/solar/code/tinype/
The executable described seems to break the very rules that the paper in question tries to work within (the article also provides reference to works that have not merged the headers).
On these grounds I do dispute that the research paper describes a novel technique, beyond suggesting that the data stored in the manner described be distributed and be encrypted (the latter I presume is obvious to someone trying to hide data)