60-Second Science

Program Hides Data in Executable Files

Hiding data in MP3s, for example, makes the file size noticeably larger. Now computer scientists have developed a system for hiding data in already large applications. Christopher Intagliata reports














Share on Tumblr

Listen to this Podcast

Imagine that you want to send a secret message to your colleague at the CIA. You can encrypt it to prevent counterspies from reading it. But they'll still know you were sending some sort of message. There's a better option—steganography—which means "hidden writing." You tuck your secret away in an unexpected place, like an MP3 or a photo file, to conceal the fact you even shared any information at all.

But you can't hide much in those file types without it being suspiciously large. Executable files, aka applications, on the other hand, come in all sizes—making them an ideal place to embed lots of data. So two computer scientists created an algorithm that packages your encrypted data into an executable file of your choice. Then it spits out a new program that works just like the original—except that your secret cargo is inside. The research is in the International Journal of Internet Technology and Secured Transactions. [Rajesh Kumar Tiwari and G. Sahoo, "A Novel Steganographic Methodology for High-Capacity Data Hiding in Executable Files"]

Ideally, this handy trick would keep your data safe from prying eyes. Until somebody notices that you're sharing an inordinate number of executable files.

—Christopher Intagliata

[The above text is an exact transcript of this podcast.]


12 Comments

Add Comment
View
  1. 1. Butchfoote 01:41 PM 5/12/11

    I was looking for the comments here, but I guess they are hidden somewhere ....

    Reply | Report Abuse | Link to this
  2. 2. jtdwyer 02:03 PM 5/12/11

    Technically, unless these "executable" files contain program instructions (I don't think they do), they are actually simply data files encoded to the specifications required by and associated by the operating system with a specific processing program (the windows media manager, for example). In this way processing the data file with the default decoding program can be accomplished simply by selecting the data file. These data files are not actually executable programs.

    Successfully hiding information within these data files only requires that there be some method of specifying to the decoding program that some data not be processed. Processing programs often include a capability for including documentation or comments, for example.

    This method of hiding data could be used for MP3, JPEG, MS Word or many other file formats whose formatting specifications are publicly available. Only the typical size of MP3 files offers any advantage for this "Trojan Horse" delivery method.

    Reply | Report Abuse | Link to this
  3. 3. Trafalgar 05:38 PM 5/12/11

    Jtdwyer, you didn't even read the article, did you?

    Reply | Report Abuse | Link to this
  4. 4. jtdwyer in reply to Trafalgar 09:24 PM 5/12/11

    I read it thoroughly before commenting.

    Do you have some specific issue with or complain against my comment that you can actually identify, or do you just randomly accuse commentators of not having read the article without cause or reason?

    Reply | Report Abuse | Link to this
  5. 5. madth3 in reply to jtdwyer 12:43 AM 5/13/11

    From the post:
    "an algorithm that packages your encrypted data into an executable file of your choice. Then it spits out a new program that works just like the original".

    Therefore, it does contain program instructions.

    Reply | Report Abuse | Link to this
  6. 6. jtdwyer in reply to madth3 02:42 AM 5/13/11

    I see where the confusion is. That is what the article said but, based on my 35+ years programming computers, that's not correct.

    The article mentions MP3 files or a photo file. I don't have any MP3 files handy but, (using Windows - I don't specifically know about MAC or Unix), if I right click on a .JPG file the menu offers to OPEN or OPEN WITH... OPEN WITH lists a bunch of APPLICATION PROGRAM FILES - containing EXECUTABLE PROGRAM INSTRUCTIONS that can be used to process the .JPG image data file. The .JPG image file does not contain any program instructions.

    These standard interchange data files are intended to allow the exchange of data between computer systems regardless of their operating system or processor platform.

    Executable programs are typically compiled to run on a specific OS and processor - that's why when you purchase an application software product, like MS Word for example, it will typically ask you to specify whether you want to buy the Windows or MAC or whatever version. Interpreted software is somewhat different, since uncompiled program source code is decoded and dynamically converted into executable program instructions.

    MP3, MPG, JPG, etc. format files are encoded data files conforming to a public standard defined so that they can be processed by many application programs. I hope this makes sense in today's 'technical' language.

    I have no doubt that 'secret' data, encrypted or not, could be hidden in MP3 or JPG or MPG files by specifying to the decoding programs in some way that they not be processed as decoding commands. Many if not all of these encoded data file formats support binary data. In an MPG file, for example each data record begins with some decoding specifications, most often followed by binary data that is essentially a bit map image if the current frame. Conversely, that bit map could be the binary representation of the text in this comment or some other binary encoding or encrypted representation of this comment.

    My original objection was to the improper and misleading references to MP3 & other encoded data files as executable application programs.

    Reply | Report Abuse | Link to this
  7. 7. RuFuS7 05:26 AM 5/13/11

    Wouldn't it be possible to just write a programe that can detect all unread code in a file? Basically highlighting the secrete info in the file.

    Reply | Report Abuse | Link to this
  8. 8. jtdwyer in reply to RuFuS7 07:47 AM 5/13/11

    If the decoding program was instructed to ignore or 'jump' around included data, as I initially suggested, that ignored data could be identified by a program.

    If, however, the 'secret' data was encrypted or otherwise encoded in any way as binary data and then encoded as normal audio or image data, I think its presence might only be detectable when decoded normally, since the final product would essentially be a noise signal.

    Of course, it might be possible for a binary encoding program to to translate textual information into actual intelligible music and back again. That would be quite a program, though!

    IMO, programmatic detection of this encoded binary data might be possible, but very difficult and not likely very reliably. Actual decoding and/or decrypting of the secret data might be practically impossible.

    Reply | Report Abuse | Link to this
  9. 9. mp1086 11:29 AM 5/13/11

    jtdwyer: based on my 5 minutes of actually reading the article, you really are missing the point!

    Information hiding in data files is already well established: "You tuck your secret away in an unexpected place, like an MP3 or a photo file,"

    But this is specifically about hiding information in *executable* files, not data files: "you can't hide much in those file types without it being suspiciously large. Executable files, aka applications, on the other hand..."

    The article makes no confusion between the two. Read it again - it's quite short.

    Reply | Report Abuse | Link to this
  10. 10. jtdwyer in reply to mp1086 01:14 PM 5/13/11

    Thanks for explaining - you are correct. I simply couldn't accept that this article was making so much about something so trivial and suspect.

    Now what I don't understand (it's the source of my confusion) is that if someone were concerned about the size of their MP3 or MPG files "being suspiciously large," wouldn't the exchange of large program files be even more suspicious? I understand that many people (millions?) often exchange JPEG, MPG an especially MP3 files, but how many people exchange large application program files?

    IMO, exchanging dozens of large MP3 files would be less suspicious than exchanging one large program file. In fact, if someone sent a copy of MS Word that was significantly larger than the version distributed by MS, for example, it should be immediately suspect. I think that many commercially available computer security programs would flag oversize standard programs as a security threat.

    Moreover, including data in programs is a standard feature of program assemblers, compilers and linkers. Producing source code containing any data to be compiled and included in a program is a very trivial programming task, not requiring that:
    "computer scientists created an algorithm that packages your encrypted data into an executable file of your choice. Then it spits out a new program that works just like the original—except that your secret cargo is inside."

    There is simply much less here than I could believe!

    Reply | Report Abuse | Link to this
  11. 11. byronraum 04:17 PM 5/14/11

    I have to agree with jtdwyer. It is a brilliant idea. However, it was already thought of more than 30 years ago by the first virus-writers. All you have to do is compare the modified executable file with one available from Microsoft. Or wherever. You might not be able to find out what the encrypted data says, but any decent anti-virus program should be able to tell you if you are looking at a suspect computer with data encrypted in this fashion. In fact, it is almost as obvious as putting a file called MySpySecrets.doc.gz on the desktop.

    Reply | Report Abuse | Link to this
  12. 12. nat42 08:09 AM 6/22/11

    The PE format was designed to hold non executable data and allow arbitrary "padding" to support alignment, both of which have been used to store data in programs that have not claimed to be practising stegnography.

    PE executables generally hold significant amounts of non-executable data, by design such as icons, bitmaps, string tables, XML, etc.

    Data bytes used for padding have also been examined in the commercial sector, esp. due to issues with older linkers that tended to store whatever happened to be stored in memory in these areas (leading to security concerns).

    In fact the authors reference to specific unused areas seems to demonstrate a misunderstanding of the flexibility of the underling format, and consideration only of a rather common implementation (though the same "unused" sections described in the paper have also been used in industry by some PE packers I believe).

    A well documented challenge to the author's view of the PE format appears can be found here: http://www.phreedom.org/solar/code/tinype/
    The executable described seems to break the very rules that the paper in question tries to work within (the article also provides reference to works that have not merged the headers).

    On these grounds I do dispute that the research paper describes a novel technique, beyond suggesting that the data stored in the manner described be distributed and be encrypted (the latter I presume is obvious to someone trying to hide data)

    Reply | Report Abuse | Link to this
Leave this field empty

Add a Comment

You must sign in or register as a ScientificAmerican.com member to submit a comment.
Click one of the buttons below to register using an existing Social Account.

More from Scientific American

See what we're tweeting about

Scientific American Editors

More »

Free Newsletters


Get the best from Scientific American in your inbox

  SA Digital
  SA Digital

Email this Article

Program Hides Data in Executable Files

X
Scientific American Magazine

Subscribe Today

Save 66% off the cover price and get a free gift!

Learn More >>

X

Please Log In

Forgot: Password

X

Account Linking

Welcome, . Do you have an existing ScientificAmerican.com account?

Yes, please link my existing account with for quick, secure access.



Forgot Password?

No, I would like to create a new account with my profile information.

Create Account
X

Report Abuse

Are you sure?

X

Institutional Access

It has been identified that the institution you are trying to access this article from has institutional site license access to Scientific American on nature.com. To access this article in its entirety through site license access, click below.

Site license access
X

Error

X

Share this Article

X