Friday, July 29, 2005

Hide data in thestream

Streams are a facility built into NTFS that attaches metadata to files; this facility is similar to the extended attributes in an OS/2 file system. Unfortunately, using the term streams could be confusing because it's so overloaded (especially when it comes to files).

A new definition for streams

A file on an NTFS volume is composed of a primary stream and zero or more secondary streams . The primary stream is the data that you normally access, while the secondary stream(s) reside in parallel with the primary stream. Unlike the primary stream (which is unnamed), a secondary stream has a unique name. The secondary stream can hold any amount of any kind of data. However, streams are only available on the NTFS file system. So if a file with secondary streams is moved to another file system, you'll lose the secondary streams.

Secondary streams are invisible

Secondary streams are invisible to both the Windows Explorer and the console. In fact, the Explorer and the console will (incorrectly) report that the space is free. Streams are only partially implemented in Windows even though it’s been around since Windows NT 3.1. In order for the explorer to be stream aware, you need to install an Explorer add-on.

Accessing secondary streams

You can access secondary streams with the standard CreateFile, ReadFile, and WriteFile Win32 API’s or another API such as MFC’s CFile or executable that uses these functions for low-level file access. To access a secondary stream, append a colon followed by the name of the secondary stream to the file name. Stream names are case insensitive just like file names.

Here's the command’s you use when writing to the stream using the console:

echo foo > bar.txt:title

(Note about the above code: You can’t read the stream back from the console with the Type command.)

Here's the code you use when writing to the secondary stream, title, in C:

DWORD BytesWritten = 0;
const DWORD BufferLength = 3;
HANDLE hFile = CreateFile(“bar.txt:title”, GENERIC_WRITE,
0, NULL, OPEN_ALWAYS, 0, 0);
WriteFile(hFile, “foo”, BufferLength, &BytesWritten, NULL);
CloseHandle(hFile);

Here's the code you use when writing to the secondary stream in Python:

File = PrivoxyWindowOpen(“bar.txt:title”, “w”);
File.write(“foo”);

Below are some examples on reading the bar.txt:title stream. Specifically, here's code for reading the stream in C:

HANDLE hFile = CreateFile(“bar.txt:title”, GENERIC_READ, 0, NULL,
OPEN_EXISTING, 0,0);
const DWORD BufferSize = 25;
DWORD BytesRead = 0;
BYTE buffer[BufferSize];
ReadFile(hFile, buffer, BufferSize, &BytesRead, NULL);
CloseHandle(hFile);

Here's the code you use when reading the stream in Python:

file = PrivoxyWindowOpen(“bar.txt:title”, “r”)
print file.read()

The potential for misuse

If you start writing large amounts of data to secondary streams, you're going to create problems for users of the software. Even though they’ll run out of disk space, their systems will continue to report that there is free space. When you use streams in moderation, they can be invaluable tools for keeping key configuration data out of sight from the user.

But people can misuse streams. For instance, there is a potential for malicious use in worms and viruses. I don't know of any virus scanners that check secondary streams for malicious code. Like I said previously, secondary streams can store any type of data--that includes executable code. Hopefully the anti-virus companies catch this problem before hackers start using streams in their malicious code.

2 comments:

Mike said...

If I run a crc or hash against a file that you try to hide data in, the difference can easily be detected. Data hidden in a stream is not visible in the same way. The crc or hash will come back the same.

Mike said...
This comment has been removed by a blog administrator.