- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Yes, but such software does the zeroing once, which is sufficient to prevent attacks based on software. Doing the zeroing more often hurts performance and provides no extra security. If power to the device is killed before the zeroing code is executed then the zeroing doesn't get done. Someone who tosses your freshly stolen powered-off laptop RAM into a cooler full of dry ice while they whisk it off to their forensics lab to apply their nefarious technologies to it is not going to be inconvenienced. ;-)
If the key is stored in a constant address in RAM then there is the possibility of it burning into the chips. Better start allocating key memory using a wear-levelling algorithm to select RAM addresses.
Mind you, if the attacker is smart, they steal your laptop from you while it's powered on with the keys in memory, and just halt the CPU without cutting power to the machine (how that's done is left as an exercise for the WTF readers ;-). The motherboard logic will preserve the contents of RAM nicely while they connect the machine to a power source in their getaway vehicle, then later when they connect a logic analyzer to the DRAM bus. After that they just dump the RAM contents and fish out the key at their leisure.
Admin
Admin
It's not as easy, but the results can be in fact stunning.
Here's what one can do, in a nutshell (assuming that the drive as submitted is operational).
First, drive's channels (all of them) are fully characterized. Like so:
Usually one in the business will know how a drive chews the data before writing it - drives don't write the raw, they write the data after it passes through a scrambling algorithm.
So, knowing what bits get written for which data, one characterizes how bits really look on the medium due to temperature, variations in the electronics, and in the medium itself. All of this gets converted into statistical measures, so that for any bit you will subsequently try to recover you'll have a good idea as to how it would have looked when written over a clean medium. You also characterize how the medium behaves when overwritten multiple times - how the old values decay, etc. Again, it all parametrizes a statistical model of the whole drive system. A lot of theory (and highly paid experts) goes into this phase. Each company will have (one or more) in-house developed analysis systems based on the theory and experience so far.
What happens then is that you read the drive in multiple fashions. First of all, you do a read using drive's read system (duh): heads, read amplifiers and descramblers, servo loops. Everything is digitized in realtime, and generates usually 20-100:1 data expansion (after losslessly compressing!). A one gigabyte drive will blow up to 0.1 terabyte as a rule of thumb. This is done just so that the "obvious" are kept on record: how drive electronics performed on the data.
At this point you're free to run the drive via it's regular IDE/SCSI controller, your digitized raw signals must demodulate to the same data that the controller feeds you. At this point you better knew everything there's to know about how that drive implements scrambling, error correction, its own data structures for management (thermal, servo, reallocation, flux gains) etc. I.e. everything that you have read so far must be fully explainable.
Then some test writes are performed in the "clean" areas where previous data is known not to be valuable. First cylinder of the disk is a place which gets written to quite rarely: boot record, partition table, etc - pretty dormant places. Again, everything that leaves or enters drive's electronics is digitized. This is done so that drive's write channels can be characterized, including the servo trends of course. Depending on given lab's modus operandi, and whether the platters are to be preserved, you either have to give up any subsequent writing at this point, or feel free to write after you return hopefully intact platters from the scanners (below).
This finishes characterization of the drive's subsystem. At this point you rip the platters out and put them on a scanning machine. You typically do a non-contact scan using something like spin-polarized SEM. Then you do a "contact" scan on a (typically in-house-made) machine which uses custom MR heads to get another scan of the drive's platter. Quite often the scans will be repeated after some processing of data shows that this or that can be tweaked.
At this point you may have a few terabytes of compressed data per every gigabyte on the drive, and have spent thousands of dollars worth of amortization alone on your equipment. This is where the data goes to processing, which is necessarily mostly automated. Hand-holding is done to align the track models with actual scans, such that the analyzers can simulate various "heads" that scan the "tracks". This initial alignment is really about accounting for each drive's uniqeness and is pretty non-critical: one the analysis system locks in on the magnetic patterns, it's supposed to know what to do next. Remember: the amound of data is vast and it's wholly infeasible to do much by hand at this point.
Depending on how good the data is after this analysis, you may need to improve on the resolution of "old" data.
This is where you use another machine to very very carefully remove a very tiny layer of magnetic material from the platter. I don't know if some electrochemical methods exist for that. Sometimes you will do local annealing to try and kill off the masking effect of strong domains, this is done again using a custom, one-off tool which will very rapidly and locally heat the domain. This is of course an automated process: the platter's most magnetized domains are annealed under feedback from a scanner (say spin-sem or what's your favourite). Sometimes whole platter will be annealed under a "selective" process where annealing strength and depth depends on the data (or whichever trend is the most favourable at any given part of the platter). The analysis tools will quite often resort to (dis)proving statistical hypotheses in order to "peel" each layer of data, and as such they may try to "push" the remnants even in the wrong way.
In this whole process, one must factor in tons of common "disk scrubbing" tools, some of which write very repeatable patterns. This is where some knowledge about what the "customer" could have potentially done to the drive helps. But in general, the whole process doesn't care diddly squat about what's actually on the drive. For every "location", a history of data is provided, "annotated" by lots of statistical data such as to further guide recovery based on data interpretation.
Data interpretation is all about what the data means: if a given "layer" of data can be somehow known (i.e. the oftentimes noisy result of analysis can be cleaned up), it's easier to get clean(er) underlying data. This is where you'll actually use software that knows what filesystems are, etc. But at this point the drive is either back to the customer, or has been already pulverized. It goes without saying (I hope) that your systems include, among others, a pretty much complete database of every published binary (or otherwise) file of every common (and less common) software package out there [this really applies to commercial stuff, less to the blow-yer-bandwidth-to-pieces plethora of free software].
I.e. if someone had some good data on the drive, and then say a defragmentation tool puts some Adobe Photoshop build whatnot help file over the spot, you better knew what the noise-free version of the file is. Implementation of such database is non trivial in itself (an understatement). The task is this: given a bunch potentially noisy 512 (or whatnot) byte sectors, find what OSes/applications were installed on that drive, and which file each sector corresponds to. This allows "staticalizing" a lot of noisy data, which helps in looking "under it", or for long-time-resident data which is not moved much be defragmenters, even classifying a part of the medium as COTS. For any particular customer, if they can provide you with some files/emails the user has been playing wiht, it helps as your COTS span grows beyond the mere minimum.
I have really spared the details, which are actually mind-boggling. The infrastructure needed to do such analysis is, um, not for the faint of heart, and whether gov't or private, you need lots of funding to have a lab like that. And lots of talent, on all levels, from techs to the ivory tower theoreticians (?).
From what I know, the whole process to this point takes tens of kilobucks and that's when the lab runs at full steam. Meaning that you churn a drive a day, and that's really pushing it. Understandably enough, a very non-WTFy IT infrastructure is crucial in processing all this data. Bandwidth-wise, it's all comparable to running a particle detector in an accelerator. Do understand that you'll want to run all the equipment non stop, preferably on customer units.
When the equipment is not otherwise down for maintenance, you'll probably want to scan known good hard drives that have hit the market, to populate your characterization database (both for electronics and for the medium itself).
Craaazy indeed.
Admin
Burn, burn the volatile memory!
Dude, if a squad of gamers storms your datacenter armed with assault weapons and proceeds to memory-dump your servers, you are going to worry about other things than what is or isn't stored how well on memory.
Wow. Is this a game or a nuclear warhead arsenal? Military standards require 7 randomizations as far as I know...
Admin
Games are notorious for being badly designed and poorly coded - I thought WTF didn't normally cover games?
Admin
I'm interested - which data recovery companies claim to be able to do this?
ALL the ones I've looked at say 'once the data has been overwritten we can't get it back'. Most of them specialise in recovering data from failed drives (motors failed, heads crashed, fried electronics etc) where they will extract the platters in a clean room and re-assemble them with new mechanisms and electronics. They may use amplifiers to recover data from damaged platters, but none say they can recover overwritten data.
Given that all the stuff I've read about data recovery (except Guttman's paper which was written 10 years ago, and other documents quoting him), says it can't be done with any degree of success I'm very sceptical. eg people say you MIGHT be able to recover, say, 40-60% of the BITS of data - which is useless if you think about it. If you know what the data was beforehand, this might be useful as confirmation, but otherwise, how do you know what
0??1?11? ?10?0??0 ??000??0
says?
MFM drives were different because they were so inprecise that the write heads weren't accurately positioned and bits of data wouldn't always be overwritten on later writes. Nowadays to get the high data densities, they're much more precise.
Admin
<It's not as easy, but the results can be in fact stunning.
Here's what one can do......>
As someone said earlier. Degauss, Shred then Burn.
What you are describing would only be available to very large corporations or governments. I don't think that the 1337 crowd has access to those type of resources.
A normal data shredder program should be enough for 99.99% of users. I have had to use a data recovery company in the past to recover some files for a client, and they could not recover overwritten files.
CAPTCHA - atari
Admin
Looks like trying to brainwash the RAM. What about the L2-cache, did they forgot about it? :D
Admin
Hint: The one(s) that do, don't have a website, and don't need to advertise, and they aren't used by the corporate world either (save for a few friends) :)
Admin
I wouldn't know, but the people who do this stuff know enough to figure it out well enough to be sufficiently well paid. I presume that a typical "data recovery" company is worth (throwing in all of its revenues from inception) less than some of the machines used in the labs I described :)
Admin
I would say RAM and HD work essentially the same way, through magnetisation. And some magnetisation could then remain also on RAM modules. Or do they use different material for that? Anyway its ridiculous to what point they push their "security" staff.
consequat? maybe consequatsch
Admin
Actually, I have a three-legged table (by design). And, I would like a car that I didn't have to refuel.
Admin
Greetings, friends. Do you wish to look as happy as me? Well, you've got the power inside you right now. So use it and send one dollar to
Happy Dude 742 Evergreen Terrace Springfield.
Don't delay. Eternal happiness is just a dollar away.
Admin
I would, three legged tables to wobble like 4 legged tables.
Admin
Blitzmerker
Admin
PS: thinking about something like "level := High(i) - real_level + 1;" :-)
Admin
Admin
Gesundheit!
Admin
Pish posh... don't bother me with trifling details such as floating-point error!
I'm telling you man, floats FTW!!1!eleven!
And just think, if you start chucking around some SSE, you could increment 4 loop counters at once! zomg! Thats, like, O(1/4)!!
Admin
More people have whipped out their ePenises in this comment thread than any other I have ever seen before...and that is saying a lot for WTF.
Admin
hmm.. where do i sign up to become one of those? "professional posters" ? would be lovely to make a living by just posting comments.
Though given the ratio of time i have spent working and the time ive spent just reading these forums, i possibly am already paid to post :D
Admin
Admin
Otoh, if the hardware is compromized, you may as well throw all hope of security out the window from the get go. Not much will be secure then. And even if its a case where someone managed to get access via a network, a cracker having access to the machine can find out way more from other things than what may have been on the RAM a second ago. Such as grabbing the whole damn server software and analyzing it back home. ;)
Admin
Haha, indeed very funny!
How many layers do you recover that way? What's the cost for - let's say: 1 MB? How long will it take, to find all layers of a 1 MB section?
Even in Gutman times, restoring drives didn't happen. Just a few bytes were recovered.
However - a clever idiot would write randoms patterns (7 times or 32 times, depending on his idiocy) multiple times to the disk before using it. Later on he only needs to overwrite it once per file, because the secret memory of the disk is already filled with garbage. :)
Admin
You're dead wrong. I've personally done exactly that.
I work in IT security, and a co-worker attended a forensics class along with some FBI members. They got talking about data recovery etc, and said even they do not possess technology to retrieve data once overwritten even just once.
I'd heard that pro shops could recover it using magnetic force scanning tunneling microscopy (STM), and we bet a lunch on whether or not they'd be able to. I thought they would.
So, after about 30 calls to leading data recovery experts, NOT ONE could recover data from a drive overwritten with zeros even just once, regardless of how much I wanted to pay them.
So, in summary, you're wrong, like I was.
Admin
Actually most optimizing compilers would strip that crap out. Using SecureZeroMemory() or:
#pragma optimize("", off) memset(data, 0, sizeof(data)); #pragma optimize("", on)
would be a better choice. Not sure why clearing the memory multiple times would make a difference? Looks retarted to me.
Admin
If if were possible to recover data 32 writes deep on a hard drive platter, wouldn't you think some storage companies would use those extra 'layers' to store more user data?
As for encrypting IP addresses, with the envelope principle, if you encrypt everything, then it's not obvious which peices are important.
Admin
[quote user="Tibit] It's not as easy, but the results can be in fact stunning.
Here's what one can do, in a nutshell (assuming that the drive as submitted is operational).
First, drive's channels (all of them) are fully characterized. Like so:
Usually one in the business will know how a drive chews the data before writing it - drives don't write the raw, they write the data after it passes through a scrambling algorithm.
So, knowing what bits get written for which data, one characterizes how bits really look on the medium due to temperature, variations in the electronics, and in the medium itself. All of this gets converted into statistical measures, so that for any bit you will subsequently try to recover you'll have a good idea as to how it would have looked when written over a clean medium. You also characterize how the medium behaves when overwritten multiple times - how the old values decay, etc. Again, it all parametrizes a statistical model of the whole drive system. A lot of theory (and highly paid experts) goes into this phase. Each company will have (one or more) in-house developed analysis systems based on the theory and experience so far.
What happens then is that you read the drive in multiple fashions. First of all, you do a read using drive's read system (duh): heads, read amplifiers and descramblers, servo loops. Everything is digitized in realtime, and generates usually 20-100:1 data expansion (after losslessly compressing!). A one gigabyte drive will blow up to 0.1 terabyte as a rule of thumb. This is done just so that the "obvious" are kept on record: how drive electronics performed on the data.
At this point you're free to run the drive via it's regular IDE/SCSI controller, your digitized raw signals must demodulate to the same data that the controller feeds you. At this point you better knew everything there's to know about how that drive implements scrambling, error correction, its own data structures for management (thermal, servo, reallocation, flux gains) etc. I.e. everything that you have read so far must be fully explainable.
Then some test writes are performed in the "clean" areas where previous data is known not to be valuable. First cylinder of the disk is a place which gets written to quite rarely: boot record, partition table, etc - pretty dormant places. Again, everything that leaves or enters drive's electronics is digitized. This is done so that drive's write channels can be characterized, including the servo trends of course. Depending on given lab's modus operandi, and whether the platters are to be preserved, you either have to give up any subsequent writing at this point, or feel free to write after you return hopefully intact platters from the scanners (below).
This finishes characterization of the drive's subsystem. At this point you rip the platters out and put them on a scanning machine. You typically do a non-contact scan using something like spin-polarized SEM. Then you do a "contact" scan on a (typically in-house-made) machine which uses custom MR heads to get another scan of the drive's platter. Quite often the scans will be repeated after some processing of data shows that this or that can be tweaked.
At this point you may have a few terabytes of compressed data per every gigabyte on the drive, and have spent thousands of dollars worth of amortization alone on your equipment. This is where the data goes to processing, which is necessarily mostly automated. Hand-holding is done to align the track models with actual scans, such that the analyzers can simulate various "heads" that scan the "tracks". This initial alignment is really about accounting for each drive's uniqeness and is pretty non-critical: one the analysis system locks in on the magnetic patterns, it's supposed to know what to do next. Remember: the amound of data is vast and it's wholly infeasible to do much by hand at this point.
Depending on how good the data is after this analysis, you may need to improve on the resolution of "old" data.
This is where you use another machine to very very carefully remove a very tiny layer of magnetic material from the platter. I don't know if some electrochemical methods exist for that. Sometimes you will do local annealing to try and kill off the masking effect of strong domains, this is done again using a custom, one-off tool which will very rapidly and locally heat the domain. This is of course an automated process: the platter's most magnetized domains are annealed under feedback from a scanner (say spin-sem or what's your favourite). Sometimes whole platter will be annealed under a "selective" process where annealing strength and depth depends on the data (or whichever trend is the most favourable at any given part of the platter). The analysis tools will quite often resort to (dis)proving statistical hypotheses in order to "peel" each layer of data, and as such they may try to "push" the remnants even in the wrong way.
In this whole process, one must factor in tons of common "disk scrubbing" tools, some of which write very repeatable patterns. This is where some knowledge about what the "customer" could have potentially done to the drive helps. But in general, the whole process doesn't care diddly squat about what's actually on the drive. For every "location", a history of data is provided, "annotated" by lots of statistical data such as to further guide recovery based on data interpretation.
Data interpretation is all about what the data means: if a given "layer" of data can be somehow known (i.e. the oftentimes noisy result of analysis can be cleaned up), it's easier to get clean(er) underlying data. This is where you'll actually use software that knows what filesystems are, etc. But at this point the drive is either back to the customer, or has been already pulverized. It goes without saying (I hope) that your systems include, among others, a pretty much complete database of every published binary (or otherwise) file of every common (and less common) software package out there [this really applies to commercial stuff, less to the blow-yer-bandwidth-to-pieces plethora of free software].
I.e. if someone had some good data on the drive, and then say a defragmentation tool puts some Adobe Photoshop build whatnot help file over the spot, you better knew what the noise-free version of the file is. Implementation of such database is non trivial in itself (an understatement). The task is this: given a bunch potentially noisy 512 (or whatnot) byte sectors, find what OSes/applications were installed on that drive, and which file each sector corresponds to. This allows "staticalizing" a lot of noisy data, which helps in looking "under it", or for long-time-resident data which is not moved much be defragmenters, even classifying a part of the medium as COTS. For any particular customer, if they can provide you with some files/emails the user has been playing wiht, it helps as your COTS span grows beyond the mere minimum.
I have really spared the details, which are actually mind-boggling. The infrastructure needed to do such analysis is, um, not for the faint of heart, and whether gov't or private, you need lots of funding to have a lab like that. And lots of talent, on all levels, from techs to the ivory tower theoreticians (?).
From what I know, the whole process to this point takes tens of kilobucks and that's when the lab runs at full steam. Meaning that you churn a drive a day, and that's really pushing it. Understandably enough, a very non-WTFy IT infrastructure is crucial in processing all this data. Bandwidth-wise, it's all comparable to running a particle detector in an accelerator. Do understand that you'll want to run all the equipment non stop, preferably on customer units.
When the equipment is not otherwise down for maintenance, you'll probably want to scan known good hard drives that have hit the market, to populate your characterization database (both for electronics and for the medium itself).
Craaazy indeed.[/quote]
OK. consider this:
Modern hard drive has track pitch less than 100 nm, bit length less than 20 nm, and magnetic coating as thin as under 20 nm. It doesn't use your dad's MFM or RLL, nor binder-based coating. Read pulses of separate fluxes are superimposed. To decode data, the analogue signal is matched against expected response of various patterns, using approaches similar to analog modems. This is called PRML - Partial Response, Maximum Likelihood. Bit error rate is pretty high, and to manage that, it uses quite sofisticated Reed-Solomon codes. That said, it's quite a miracle that such little bits can be read at all at such high speed. If you simply overwrite the original data with just anything, 1) it will go well below random noise, and 2) you won't be able to subtract the new data, because you cannot be sure what to subtract.
Regarding data retention in DRAM, if it lost power, it will keep charge longer than under power without refresh. That said, I observed a device which kept much of its DRAM contents while unplugged from USB power for more than 20 seconds. That caused problems for its power cycle detection in the firmware and host software. The software was not sure if it needed to reload the firmware. RAM Burn-in, of course, is just steer manure.
Admin
Who's out in space intecepting and jaming the radio signals that the CPU gave off as it processed the encrytped data?
Admin
Admin
Err.....
A couple things here.. First of all, something I haven't seen mentioned here yet... no preprocessor is going to look at code like that (even IF it did work), and allow it to stay equivalent to how it's been presented by the coder.. If a preprocessor sees the net result will always be the same, to hell with your code, it'll forgive you for being human and build the binary according to the best-form shortcut.
Overwriting memory is pointless. Memory doesn't work that way, Now, if the page in memory that code resides in happens to make it to swap at some point, that's different, but we're talking about two mosquitos colliding in the grand canyon at that point.
A true WTF.
Admin
I disagree that these routines are blazingly fast.
They are still consuming precious instructions to do the initial compares even if it will always equate to false, and the loop will not execute :)
Admin
Haha! I think that's what happened in the third movie and I defy anyone to repudiate this claim because I don't think anyone was really that sure at the time of production.