Is it possible to recover original content from an edited image file?

  • OK possibly a random question here, but one that is relevant to the site and regarding a hot topic nowadays: privacy.

    If I load an image file (in my case a PNG photocopy of a document which contains some personal details) into Gimp, and I 'fill' with black (i.e. as if to redact) the areas of the image that contain sensitive information, and I then overwrite the original file, is there any way that the original untouched image can be recovered or gleamed? Such as low level bit analysis or...?

    If so what tools and techniques are behind that, and how can I mitigate this? Is it a simple case of not overwriting the original but simply creating a new file with the changes?

  • As written, this is borderline off-topic. But the answer could apply to photography as well, for example, if taking pictures in a war zone where recognizable faces could be dangerous for the subject.

    To answer it, we have to look at two things.

    First, the way the information is stored in an image file. Generally, each pixel is simply represented by triplet of values, for red, green, and blue. Mixed together, this makes the color. If you change the value of pixels in a certain area to 0, you black them out, and the original information is gone. So, that's pretty safe -- if that's all your image contains. Many image formats also include invisible metadata -- information about the camera the photo was taken with and possibly things like location tags. Those need to be dealt with separately. Additionally, some image formats retain multiple layers and may even include undo history -- this includes Photoshop's PSD files and Gimp's XCF files. (And possibly some TIF files.) If you export to PNG or JPEG, you are probably okay.

    Second, consider the way in which you redact your photo. If you use a drawing tool and replace the pixels as above, this is fairly safe, because you are adding new information and destroying the old. If you use some sort of mosaic or blur filter which takes the existing pixels and transforms them, it's actually possible that a clever reversal of the algorithm could get back more information that seems possible. So, don't do that.

    If you're confident that you've masked the pixels using a safe approach, but a little unsure if the file itself is safely "cleaned", take a look at What tools exist to remove metadata from photos?, which gives several good approaches. (Personally, I use jhead -purejpg.)

    Good answer. Don't to mention removing the original thumbnail as part of the metadata cleaning. Can't remember where but I know this was a privacy issue for at least one person that made the news for failing to do so.

    But be careful with images with transparency - these add an extra alpha (opacity) channel. Depending on how the image was made, there may be data in the RGB of the transparent areas that you can't see because of the transparency - but may show if displayed by something that doesn't support transparency (or you edit the image to increase the opacity in those areas).

    Some information could also be recovered from other parts of the image. For example, the objects in the redacted part of the image may also be reflected in other objects. Similarly, light reflected by objects in the redacted parts might color unredacted parts, giving at least some idea of what's in the blacked-out areas.

  • You won't be able to recover the covered parts of the image but maybe you could find a previous version.

    If you are using windows you can right click the containing folder and click previous versions. You may find a version in there depending on when your shadow copy runs and how long you left it before editing the image.

    Thanks for your reply. In this case I'm only concerned with once I've redacted specific parts of the photocopy image and overwritten the original untouched version of the file that once I send via email this edited PNG to somebody, they will not be able to use any tools to "guesstimate" what was blanked out. You're saying this is the case, right (since nobody but me has access to the folder/shadow copy)?

  • In addition the the answer of @mattdm I would like to add another point of view. If the question is only about recovery of data from an image you sent to someone or uploaded somewhere, the given given answers are correct and sufficient.

    But also consider recovery of the original data from the physical storage device, say harddisk, USB stick, SD card, etc.

    1) Overwriting a file doesn´t mean it is actually physically overwritten on the device.

    2) Even if it is physically overwritten, one time might not be enough.

    Agreed. There's an image processing answer to this question, and then there's a forensics answer to the question. The forensics answer to the question gets very complicated.

    If the files is written to a solid-state drive its likely that the data from the original image is left alone and won't be rewritten for weeks. (Solid state drives write their data to different parts of the solid state memory each time, since the memory wears out with use. An expert could likely find the original image.)

    Magnetic media can also write new versions of a file to a different physical spot on the drive and leave the original intact, although they may also write the data back to the exact same spot, or a mixture of new and old parts of the drive. However, experts can analyze the magnetic data on the drive and may be able to extract the original data even after it's been overwritten with new data.

