The fake storage size scam just went mainstream, and many people are going to loose data
Released on: 2022-06-25

Putting aside the various controversies like employee rights; I’ve always considered Amazon Prime to be one of the safest places to buy stuff online; Especially when compared to somewhere like wish.com. While I still take care to read the reviews and do my research before buying something; it has been incredibly rare that I’ve needed to return something, for any reason. And the few times that I have needed to, it has been quick and painless.

So imagine my curiosity when I came across an external USB storage drive with a combination of specs and price that, while being on the edge of what’s plausible, seemed unlikely.

Screenshot of one of the many scam listings.
Above: Screenshot of one of the many scam listings.

For those who would like a quick overview, I’ve created a video (6:50) about that here:

If you’d like to dive deeper, read on.

Was it a scam?

Yes. I was really hoping to be wrong about this. But it is, and the chance of it being an accident are very, very low. It’s the old scam where the drive pretends to be a different size from what it actually is. And then when it runs out of space it does one of two things:

I’ve actually put together a playlist of the same scam being done in different devices. I’ve stopped actively looking for these, but if I passively come across more, I’ll add them to this list.

How do you know?

Before buying

Suspicions were triggered by all of the inconsistencies in the listing:

The price vs the capacity

was unusually good for current pricing. It’s not inconceivable. Larger capacities exist, and industry pricing is all over the place depending on what you’re looking for. But this combination is certainly pushing it, especially since the form-factor implies M.2, which is considerably more expensive than this for a given capacity.

A screenshot of one of the scam listings. Showing the price along with a few details of the product.
Above: A screenshot of one of the scam listings. Showing the price along with a few details of the product.

Also notice that it is listed as a “Hard Drive External HDD”. There are hard drives that might fit in an enclosure of this size (leaving half the thing empty). But they would struggle to get anywhere close to 16TB.

The specs in the photos didn’t match the text.

A screenshot of one of the photos. Showing USB 3.1, and a speed of 100MB/s.
Above: A screenshot of one of the photos. Showing USB 3.1, and a speed of 100MB/s.

A screenshot of the detailed specifications of the product.
Above: A screenshot of the detailed specifications of the product.

The technologies and specs didn’t match.

USB 2 has a maximum speed of

480 Mbit/s (maximum theoretical data throughput 53 MByte/s)

53MB/s is no where close to 100MB/s, let alone 400MB/s.

A screenshot of the detailed specifications of the product.
Above: A screenshot of the detailed specifications of the product.

The combination of technologies was unlikely, but not impossible.

Speed vs Capacity

If you are working with a large amount of storage, like 16TB, you are likely wanting to put a large amount of data on to it. You probably don’t want to wait for huge amounts of time to do it. Let’s put that in perspective:

Very approximate time to fill 16TB using

And that’s before we take into account whether the storage is able to sustain that speed for long periods of time.

For most use-cases, USB 2 is unlikely to be what you want. It could still be useful as a background backup, so it could have been a cost cutting measure for that target audience, although it certainly isn’t marketed that way.

A screenshot of the spreadsheet that I used to calculate these.
Above: A screenshot of the spreadsheet that I used to calculate these.

You can download the spreadsheet in OpenOffice, and Excel (untested) formats.

A screenshot of the detailed specifications of the product.
Above: A screenshot of the detailed specifications of the product.

USB 2 vs USB C

USB C has only been around for a few years. So having the USB C connector tells you a bit about the age of the device, and therefore what assumptions come with it.

Devices with a USB C connector, and use USB 2 exist (cheap, modern phones without fast charging, for example). But they sacrifice both speed, and the amount of power that can be used. See “Speed vs Capacity” above for more information on what that is unlike in a device like this.

The deep dive

Some of the above issues could have been mistakes in the listing, or weird cost cutting choices. So there was still some plausibility to it.

When I received it, the USB cable was faulty. Not really relevant, but it wasn’t a good start.

I did a lot of testing, and actually wrote a tool to help me with this so that I would really understand what was going on, and what failures looked like. I ran the tests over exfat, and vfat (fat32). I wanted to also try ext4, but the drive simply didn’t work well enough to even mount it, let alone test it.

USB C

Correct: Here’s a photo of it.

Photo of the device showing the USB C connector.
Above: Photo of the device showing the USB C connector.

USB 2

Correct.

Screenshot showing the device within USB tree via lsusb.
Above: Screenshot showing the device within USB tree via lsusb.

Screenshot showing the root hub for the USB bus that the device is attached to.
Above: Screenshot showing the root hub for the USB bus that the device is attached to.

For comparison, this is what a USB 3 flash drive looks like on the same physical socket:

Screenshot showing a USB 3 flash drive plugged into the same physical socket.
Above: Screenshot showing a USB 3 flash drive plugged into the same physical socket.

Speed

Not even close. Here’s a quick recap for what we’re expecting:

Reality:

Screenshot of typical performance of the drive. 1139KB/s, which is about 1.1MB/s.
Above: Screenshot of typical performance of the drive. 1139KB/s, which is about 1.1MB/s.

Screenshot of performance when the drive is writing into a black hole. 40.4MB/s.
Above: Screenshot of performance when the drive is writing into a black hole. 40.4MB/s.

All of this testing is taking into account the OS-provided disk cache, which masks the performance the applications see until the cache runs out. Note that the nmon screenshots are not affected by this because they are showing what the OS is actually writing to the storage, not what is being written to the cache. Meanwhile rsync is showing what it is able to write to the cache. So it is affected.

You can see the cache running out and the rsync averages starting to go down here:

Screenshot of the OS-provided disk cache skewing early writing speeds.
Above: Screenshot of the OS-provided disk cache skewing early writing speeds.

Capacity

No. But it really looks like it unless you take time to dig a little deeper.

The normal ways of checking it

By all the methods that a user might check, it looks like it has the advertised 16TB. When the drive is plugged in, it reports itself as 16TB to the OS:

A screenshot of the device reporting its capacity to the OS.
Above: A screenshot of the device reporting its capacity to the OS.

The partition layout is set up for 16TB (15.3T):

A screenshot of the partition layout.
Above: A screenshot of the partition layout.

The filesystem also reports 16TB:

A screenshot of the filesystem reporting 16TB.
Above: A screenshot of the filesystem reporting 16TB.

What I was looking for?

Exactly what will happen depends on the filesystem being used, and the exact implementation of the scam.

The most important thing is file corruption, but also speed changes when the capacity runs out. If it speeds up, it’s probably the black hole method. If it slows down, it’s probably the overwrite method.

Here is what a failed file looked like on my drive:

A screenshot showing two files that should be identical, but are not.
Above: A screenshot showing two files that should be identical, but are not.

My drive used the black hole method, which leads to recent files being corrupted/absent.

For comparison, here is a video of someone testing for the same scam on a different device. On that device, it uses the over-write method, which corrupts previously stored files.

If it’s not happening already, I expect that scammers will use inline compression to make the device last a little longer before it gets detected. I haven’t seen anyone talking about this yet, and I haven’t seen any evidence that it is being used. But it would totally make sense for them to do at some point, if they aren’t already. The main reasons why they might not are:

Also note how various places noted the corrupt/missing Alternate GPT at the end of the drive:

Screenshot of dmesg showing errors about the Alternate/backup GPT.
Above: Screenshot of dmesg showing errors about the Alternate/backup GPT.

Screenshot of fdisk showing errors about the Alternate/backup GPT.
Above: Screenshot of fdisk showing errors about the Alternate/backup GPT.

Test structure

My tool would copy from a known location on another drive to a unique directory (using a timestamp) on the suspect drive. It would repeat this into unique directories until it was told to stop, or a specified iteration limit was reached.

Since the expected state was easy to measure, it periodically does a complete check of the contents of the directory to make sure that it isn’t corrupted. However, while this is exact, it’s very slow.

So I added a fast check method, which compares a small section of the beginning, and end of the files. This gives a definite-fail, and a probably-pass. Ie if you see a fail, you know it has failed. If you see a pass, it is probably a pass, but might actually be a fail. This was plenty good enough considering how regularly fails happen once the physical capacity has been exceeded.

exFAT specific characteristics
vFAT specific characteristics

How did the advertised specs match up to reality?

FEATURES:

What’s Included:

How does it work?

When you connect the device to your computer, your computer talks to a controller on the device. That controller tells your computer about the device, including things like the capacity. It also talks to the storage on your behalf.

The scam works by reprogramming the controller to miss-represent the storage size, and then doing something dodgy when it’s asked to store something in an area of the storage that it says exists, but doesn’t.

Diagram showing how the computer connects to the storage.
Above: Diagram showing how the computer connects to the storage.

Why is this a big deal?

Hopefully we already agree that getting scammed is bad. But this is worse than that. The drive is actively designed to hide that it is failing. And it will definitely fail. It’s not a far off statistical probability. It will fail. And the nature of the drive is such that people are most likely to use it for backups, or for storing stuff that they don’t need to access often.

While it’s expected that people should test their backups from time to time, that’s a heavy process and gets heavier depending on how thorough you are with the testing. Furthermore, if you test your backup by reading a sample of random files, there’s a high chance that you won’t notice the problem.

Add to that; if the drive is formatted as exfat, which is likely to be the case for most drives now and was the default for this drive, files will silently go missing when the drive is disconnected from the computer. So any files you do actually see and test are likely to be fine when you test them, but not actually fine when you need them.

Writing a new backup to the drive won’t be obvious for testing, although it will likely need to do some time-consuming work to replace the missing data. So if you’re expecting it to be quick (because nothing should need to be done), this might let you know. But then when what assumptions do you make next? Do you accept that it’s now up-to-date? Do you test it one more time? If you test it one more time, the OS disk cache may mask the fact that the data needs to be written again.

Basically, there are multiple ways that someone could miss that their data has gone missing, even if they have explicitly checked. There’s a high chance in people loosing data that they thought was backed up and therefore safe. Imagine what ever you specifically don’t want to loose, and what would be the consequences of loosing it. How bad would that be now? How bad would that be in a few years when you don’t remember what was on the drive?

What you should look out for?

Before buying

This is hard. The best numbers I give you today, will be blurry in a few weeks, at best. These scams work because they are just plausible enough to be believable, but just fantastic enough to be very attractive. And last just long enough that you’re unlikely to notice before it’s too late. So here are some things to look out for. Any of them could be a mistake (certainly not uncommon on Amazon), but as you see more of them, it will give you more confidence that is might be a scam:

I’ve always been an advocate for trying new brands, particularly when they are doing something better than the big brands. But this is one area where this strategy is more likely to bring pain. Your storage is incredibly important. It’s worth having as much trust in the transaction as possible. Here are some things that might help build that trust:

What can you do to be more confident in your storage?

Test it.

Understand that for any rule that I give you here, scammers will find a way to hide the scam from it. Therefore this information will age. But it will hopefully give you a feel for how to build your confidence.

Tools

This will be your most time-efficient way to test the drive, and will help avoid human error/bias when testing.

I’ve written one called testForFakeStorageSize. This is in very early stages, and is aimed at a highly technical user. Therefore I’d suggest trying one of:

I haven’t tried either of these, and haven’t been able to find the official source for them. So I suggest taking care. But if you get one of these set up, it is likely to be your best bet.

No matter what tools you try, make sure that they are up-to-date to give you the best chance of detecting the latest techniques.

What to look for when manually testing the drive?

Note that different file systems behave differently. The different scam methods behave differently. And new scam methods will likely appear over time. So some items in the list below may match your drive, while others may don’t. You’ll need to use your judgement as to whether the drive is healthy, and further whether it’s a scam, or just broken:

Things to do:

A note about the specifications that I have highlighted

This scam can be done with any capacity. I have chosen the ones that I have because they are on the edge of what’s likely right now, so they are easier to spot.

Is this on other Amazon stores?

Here is a small sample of the available localisations as at: 2022-06-24.

Who in the chain is doing this?

I don’t know. It could be anywhere from a disgruntled/dodgy employee in the factory who had the knowledge to reprogram the controller, through to a vendor on Amazon specifically seeking the drives to make a quick buck, or anywhere in between. It’s completely plausible that anyone closer to the buyer than where the scam took place, had no idea that it was a scam. But I will give you a couple of data points:

What would I like to change?

This is not a case of “you get what you pay for”. This is a scam, and not poor quality.

Summary

It sucks that this sort of scam is being so heavily promoted on a platform that used to feel pretty safe. Fortunately, for the purposes of this conversation, Amazon’s returns process is excellent, which takes away a lot of the stress of discovering that you have a scam unit. It’s just a massive shame that many people aren’t going to notice that they have a scam unit until they’ve exceeded the return window, but more importantly, lost potentially important data.

Amazon needs to get on top of this, but this is likely to be a game of whack-a-mole, and is a really hard problem to solve. And solution will likely create some friction somewhere else. But in the mean time, it should be really easy to report a scam product.

For consumers… We’re probably going to have to accept that these scams will be common place for the foreseeable future.

This post references

The RandomKSandom series is the spiritual successor to FunnyHacks. Here, you can find all of the posts about it....

Posts using the same tags

I'm stopping my Patreon activity for now. Let's dive a little deeper into why....
Getting an external display running on the Astro Slide (or any modern Android phone) without the HDMI dongle....
What is handWavey? And how to get up to speed with it more quickly....
4 easy phone hacks to make your phone more useful and fun...
The RandomKSandom series is the spiritual successor to FunnyHacks. Here, you can find all of the posts about it....
The RandomKSandom series has begun...
Home | About | Contact | Cookies | Site map