Categorie
Domande di Internet

Why is the time estimated on computer installation/download/repairs etc. ALWAYS wrong?

Bentornati ad un’altra fantastica edizione delle domande di cultura generale!

542 utenti della rete avevano questa curiosità: Spiegami: Why is the time estimated on computer installation/download/repairs etc. ALWAYS wrong?

No matter what it is, be it an installation, a download, a diagnostic process.

Whenever a computer does a process that it gives a time estimate for it is always wrong. While I understand that for things like a download, speed can change and so does the time, I don't understand why it cannot reliably predict how long a process on local hardware will take?

Ed ecco le risposte:

Your processor has many applications that want its time. To give you the illusion they’re ALL running at once it quickly swaps between them.
So, when the application in question is briefly running and asks itself “when might I be done?” it has to guess; it’s at the mercy of the operating systems scheduler.

So just like a download is at the mercy of the wild West which is the internet, so too are your local applications – as there’s often hundreds always running in the background, each of which could spring into action at any time.

when an install or download starts the estimated time is based on data transfer. As a connection is made the estimation starts, however it is only transfer small files so it may take a snippet of a second then have to find a new space for the next file to begin. Each time it has to find a new slot there is no data transferring, that is why it estimates a long time. As it reaches the main files these are normally larger and data transfer remains high for longer time and the estimate changes and becomes more accurate. As it closes out the install it also does some small files which again need to have space found and to be transfered each time making the estimate different again even though It may have 5kb left that 5kb might be 100 text files that all need to be place so instead of taking a fraction of a second, could take 1 or 2.

Installs and downloads rely on the hard disc write speed(fragmentation also affects this) as well as data transfer speeds. Downloads can also fluctuate because of changing sources but are generally closer in estimation because there is often less smaller files to transfer.

When writing to a disc it needs to a:find where to write something b:write something there c:write where it closes d:note where to find it again. This is why those little files take longer than they would seem they should and as these files in installs are all through it that is why the estimated is wrong. Eg a game may have 1000 tiny files for programming and stuff and 10 large files for video audio and game data etc.

Downloads are actually pretty good to predict (as long as the connection is somewhat stable) because what the program is doing is fairly uniform: Download some chunk of data, write it to the disk. Once you’ve done that a few times and you know how big the actual file is (i. e. how many more times you have to repeat that process until you’re done) you can predict pretty well how much longer it’s gonna take. You just take the average of the last x cycles and multiply it with the number of cycles left.

The problem starts when you have to do very different things in one process that to the user looks like one process. Let’s say I’m installing a game (to the user one process). But at the working level that involves multiple steps with completely different properties. One subprocess of that is downloading the packed archives from some server. (Quite easy to mesaure, as I just talked about). But the next step is uncompressing those to the disk. How does that compare to downloading? Hard to say upfront. Downloading depends practically only on download speed and disk write speed. Unzipping depends on how fast the CPU is, how much CPU time is consumed by other processes and how fast the disk can read and write. Were we capped by the download speed before? Can the disk actually write much faster? How does the disk read speed compare to the write speed? Will some anti virus guard delay our write calls per file? We don’t know yet. Notice that the uncompressing part by itself can again be predicted okayish as well. But only once we started it. Worse than downloading an archive, because depending on the file sizes in that archive the OS might handle writing very unpredictably. But still okayish. You can uncompress say 50 MB three times, check how long that took on average and again multiply that with the rest of 50MB chunks left. After that you have to create some registry entries and/or shortcuts in some folders. Now that’s some completely different calls to the OS again. How long do those take? Might not even be consistent over several patches of the same OS. Let alone different versions of it. Or god forbid we have multi-platform support. So we have Windows 7 vs. Windows 10 vs. Mac OS vs. whatever powers a Playstation. (Actually now I’m a little curious).

So to sum up: The individual steps needed – when they are quite uniform – can be estimated pretty well once you’ve done some part of them. Thats also why you often see different progress bars for individual steps. Or at least status messages of what step we’re currently in. That’s the best info we have at that point in time. What’s hard is bringing them together upfront, especially with data that is unknown at the start of the whole process. The best you can do are rough estimations with a lot of assumptions about the computer this process will be running on.

I mean you could totally measure most of it with “test download, “test unzip”, “test xy” before the actual process takes place and that would improve the upfront estimation siginificantly. But at some point the effort and software complexity is just is ridiculous compared to a slightly more accurate progress bar. Also I’m not sure how happy users would be if they found out we actually added two minutes to their installation process doing pointless downloading and unzipping just to measure their system in order to improve estimation accuracy. I believe in the end people prefer if it get’s actually done as fast as possible.

But I admit is is kinda annoying when the first 60% take 5 minutes, the next 30% 8 minutes and then the last 10 go through in a minute.

Because modern hardware is not very predictable, and systems multitask.

Take a hard disk. The head (the triangular thing in the middle of the platter) must be in the right position to write. If it’s right where it’s needed, it can write instantly. If not, it needs to move, which takes time, and it also needs to wait until the platter under it rotates into the right position, which also takes time.

Plus, different things need to be written in different places, so there will be a large number of movements needed. The computer isn’t aware any of this: it just says “Write this data into block #52343”, and it’s up to the drive to do all the positioning. So, there’s no way for the computer to know how long it will take. It can make an estimation by measuring averages, but nothing says the first half of the installation can’t require less movement for the disk heads than the second half.

Then there’s that systems multitask. Right while you’re installing, the antivirus might be scanning something, or not. And the installer has no idea what the antivirus is doing, how long that takes, and which parts of the install process are going to be affected by the antivirus.

There’s also that many estimations are very rough to start with. Creating a file takes time. So copying a single 1GB file is going to be faster than copying 1000 1MB files, and many installers don’t take that into account and just go by a measure of total data transferred and an estimation of how many MB/s the computer can do.

There’s also complex processes, where for instance some steps may be optional, or consume different amounts of time depending on what you’re installing on, and it can be difficult to take that into account.

The reason it’s not accurate is that the estimates and percentages are a way to tell you that it’s still installing and not stuck. They were never meant to be precise.

And downloads are simply based on the remaining size to download divided by throughput of the transfer.