Quantifying a performance budget using HTTP Archive data
As network connection speeds are highly variable, setting performance targets in terms of load times can be hard to work towards. To ensure page performance targets are met it is a good idea to set a budget that is defined as quantifiable and repeatable metrics - such as total number of HTTP requests and data transferred. Ideally this budget should be set based on testing and RUM data but for new sites, or other sites that have not generated much data, the HTTP archive can be used as a data set. Based on this a reasonable budget for a site is: 350KB total data in 25 requests, a Page Speed Score of 80% and a Speed Index of 1500.
Metrics
The following metrics are used for setting and measuring the budget:
- onLoad time
- Time taken in milliseconds from the start of the request to the browser's
onLoad
event firing. (Lower is better.) - Total data
- Total amount of data in KB transferred for the page and all assets. (Lower is better.)
- HTTP requests
- Total number of HTTP requests required to load the page and all assets. (Lower is better.)
- Page Speed Score
- Percentage score as measured by Google's page speed insights tool. (Higher is better.)
- Speed Index
- Measure of visual completeness over time as used by webpagetest.org. (Lower is better.)
Method
The HTTP archive data set for pages on a desktop browser from June 2014 was used. Averages for data ranges were calculated using MySQL's AVG
function and the data set was segmented using LIMIT
and OFFSET
. As my SQL-fu is limited I used Python to generate the segment bands. The data analysis script is available as a Gist.
The data was segmented to provide average values for the top 5%, the top 5%-10%, the top 10%-25%, the top 25%-50% and the bottom 50% for each metric. Whilst the onLoad
time was used as the basis of the budget, the analysis was run for the other metrics to ensure the values were reasonable. The data and analysis for the additional metrics is included as an appendix.
Results
The data for onLoad
is shown below.
onLoad
time
Segment | onLoad time (ms) | Total data (KB) | Number of HTTP requests | Page Speed Score (%) | Speed Index |
---|---|---|---|---|---|
0%-5% | 1065 | 190 | 13 | 81 | 1137 |
5%-10% | 2130 | 395 | 28 | 78 | 1770 |
10%-25% | 3445 | 726 | 46 | 78 | 2607 |
25%-50% | 5814 | 1172 | 72 | 78 | 3817 |
50%-100% | 14394 | 2841 | 137 | 77 | 7148 |
There were 289,784 pages in the data set.
Conclusion
There are some things about the data from the HTTP archive that should be taken into consideration when using this to set a budget.
- Sites and URLs from the archive were not checked, the fastest pages are likely to be those that contain little content and may not be a representative sample.
- Using averages with little data validation can give misleading results. In particular, there is no checking of standard deviation to check the variance.
When choosing the values I have opted for round numbers and have aimed for the budget to sit comfortably in the 5%-10% band. The budget is intended to be achievable rather than a serious challenge.
Based on this I have set my recommended budget as: 350KB total data in 25 requests, a Page Speed Score of 80% and a Speed Index of 1500.
This budget is set without considering any external factors so, whilst it can provide a reasonable starting point, any budget you chose should be validated with testing and RUM data.
Appendix
This budget has been set based on the onLoad
time as the time taken to load a page was used as the key metric. However, to ensure focusing on this one metric hasn't produced values that are unreasonable, similar segmenting was carried out on the other metrics.
Total data
Segment | onLoad time (ms) | Total data (KB) | Number of HTTP requests | Page Speed Score (%) | Speed Index |
---|---|---|---|---|---|
0%-5% | 1874 | 38 | 10 | 80 | 1467 |
5%-10% | 3197 | 158 | 25 | 78 | 2261 |
10%-25% | 4817 | 394 | 46 | 78 | 3100 |
25%-50% | 7097 | 866 | 76 | 79 | 4217 |
50%-100% | 13153 | 3133 | 136 | 77 | 6719 |
Total amount of data transferred has a good correlation with load time1. The value chosen for our budget is outside the top 5%-10% suggesting we might look to lower this.
HTTP requests
Segment | onLoad time (ms) | Total data (KB) | Number of HTTP requests | Page Speed Score (%) | Speed Index |
---|---|---|---|---|---|
0%-5% | 1893 | 119 | 6 | 82 | 1533 |
5%-10% | 3301 | 383 | 17 | 79 | 2471 |
10%-25% | 5061 | 792 | 34 | 76 | 3378 |
25%-50% | 7270 | 1333 | 61 | 77 | 4477 |
50%-100% | 12981 | 2749 | 148 | 78 | 6478 |
Total number of HTTP requests has a good correlation with load time1. The value chosen for our budget is outside the top 5%-10% suggesting we might look to lower this.
Page Speed Score
Segment | onLoad time (ms) | Total data (KB) | Number of HTTP requests | Page Speed Score (%) | Speed Index |
---|---|---|---|---|---|
0%-5% | 7238 | 1829 | 73 | 96 | 3818 |
5%-10% | 8063 | 1645 | 80 | 93 | 4341 |
10%-25% | 8629 | 1600 | 90 | 91 | 4610 |
25%-50% | 9308 | 1712 | 101 | 85 | 4924 |
50%-100% | 9881 | 2021 | 98 | 67 | 5481 |
Page Speed Score seems to have little correlation with load time, or other metrics. Whilst the value chosen for our budget is in the bottom 50% the lack of correlation suggests that it's not too much of a concern.
This result is particularly interesting - if there is little correlation between Page Speed Score and load time then why even bother setting a target for it in the budget? There are two key considerations: firstly it is an objective and repeatable measurement and secondly it is a check-list of techniques so, by setting a budget for Page Speed Score, it helps ensure that these techniques are adopted.
Speed Index
Segment | onLoad time (ms) | Total data (KB) | Number of HTTP requests | Page Speed Score (%) | Speed Index |
---|---|---|---|---|---|
0%-5% | 1685 | 199 | 18 | 81 | 817 |
5%-10% | 3156 | 525 | 40 | 80 | 1442 |
10%-25% | 4778 | 862 | 60 | 80 | 2158 |
25%-50% | 7001 | 1339 | 83 | 79 | 3321 |
50%-100% | 13236 | 2703 | 126 | 76 | 7592 |
Speed Index has a good correlation with load time1. The value chosen for our budget is still within our top 5%-10% target.
Footnotes
- This ties in with the HTTP archive's own analysis.