Cheat detection

Whenever a challenge is solved, a cheatScore between 0 and 1 is internally assigned to the solution. This indicates how likely the challenge solution has been achieved with cheating.

Cheat detection log entries

Cheat score calculation

The calculation curently relies only on the time difference between current and previous solve in relation to the difficulty of the current challenge. The cheat score also factors in if hints and/or tutorials are enabled or disabled on the Score Board.

Challenge Difficulty Minimum solve time w/o hints w/ tutorial
2 minutes +1 min ÷2
⭐⭐ 4 minutes +2 min ÷2
⭐⭐⭐ 6 minutes +3 min ÷2
⭐⭐⭐⭐ 8 minutes +4 min n/a
⭐⭐⭐⭐⭐ 10 minutes +5 min n/a
⭐⭐⭐⭐⭐⭐ 12 minutes +6 min n/a

The underlying formula assumes that a non-cheating user requires a certain absolute minimum amount of time to solve hacking challenges. It is important to note, that this does not imply that you are expected to only need this minimum minutes for a challenge of certain difficulty.

Coupled challenges

The cheat scoring takes into account that some challenges will be solved in the same HTTP request, for example:

  • logging in the admin user with his weak password solves logging in the admin by any means (e.g. SQL Injection), too
  • both XXE challenge automatically solve using a deprecated B2B interface
  • the generic null byte challenge is typically solved along with the first actual exploit to access some sensitive file from /ftp

To avoid false positive cheat scoring, the second of two coupled challenge solves will never count as cheating when they happen in sequence.

Trivial Challenges

Some challenges are so frequently solved by accident or coincident, that it would be unfair to take them into account for cheat scoring at all. This includes:

  • triggering any kind of error that is improperly handled by the application, which if often solved as a by-product of solving other challenges
  • reading the privacy policy of the shop, which is merely a challenge making fun of the fact that almost nobody reads those in real life

Total cheat score

The server also keeps track of the average cheatScore across all solved challenges in the totalCheatScore which is available via the juiceshop_cheat_score metric but also sent in each Challenge solution webhook call. The totalCheatScore value is not persisted across server restarts, but its calculation is also not irritated by automatic or manual restoring of hacking progress.

After the individual challenge cheat scores from the screenshot at the beginning of this chapter, you would get the totalCheatScore metric seen below:

# HELP juiceshop_cheat_score Overall probability that any challenges were solved by cheating.
# TYPE juiceshop_cheat_score gauge
juiceshop_cheat_score{app="juiceshop"} 0.49485416666666665

The following values for totalCheatScore were measured during activities that are definitely considered cheating:

  • >93% on final webhook call when executing all Integration tests in <1 minute on the author's Windows 10 laptop
  • >92% on final webhook call when executing all End-to-end tests in <20 minutes on the author's Windows 10 laptop

Limitations

The cheat scoring assumes that a single user is hacking the Juice Shop instance. If the application is used by a team, the values need to be considered less reliable, as extra solve speed might come from parallelization of challenges across team members. Similarly, experienced Juice Shop users will also solve challenges faster than a new user, so their speed is likely to trigger cheat detection as well.

If the Juice Shop instance is under the control of the user, any cheat score it reports via Prometheus or Webhook cannot be trusted at all.

All in all, the cheat score should never blindly be used as a tool to caution or sanction somebody. Vice versa a low score should also never blindly be used to determine monetary rewards etc.

results matching ""

    No results matching ""