CROSSHAREAccount

Crosshare difficulty ratings

Crosshare recently became the first crossword software to implement automatically generated difficulty ratings for crossword puzzles. This article explains a bit about how it works.

Assessing crossword difficulty

The most well-known difficulty rating system for crosswords is that used by the New York Times. The Times editors aim to place puzzles with similar difficulty levels and features (e.g. themes, grid size) on the same day of the week. So when you describe a puzzle as being "Tuesday-like", other solvers usually know what you're talking about.
Instead of using days-of-the-week, Crosshare's difficulty scale uses symbols familiar to anybody who has been on a ski slope:
  • a green circle means a puzzle will be easy
  • a blue square means it's medium difficulty
  • a black diamond is a difficult puzzle
  • a double black diamond is very difficult

How are Crosshare ratings computed?

In chess, rating systems have been around since the early 20th century for assessing the relative skill of different players. Everybody starts with a base rating. When you win a match, your rating goes up and your opponent's goes down. When you lose, the opposite occurs. If you lose to an opponent with a very high rating yours won't drop by too much. If you lose to an opponent with a low rating it drops dramatically.
Crosshare's difficulty ratings work on the same principle. Every solver and every puzzle gets a base rating (they start at 1500). When a solver completes a puzzle without check or reveal their rating goes up and the puzzle's goes down. When the solver needs to reveal to finish a grid their rating goes down and the puzzle's goes up. Crosshare also maintains a separate statistic for each puzzle and solver which indicates our confidence in their rating. The more times a puzzle is solved (or the more puzzles a player solves) the more confident Crosshare becomes in its rating. This confidence score is used when updating ratings - solving a puzzle that we're very confident is hard will count more than solving one we're still not sure about.
Crosshare ratings are updated once per day based on all of the solves that occurred during that day.

What do the numbers mean?

When you click on one of the difficulty indicators next to a puzzle, Crosshare will show you the current numerical rating for that puzzle. This number is rather meaningless on its own, but it can be compared to other puzzles to see which is trickier - the higher the number, the tougher the puzzle.

How are the symbols decided?

Given a solver's rating and a puzzle's rating, Crosshare can estimate the likelihood that a puzzle will be solved without check/reveal. If the solver has above an 80% chance of success, the puzzle gets an "easy" badge. If it's above 50%, the badge is "medium". If it's above 25% the badge is "difficult", and otherwise it's "very difficult". Notably these symbols are customized to the solver in question. The more puzzles you solve on Crosshare, the more accurate the difficulty indicators will get for you.
When a puzzle is marked as "unsure" it means that Crosshare does not have very good confidence in its rating (see above for an explanation of confidence). Anecdotally it seems like puzzles need about 3-5 total solves before Crosshare gains enough confidence to display their rating.

Pros / cons of this approach

This approach seems to give a great "rough idea" of how hard a puzzle will be to solve. There are some notable weaknesses in the approach, though:
  • Puzzles frequently have domain-specific themes / knowledge. If you're solving a Dungeons and Dragons puzzle that's rated "very difficult" but you're a DnD master, that rating probably won't be accurate for you.
  • A puzzle might be very easy overall but feature one impossibly tricky clue (a "Natick"). Crosshare will rate the puzzle the same as another grid that is tough all-around.

How can I find out my solver rating?

Soon we will be adding a mechanism for you to track your own solver rating. For the reasons given above we caution you against putting too much stock in your rating as compared to other solvers. Another issue is that some solvers might choose to "go to google" for an entry they don't know, while others won't. That said, tracking your own rating might be an interesting way to see how your solving has progressed over time.

This article is part of a series of posts designed to teach visitors about crosswords in general as well as some Crosshare specific features. If you have any questions or suggestions for this or other articles please contact us via email or discord.

We're seeking volunteers to help expand and edit this knowledgebase so it becomes more useful for constructors and solvers. If you're interested please reach out!

Loading...