Hanabi Competitions
Welcome to Hanabi Competitions! This website is used to organize events where players self-assemble into teams to compete on a set of Hanabi deals. We currently only support play on hanab.live.
How it works
Check the homepage for details of active competitions. We also encourage you to join the Hanabi Central Discord server, where new competitions and competition results are announced (refer to the #hc-* channels there).
In order to compete, follow these easy steps:
- Create an account on hanab.live (it's as simple as entering a username and password).
- Organize a team of the number of players specified by the competition rules.
- For each competition deal, have a player create a table on hanab.live by following the links found on the homepage for that competition. Useful tip.
Scoring
Competition scoring system
Competition scoring is based on the matchpoints system used in bridge. A team gets 2 points for each team they beat, and 1 point for each team they tie. Null results, i.e. when a team plays some, but not all of the competition deals, are not "beaten". However, ranking is done on an individual basis, in order to account for teams that for whatever reason have to change members in the middle of a competitions; regardless, we recommend playing with the same team (and on the same hanab.live accounts) for each game.
Game result ranking
There are two types of game result ranking; each competition will use one of them:
- Standard: results are ranked first by game score, and second by the turns score, i.e. the number of turns taken in the game.
- Speedrun: results are ranked first by game score, and second by game duration.
Competition series
A series is a collection of competitions. Each participant in a series is scored according to the sum of fractional matchpoints — the fraction of available matchpoints won in a competition — achieved by that player across eligible competitions. A series may have first-x and/or top-y scoring, meaning only a player's first x and top y competitions count toward this sum. This design aims to give players an opportunity to miss one or more competitions in a series, relieving some scheduling pressure.
There are also special all-time leaderboards, which consider every competition in history matching some characteristic, but use the median score across all of them, scaled up by a function sublinear in the number of played competitions (to prevent camping on a small number of high outlier performances).
Universal competition rules
- Teammates may not communicate by any means except allowed game actions.
- We explicitly forbid reading into the length of time it takes for a teammate to decide on an action.
- Each player in a game must be a distinct human (no bots or solitaire shenanigans).
Player aliases
It is common for a player to make several different accounts on hanab.live. Although we insist that you use the same account for all your competition play, we recognize that mistakes happen. If you accidentally use an alternate account, contact an administrator, providing the name of your main account, as well the names of all alternate accounts you may have used (indicating clearly which is the main account). We will update your results accordingly.
Table creation parameters
There are two additional parameters that you may wish to set when you create a table using the generated links on the homepage. These are table password and card cycling; if you don't know what the latter is, don't worry about it. They can respectively be set by appending the following snippets to the end of the url:- &password=myurlencodedpassword
- &cardCycle=true
Searching across competitions
You can search across the entire set of competition games using arbitrary SQL constituting a WHERE clause. Here are the columns you can constrain:- competition_name
- final_rank
- fractional_MP
- sum_MP
- player_name
- base_seed_name
- seed_matchpoints
- replay_URL
- site_game_id
- score
- turns
- datetime_game_started
- datetime_game_ended
- character_name
Competition design philosophy
Competition cadence
We started with a weekly cadence for the competitions, but found it became exhausting to find time for not only playing the deals, but also practising under the ruleset and developing specialized strategies. We now run at a biweekly cadence. All of the admins are happy with this arrangement; however, based on a poll, a not insignificant number of competitors prefer a weekly cadence. This choice is not set in stone, particularly with respect to different competition series.
Scoring system
Ranking strikeouts and terminations
The official Hanabi rules, which have been published in several different forms (not only for different rulebook versions, but also for different localizations), offer surprisingly little guidance into how to score strikeouts. To summarize across all these different versions, a team that reaches three strikes "loses". In online implementations of Hanabi, such as those on Board Game Arena, keldon.net, and more recently hanab.live, this has traditionally been represented as a score of 0. The same is true of terminated games, whose result can be reasonably described in a manner similar to that for strikeouts: namely, "loss".In my humble opinion, assigning these two game finish states a score of 0 is a rather arbitrary choice for doing what is effectively a coercion into a "finished normally" state. As such, I do not value this choice for use in competitions past what it gives in implementation simplicity, since we're pulling game data from hanab.live.
Ranking games that are heterogenous in finish state obviously requires some creativity. Here are several questions relating to the principle of competition that we considered when deciding on the current policy of awarding the game score that was achieved just prior to strikeout(termination), and the turns score that includes the turn where the third strike was earned(where the game was terminated); consider a competition for a variant with max score 25:
- Does a strikeout/termination at 24 game score indicate a better performance than a strikeout at 1 point?
- Does a strikeout/termination at 24 game score indicate a better performance than a regular game finish at 1 point?
- Does a strikeout/termination at 24 game score and 50 turn score indicate a better performance than a regular game finish at 24 game score and 60 turn score?
- Does a strikeout/termination at 24 game score and 50 turn score indicate an equivalent performance to a regular game finish at 24 game score and 50 turn score?
- Will it ever be advantageous for a team to intentionally strike out? To intentionally terminate the game?
- Yeah, duh.
- Yes. Getting a score of 1 is so bad as to be a contrived situation.
- Yes. The team who finished normally played much less efficiently, and still managed to lose a point.
- At the very least, it's close enough that we don't have a strong opinion on which gets ranked higher. Default to the simplest choice, which is treating them on even footing.
- There is only one type of situation where it is advantageous for an individual to intentionally strike out or terminate. That situation is when it is predicted with high likelihood that a teammate will a) strike out or b) fail to score a remaining possible point. The latter is virtually impossible, since even at 0 clues, at least one remaining play can be communicated by a positional discard, regardless of the amount of context or true information on any hand. The former is not impossible, but it is highly improbable to affect the competition rankings in a meaningful way; an intentional termination could at best improve the turns score by 2, and the game score for strikeouts is typically low enough that no other games with that game score are recorded, meaning turns score doesn't even factor in.
With all that said, our chosen solution seems to satisfy the best balance of performance measurement fidelity and simplicity. There are several slightly different approaches that are also reasonable, but unless you think you have a very compelling reason why one is better than the status quo, we'd recommend that you not start a discussion on this topic.
Accounting for null results
There's a different type of game result than those discussed in the previous section. This is namely the null result, i.e. when a competing team plays in not all of the deals (a team has to play in at least one deal to be considered as having competed in a competition). We considered several tweaks to the traditional matchpoint system to account for null results, but ultimately decided against each.
Although it is a feel-bad moment to rank last in a deal and receive no matchpoints, same as if you had not competed at all (not to mention that every other team stole your lunch money won matchpoints off you), awarding participatory matchpoints has the following negative consequences:
- It incentivizes bad-faith participation, i.e. creating a table with no intention of actually trying to play the game well. Teams who wouldn't have time to play through a deal could create then immediately terminate a game in order to receive some free matchpoints. If this happened to multiple teams, there would be a bizarre race-to-one-above-the-bottom, where they would try to barely out-compete each other.
- It inflates each team's fractional matchpoints, rendering it no longer a direct measurement of win rate.
Number of turns taken as a score
Our initial ideas for scoring Hanabi Competitions mostly involved some combination of cards left in deck, clues left, and final round turns taken. Treating clues and deck size differently leads to some weird tensions in lategame decision-making, as we learned from competitions held by a certain other Hanabi group that shall remain nameless. Then, we realized that cards in deck and final rounds are essentially the same concept, which can be captured by the concept of virtual cards in deck, e.g. if two final round turns have been taken, there are -2 virtual cards left in deck. Finally, we realized that summing clues and virtual cards in deck was nearly equivalent to the number of turns taken, with two exceptions; in the turn score approach:
- the first two strikes don't necessarily cost the team anything;
- clues returned to the team by playing terminal cards (i.e. 5s, usually) don't add to the score.