I had tendencies to be obsessed with numbers and statistics from early childhood. That guy, who kept in mind which friend scored how many goals in PE soccer games and then produced cards with their stats for all the kids? Yeah, that was me. So it was fitting — even though it was only a good ol’ dumb luck — that when I was eight years old, I stepped into baseball, only the most stats-happy and data-driven sport in the world.
Even nowadays, baseball, much like football, is viewed as an obscure sport in the Czech Republic. Mostly a boring spectacle that’s maybe good for Americans but doesn’t make sense to most Europeans. But back in the 90s? Forget about it: it was by and large considered a sport for the non-athletic kids, which is amusing because serious baseball players are damn studs. Most spectators didn’t know what they were watching. For that matter, some of my teammates happened to be in that category as well. Just having bodies to fill the roster with was enough to get some playing time back then. But I was ready even before I stepped on the field to feel the grass for the first time; “Jakub learned baseball by playing a videogame” was a running joke in our team. Oh yeah, I played the hell out of Hardball II, and I knew many minor rules and details of the game like what’s infield fly or which players were responsible for which cutoffs after an outfield hit.
That served me well, so it was only natural I used the same tactic much later when I was in a serious need of practice as a rookie-scorer for my local team in 2004-ish, which led me to score several international tournaments. I practiced by spending my evenings scoring the games I’ve played in Triple Play 2001—what a nerd! Jumping from being a player to being a scorer was a logical step, as I noticed that I looked forward less to the actual games and more to the nights after. I went through the scorebooks and updated the Excel spreadsheets filled with every player’s numbers. In retrospect, being in the actual competition was simply a means to an end, like a junkie that needs his fix.
I practiced scoring laboriously by doing the MLB games — if I was lucky to catch them on TV — or, more often, the games I played in Triple Play 2001. The picture above is an example of user-controlled Atlanta Braves beating the snot out of AI-Toronto Blue Jays, 23-9.
I used this little enthusiasm of mine for a little bit when my friends and I ran Doom — yeah, that Doom — tournaments. There I fully realized that it doesn’t matter what competition it is; a sport, a game, a mixture of both (I’m looking at you, poker), players love their stats. But broadly speaking, I pretty much went on hiatus after I ditched baseball in 2008. Both in terms of me growing as a stat-head and as a person overall. I guess that’s what adolescence does to some people.
A few years later, my oldest brother told me about this book, The Signal and the Noise by Nate Silver which became my guidance. The book is focused on data: what kind of data we have at hand, how we should use it, and how we do use it. It was immediately apparent to me that this guy knows his stuff. Silver made a good name for himself by predicting baseball’s player’s performance and election results. The first chapter I’ve read was regarding undersized, balding second baseman Dustin Pedroia. Most traditional scouts evaluated him rather poorly. He was drafted in the second round by the Red Sox and became an absolute allstar, winning MVP in his second year as a starter. I was hooked instantly. At the risk of making this sound too dramatic, I’m gonna go ahead and claim this book is the top-2 on the list of reasons why I turned my life around—the first one is meeting my wife. Silver’s book not only rebooted my knack for quantifying things, but it also made me appreciate reading. In the three years between 2016 and 2018, I’ve read 205 books.
We installed e-sport elements into competitions for the archaic game of Doom. This was well before e-sports were even a thing. Stats were so significant that I received two complaints via email within ten minutes of disconnecting parts of the site so that I could do small updates to our point system formula. True story.
The year is 2014, and I’m deeply in love with the NFL and slightly frustrated with the state of ignorance regarding data in it. In a way, it was understandable. Football is one of the fiercest ways two humans can compete without killing each other. Even if the traditional football guys didn’t see what these Harvard and Yale graduates did to baseball, the message to a stereotypical nerd hunched over his laptop was loud and clear: “You don’t belong here, buddy“! The irony is FootballGuys™ were rejecting ideas that ultimately made the offenses play much
more manly aggressively. Either way, the contrast when comparing the access to information football and baseball teams had was enormous: MLB had high-level cameras on every stadium, and they let analysts know the pitch speed, along with horizontal and vertical movement; when the bat contacted, they could see the angle and speed with which the ball came off the bat, all of this in real-time. And yet, NFL teams were still judged by meaningless volume statistics. Enter Football Outsiders and Warren Sharp and his blog. These guys got the smarts and published dozens of articles each year. All one needed to get smarter about the sport was right there for the taking.
As I was preparing for the 2015 NFL season, which was my first (spoiler alert: it was my last, too) as a paid betting advisor on a Czech betting site KolemDvou, I put together a model and called it Anthony, after the football tout played by Matthew McConaughey in the movie Two For The Money. Initially, the idea was Anthony would help me pick games to bet on, but I quickly realized that this wouldn’t fly, as I didn’t have enough time nor data to do sufficient backtests. Still, it served me beautifully as a way to comfortably filter stat sheets for each weekly match-up, saving me countless hours of research. On top of that, I probably got to an inevitable conclusion that I honest-to-god suck at betting maybe a year or two earlier than I would without the model. Even though my short-lived career as a betting advisor crashed and burned spectacularly, I decided to stick with the model, as well as with its silly name, and after I licked my wounds and counted my losses, it became my standard for rating teams.
The idea of building a machine that would pick games to bet never left me. Instead of again trying my luck with football, in 2016, I turned my attention to ice hockey. And it was once again Nate Silver to inspire me. His site FiveThirtyEight was using Elo ratings to compare teams in multiple sports. It seemed fascinating that merely providing game results without any additional stats or details could help estimate how the teams stacked up. I spent the whole summer building a model with an Elo rating of my own. Putting together results from close to twenty hockey leagues since 2000 was reasonably quick, but the manual collection of a few years’ worth of historical betting odds took a while. I looked at everything from the glamorous NHL to the lowest of the low: Ekstraliga w hokeju na lodzie, which is indeed a Polish league. As expected, I quickly learned there’s no way I can ever win in NHL, KHL, or Swedish league with my simplistic model. Oddsmakers were just too right at this level to get outsmarted by my simple excel sheet. In less popular European leagues, however, that was a different story. These markets weren’t even close to being efficient. Oddsmakers try to balance their lack of knowledge by forcing you to pay more juice if you want to bet on these unpopular leagues, but my backtest suggested there still was room to make money. And so for the whole 2016 season, I bet real money on my model’s picks for a little profit of +3.1% yield. A human bettor can make maybe 300 bets a season. But a model doesn’t get tired. It never wakes up with a crippling hangover, or feeling like not working and just watching Futurama instead. And so in 2016 when the model advised me to place precisely 867 bets, I did exactly that. With so many plays, +3.1% yield transformed into +28% return on investment which I considered a success.
In the 2017, me and my model weren’t going nearly as strong. I pulled the plug on it after 194 plays with a weak +0.6% yield. I didn’t mind the grind. I was strangely having almost a perverse pleasure betting on these inferior leagues. However, my presence with the Armchair Analysis project grew into a serious second job, and my priorities shifted pretty fast.
When I found Armchair in 2016, my future boss, Dennis Erny, was just putting together a team of charters. The idea was, we would all collectively go through every play of every week to mark various details: how many yards the receiver ran after he caught the ball; whether the QB was under pressure; if the receiver dropped the ball. This would provide the layer of information I felt I’m missing tremendously in Anthony. Without thinking twice, I’ve put together a motivational letter of sorts, added a bunch of my old articles, and sent it over via the webspace I mistook for an email form. Oh, the horror when I realized my defacto CV ended up in public comments instead—awkward! We had a good laugh over that — Dennis surely more than me — and I worked the 2016 season for four hours a week as a volunteer, working the games of Carolina Panthers and Baltimore Ravens.
I loved the work and remained on the team. Before the 2017 season, my excitement went through the roof. I’m sure at times I was seriously flirting with the fine line between being enthusiastic and being downright annoying. But at the end of the day, that was for the best. After all, while Armchair had a long history of selling data, the charting project was very new, and it takes a while to build a dedicated team. In the first few weeks, several people quit on the project to make things more interesting. We made it work. That resulted in my promotion to senior analyst. As much as I loved Polish and British hockey games, I had to make priorities, so ditching the hockey betting was in order. It was the right decision, too: for the 2018 season, I received a contract to a senior position that required me to watch every pass play of every team. A year later, I walked into the 2019 season cocky as hell with an official position as “Head of game charting,” which added reviewing rushing plays on top of what I was already doing a year before. The offseason in 2020 was huge because as the world was turning into chaos amidst the pandemic, we sensed a chance to hit it big. For the first time ever, we didn’t take a break, and we kept the ball rolling. It paid off: on the back of the NFL covid-season, we’ve put together the biggest downloadable data collection on the web, for about a third of the prize our competition asked. We even attracted investors and eventual buyers of the project.