The game was released about a month ago (February 27th, 2023), and it’s sold over 5000 copies (I estimate ~30-35% of sales will get into my bank account after discounts, VAT, Steam cut, and income tax). This would be a flop for indie studios with multiple people, however, I’m doing this part-time and it is a success for me. It looks like it’ll pay for the salary I’ve foregone by only working part-time while being just my first commercial game.
In this post, I’ll document what I did since the last update:
Finished most of the game by the beginning of January (a bit later than planned)
Tried to get players for the beta but didn’t really manage to get too many
Emailed streamers by the end of January, and then the magic happened. Some big streamers like Retromation, Aliensrock, Olexa, and Sifd picked up the game and wishlists shot up. All the feedback I didn’t get through the official beta, I got via an increase in players due to streamer exposure, but the deadline was very tight at that point since I wanted to release by February 27th. I’m still not sure whether I should’ve pushed the release a couple of weeks further
At the beginning of February, I was contacted by illustratorIsaac Murgadella offering to do artwork for the game. I was hesitant because I feared the game might lose its identity so close to the release date. However, I also feared that the art I had at the moment resembled Magnus Carlsen too much and that he may not like that (I’d emailed his team about it and never got a response). I ended up deciding to go with it, and I’m super happy I did as the game looks a lot better
Participated in Steam’s Next Fest and wishlists kept increasing
I wrote to j4nw (developer of Pawnbarian), who inspired me to get into this journey, and we decided to bundle together for my launch. He’s been super nice and I’m very happy to work with him
Ported the game to Linux (very easy), and I tried to port it to Mac but it wasn’t so easy and I decided to wait after the release
With one week left until release, I felt that there were too few relics and sprinted and added ~20 new relics, duplicating the amount
Finally released the game on February 27th, and found out that I’d left some bugs in it, so I spent a week patching all bugs that appeared
My Discord server grew up to ~250 people and I kept getting feedback and requests for more content
Plans for the future
Given the game’s moderate success, I’ve decided to expand a bit on it with some of the top requests I’ve gotten from my discord, as well as port it to more platforms and localize it to other languages. Here’s an estimated timeline of how this could happen:
Infinity mode: you can play endlessly after beating the game. I released the final version of it on April 7th
Practice mode: set up a board and play against friends or the AI. Expected by mid-April
Content update, with new units and maybe new relics and items. Expected by end of April, maybe later if lots of playtesting is required
Mac port. I’m not 100% on the complexity of this, but hopefully, I can have it done by mid-May
Localization for German, Spanish, French, Chinese, Russian, and Catalan. Here I need to work on a system that makes sure that all the text in the game is read from a file, so I can easily change languages. After that, I’ll have to send the text to translators. I expect this to be done by the end of May, but I’m aware it could take longer
Mobile port as a free demo + in-app purchase for the full game. It will require UI re-design, and making sure the controls work for mobile. I’ve never done this before, I hope it’s done by end of June.
Keep in mind that I’m working on this solo and any real-life issues may delay this plan. But if there are no emergencies I think it’s doable.
Sorry, your subscription could not be saved. Please try again.
Since these posts aren’t getting much traction, I haven’t made any in what feels like an eternity. Since I last posted, I have made lots of improvements to the game:
Improved the map both visually and its inner workings, now offering more paths
Added some juice, by improving some of the game’s visual effects, adding a slight screen shake, and blood stains on the board
Added a lot more pieces, relics, and items
Prepared the infrastructure for the full game (3 chapters + final boss)
You can check out all these improvements on the Steam demo and on itch.
Additionally, I’ve participated in a couple of festivals, getting to ~250 wishlists. Assuming I double the number of wishlists by launch, that wishlist conversion rate is 20% and the price is 10€, the game will make ~1.000€ and I’ll probably get half of that amount after steam cut and taxes. This figure is way below my initial expectations, but it’s only my first commercial game and I’ll still be happy. However, I’ll still do what I can to get it to more people.
My plan is to release it in February. Before launch, the plan goes as follows:
Finish the game by the end of December. This includes a new type of map location involving sacrifices in exchange for more powerful units and relics, completing the game lore (I have a draft in my head but it needs some ironing), and populating the final stages (90% of the content is made, I just need to tell the game where to show it)
Run a beta to gather some feedback. I’ll get players for the beta from alphabetagamer and some specific subreddits. My plan is to credit and give steam keys to everyone that makes a contribution to improving the game
Contact streamers (hopefully by mid-January) to start generating some interest in the game
Participate in February’s Steam Next Fest
Release a week after Next Fest
So that’s that. Hopefully, the plan works out and I can be here in a couple of months talking about a successful launch 😉
Hi everyone, during the last few weeks I’ve been on vacation, which has allowed me to spend a lot of time adding more content to the game. Since the previous update, I’ve added:
Many new pieces: portal mage, immortal, cardinal, pawn, and fool
The item system, including gold rewards for winning battles and a shop
A difficulty system to make sure everyone can enjoy the game
Quality of life improvements to the initial army and map system, making sure you’re not shown too many new units at the same time and you always have relevant options on the map
Some polish to the sounds and new tracks by my brother Licus
With all of this, I am very happy with the demo in terms of gameplay. But there are still many visual improvements that I’d like to add, mainly animations to improve the game’s juice.
Next steps
Since Steam’s Next fest is at the beginning of October and I already have a working demo, I’ll focus on marketing during the following weeks. I intend to test many different things and see what sticks to try to build some momentum before the festival. Make sure to wishlist the game on Steam if you haven’t yet. I’ll surely do a minor update when the cover art is ready, and if I have some time to spare I’ll add some extra animations.
After the festival, I’ll go back to developing the game working on a dynamic monologue system and on content for the other 2 stages.
Thanks for reading and as always, subscribe for more updates.
Sorry, your subscription could not be saved. Please try again.
I’m happy to announce that during the past few weeks I’ve added the following changes to the game:
Introduction of relics, some of which affect combat and others affect army upgrades. You will always start with an Alarm Bell, that tells you when your king is in danger (which should make the game more accessible). Other relics will be available by visiting treasure nodes on the map
Added rocks that block unit movement during combat
Added some animations and sound effects to improve the game’s juice
Nerfed the Berzerker, as it was too strong. Now it moves either 3 squares vertically or horizontally, or 2 squares diagonally. It can no longer kill the enemy king on its own on most situations
I’ve created a Steam page for the game, with the placeholder art. If you’re reading this, go wishlist it now, thanks! I don’t really expect it to get noticed too much right now, but I needed it up to enroll on festivals. On that note, I got rejected from Tacticon and I’ve enrolled on October’s Steam Next Fest. Starting to enroll on festivals so early may be a bit reckless, but I expect development speed to pick up soon as I’ll be on vacation during August and on September my daughter will start kindergarden.
I’ve also started talks with an artist to commission art for the game.
My following steps will be adding the item system along with a rewind mechanic that should make the game a lot more accessible.
Thanks for reading and as always, subscribe for more updates.
Sorry, your subscription could not be saved. Please try again.
During the last months, I’ve been working on a chess roguelike game: the Ouroboros King. I’m trying to do with chess what Slay the Spire did with card games.
I’ve finally released the first version of the Ouroboros King, you can play it on itch. All constructive feedback is welcome.
This first version contains the following elements:
A procedurally generated map a la Slay the Spire
An army management system, so you can change your piece formation
A combat system, that is basically chess with some variations (doesn’t tell you when you’re in check, kings can be captured, new pieces are available)
An event system, where you can upgrade your army and recruit new pieces after winning a combat
However, it’s still lacking many elements that I want to incorporate into the game:
An item system, with consumable combat bonuses
A relic system, with permanent bonuses
A dialog system and lore descriptions to tell the game’s story (similarly to how the Souls series or Hollow Knight tell their stories)
Many extra alternative chess pieces, to add more variety
Battlefield modifications, such as rocks that block movement
2 extra stages with boss fights at the end (this release includes only the 1st stage)
Endgame difficulty options and unlockables to extend the game’s life
Background music
Many of them will make it to the free demo, and the rest will be available on the final version that I plan to release on steam.
Sorry, your subscription could not be saved. Please try again.
In May 2022 the Naga tribe was introduced to HS Battlegrounds. From the start, the tribe was completely OP with decent early-game units what and crazy late-game scaling. Since then they’ve been nerfed twice, lowering both the initial stats and scaling potential of some minions. In this post I’ll help you build a Naga board optimized for scaling, using the tools of numerical analysis.
The growth engine
This scaling is thanks to growth engines that interact with spells and the new spellcraft mechanic. There are many Naga that scale when you play spells, but not all of them are equally effective. Here are the scaling Nagas ordered by decreasing order of effectiveness:
Tidemistress Athissa, is not as OP as it used to be, but still very strong. If you get 5 procs (a quite conservative amount, 4 Spellcrafts on board and cycling 2 extra spells), that is +18/+18 on your board, more than a golden Ligthfang with 4 tribes or a Charly and a Pumba. Note that Athissa procs on all spells, including coins, blood gems and discovers from triples. We’ll compare the other minions to Athissa.
Critter Wrangler, half the scaling of Athissa on Spellcrafts, none on other spells. All in all, this will be ~40% as effective as Athissa, depending on whether Quilboar are on the lobby and the number of triples you get.
Eventide Brute (after you cast a spell, gain +1/+1). ~33% of Athissa’s scaling and it gets all the buffs, making it more vulnerable to poison/Leeroy.
Lava Lurker (the 1st Spellcraft spell cast on this each turn is permanent). The best spell you can use on it is Shoal Commander’s one, which gives it +7/+7 assuming you have 7 Nagas. If you optimize your setup for the Lurker and get 1 golden Lurker and 2 golden Commanders, you could get +28/+28 scaling per turn, which is still below the conservative estimate for Athissa. All in all, Lava Lurker can help you in the mid-game, but it falls short as a scaling engine.
Corrupted Myrmidon (Start of combat: double this minion’s stats). It doesn’t grow on its own but utilizes buffs better than other minions. Assuming you get all Athissa procs on it, you’ll get an extra ~25% plus you can double the stats from gems. If you have Critter Wrangler instead, you’ll double its efficiency on spells from hand. Another bonus is that it gives you a lot of tempo if you already have some Spellcrafts to buff it. As with Eventide Brute, concentrating buffs on this will make you susceptible to poison and Leeroy.
The clear winner by a wide margin is Athissa. In its absence, you can try to survive with a combination of Wranglers, Brutes, Corrupted Myrmidons and Lava Lurker.
Spellcraft minions
There are 7 Spellcraft minions, 6 of which are Naga and the other one gives you Nagas. Let’s analyze them:
Orgozoa, the Tender is not a Naga, but procs Athissa and also gives you more Nagas to round up your composition or proc Athissa again. Once you have 4 Naga on the board, this gives you the best scaling since it can discover more spells for extra procs.
Glowscale is great for combat, giving you the ability to DS your biggest minion.
Other Spellcraft minions. They offer a moderate amount of stats and taunt/windfury. They can be useful in helping you survive while you get your growth engine, but won’t help you scale as much as Orgozoa and their buffs aren’t as significant as DS in the late game. The best of them in terms of stats is Shoal Commander. However, even if you get a golden Commander, it will give be +14/+14 in combat stats which can be easily outclassed by one or two turns of scaling with Athissa. The only case when it’s relevant and even necessary is when you include Lava Lurker on your composition.
The ideal composition
Once we know the pieces of the puzzle, it’s time to think about the best way to assemble it. How many Spellcraft minions should we get? Is Lava Lurker worth it?
To analyze the composition, I’ve simulated the number of +1/+1 buffs we get for many different board combinations. These simulations make the following assumptions:
We have 6 “stable” minions that you are growing and 1 flex slot that you use to rotate spells
3 played spells per turn from the shop (Spellcraft, coins, gems, discovers)
80% of the spells are Spellcraft, and 20% are other types
We have a maximum of 1 Corrupted Myrmidon (or a golden one), which gets an equivalent of an extra 80% of the Critter Wrangler procs (you may put DS on other minions our use the discover from Orgozoa) and 20% of the Athissa procs
We have a maximum of 1 Lava Lurker (or a golden one) and it gets +7/+7 each turn (+14/+14 if golden), equivalent to having 1 Shoal Commander (2 if golden) and 7 Naga on board
With this in mind, we can calculate the number of procs as follows:
The best composition gets an equivalent of 104 +1/+1 procs per turn and consists of 2 golden Athissa, 2 golden Critter Wrangler, 1 golden Myrmidon and 1 Spellcraft minions.
The best composition without golden Athissa gets an equivalent of 79 +1/+1 procs and consists of 3 golden Wranglers, 1 golden Corrupted Myrmidon and 2 Spellcraft minion.
The best composition without any golden minions gets an equivalent of 46 +1/+1 procs and consists of 2 Athissa, 1 Critter Wrangler, 1 Corrupted Myrmidon and 2 Spellcraft minions.
I’ve measured the importance of each minion by calculating the average number of appearances on the top 10 compositions for each scenario. All copies are golden unless forbidden by the scenario:
All compositions
No golden Athissa
No golden minions
Athissa
2.1
0.8
2
Corrupted Myrmidon
1
1
0.5
Critter Wrangler
1.4
2.8
0.8
Lava Lurker
0.1
0.3
0.5
Eventide Brute
0
0.1
0.1
Spellcraft Minions
1.4
1
2.1
Avg. procs per turn
97
74
44
I’ve made this spreadsheet calculator to calculate the number of procs you’d get based on your composition. It’s read-only so it remains the same, but you can copy it to another spreadsheet and use it if you want.
The flex slot
As suggested above, the flex slot is used to rotate minions that give you spells (Spellcraft, Seashell Collector, Quilboar). However, at the end of the turn, you should be playing a minion on that slot.
If you feel like the combat will be easy, you can try to get an extra spell for the next round by playing a Spellcraft minion or a Quilboar that gets gems on combat. If you play a Spellcraft minion, you should do so after playing all your spells so it doesn’t “steal” any procs.
If you’re pressured, try to get a Leeroy, Mantid Queen, Ghastcoiler or Selfless Hero to strengthen your board.
Getting there
This article just covers the ideal composition in a void, but on a BG game, you need to survive while you build your comp. In some cases, it will be impossible to build full scaling and you’ll keep your early Lurker or Brute on the board, that’s completely fine.
Conclusion
I’ve done the math on scaling for Naga comps, here are the main take aways:
Get as many copies of Athissa as you can
Critter Wrangler is a great minion to complement Athissa
A Corrupted Myrmidon (especially golden), is a great receptor of Athissa and Wrangler buffs
Lava Lurker (if you have Shoal Commander) and Eventide Brute are also viable
Get between 1 and 3 Spellcraft minions on the board, Orgozoa and Glowscale are the best
Round up your comp with another Spellcraft for a bit more scaling or another useful unit if under pressure
Sorry, your subscription could not be saved. Please try again.
I’ve always wanted to learn how to make video games, but I just had never gotten to it.
This August I decided to change that. I had a two-week vacation in a remote and quiet place and used my spare time to build my first game. The final result is a short and unpolished game, but I have the satisfaction of having finalised the project and made almost all assets from scratch.
I had no previous experience in video game design, but have coded for some time and I used to draw a lot as a kid. In preparation for the project, I did a 2d platformer Unity tutorial on the afternoons during the week prior to my vacation. I also thought of a whole game concept of a roguelike where you’re an evil weapon (inspired on Nightblood from Stormlight Archive) that is trying to escape its confinement by tricking a human to wield it… but that ended up being waaaay too much and I had to cut the scope multiple times.
Once I had an idea i started planning out the main parts I needed for the project:
Level outline and player control
Enemies, attack and death animations
Aesthetic level design
Enemy sprites
Player sprites
Sound
Menu and victory screen
Level outline and player control
For the level outline, I wanted something short since it was my first project. Also the cultist theme made me think of ancient rituals and a Stonehenge aesthetic, including stone monuments. I ended up building a level that consisted of three main parts:
A couple of platforms to get the player started on the jump mechanics
A plateau with space for a couple of enemies that could be engaged individually
A final area where you had to fight many enemies at the same time
As for the movement, I just used the same control scheme that I’d learned from the platformer tutorial, and improved the jump a bit by learning from other tutorials. I also added gamepad compatibility by follwoing this tutorial. Anecdotally, I missed a small step in the middle of that tuto and ended up wasting more than an hour trying to figure out what was wrong…
This is the result after the first iteration:
Enemies, attack and death animations
Since I wanted to do attack animations I needed some sprites that could do that, not just a bean. So, let me introduce you to …
Bean with a sword
I made it and animated it using Photoshop’s basic tools. And since I didn’t want to waste much time I used it for the player and tweaked its size and color for the enemies.
Once I had my beans in place, I started coding the player’s attack controls and the health system. For the player’s attack, I followed this Blackthornprod tutorial. For the health system, I used what I had learned from the 2d platformer tutorial.
After that, I started coding the enemy AI. I started with a very easy approach that ended up doing the job, with no need for extra complexity. This is the AI’s behavior:
If you’ve recently been hit wait for a bit, else
If the player is in range wait a bit and then attack, else
If the player is in sight follow him, else
If you’re in front of an obstacle turn around, else
Walk in the direction you’re facing
I had also initially planned for “mage” type enemies that shoot fire balls at you, but realised that I didn’t have time to implement that (coding + drawing animations). So I just cut that out of the project.
Something that I wanted to focus on during this project was learning to design beautiful levels such as the ones in Hollow Knight. I searched a bit and found these twoawesome tutorials from a small youtube channel.
After tinkering a bit with some elements, the final level setup was:
Some fog (from the above tutorials)
Black squares to cover “blank” regions
2 or 3 layers of grass paralax in the front (photoshop brushes)
The player layer
Rocks and walls (copied from google images)
3 layers of half-assed (time restrictions…) mountain paralax in the back
A blue sky with stars and a moon that’s too high up to be seen at any moment
I know it lacks polish, but I wanted to finish the project during my vacation so I had to move on.
Here’s the third iteration:
Enemy sprites
My first enemy: the soldier (sword added with PS)
I don’t really like pixel art and I don’t have a drawing tablet, so I decided to draw the characters, take pictures and use photoshop to digitalize them. The traditional way to do this is with the pen tool, but I quickly realized that this would take too much time so I ended up using the magic wand and some filters to make the lines more even. I think the result looks nice enough while being quite fast. If there’s interest I may write a guide detailing my method.
The first step of animating is having a clear character model and the second one id defining which animations. The animations that I needed for the enemy character were:
Walk/run
Attack
Die
Die and attack animations were kind of easy, but running was harder. For the run animation I took inspiration from a shovel knight gif. I used an online tool to break it down into frames and basically copied the leg positions from the frames.
Finally, I added a particle system to simulate blood splashes when the player or an enemy is hit (here’s a coupleof tutorials).
Player sprites
I initially planned on having a hooded guy wielding an evil scythe as the player character
The fact that I used the same placeholder sprites for the player character and the enemies and that I was low on time, led me to do the same for the final sprites. I just added jumping and idle animations and called it a day.
Sound
The sounds I needed were:
Jump
Slash
Character hurt
Character dies
Background music (I shamelessly copied the song of the prayer from FF X)
Click
Victory song
I just recorded all of those with my phone in less than half an hour (all mouth noises). Afterwards, I did some light editing with Audacity and followed a coupleof tutorials to get them into the game.
Menu and victory screen
For the menu, I just followed this tutorial and added the player sprites. I also built victory and defeat screens using the same principles.
Conclusion
Here’s the final result:
And that’s it. I learned a lot from this project and had fun doing it. The result is nowhere near what profesional videogames look like, but it does look better than I anticipated.
This project helped me get a better understanding of what making a full game entails, even if at a small scope. It’s an exercise I’d recommend to all aspiring game developers before getting into bigger projects.
Next, I’ll try to make a project with more focus on playability and less focus on assets. Let’s see how it goes.
Sorry, your subscription could not be saved. Please try again.
Simulation is a very potent tool that is often lacking in many data scientists’ toolkits. In this article, I will teach you how to use simulation in combination with other analytical tools.
I will be sharing some educational and professional examples of simulation with Python code. If you are a data scientist (or on the road to becoming one), you’ll love the possibilities that simulation opens for you.
What is simulation?
Simulating is digitally running a series of events and recording their outcomes. Simulations help us when we have a good understanding of how individual events work, but not of how the aggregate works.
In physics, simulations are often used when we have a hard-to-solve differential equation. We know the starting state, and we know the rules for infinitesimal (very small) changes, but we don’t have a closed formula for longer timespans. Simulation allows us to project that initial state into the future, step by step.
In data science, we usually work with probabilistic events. Sometimes we can easily aggregate them analytically. Other times there is no analytical solution, or it’s very hard to reach it. We can estimate the probabilities and expected results of complex chains of events, by running multiple simulations and aggregating the results. This can be very useful to understand the risks we are exposed to.
Simulation is also used in hard artificial intelligence. When interacting with others, simulation can allow us to anticipate their behavior and plan accordingly. For example, Deep Mind’s Alpha Go uses simulations to calculate some moves into the future and make a better assessment of the best moves in its current position.
To run a simulation we will need a model of the underlying events. This model will tell us what can happen at any given point, the probabilities of each outcome and how we should evaluate the results.
The better our model, the better the accuracy of the simulation. However, simulations with imperfect models can still be helpful and give us a ballpark estimate.
Simulation is a subject where examples work better than theory, so let’s jump into some use cases.
Example 1. Estimate the value of pi by using simulation
This task can be done in many ways. One of the easiest is as follows:
Draw a square of side 2 and with its center at the origin of coordinates of a 2d plane
Draw the inscribed center of that square (radius 1 and its center at the origin of coordinates)
Sample random points from the square (two uniform distributions from -1 to 1)
Whenever you draw a point, check whether it is inside the circle or not
The proportion of points inside the circle will be proportional to the area of the circle so:
Similar methods can be used to estimate the value of integrals via simulation.
Example 2. Solve a difficult probability problem
Solve this problem by P. Winkler:
One hundred people line up to board an airplane. Each has a boarding pass with an assigned seat. However, the first person to board has lost his boarding pass and takes a random seat. After that, each person takes the assigned seat if it is unoccupied, and one of the unoccupied seats at random otherwise. What is the probability that the last person to board gets to sit in his assigned seat?
The problem can be solved using logic and probabilities, but it can also be solved by simply programming the described behavior and running some simulations:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1234)
def simulate_boarding(num_passengers):
passenger_seats = set(range(num_passengers))
for i in range(num_passengers):
if i == num_passengers - 1:
if list(passenger_seats)[0] == i:
return 1
else:
return 0
if (i == 0) or (not i in passenger_seats):
i = list(passenger_seats)[np.random.randint(0, num_passengers - i)]
passenger_seats.remove(i)
else:
passenger_seats.remove(i)
num_sims = 10000
num_passengers = 100
positives = 0
is_same_seat = [simulate_boarding(num_passengers) for i in range(num_sims)]
is_same_seat = np.array(is_same_seat)
print(is_same_seat.mean())
plt.figure(figsize=[8,5])
one_to_n = np.arange(1, num_sims+1)
plt.plot(one_to_n, is_same_seat.cumsum() / one_to_n)
plt.show()
Probability simulation convergence
You can find more probability problems to practice here.
Example 3. Simulating game outcomes
How many games would it take Magnus Carlsen (Elo of 2847 as of 18-07-2021) to get back to his current rating if he was dropped at 1000?
First, given two player’s Elo ratings, the probability of player1 beating player2 is:
Second, after the game, player1’s Elo rating is updated as follows:
Where:
result is 1 for a win, 0.5 for a tie and 0 for a loss
K (also known as K-factor) is the maximum possible adjustment per game and varies depending on the player’s age, games played and ELO
Now that we have a model, we just have to initialize Magnus current Elo to 1000 and code a while loop that:
Has Magnus play a game against a player of his current Elo
Calculates the probability of winning using the real Elo and simulates the outcome of the game
Updates Magnus’s current Elo according to the result
Stops the loop if Magnus has reached his real Elo
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1234)
def get_prob(elo1, elo2):
return 1/(1+10**((elo2 - elo1)/400))
def update_elo(elo, prob, result, k):
return elo + k * (result - prob)
def play_until_top(real_elo, initial_elo):
current_elo = initial_elo
num_games = 0
k = 40
elo_list = [initial_elo]
while current_elo < real_elo:
if num_games > 30:
k = 20
if current_elo > 2400:
k = 10
prob_win = get_prob(real_elo, current_elo)
result = 1 if np.random.rand(1)[0] < prob_win else 0
current_elo = update_elo(current_elo, 0.5, result, k)
elo_list.append(current_elo)
num_games += 1
return elo_list
num_sims = 1000
num_games = [len(play_until_top(2847, 1000)) for i in range(num_sims)]
num_games = np.array(num_games)
print(num_games.mean())
plt.figure(figsize=[8,5])
plt.hist(num_games,bins=50)[2]
elo_history = np.array(play_until_top(2847, 1000))
plt.figure(figsize=[8,5])
plt.plot(np.arange(0, len(elo_history)), elo_history)
plt.show()
Example Elo trajectory
Games to real Elo distibution
Another cool example would be to simulate the NBA playoffs. For a first approach, you can assume that each team has a probability of winning proportional to the games they won during the regular season (GW) so that in any game the probability of team 1 winning is GW1 / (GW1 + GW2). You can also analyze how probabilities change if you change the series from Best of 7 to Best of 5 or Best of 9.
Example 4. Business application, estimating value at risk
Collectors LTD is a debt collection company focused on enterprise debt. It buys portfolios of business loans that have defaulted at some point and tries to collect the payments for those loans. Some of the companies will be bankrupt and won’t be able to pay, and others are likely to go bankrupt in the future. The key to Collectors LTD’s business is in estimating the value it can get back from a portfolio. For this reason, Collectors LTD has developed a model that predicts the probability of a company repaying part of that debt. Among those companies that repay some of the debt, the amount paid is distributed uniformly from 0% to 100%. Collectors LTD can use its model in combination with simulation to evaluate the expected return of the portfolio, and how volatile that return is.
Since I can’t share the real data with you, I’ve created a synthetic dataset that mimics the relevant properties:
Keep in mind that this solution assumes the probabilities of collection are independent of one another. This isn’t true for systemic risks such as a global economic downturn.
Conclusion
I hope you’ve liked these examples and that you can find applications of simulation in your day-to-day data science job. If you’ve enjoyed the article, please subscribe and share it with your friends.
Sorry, your subscription could not be saved. Please try again.
When you start learning, it’s very hard to have a clear direction. You often waste time on uninteresting, useless, or outdated topics. You wander and run in circles.
However, once you’ve mastered the topic, it’s easy to look back and see the fastest path from noob to pro. If you only could go back in time and give yourself the roadmap… Even if I cannot do that with myself, I can do that for others. This is the objective of this article: to give you the tips I wish I knew when I started learning data science and machine learning.
1. Get solid mathematics, probabilities, and statistics foundations
Mathematics and statistics are at the core of machine learning. So it will be very difficult to understand machine learning algorithms if you don’t know the building blocks.
However, this doesn’t mean you need to be a math wizard. You should understand math and stats concepts such as vectors, matrices, derivatives, probability distribution, independent variables, or standard deviation. More advanced mathematics (like learning to prove theorems) won’t help you much when studying machine learning, even though it can be a lot of fun.
2. Learn either Python or R and learn them well
When doing data science and machine learning, you will spend most of your time coding in R/Python. So it’s important to learn the ins and outs of your language of choice.
Data scientists spend a lot of time cleaning and manipulating data, so you should give special attention to data manipulation libraries. The most popular ones are Pandas for Python and data.table and dplyr for R.
3. Learn good programming practices
Writing clean and efficient code will make it easier to share your work with others. And even if you work alone, will make it easier for you to debug and maintain your own code. Entire books have been written about this so I’ll give you a short list:
Use consistent and descriptive names for variables, columns, and functions
Don’t repeat code, use functions or classes if you need to do the same process multiple times
Understandable code is better than compact one: 10 lines everybody understands vs 2 lines nobody understands
Don’t overoptimize your code at the start, but know where the bottlenecks (parts that won’t work well if you increase the volume of data) are in case you need it to scale
Use consistent indentation and try to limit line length
4. You don’t need to learn all the different supervised learning models
This is one I struggled with. When I started learning I thought that every situation would need a different type of model and that I needed to learn them all to be well equipped. But this is far from true. Linear/logistic regression is surprisingly effective for tabular data problems. And XGBoost or random forest will help you if you have a lot of non-linearities. Artificial neural nets are great for image and NLP problems but are otherwise overkill and more difficult to set up.
Aditionally, you don’t have to keep up with all the published papers. Most staple techniques in the industry are decades old. If you ever have to face a very unique problem, then may be a good moment to dive into the literature.
5. Once you know the basics and understand them well, it’s mostly about doing projects
After completing one or two ML courses, don’t spend your time on more theory, dive straight into doing some projects. If you’re lacking some knowledge, you can pick it up on the way.
Working on projects puts your knowledge into practice, and helps you figure if you really understood everything well. Additionally, by doing projects you create valuable experiences that will help you get hired later on.
6. Doing tutorials and reviewing other people’s projects is very helpful at the start
When you’re learning a new tool or model and don’t feel confident about using it on your own, looking at an example is a great way to get some inspiration.
Additionally, some useful online resources are paid. I have personally tried to distill my years of experience as a data scientist into Data Projects, a product to learn data science by doing real-world projects. I hope it can help others as much as it would’ve helped me.
8. Explaining your work to others is a great way to consolidate your knowledge
It’s also a great way to work on your communication. You can do this by telling your friends, blogging, or making youtube videos. This will be a crucial skill when working with others.
9. Don’t despair if you don’t get it right
Nobody gets it right the first time. Trial and error is the way to go, especially on fields like this where there is no one exact solution
10. Lean on online communities
The internet is full of helpful and generous people, if you’re struggling with something search and if you don’t find the answers, ask in the forums (reddit or stackoverflow).
11. Learn more about your problem domain
Don’t focus only on the purely technical, try to understand what is really behind the problems you’re modeling. It will help you decide which is the best error metric for the problem, select the most insightful variables, and communicate to non-technical stakeholders using their own language.
12. Work with messy data
Don’t just stick to problems with pre-cleanded data. The world is messy, and having some experience on treating and structuring data will prepare you for future challenges.
13. Work on what makes you curious, that will keep you motivated
Following your curiosity and your passions will make sure you don’t abandon your path to becoming a data scientist halfway through. Additionally, it makes the whole learning experience a lot more fun!
Sorry, your subscription could not be saved. Please try again.
When I learned data science I didn’t know where to start, so I wasted many hours learning only tangentially useful stuff. Now, after more than five years as a data science consultant, I know what I would’ve done differently. In this article, I will offer you a roadmap on how self-learn data science with links to useful resources.
Data science pre-requisites
Even though I believe everyone can learn data science, those with a technical background will have a head start. Before getting into DS specific subjects it is useful to have some notions about mathematics, statistics and probability.
It is not necessary to be an expert in any of those, but you need a solid foundation. If you’ve never studied any of those, don’t worry, I’m here to help. In the following paragraphs, I’ll briefly describe each prerequisite and link to educational resources.
Mathematics for data science
To get started with data science you need to get familiar with some of mathematics’ most common objects. These Khan academy lessons about vectors, matrices and functions are a good place to start. Also, here’s the summary (in more formal mathematical language) of a Stanford course. These concepts are the building blocks of most machine learning algorithms and provide you with a framework for structuring data. Getting to this level of mathematics will allow you to understand and use the algorithms that others have invented and implemented and get results.
If you really like mathematics, you can dive deeper into mathematics by taking full calculus and linear algebra courses. This will require a lot more work but will unlock a more complete understanding of the inner workings of machine learning algorithms and how to implement and adjust them.
Probability and statistics
Probability lies at the core of the data scientists’ view of the world. When dealing with big numbers and random events, probability and statistics provide the tools to make sense of them. It isn’t only about the exact methods or formulas, but also about developing a probabilistic intuition. These courses from Khan academy on probability and statistics are both beginner-friendly and got all the information you’ll need. Here is a mathematically formal summary of a probability course from Stanford.
In addition to formal education in probability and statistics, reading non-fiction books can also help to develop an intuition. I recommend the following books in no particular order: Thinking fast and slow, Factfulness, Thinking in bets, Fooled by randomness (or any of Nassim Taleb’s books).
Finally, reading about statistical paradoxes will help you make sense of data when you face unintuitive conclusions.
Data-oriented programming language
A big part of a data scientist’s job is reading, manipulating and running analysis on data. This is usually done by coding in a data-oriented language. These languages allow us to write instructions for a computer to execute. Even though there are many different programming languages, most of them use very similar structures. The two most popular data-oriented programming languages are Python and R, and you can start with either one. If at some later point you work with people using the other one, you can use that as an opportunity to learn it.
If you’ve never coded before, don’t worry. Both of them can be a good first point of contact with programming. A lot has been written about which one is better, but the truth is they have different strengths.
R’s strong points are:
It is designed for data and statistical work, so manipulating data is easier
There is a vast universe of statistics libraries
The Shiny library makes it very easy to make a web app with no previous web design experience
RStudio is a wonderful IDE (I haven’t found one that I like as much for Python)
Python’s strong points are:
It’s a general-purpose programing language as well as one of the most popular languages overall
It usually runs faster than R
It has better packages for deep learning
I personally prefer R because of its more compact syntax in the data.table package and also because I have more experience with it.
Learning R
If you are new to programming, I recommend you start with one of these resources:
If you have been coding for a while, you can get the basics with learn R in Y minutes.
Once you know the basics, it’s time to learn one of the two main data manipulation libraries: data.table (my personal favorite) or dplyr. Another useful library is ggplot2 for making beautiful graphics.
Learning Python
If python is your first programming language you can start with any of these:
If you’re already familiar with coding you can just read this documentation.
And once you’ve mastered python’s basics, you can go into the specialized tools to manipulate data: Pandas and Numpy. Here’s a tutorial and here’s a video to help you learn those packages.
Learn machine learning
Now we get to the exciting part.
There are many different techniques and tools in machine learning. One of them has been my most used analytical tool during my years as a data science consultant. And that technique is supervised learning, in both of its forms: classification and regression.
Supervised learning, also known as predictive modeling, is about learning from examples in which we know in advance the correct answer. In regression the answer is a numerical value, and in classification it is categorical.
Predictive models can be used to make demand forecasts, identify risky creditors and estimate the market price of a house among many other uses.
Here are some courses that will teach you the main framework to approach predictive modeling problems, as well as some supervised learning models:
In my experience, 3 families of models can help you solve most supervised learning problems you’ll ever encounter:
Linear and logistic models (explained in the above courses) are easy to understand, easy to interpret, fast to train and reasonably accurate
XGBoost (gradient boosting trees implementation) is a top-of-the-class model in terms of precision, speed and ease of use. However, they’re not as easy to interpret as linear models. Here’s an introduction to decision trees (pre-requisite) and a couple of articles about how XGBoost works
Neural networks are great for natural language processing and image models. However, I’d leave them to more advanced data scientists since they’re more difficult to set up
SQL is the most used database language and most companies use one of its variants for their database. Even Amazon’s Athena and Google’s big query can be accessed using SQL syntax.
So if you’re planning on getting a job in data science I recommend you learn SQL since it will be a requirement for most employers. If you’re doing personal projects it’s up to you. For small-scale projects, you will be just saving your data on text files. For bigger projects, SQL skills may come in handy.
Sorry, your subscription could not be saved. Please try again.
Thanks for subscribing!
What’s next?
Once you’ve learned the basics about R/Python and supervised learning, it’s time to practice. Do a project with open data or participate in a Kaggle competition. Or get a job as a data scientist and learn while getting paid. Practice is what will help you hone your skills and generate proof of your knowledge.