How to learn data analysis
At least what worked for me.
One of the biggest blindspots of my professional life has been that I did not really have any data analysis experience beyond what Excel can do. I was aware of tools like Jupyter notebooks that seemed like the logical upgrade to Excel. I just never had time or a good reason to learn anything new.
Recently I stumbled across marimo. I have also recorded my collection of "Magic: The Gathering" cards with ManaBox (about 33'000 cards). As ManaBox lets me export my collection as a CSV file, I took another stab at learning data analysis.
So I created a small collection of notebooks centered around my MtG collection. It mainly uses the ManaBox export as well as data from the Scryfall API.

A quick overview of the notebooks:
- mtg.py
- This notebook is designed as a module to load in the other notebooks. It loads the collection from the CSV file into a dataframe and predefines some filters (e.g. noTokens).
- ingest.py
- This notebook helps me sort new cards into the collection. It searches for a suitable preexisting binder/box. For example, the common cards are currently in six boxes. If I add a new common to the collection, it shows me in which of the boxes other commons of the same set are. The same with foil cards and the other rarities.
- Cards that do not have an existing place are sorted by the release date of their collection. This helps me find a suitable place, as all the boxes/binders are sorted by the set release date.
- precon.py
- This notebook tells me if any preconstructed decks are "hiding" in the collection. It uses the deck lists from MTGJSON.
- I mainly created it to see if all the cards of my very first deck are still around. Turns out, 4 are missing.
- priceFilter.py
- I created this to figure out which cards to double sleeve within the collection. It loads the pricing data from a bulk card download from Scryfall and then filters by price.
- reorganise.py
- This is still in the making. As some boxes are getting very full, this gives me information on how to reorganise the collection (e.g. Sets that have enough cards that justify their own box).
- deckStats.py
- This notebook is probably the most "mathy". It prints various statistics about a selected deck. It simulates starting hands as well as draw probabilities for basic lands. It is very much a work in progress. I am still adding more features. This is also the notebook I tried different visualisation frameworks.
I like that marimo runs directly in the browser and has some nice features for data exploration. I will continue to use it for problems around my MtG collection, but also next time I would have used Excel.