WebP Rules

This is a PNG.

This is a WebP.

One of these images is a PNG, and one is a WebP. The PNG is 12KB, and the WebP is 5KB. Do you see a difference between the two?

Me, neither.

I’ve been a steadfast PNG supporter since time immemorial. Cringing at any sight of JPEG compression. Way back in the way when forums were still big, and users spent inordinate amounts of time creating cool-looking signature images, I was big into PNGs. I prided myself on every aspect of my little 300×100 pixel PNG box. But mostly the crisp lines and vibrant colors.

When creating images for a site, I’d typically use (old-school) Photoshops ‘optimize for web’ feature for PNGs. This did a pretty decent job at compressing my PNGs to reasonable sizes.

But, nothing ever gave me such a boost as a WebP. Even for large images full of color, I can still drop a picture from 1.2MB to 680KB, while maintaining the same visible quality! That’s insane!

One of these days when I have more time and energy I’m going to read some more specs on how this works, but until then, I’m just going to go on living like it’s pure magic and my days will be a little brighter for it.

Mang JS

A long time ago I built a tool I called mang: Markov Name Generator. It went through a number of iterations, from a .Net class library, to WPF app, to a Vue app, even as part of a gamedev toolkit for a long abandoned roguelike. Most recently, it’s been a desktop app built with Avalonia so I could use it on Linux. Over the years it’s become my “default” project when learning a new language or framework.

Well, now I have one of those new-fangled MacBook Pros. Anyone familiar with MacOS development knows how much of a hassle it is to build anything for it, even if you just want to run locally, and I do not want to pull over all my code and an IDE just to run in debug mode to use the app.

What I did, instead, was port the whole thing over to a single static webpage.

I put off doing this for a long time because I’m not a huge fan of JavaScript. Also just as important to note is my tendency to over-complicate, and I kept finding myself wanting to build a full API and web app, which is just completely unnecessary for a tool like this. But I do use mang quite a bit, and there’s nothing so complicated that it can’t be done in vanilla JS. So I bit the bullet and ported the name generation to start.

It’s more of a transliteration than a “true” port or rewrite. The code is almost exactly the same, line by line and function by function, as it is in C#. But the end result is pretty compact: the CSS itself, which is just a very-slightly-modified fork of simple.css, is almost larger than the entire index.html file. While there is plenty to whine about when it comes to JavaScript, it is nice to have everything in such a plain and accessible format.

The entire tool is unminimized and all of the assets are free to browse.

And all in all, this whole thing went much smoother than I expected for less than an hour of work.

What Changed

As part of this process I removed some of the name types that can be generated. Most of the types available in mang are historical or fictional, and it felt odd to have some name types with contemporary sources. As such, all of the East Asia, Pacific Island, and Middle-East Sources have been removed.

What’s Coming

I have not ported the admittedly barebones character generation stuff yet. I have some better plans for that and will be spending some time fleshing that feature out.

The character generation so far has been “trait” flags, randomly selecting between things like “carefree” and “worried”, or picking a random MBTI type. It’s generally enough for rough sketched of an on-the-fly NPC or something, but could use some more work to be truly useful as a starting point for anyone requiring more detail.


Helion, a Theme

A Brief Rant About Color

I have a lot of opinions on colors and color schemes. For example, if you are implementing a dark mode on your app / site / utility and decide to go with white text on a black background (or close-to-white on close-to-black), you should be charged with a criminal offense.

High contrast is a good thing! Don’t get me wrong! But that type of dark mode is offensive to my eyes and if I find myself using something with that color scheme I will switch to light mode, if possible, or simply leave and never come back. It’s physically painful for me, leaving haloes in my vision and causing pretty harsh discomfort if I try to read for more than a few seconds. And though this may be contentious, I find it a mark of laziness: you’re telling me you couldn’t be bothered to put in even a little effort in finding a better way to do dark mode?

So it may come as no surprise that I am a long-time Solarized user. From my IDEs, to my Scrivener theme, to my Firefox theme, to anything I can theme — if it’s got a Solarized option, I’m using it. For a long time, even this blog used some Solarized colors. (Dracula is a close second in popularity for me.)

Helion: A VS Code Theme

I’ve long experimented with my own color schemes. It’s a perfect channel for procrastination, and a procrastinator’s work is never done. Today, I think I’ve settled on a good custom dark theme which I want to release, on which I’d like to iterate as I continue to use and tweak.

Helion, inspired by Solarized and colors that faded away, is my contribution to the dark theme world. It’s not perfect — few things are — but my hope is that it becomes a worthy entry to this world.

Right now it is only a Visual Studio Code theme. As I continue to use and tweak it, I plan to standardize the template into something I can export and use for other tools.

Here is a screenshot:

A screenshot of the Helion theme in use in Visual Studio Code, viewing two JSON files side by side.

Just comparing some json files

Now, I am not a usability expert. The colors here are in no way based on any scientific study and I do not assert that they are empirically perfect or better than any other dark mode theme. This is simply a theme which I’ve customized for my own tastes, according to colors and contrasts that are appealing to my own eyes

That said, any feedback is greatly appreciated. If anyone ever does choose to use the theme, I would be delighted to hear from you, whether it’s good or bad (or anywhere in between).

Enjoy!

Shuffling

A while back I went through my iTunes library and took the songs/albums/artists not already living in their own playlist and put them into a mega-playlist I called all of it.1 This way I could just hit “shuffle my library” when in my car and emulate what I used to do with my iPod. I don’t want to fuss with any UI, or play some algorithmically-generated playlist full of suggested music I might like. I just want to listen to my library, on shuffle.

But I’ve been feeling lately like shuffling just… isn’t good. Maybe the same artist plays twice in a row, or within just a couple minutes of my past listen. Or each shuffle still front-loads the same small selection of songs, so I don’t really get to explore the depths of my library. It’s not a large library (10,000 songs), but not small either, so why am I hearing the same old stuff? This is the 21st Century, is it really so hard to shuffle music?

The short answer is: no, it’s not. We can pretty much shuffle any list of things in an almost truly random fashion. But there are plenty of reasons shuffling doesn’t seem random. Humans are inherently pattern-seeking animals2, and in a truly random sequence of songs, a given artist is no more or less likely to play next based on the previous artist. So you could have two artists play in a row—or if it’s truly random, the same song!

Another problem is that once software gets Good Enough™ it doesn’t usually get touched again until there are actual problems with it—or there is a strong monetary incentive to do so. So a developer with the task to write a shuffle feature might do what’s Good Enough™ to close the ticket according to the requirements and test plan3, then move onto the next ticket.

So what does it really take to do a good job shuffling a playlist? I wanted to do a little experimenting so I thought I would start from the ground up and walk through a few different methods. First, I need…

The Music

I took my playlist and ran it through TuneMyMusic to get a CSV. Then I wrote a little bit of code to parse that CSV into a list of Song objects, which would be my test subject for all the shuffling to come.

True Random

First I wanted to see how poorly, or how well, a “true” random shuffle worked.4 This is easy enough.

We’ll just do a simple loop. For the length of the song list, grab a random song from the list and return it:

for (var i = 0; i < songList.Count - 1; i++) 
{
    var index = RandomNumber.Next(0, songList.Count - 1); yield return songList[index]; 
}

And right away I can see that the results are not good enough. Lots of song clustering, even repeats: one song right after the other!

[72]: Goldfinger - Superman 
[73]: Metallica - Of Wolf And Man (Live with the SFSO) 
[74]: Orville Peck - Bronco 
[75]: Def Leppard - Hysteria 
[76]: Kacey Musgraves - Follow Your Arrow 
[77]: Kacey Musgraves - Follow Your Arrow 
[78]: The Thermals - My Heart Went Cold 
[79]: Miriam Vaga - Letting Go

A Better Random

After looking at the results from the first random shuffle, two requirements have become clear:

  1. The input list must have its items rearranged in a random sequence.
  2. Each item from the input list can only appear once in the output list.

Kind of like shuffling a deck of cards. I’d be surprised to see a Deuce of Spades twice after shuffling a deck. I’d also be surprised to see the same Kacey Musgraves song in my queue twice after hitting the shuffle button.

Luckily this is a problem that has been quite handily solved for quite a long time. In fact, it’s probably used as the “default” shuffling algorithm for most of the big streaming players. It’s called the Fisher-Yates shuffle and can be accomplished in just a couple lines of code.

for (var i = shuffledItemList.Length - 1; i >= 0; i--) 
{ 
    var j = RandomNumber.Next(0, i); (shuffledItemList[i], shuffledItemList[j]) = (shuffledItemList[j], shuffledItemList[i]);
}

Starting from the end of the list, you swap an element with a random other element in that list. Here I’m using a tuple to do that “in place” without the use of a temporary variable.

The results are much better, and at first glance almost perfect! But scanning down the list of the first 100 items, I do see one problem:

[87]: CeeLo Green - It's OK 
[88]: Metallica - Hero Of The Day (Live with the SFSO) 
[89]: Metallica - Enter Sandman (Remastered) 
[90]: Guns N' Roses - Yesterdays

It’s not the same song, but it is the same artist, and in a large playlist with lots of variety, I don’t really like this.

Radio Rules

Now I know I don’t want the same song to repeat, or the same artist, either. While we’re at it, let’s say no repeating albums, too. So that’s two new requirements:

  1. No more than x different songs from the same album in a row.5
  2. No more than y different songs from the same artist in a row.

This is pretty similar to the DMCA restrictions set forth for livestreaming / internet radio. It will also help guarantee a better spread of music in the shuffled output.

Here’s some code to do just that:

var shuffledItemList = ItemList.ToArray();
var lastPickedItems = new Queue<T>();
for (var i = shuffledItemList.Length - 1; i >= 0; i--)
{
    var j = RandomNumber.Next(0, i);

    var retryCount = 0;
    while (!IsValidPick(shuffledItemList[j], lastPickedItems) &&
           retryCount < MaxRetryCount)
    {
        retryCount++;
        j = RandomNumber.Next(0, i);
    }

    if (retryCount >= MaxRetryCount)
    {
        // short-circuiting; we maxed out our attempts
        // so increment the counter and move on with life
        invalidCount++;

        if (invalidCount >= MaxInvalidCount)
        {
            checkValidPick = false;
        }
    }
    
    // a winner has been chosen!
    // trim the stack so it doesn't get too long
    while (lastPickedItems.Count >= Math.Max(ConsecutiveAlbumMatchCount, ConsecutiveArtistMatchCount))
    {
        _ = lastPickedItems.TryDequeue(out _);
    }
    
    // then push our choice onto the stack
    lastPickedItems.Enqueue(shuffledItemList[j]);
    
    (shuffledItemList[i], shuffledItemList[j]) = (shuffledItemList[j], shuffledItemList[i]);
}
return shuffledItemList;

This, at its core, is the same shuffling algorithm, with a few extra steps.

First, we introduce a Queue, which is a First-In-First-Out collection, to hold the x most recently chosen songs.

Then when it’s time to choose a song, we look in our queue to determine if any of the recent songs match our criteria. If they do, then we skip this song and choose another random song. We attempt this only so many times. While the chance is low, there’s still a small chance that we could get stuck in a loop. So there’s a short-circuit built in that will tell the loop it’s done enough work and it’s time to move on.

In addition to that, there’s a flag with a wider scope: if we’ve short-circuited too frequently, then the function that checks for duplicates will stop checking.

This is an extra “just in case”, because if I hand over a playlist that’s just a single album or artist, I don’t want to do this check every single time I pick a new song. At one point it will become clear that it isn’t that kind of playlist.

Once a song has been chosen, the lastPickedItems Queue gets its oldest item dequeued and thrown to the wind6, and the newest item is enqueued.

How does this do? Pretty well, I think.

[89]: Metallica - One (Remastered) 
[90]: Stone Temple Pilots - Plush 
[91]: Megadeth - Shadow of Deth 
[92]: System of a Down - Sad Statue 
[93]: Jewel - Don't 
[94]: Def Leppard - Love Bites 
[95]: Elton John - 16th Century Man (From "The Road To El Dorado" Soundtrack) 
[96]: Daft Punk - Aerodynamic 
[97]: Kacey Musgraves - High Horse 
[98]: Above & Beyond - You Got To Go 
[99]: Gnarls Barkley - Just a Thought

But not all playlists are a wide distribution of artists and genres. Sometimes you have a playlist that is, for example, a collection of 80s rock that’s just a bunch of Best Of compilations thrown together. How does this algorithm fare against a collection like that?

Answer: not well.

[0]: CAKE - Meanwhile, Rick James... 
[1]: CAKE - Tougher Than It Is (Album Version) 
[2]: Breaking Benjamin - I Will Not Bow 
[3]: Enigma - Gravity Of Love 
[4]: CAKE - Thrills
[5]: Clutch - Our Lady of Electric Light

It immediately short-circuits, and we see lots of clustering. Maybe not a deal-breaker for a smaller, more focused playlist, but I can’t help but feel there’s a better way to handle this.

Merge Shuffle

Going back to the deck of cards analogy: there are only 4 “albums” in a deck, but shuffling still produces good enough results for countless gamblers and gamers. So why not try the same approach here?

We want to split our list into n elements, then merge them back together, with a bit of randomness. Like cutting and riffling a deck of cards.

First, we’ll do the easy part: split up the list into a list of lists – like a bunch of hands of cards.

private List<List<T>> SplitList(IEnumerable<T> itemList, int splitCount)
{
    var items = itemList.ToArray();
    return items.Chunk(items.Length / splitCount).Select(songs => songs.ToList()).ToList();
}

Then, we pass this list to a function that will do the real work of shuffling and merging.

private IEnumerable<T> MergeLists(List<List<T>> lists)
{
    var enumerable = lists.ToList();
    var totalCount = enumerable.First().Count;
    var minCount = enumerable.Last().Count;

    var difference = totalCount - minCount;
    var lastList = enumerable.Last();

    lastList.AddRange(Enumerable.Repeat((T)dummySong, difference));
    
    // set result
    var resultList = new List<T>();
    var slice = new Song[enumerable.Count];

    for (var i = 0; i < totalCount - 1; i++)
    {
        for (var l = 0; l <= enumerable.Count - 1; l++)
        {
            slice[l] = enumerable[l][i];
        }

        for (var j = slice.Length - 1; j >= 0; j--)
        {
            var x = RandomNumber.Next(0, j);

            (slice[x], slice[j]) = (slice[j], slice[x]);
        }

        for (var j = 1; j <= slice.Length - 1; j++)
        {
            if (slice[j - 1] == dummySong)
            {
                continue;
            }
            
            if (slice[j].ArtistName == slice[j - 1].ArtistName)
            {
                (slice[j - 1], slice[slice.Length - 1]) = (slice[slice.Length - 1], slice[j - 1]);
            }
        }

        if (i > 0)
        {
            var retryCount = 0;
            while (!IsValidPick(slice[0], resultList.TakeLast(1)) &&
                   retryCount < MaxRetryCount)
            {
                (slice[0], slice[enumerable.Count - 1]) = (slice[enumerable.Count - 1], slice[0]);
                retryCount++;
            }
        }
        
        resultList.AddRange((IEnumerable<T>)slice.Where(s => s != dummySong).ToList());
    }
    
    return resultList;
}

This is kind of a big boy. Let’s go through it step by step.

First, we cast our input list to a local variable. I am allergic to side-effects, so I want any changes (destructive or otherwise) confined to a local scope inside this function to keep it as pure as possible.

We’ll take the local list, and then find the length of the biggest chunk, and the length of the smallest chunk. There will only be one chunk smaller than the rest. We’ll fill it up with dummy songs so it’s the same length as the other chunks, and then disregard the dummy songs later.7

Once our lists are in order, we slice through them one section at a time. The slice gets shuffled8, then checked for our earlier-defined rules, but a little more relaxed: no artist or album twice in a row. If a song breaks a rule, we just move it to the end of the array and try again, always with a short-circuit so we don’t get caught in an endless loop.

And of course, we will always allow / ignore the dummy songs, so they don’t interfere with any real choice.

But, there’s a problem! Like a deck of cards, shuffling once just isn’t enough. Like you do with a real deck of cards, let’s go through this process at least seven times.

for (var i = 0; i <= ShuffleCount - 1; i++)
{
    var splitLists = SplitList(list.ToList(), SplitCount);
    list = MergeLists(splitLists);
}

And… the output looks really good, in my opinion!

[0]: Daft Punk - Digital Love 
[1]: America - Sister Golden Hair 
[2]: CAKE - Walk On By 
[3]: Guns N' Roses - Yesterdays 
[4]: Hey Ocean! - Be My Baby (Bonus Track) 
[5]: CAKE - Meanwhile, Rick James...
[6]: Fitz and The Tantrums - Breakin' the Chains of Love 
[7]: Digitalism - Battlecry 
[8]: Harvey Danger - Flagpole Sitta 
[9]: Guttermouth - I'm Destroying The World

However, this only really works well in the areas where the plain-old Fisher-Yates shuffle doesn’t. When used on smaller or more homogeneous sets, the results still leave something to be desired. These two shuffle methods complement each other, but cannot replace each other.

Shuffle Factory

So what happens now?

I thought about checking the entire playlist beforehand to see which algorithm should be used. But there’s no one-size-fits-all solution for this. Because, like my iTunes library, there could be a playlist with a huge number of albums, and also a huge number of completely unrelated singles.

So let’s get crazy and use both.

First we need to determine the boundary between the Fisher-Yates shuffle and the “Merge” Shuffle (for lack of a better term). I’m going to just use my instincts here instead of any hard analysis and say: if it’s a really small playlist, or if more than x percent of the playlist is one artist, then we’ll use the Merge Shuffle.

private ISortableList<T> GetSortType(IEnumerable<T> itemList)
{
    var items = itemList.ToList();
    if (HasLargeGroupings(items))
    {
        return new MergedShuffleList<T>(items);
    }
    return new FisherYatesList<T>(items);
}

public bool HasLargeGroupings(IEnumerable<T> itemList)
{
    var items = itemList.ToList();
    if (items.Count <= 10)
    {
        // item is essentially a single group (or album)
        // no point in calculating.
        return true;
    }

    var groups = items.GroupBy(s => s.ArtistName)
        .Select(s => s.ToList())
        .ToList();

    var biggestGroupItemCount = groups.Max(s => s.Count);

    var percentage = (double)biggestGroupItemCount / items.Count * 100;

    return percentage >= 15;
}

Pretty straightforward! There is also the function that performs the check, and returns a new shuffler accordingly.

Now let’s shuffle.

public void ShuffleLongList(IEnumerable<T> itemList,
    int itemChunkSize = 100)
{
    var items = itemList.ToList();
    if (items.Count <= itemChunkSize)
    {
        chunkedShuffledLists.Add(GetSortType(items).Sort().ToList());
        return;
    }

    items = new FisherYatesList<T>(items).Sort().ToList();
    
    // split into chunks
    var chunks = items.Chunk(itemChunkSize).ToArray();

    // shuffle the chunks
    var shuffledChunks = new FisherYatesList<T[]>(chunks).Sort();

    foreach (var chunk in shuffledChunks)
    {
        chunkedShuffledLists.Add(GetSortType(chunk).Sort().ToList());
    }
}

Again, pretty simple.

Split our input into x lists of y chunk size (here, defalt to 100). Again we’ll do a little short-circuiting and say if the input is smaller than the chunk size, we’ll just figure out the shuffle type right away and exit immediately.

Otherwise, we do a simple shuffle of the input list and then split it into chunks of the desired size. I chose to do this preliminary shuffle as an extra degree of randomness. I hate hitting shuffle on a playlist, playing it, then coming back and shuffling again and getting the same songs at the start.9 So this will be an extra measure to guarantee the start sequence is different every time.

Next we shuffle the chunk ordering. Again, using Fisher-Yates, and again, for improved starting randomness.

After that we just iterate through the chunks and shuffle them according to whichever algorithm performs better for that particular chunk.

The output here is, again, really nice in my testing. I ran through and checked multiple chunks and felt overall very pleased with myself, if I’m being honest.

[0]: Rina Sawayama - Chosen Family 
[1]: Clutch - Our Lady of Electric Light 
[2]: Matchbox Twenty - Cold 
[3]: Rocco DeLuca and The Burden - Bus Ride 
[4]: Rob Thomas - Ever the Same 
[5]: MIKA - Love Today 
[6]: CeeLo Green - Satisfied 
[7]: Metallica - For Whom The Bell Tolls (Remastered) 
[8]: Elton John - I'm Still Standing 
[9]: Rush - Closer To The Heart 
... 
[0]: Wax Fang - Avant Guardian Angel Dust 
[1]: Journey - I'll Be Alright Without You
[2]: Linkin Park - High Voltage 
[3]: TOOL - Schism 
[4]: Daft Punk - Giorgio by Moroder 
[5]: Fitz and The Tantrums - L.O.V. 
[6]: Stone Temple Pilots - Vasoline (2019 Remaster) 
[7]: Jewel - You Were Meant For Me 
[8]: Butthole Surfers - Pepper 
[9]: Collective Soul - No More No Less

Outtro

I don’t think there’s any farther I can take this. I know if I looked closer and the end results, I could find something else to change. There’s a whole world of shuffling algorithms out there, and plenty to learn from. If I felt so inclined I could write something to shuffle my playlists for me, but this exercise was really to learn first why shuffling never seemed good enough, and second if I could do better.

(The answers, as usual, were “it’s complicated” and “maybe”.)

Further Reading

  • The source code for my work is over at my github.
  • Spotify, once upon a time, did some work on their shuffling and wrote about it here
  • Live365 has a brief blog post on shuffling and DMCA requirements here

How to Teach Yourself to Code

Several times over the past few years I’ve been asked how I learned to code. I didn’t go to school for it, and it wasn’t originally something I aspired to do. So I never really had a good, straightforward answer, and just ended up rambling on for a while. The more I thought about it, the more I realized that my path to becoming a software developer was almost as meandering as my answers, and just as packed with fitful starts and dead ends. So, I figured I would sit down and write it down, though unlike my Teach Yourself SQL post, this post’s scope is much wider, on just how to teach yourself to code in general.

Or, more specifically, how I taught myself. Everyone learns differently, so just because something worked for me this doesn’t mean it will work for someone else. I’ll try to also include “further reading” and other types of resources that I didn’t use, but I know to be useful.

So… how do you teach yourself to code?

0. Get Very Comfortable with Google

Or your search engine of choice.

The point is that part of being a software developer, or working in any capacity with technology, is being able to find and parse information online. You need to be able to form your question in a way someone else can understand, then find the appropriate resource, then understand what you’re reading.

This might be official documentation from a company about its products. It might be a StackOverflow post, or a thread on a forum. You might need to be the one to ask the question, because it hasn’t been asked before. Which leads into the more important point, way more important than just knowing how to google something:

A fundamental part of being any good at any of this is the ability to clearly define your problem.

This will not only make it easier to find answers to your questions, but it will make it easier for you to solve your own problems. The vast majority of the work is taking a nebulously-defined problem, breaking it down into solvable chunks, and then solving those chunks. Remember word problems in your math classes in high school? It’s like that, forever.

1. Pick a Language

I know this seems like a weird first step. How do you pick a programming language if you don’t know anything about programming? In reality it doesn’t matter very much which language you pick; what matters is that you pick a language and stick with it.

This advice I drop with personal experience. The list of languages I picked up and forgot is longer than the list of languages I use today. First was Ruby, when I was 11 or 12 and building games with RPG Maker XP. Then it was C, then C++, then JavaScript… the problem was, every time I started with a new language it was like starting over. This was because I didn’t stick around long enough to master the fundamentals. Once you get the fundamentals down, then moving languages is easy. But you’ve gotta stick around long enough to do that. That’s the hard part.

That said, if I were to start again today, I’d pick one of these and stick with them.

  • C#: Developed by Microsoft. The tools are free, the documentation is extensive and detailed, and you can pretty much choose exactly how deep you want to dive here. The language and the tools have everything built-in to let you get started easily, and learn its inner workings at your own leisure. This is my favorite language, and my default when starting a new project.
  • JavaScript: If you can use a web browser, you can write and run JavaScript. Nothing extra is needed to run this code, making it the easiest to just get up and running. As a bonus, you’ll never be out of a job if you can master this language.
  • Python: Similar to JavaScript in a lot of ways, this still requires some setup before you can use it. Once set up, though, Python is another great language to start with, and perfect for building everything from small tasks to games to big web server applications.

Obviously, you should do your own research, but don’t get too deep in the weeds. Just pick a language, any language, and decide that it’s the language you’ll use to learn how to program. Some do come with a learning curve just because extra tooling might be needed to run them (C#, Python), so decide if you want to fiddle with installers or just get going (JavaScript).

2. Learn Your Fundamentals

Now it’s time to hunker down and start the real learning. I linked to some documentation for each language above, which is a great place to start.

From a bird’s-eye view, your learning path would cover:

  • Data Types: These are the building blocks of programming, and are how your code is represented to you, the coder, and how it’s interpreted by the machine. It’s the difference between an integer (1), a decimal (1.0), and regular text ("1")and how these can and can’t interact with each other.
  • Operators: These are how you set and update your data. Like mathematical operations, they let you add things together, check for equality, and more.
  • Conditions and Loops: “If-this-then-that” logic, or “for every item in this list” logic.
  • Functions / Methods: Like formulas, or small mini-programs. If data types, operators, and conditions are water, rocks, and paste, then functions and methods are the bricks that make the foundation for a program. You put everything together into a method, then call that method from other code to do a predefined set of work. Like “2+2=4”.

Remember that Google is your friend, and there are countless resources on all of the above out there. The official documentation for the languages I posted above includes info on these terms, or the resources I will link to below will guide you through them in a more structured manner.

3. Dive a Little Deeper

This is where the Computer Science comes in: algorithms and data structures.

This is a big topic. Entire textbooks are written on just this alone. The important thing to remember is that every piece of code you write comes with its own complexity. Some code will perform better than other code, and many problems can be generalized, or abstracted, and solved with algorithms that some other smart people have already figured out. Like, what’s the best way to sort a list of items of a certain type?

Anyway, here are two specific places to start:

  • Look up “Big O Notation” and how it relates to the speed or complexity of an algorithm. You don’t need to know specific algorithms to understand this. It’s a good primer on efficiency, and an important part of describing the complexity of a particular piece of software.
  • You like video games? Look up “binary space partitions” and how they’re used to generate dungeons in roguelikes. Here’s a link to check out, without any code in it. This is a great example of taking a more generalized data structure / algorithm, and applying it to the specific problem of “how do I automatically generate dungeons for my video game?”

4. Make Something

As I stressed in my SQL post, if you don’t use it, you lose it. All this reading is worthless if you don’t do anything with it. It will just fall out of your head in six months and you’ll have to start over again.

Pick a project, any project. Find a small task you do that’s repetitive and might be automated and try to automate it. Go through this list of project ideas and work down it. Build a Pong clone, or a Breakout clone. Find a textbook and do the questions at the end of each chapter.

Start small, and work your way up to bigger things. Don’t overwhelm yourself. But keep going, keep building. It’s okay to just follow tutorials and copy code as long as you’re also thinking critically about what you’re copying and why it worksand better, how you might do things differently on your own. Go past the end of the tutorial. Keep going. It’s okay to get stuck, it’s okay to break things. That’s how you learn.

But keep going. Keep building and learning and growing.

5. Read

Maybe this should have been first. Or maybe it should replace everything above. Textbooks remain the single best method of distributing and acquiring knowledge. The good ones structure themselves in a logical and approachable way so that you can start out as ignorant as the day you were born, and finish as an adept.

The only downside is that technology moves at a fast pace, and textbooks for things like specific languages might be a little outdated by the time you find them. That’s okay, really. Most languages don’t change enough for it to matter, or you could use the book’s structure to set your pace and help you find more modern materials that are available elsewhere.

Or you could focus on books that aren’t about a particular language. Some of the best books on programming are more about how to approach the work, ways to think about code, and the “soft skills” many of us lack in the industry; that is, how to navigate people, culture, and politics. Here’s a good list of these books.

6. Further Resources

Okay, you’ve made it this far, but maybe everything above is still too vague to really get started. Here’s a list of free resources designed to get you going. Most of these are structure learning, like online classes, for you to do at your own pace.

  • Harvard University’s CS50 Course: My advice above kind of contradicts this course in the sense that the course covers multiple programming languages. When in doubt, trust the professionals, because they know better than me. This course covers everything from the very basic on through to some fairly complex projects. Really, you can’t beat this. There are other universities out there like MIT that also post courses online for free.
  • /r/learnprogramming: A friendly little community with good resources and helpful community members. I just linked to the wiki, but the sub itself is great for if you have questions and for finding even more great programming links. Click around in here and you’ll definitely be able to find where to get started on just about anything.
  • The Odin Project: Everything you need to know to become a successful and well-versed web developer. A full curriculum, from start to finish.
  • Project Euler: Not necessarily a tutorial resource, but a great way to flex your math and problem solving skills. This website is a series of math problems that start fairly simple and very quickly escalate. These are a great way to get your bearing on a new language.
  • The RogueBasin Roguelike Tutorial: This is a tutorial for making a roguelike game using Python and a Python library called libtcod. It’s a great tutorial, has pointers on using Python in general, and by the end you have a fully functioning game base which you could extend, or at the very least you’ll learn a great deal about handling user input, graphics, dungeon generation algorithms, and more. It’s fun and the roguelike genre is a great playground for any developer.

That’s all, folks!

This was by no means an exhaustive how-to. Nor did it cover the actual path I took. But I started over this is how I might do it today. Learning new things is a skill all on its own. Sometimes the best thing is to just get a few pointers and go off to explore on your own. That’s how my brain works.

Custom Authorization Schemes in .Net Core 3

Recently while working on a .Net Core API project I had to add some authorization features to further protect endpoints based on user-level security. This security scheme was conceptually pretty simple, but a little complicated to implement. In the end I had to implement some custom authorization middleware myself, so I would have just the right level of granularity and control.

The Problem

The project I was working on has some fairly granular and customizable security controls originally implemented in the legacy codebase. This API needed to reimplement the same security controls for parity with the legacy codebase, so that from a user’s standpoint nothing changes at all.

To do this, I wanted to put a top-level auth check on a resource, or endpoint, so the API could reject unauthorized requests right off the bat without even thinking of going any further. A simple want, but the details were a little hairier.

In this system, each user is assigned a security group. These security groups determine the accessibility of roughly 800 controls, actions, data points, and more. This translates to roughly 800 VMAD-style permission combinations. Further, security groups were customizable per-installation. So an ADMIN at one site might be a SYSADMIN at anotheror, even if the names are the same, the permissions for the ADMIN groups at each site might be just slightly different.

This immediately ruled out any baked-in authorization feature. I couldn’t use role-based and policy-based authorization, because these rely on roles or policies to be named in a standard fashion, and for that I had zero guarantee.

Claims-based authorization was likely out. Stuffing the required data into claims data itself didn’t appeal to me. Neither did having to write out the requirements for every possible VMAD permission needed.

Fortunately, there was one constant across all this: because of the way the permission data was stored, the index of the permission would never change. So the ability to “view(44)” meant the same no matter what configurations you made.

So I decided my end goal would be simple: slap a custom authorize attribute on the endpoints that need one, and then move on. It would look like this1:

[CanView(44)]

The Solution

In the end I needed to implement my own IAuthorizationProvider, along with custom attributes and an in-memory cache storing configured security information.

I’ll show some examples for a theoretical “CanView” requirement, assuming we’re implementing a classic VMAD permission scheme with the structure I outlined above.2

Defining an Authorize Attribute

We’ll need to start with the authorization attribute first. This includes the IAuthorizationRequirement as well as a new attribute implementation.

public class CanViewRequirement : IAuthorizationRequirement
{
  public int Index { get; }

  public CanViewRequirement(int index)
  {
    Index = index;
  }
}

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Method, AllowMultiple = true)]
internal class CanViewAttribute : AuthorizeAttribute
{
  const string POLICY_PREFIX = "CanView";
  
  public int Index
  {
    get
    {
      if (int.TryParse(Policy.Substring(POLICY_PREFIX.Length), out var index))
      {
        return index;
      }
      return default;
    }
    set
    {
      Policy = $"{POLICY_PREFIX}{value}";
    }
  }

  public CanViewAttribute(int index)
  {
    Index = index;
  }
}

internal class CanViewAuthorizationHandler : AuthorizationHandler<CanViewRequirement>
{
  private readonly IServiceProvider Services;

  public CanViewAuthorizationHandler(IServiceProvider services)
  {
    Services = services;
  }

  protected override async Task HandleRequirementAsync(AuthorizationHandlerContext context, CanViewRequirement requirement)
  {
    if (!context.User.Identity.IsAuthenticated)
    {
      return;
    }

    SecurityGroup security;
    using (var scope = Services.CreateScope())
    {
      var securityService = scope.ServiceProvider.GetRequiredService<MySecurityGroupService>();
      security = await securityService.GetSecurityGroupInfoAsync(context.User).ConfigureAwait(false);
    }

    if (security.CanView(requirement.Index))
    {
      context.Succeed(requirement);
    }
    else
    {
      // reject and log the request accordingly
    }

    return;
  }
}

Okay, there’s a bit going on here. First, we create a class that implements the IAuthorizationRequirement interface. The CanViewRequirement simply holds the index of the permission in our data, and implements that interface so we can use it for our AuthorizationHandler down below in HandleRequirementsAsync.

In HandleRequirementsAsync, we check an in-memory cache holding the security groups to see if the calling user’s security group does indeed have the requested permission. If so, context.Succeed(requirement) allows the request throughotherwise, we log the failed request and it’s rejected by default.

The CanViewAttribute class simply lets me set all this up as defined above, by using an attribute on a method or endpoint like this: [CanView({index})]

A note about attributes

In C# attributes can be used for metadata and code extensions. They go above your method declaration and provide helpful documentation, often extending behavior in a standardized way. It’s helpful to think of attributes as wrappers around methods, especially in this case. If you look at a method like this:

[CanView(44)]
public async Task<IActionResult> GetFooAsync()
{
  // do things
  return Ok();
}

And then unbox it, it might look something like this (pseudocode):

public async Task<IActionResult> CanGetFooAsync(int index, GetFooAsync getMethod)
{
  if (authService.IsAllowed(CanView, index))
  {
    return await getMethod;
  }
  return Unauthorized();
}

public async Task<IActionResult> GetFooAsync() 
{ 
  // do things 
  return Ok(); 
}

The Authorization Provider

All the above is well and good, but none of it does anything on its own. We have to set up a policy provider for the right code to get called when the attribute is reached. For this, we need to implement IAuthorizationPolicyProvider:

internal class VmadPolicyProvider : IAuthorizationPolicyProvider
{
  const string POLICY_PREFIX_VIEW = "CanView";
  public DefaultAuthorizationPolicyProvider FallbackPolicyProvider { get; }

  public VmadPolicyProvider(IOptions<AuthorizationOptions> options)
  {
    FallbackPolicyProvider = new DefaultAuthorizationPolicyProvider(options);
  }

  public Task<AuthorizationPolicy> GetDefaultPolicyAsync() =>
    FallbackPolicyProvider.GetDefaultPolicyAsync();

  public Task<AuthorizationPolicy> GetFallbackPolicyAsync() => 
    FallbackPolicyProvider.GetFallbackPolicyAsync();

  public Task<AuthorizationPolicy> GetPolicyAsync(string policyName)
  {
    if (policyName.StartsWith(POLICY_PREFIX_VIEW, StringComparison.OrdinalIgnoreCase) &&
        int.TryParse(policyName.Substring(POLICY_PREFIX_VIEW.Length), out var index))
    {
      var policy = new AuthorizationPolicyBuilder(JwtBearerDefaults.AuthenticationScheme);
      policy.AddRequirements(new CanViewRequirement(index));
      return Task.FromResult(policy.Build());
    }

    return FallbackPolicyProvider.GetPolicyAsync(policyName);
  }
}

All this is a semi-fancy way to generate the policies we need on an as-needed basis. Instead of hardcoding every CanViewRequirement possibility from 1 to 800, these are built for us by the attributes and requirements as we go. The interface also specifies that we add a default and fallback policy provider, from which I simply grabbed the defaults from the default, created in the class’s constructor.3

This is the ultimate goal of the code: do the work once, and use it everywhere. If we wanted to extend or refine details behind our CanView security, that is only done in one place.

Plugging it All In

Since this is .Net Core 3, we’re relying on dependency injection to keep things afloat. So we need to add services for everything we’re using here:

services.AddSingleton<IAuthorizationHandler, CanViewAuthorizationHandler>();
services.AddSingleton<IAuthorizationPolicyProvider, VmadPolicyProvider>();
services.AddAuthorization();

This tells the software that we’re using some custom authorization handlers and policy providers and points it to their definitions.

After this, using custom authorization attributes for our endpoints is a breeze. With just one line, we can narrow access to a single flag in the database out of thousands. Extending this would likewise be a breeze. Definitely easier than defining every single one manually, or crossing our fingers and hoping a security group name doesn’t change!

Wrapping Up

This may or may not be the best way to solve this particular problem. But we analyzed the trade-offs and made a calculated decision. If you were to go through and comment the code above with the intentions behind it all, I personally believe it would be easier for a new developer to hop on and get going than the alternatives. While complexity has to go somewhere, it’s better for these types of abstractions to have a gentle curve. Let someone care about the details only if they need to.

Mang 1.0 is Released

As part of my ongoing work with SCOREDIT, I’ve been working on porting some old code to a new utility for use in my newer project as well as a few other things I’ve got going on. To that end, I released Mang (Markov Name Generator), a tool for generating names and words.

Sigh… A procrastinator’s work is never done…

Mang uses Markov chains to generate output that is similar to, but not the same as, a given set of inputs. This means if you feed it a dictionary of English words, it will output words that look like English but are just a little bit off. This is great for building up conlangs, creating NPC names for your tabletop games, etc.

The library comes “batteries included,” meaning it already has a small collection of names you can use for name generation. The readme has more details on this. You can also browse the source to see what all is available.

I’ve had the old Mang lying around for a long time gathering dust. Built with WPF, it was serviceable and the output was okay. But I wanted to extract the Markov and name generation code and make it into something more portable. I was also frequently unhappy with the output, hitting the refresh button forever until I found something inspiring.

The new Mang has a few improvements to word generation to ensure no gibberish is put out. The library itself is tiny – just a few files. And its interface is small and foolproof.

To prove all these things to myself I plugged it into SCOREDIT today and got to generating:

Mang used with SCOREDIT to generate words

As you can see, it’s not all perfect, but there’s a few good ones in there to serve as starting points. In the future I plan to extend the generation capabilities to come up with more “realistic” output, but that’s another blog post.

Head on over to the repo to check it out!

SCRL: Week 06

After motivation petered off, it was easy to rationalize not doing anything on this project. The excuses rolled in: what’s the point, it will never make money; no one will play it; making games fun is hard. Etc.

Times like these, I remind myself when motivation wanes, it’s usually because of a good reason. I’d hit a wall, and it had nothing to do with the excuses I was making. After some thought and false starts, I finally found out what was happening: working on SCRL was no longer fun.

Above all else this project is supposed to be fun for me. It’s a side project for a game that likely no one will ever play but me. So why not make it fun just for me? And why not work on the things that are fun and interesting to me? With that in mind, I thought about what I dreaded doing the most for the project, then thought of a way to make it fun for myself to do it.

Enter… SCOREDIT! (I am not good at naming things.)

SCOREDIT is a GUI application to add and edit game data, from NPCs to game regions (dungeons and towns) to treasure drops. A few semi-difficult decisions were made in the process of setting this up:

  • The GUI will be written in WPF.
  • The CSV files are getting replaced with a SQLite database.

I went with WPF because I know it and can move relatively quickly with it. SQLite (along with Dapper) will make it so I worry less about how I’m serializing/deserializing data and more about the data itself. Plus, most of the data does require some kind of relational integrity, such as monsters referencing aspects, treasure classes, and other monsters. Using a proper mini-database will make it harder for me to shoot myself in the foot.

a screenshot of SCOREDIT

Nothing pretty, but that’s not the goal here. So far it’s purely UI. I’m using this as a kind of mockup phase, where as I think of attributes or pieces of data to add, I add the boxes for them. Then, once I’m done, I’ll write up the database to back it. After that, I’ll rewire SCORLIB to talk to the database instead of the CSV files.

So far, for the screen above what I like the most is the auto-generation of stats. Of course, this could be done at runtime, but I like to tweak things beforehand and I don’t want to rely too much on in-game random generation for things like this. The plan is to choose a monster type — is it a tank? magic user? a speed demon? — and adjust its stat distribution accordingly. When I was still staring at the CSV files, manually choosing and writing out stats was one of the things I dreaded most.

This UI is bound to change quite a bit before it’s finished.

SCRL: Week 05

2020-03-28

Wow. Already been a month since I started working on this project. Kind of crazy how quickly that time just… disappeared.

Anyway, today I started working on some random generation utilities. Not a lot of this was new work — I’ve already done 90% of that job in SCORLIB. My factories are serving me well, so far. But, in order to actually test and make sure the random generation is producing sensible results — for some definition of sensible, I guess — I needed a way to hammer generation without loading up game levels.

Enter SCORLIB CMD:

cmd line output

(Naming things is hard.)

Essentially, I wanted a command line utility I could use to generate the things I tell it to generate. I want to be able to inspect the output of this generation in a game-agnostic way just to see how things work and make sense, because this is how I think about things. This is not a way to test game balance (though in the future it might get some features like that), nor to test anything specifically game related at all.

Say I want to see what treasure a gordix1 drops:

command line output

Or I want to see a full list of possible drops:

command line output

This is a very simply use case; the screenshots are just illustrative. Because this game is going to be fairly loot-driven, I need to iron out how “good” the loot drops themselves feel. I think this is the first step to making sure it’s all right. Once I have a decent amount of Things for monsters to drop, I can throw them into the game and then run around and test there. …after I implement the inventory system…

2020-03-29

Cleaned up SCORLIB a little and had it load some item properties that weren’t being read from data files. Then I started work on a proper Inventory screen. Not much to show yet – still rough work, and I think I need to step away from the code and use a pen and paper to sketch possible layouts before I do any more work.

2020-03-30

I spent 2 hours modding Fallout 4 today. So, in other words, I was not productive at all.

2020-04-01

Brain is feeling fried this evening. Tinkered with the Inventory screen some more. Decided to probably start off from something like the ADOM inventory screen, more or less, and go from there. Figured they got something good going. I got just a little bit done as far as drawing everything up properly – not enough to show or feel productive about.

SCRL: Weeks 03 & 04

2020-03-15

Spent some time writing up another blog post on how I want Aspects and similarities to work. That post will probably go up later this week. After writing, I went and sketched up some code to calculate the “similarity index” between two monsters, and made a little command line application that I could use to test SCORLIB components and run through the code without having to run the game and test there. This made it very easy to compare a few different approaches to calculating this index, which I go over in that blog post.

2020-03-19

It’s been a busy week and I haven’t had any time to work on this project, except for some thinking here and there. The last time I started working on this I started to implement loading game data from SCORLIB. This involved parsing flat files into game objects and setting up factories that I would use to retrieve new objects / mobs / items / etc. as needed. This all works really well – I even have loot generation going, very similar to Diablo’s mostly because I straight up stole the Diablo II file format and style of loot / mob generation.

However, I came across a snag: I want SCORLIB to be as “pure” as possible. Meaning, I don’t want to sully the library with implementation details for whatever game “front-end” is using it. So, even though this game is a roguelike and uses ASCII graphics, I still might want to use SCORLIB with another project later. The problem here is I still need somewhere to store graphics-related metadata so my factories can function without huge gobs of code gluing them together and slapping graphics on things. I think what I’ll end up doing is adding a second set of metadata files to SCORLIB in their own folder / namespace and joining the graphics data (colors, the character glyphs representing entities, etc.) with game data. This way these graphics settings are still separate, but also still loaded from disk instead of hard coded.

2020-03-24

Decided to combine two weeks of dev journaling because… well, I didn’t do much last week.

Today I got a good portion of the work done for my monster factory to a) load in all the game data from SCORLIB from disk and b) load in all Monogame-specific data, also from disk. So at “design time” in my tab-delimited text file I can set all required display settings for my in-game objects, divorced from the logic behind it all (stats and other rules). If my parser somehow fails to conver the string to a proper color / glyph representation (like if I fat-finger while typing) then it gracefully falls back to hot-pink foreground, blue background, and ‘?’ as a glyph.

Because all these objects are relatively small in the grand scheme of things, I just hoover everything up into memory when the game starts, so when I need a new instance of something I request it from the factory. Like so many decisions before this, if it does become a problem, I will revisit it — but I want to move forward adding things to the game without spending too much time right now on what may be premature optimization.

The factory stuff isn’t done, but the loading from disk is 90% done. Probably tomorrow I’ll finish the monster factory and start generating monsters randomly for real!

2020-03-25

It is done!

Monsters, both their stats and display info, are drawn from disk into little factories where they can get pumped out onto maps at will. Because of the number of properties and objects that live inside the Monster class, instead of trying to get clever with reflection or MemberwiseClone or any number of things, I just sucked it up and wrote out all the assignments needed for cloning operations. It took two excruciatingly long minutes to do, but in the end it was far easier to write, test, and look at than any other code would have been — and it runs faster, too. Imagine that.

Next it’s time to extend the MapFactory to randomly assign the next map type, then assign that map a Deity type. I think I want the map factory methods to populate the maps with their creatures, as well. I’m not sure what’s next, but there’s a few options:

  • Spawning boss monsters
  • Spawning monsters in groups based on their monstats metadata
  • Implementing treasure classes for loot

Week(s) In Review

Can’t always have productive weeks. When your day job is also programming, sometimes it’s hard to motivate yourself to do more of the same during your free time. In any case, I got a few key things done these past couple weeks, and now things are starting to get “hard”. Designing and implementing systems is easy, but making them playable and fun? That’s the challenge.