in reply to Storable for user preferences

The "Relationally Correct" approach is to create one column for each preference. This gives you efficient fetching of all the prefs with a single select query, it lets you write convenient queries to see which users are using which options, it lets you run aggregate queries across the preferences, and so on. It also lets you declare an accurate data type for each preference. You don't need to put all the columns in the users table; you could make a 1:1 relation to a table of user-preferences if that keeps your code cleaner.

The downside from a programming perspective is that it means you need a schema change every time you need a new preference. I like to argue that the correct solution here is to make schema changes painless so that you can add columns as often as you like. It generally means you spend a lot of effort on tooling, but I think the results are worth it in the end.

EAV is the second option. The upside is you don't need to change your schema, ever. The downsides are that you have to write *much* more complicated queries to do all the normal relational things with your data. You also have to decide whether to correctly declare the data types (meaning multiple EAV tables or multiple 'value' columns) or just make everything a string, resulting in a lower quality data model. You may run into limits of your DB engine like how many joins are allowed in a single query, or you may have to write code that selects multiple rows to get one user's worth of attributes. The more EAV you use, the more effort you have to spend designing the code that runs the queries, and you can easily end up with performance problems. (though probably not on something as small as user preferences)

The third option is serialization. I used to recommend against this because it takes power away from SQL (e.g. selecting users where a preference is set to some specific value, comparing the number of users who use preference A vs. preference B, running a query to see if *any* user has selected some invalid option for preference X, performing a schema update where preference Y has been removed, and so on), but now most database engines let you reach into JSON data with special keywords! So, I would still recommend against Storable, but if you are using Postgres, use a "json" or "jsonb" column. If you decide you want to run fancy SQL queries later, you can learn the functions that let you access fields of the json and run the query in the database instead of needing to fetch *every* user to perform those kinds of queries.

In short, I recommend either continuing with one column per setting, and improve your schema-update scripts and workflows, or the jsonb option if you are using Postgres.

The specific job you're doing now is sort of low-importance for which method you choose, but if you choose one of the better options it gives you experience for the next time this problem comes up on some more critical/large-data scenario.

Replies are listed 'Best First'.
Re^2: Storable for user preferences
by Bod (Parson) on Oct 01, 2023 at 11:10 UTC
    In short, I recommend either continuing with one column per setting, and improve your schema-update scripts and workflows, or the jsonb option if you are using Postgres.

    Thanks for your detailed answer which confirms a lot.

    I'm using MariaDB but that has pretty decent JSON handling.