Sudoku has a lot of very straightforward nice features. Instead of being a single hard puzzle, it's a puzzle schema, creating a never-ending set of puzzles to solve. Mentally, it's quite straightforward to attack, not even requiring addition. The difficulty of puzzles varies, allowing a solver to pick one of their desired level. There's a variety of interesting techniques for making deductions. And of course, the rules are simple enough to explain in just a few minutes.

There are several fairly straightforward techinques most people use for making deductions in Sudoku. First, you can split each cell into nine yes/no states, for whether that cell is each possible digit. Deductions then available include: If there's only one space for a given digit in a single row/column/subsquare then that cell must be that digit. If all possibilities but one for a single cell are excluded then that cell must contain the remaining digit. If all available places for a given digit in a single subsquare occur in a single row/column, then that digit can't appear anywhere else in that row/column. Likewise for occurences in a row/column which all are in the same subsquare. Finally, if two cells within a single row/column/subsquare each only have two possibilities left, and those possibilites are both the same, then neither of those digits can appear anywhere else in that row/column/subsquare.

These techniques are all special cases of two very powerful approaches, both of which look only at a subset of the known puzzle restrictions. The first one is that we ignore all possible data but whether each cell is a specific digit, and make deductions within that subset of the total puzzle restrictions. The other is that we look at the expanded information of whether each cell can be each specific digit, and limit our reasoning to a single row/column/subsquare. I leave it to the reader to figure out which of the simpler techniques are a special case of each of the more general techniques.

Problems with much simpler structure, such as 3sat, don't have convenient subsets of restrictions which can be used for inference, and experience has strongly indicated that the best way to attack them exhaustively is simply to use backtracking. Backtracking is the technique where you try one possibility, and work at it until you either find a solution or determine that there is none in that line, then move on to other possibilities. If you don't care about guaranteed results there are some randomized stochastic techniques, basically consisting of taking a partial solution and messing with it until it becomes a full solution, which work better. I don't think there are any better subsets for use on Sudoku than the ones I gave above, so for solving generalized Sudoku with side length greater than 3 your best bet would be to use both deduction techniques until they both stop yielding results, then resort to backtracking.

The way I solve Sudoku puzzles is to first figure out what I can deduce based on only the 1s, then only the 2s, then 3s, etc. and when I get to 9 wrap back around to 1. If I make an entire cycle without being able to deduce anything, I start writing what the possible digits are in each cell and make deductions based on that. This approach uses the first of the two powerful techniques, I prefer it mostly because it's an amount of information in a form I can readily hold in my head, unlike the other approach, which is too much information to hold in my little brain at once.

An interesting quirk of human cognition is that we're very bad at doing backtracking. While we can easily store relatively large amounts of information in a stack, and relatively large amounts of information arranged as an image, a quite small amount of backtracking information will throw us completely. I've long puzzled as to why this is, and suspect that it holds a number of clues to the workings of human cognition, but I digress. If you wish to resort to backtracking on a Sudoku puzzle, the most intuitive way to do it is to make multiple complete copies of the entire board, and put each possibility for the thing being branched on in exactly one of them. You're most likely to make rapid progress by branching on something with exactly two possibilities, for example a single cell which must be one of two possible digits, or a row which has two possible positions for a particular digit to go in. You can then deduce on each one separately without regards to the other one, and if you wish to branch on one again you simply add more. If you find that a particular board is unsolvable, cross it off and start work on another one.

Generating Sudoku puzzles is an interesting problem. The obvious approach of starting with an empty board and adding a digit at a time until there's only one solution tends to produce relatively easy puzzles, since the amount of constraint added with each digit is a lot and the resulting puzzle tends to be overconstrained (and some other reasons, but they're harder to explain). A better approach is to start with a finished Sudoku position and repeatedly remove digits until there's no way to remove another digit and still have a unique solution. Oddly, it's hard to produce a 'random' solved Sudoku position, due to the interesting structure of the constraints, and it may be that certain solved Sudoku positions have somewhat anomalous, so perhaps picking one at random isn't the best approach and you'd like to specifically weight for or against ones with unusual properties. Since it's fairly easy to come up with a half-assed way of making an arbitrary solved Sudoku position with biases which aren't obviously all that horrible, I'm just going to gloss over the details for now.

An interesting flavor of puzzle can be generated by starting with removing all instances of a particular digit. It's easy to show that a Sudoku puzzle with a unique solution can't have all instances of two different digits be removed, and it's rare to come across puzzles with all instances of one digit missing, so they present an interesting challenge.

To make easier puzzle instances, one can either remove fewer digits, or only remove digits which allow for the unique solution to be found using simplified deduction techniques. The latter approach will do a much better job of finding puzzles which are of a consistent difficulty level for human solvers.

In terms of structure, a Sudoku puzzle is a 3x3x3x3 hypercube with the restriction that each cell occur exactly once in three of the six planes. What the other three planes are is left as an exercise to the reader.