I think the difficulty with multithreading is just multithreading itself. I think once you have a program to support multiple threads, getting it to run on 2, 3, 8, or 20 processors is just a bit of tweaking.
From what I understand, the bulk of the difficulty is just getting the application multithreaded to begin with.