programming the 65816 in native mode gives you all kinds of advantages over the legacy 6502 architecture.
 True, it does have a handful of upgrades from the 65C02, but not much. It still doesn't have a multiply/divide instructions, you still have to clear the carry flag before adding(same for subtraction) 

, even though they are 16bit there still only 3 registers.
 Almost every other feature they added to the 65816 from the 65C02 already existed in the Hu6280. The two new block transfer instruction( the PCE has 4), stow zero, push/pull all 3 registers to the stack, INC/DEC the accumulator register, as well as the improved JSR and branch instructions. Other than the increase of address bus (24bit vs 21bit) and the 16bit regs ala intel style (AH/AL), the Hu6280 is closer to the 65816 IMO.
3.58MHz mode is typically only used for certain peripheral accesses, such as backup ram).
The Hu6280 only has two speeds 7.1mhz and 1.79mhz (yes the magazines are wrong  

 ) - games only used the 1.79mhz for backup ram access. The PSG is supposed to be based 7.1mhz as well, but according to Charles doc its actually 3.58mhz.
The SNES/SF version of the 65816 ran in 3.58mhz or 1.79mhz mode. If I remember correctly, this was because the game was running on a slow or fast rom - why didn't they use wait states?
 All in all the 65816 is a waste of a CPU, but I have a feeling nintendo wanted to use it as developers wouldn't have a new learning curve