Maybe. From what I've read, the combination of GPU compute instructions and a shared memory space means there's benefits to assembly level optimization. There may be APIs and you may be able to write your compute shader code in C, but it's still basically bare metal. A different shader architecture is going to be different. Also, while it might no longer make sense to optimize if you're writing a single game, the vendors of Havok and UE4 and so forth do have an incentive to make their products better by writing low level bypasses for the API when it's too slow.