When you have a MonadPlus instance satisfying the left distribution law [*]:
(m <|> n) >>= k = (m >>= k) <|> (n >>= k)
(Just <$> m <|> pure Nothing) >>= k
= (Just <$> m >>= k) <|> (pure Nothing >>= k)
= (m >>= k . Just) <|> k Nothing
(pure Nothing <|> Just <$> m) >>= k
= k Nothing <|> (m >>= k . Just)
These look pretty similar, so there's no one obviously correct choice. You could imagine using one to indicate that you prefer to include the optional part, but that you're okay without it, and the other to indicate that you only want to include the optional part if necessary.
Suppose, instead, that you have an Alternative instance that satisfies the left catch law:
pure x <|> m = pure x
This is very common for parsers that don't backtrack by default, along with types like Maybe and IO. Now
pure Nothing <|> Just <$> m = pure Nothing
Oops! That's not useful at all!
Now, let's look at another important practical matter. Lots of real code in the wild uses `optional`. Almost all of this code would break if the semantics were changed. So there is basically no chance that will ever happen. You could make an argument for including the other version too, but I don't yet see a compelling use case.