Time offset relative to the beginning of the audio, and
corresponding to the end of the spoken word. This field is only set if
enable_word_time_offsets=true and only in the top hypothesis. This is an
experimental feature and the accuracy of the time offset can vary.