So I wrapped TIMIT into a class. You can use it as you see fit.
I haven’t added any preprocessing (centering, normalization, wavelets, Fourier transform, LPC…). (EDIT: I use however the segment_axis function used by João here to cut the sequence into frames, copy this file in your Python path.)
This class is using a reduced set of phonemes, as the same phoneme can be written (and is written) in multiple ways (mentioned here).