This script makes listening to journal articles and technical books easier. It is basically a collection of all the things I found myself having to do manually in order to get text to speech programs pronouncing technical (lots of greek symbols) and mathematical notation (things raised to powers, order of operations, names of operators) correctly. I even added a simple routine that slices out the references sections from papers and books between chapters.
TkTTS is a Gnome2/Nautilus-script GUI frontend for text to speech apps with a few extra text processing utilities. It's not very user friendly, and has lots of hard coded bits and dependencies.
Put it in ~/.gnome2/nautilus-scripts and make it executable. Alternately you can forget the Tk GUI entirely and use it on the command line by putting it in your PATH and then calling TkTTS.pl with the path to a file but unless you comment them out it still has perl/Tk depends.
./TkTTS.pl /home/superkuh/library/somedocument.pdf
It wraps and calls 3rd party system utilities to do the heavy lifting for the following formats: pdf, dvju, ps, epub, html, txt, and the linux clipboard.
TkTTS.plFestival of the apropriate version, Calibre, and DjVuLibre might not be in the system repositories. Besides them for ubuntu,
sudo apt-get install perl-tk xpdf-utils pstotext html2text
I'll try to use PAR::Packer in the future to bundle working perl modules. But until then the non repository perl dependencies are listed in the section of the script shown below.
I recently bought a new computer and put the latest Ubuntu LTS version on it. But the latest Ubuntu ships with a horrible desktop environment and Gnome has jumped the shark so I was required to install MATE desktop instead. This meant giving up Nautilus as a file manager and Nautilus bindings are what TkTTS uses to know which files to TTS. Porting to the Caja filemanager was fairly painless though and you can download a Caja compatible copy of TkTTS below.
TkTTS_caja.plOn Ubuntu MATE 14.04 and later they switched to gsettings. The new scripts dir is,
~/.config/caja/scripts
Unfortunately the version of Festival and the associated libs that ships with Ubuntu Precise are not compatible with the existing high quality voices. It is best just to use Festival 1.96 and the old voices. To do this you just have to remove the Festival version from the Ubuntu repos and manually download and install the old packages. stactrac's post on the Ubuntu Forum's Festival thread got me started. Here's a list of what I needed to install,
http://packages.ubuntu.com/lucid/amd64/libestools1.2/download http://packages.ubuntu.com/precise/all/festlex-poslex/download http://mirror.pnl.gov/ubuntu//pool/universe/f/festlex-cmu/festlex-cmu_1.4.0-6_all.deb http://launchpadlibrarian.net/37213705/libaudiofile0_0.2.6-8ubuntu1_amd64.deb http://launchpadlibrarian.net/35363331/festival_1.96%7Ebeta-10ubuntu1_amd64.deb
Once you've installed the old packages you can install the high quality Festival voices from the Ubuntu Forums thread just like before.
The code snippet below gives an idea of the Perl modules and third party programs that might be needed.
use strict; use warnings; use Encode; use charnames':short'; use Clipboard; use Tk; # from perl-tk in repositories use Tk::Menu; use Tk::Pane; # in Tk:: use List::Util qw( reduce ); use Parallel::ForkManager; my $manager = new Parallel::ForkManager( 4 ); #CONFIG################################################################ my $norefs = 0; # a very simple attempt to remove references from the end, default 0 off. my $wordreplace = 0; # scientific notation pronounciation fixes, default 0 off. my $defaultdocviewer = 'evince'; # xpdf, okular? my $epubdocviewer = "fbreader"; # calibre my $htmldocviewer = "true"; # opera? firefox? iceweasel? chromium? nothing(true)? xdg-open? my $editor = "gedit"; # mousepad, vim, emacs, etc my $tts = 'festival --tts'; # festival #my $tts = "swift -f"; # cepstral swift #my $tts = "flite -f"; # festival lite #my $tts = "espeak -ven+f4 -p 70 -f"; # espeak is installed in lots of distros my $pdftotext = 'pdftotext'; my $djvutotext = 'djvutxt'; my $pstotext = 'ps2ascii'; my $epub2txt = 'ebook-convert'; # calibre is best for epub my $html2text = 'html2text'; #my $html2text = 'html2text -width 140'; my $tempfilepath = '/tmp/tts_temp.txt'; # probably shouldn't change the name, dir change is fine. my $epubtempdir = '/tmp/epub2text'; # if you change this, be careful, I do a `rm -rf $epubtempdir`. my $homedir = '~/.tktts/'; my $webpage = '/some/dir/here/filename.html'; my $makewebpage = 0; # leave this at 0 unless you're me/superkuh. my $titlerepeatremoval = 0; # for any line >10 chars long, remove all instances after the 10th repetition
And here's a rushed, buggy, fuction the removes reference sections from technical papers and books. It works for almost all documents but there are rare false positives that'll remove more than intended.
sub filterreferences { my $texttoedit = shift; # whenever "references" follwed by a newline is encountered discard all follwing lines until # encountering words like chapter, introduction, section, or abstract that indicate the start # of new content. This fails in ~10% of cases but it's really helpful for the other 90%. my $fixedtext; my $inreferencesstate = 0; my @lines = split(/\n/, $texttoedit); foreach my $line (@lines) { if ($line =~ /(chapter|introduction|section|abstract|appendix)/i) { $inreferencesstate = 0; } elsif ($line =~ /references\s?$/i) { #only if there's nothing after references like its $inreferencesstate = 1; #a heading of a section. } $fixedtext .= "$line\n" unless $inreferencesstate; } $texttoedit = $fixedtext; # Perhaps remove all (.+\d{4}), to remove inline references. But how to be sure? return $texttoedit; }
[comment on this post] Append "/@say/your message here" to the URL in the location bar and hit enter.
Type, "/@say/Your message here." after the end of any URL on my site and hit enter to leave a comment. You can view them here. An example would be, http://superkuh.com/rtlsdr.html/@say/Your message here.
You may not access or use the site superkuh.com if you are under 90 years of age. If you do not agree then you must leave now.
The US Dept. of Justice has determined that violating a website's terms of service is a felony under CFAA 1030(a)2(c). Under this same law I can declare that you may only use one IP address to access this site; circumvention is a felony. Absurd, isn't it?
It is my policy to regularly delete server logs. I don't log at all for the tor onion service.
search. (via google)
I enjoy recursion, dissipating local energy gradients, lipid bilayers, particle acceleration, heliophysics instrumentation and generally anything with a high rate of change in electrical current. This site is a combination of my efforts to archive what I find interesting and my shoddy attempts to implement it as cheap as possible.
I get all email sent to @superkuh.com
Make-up any address *@superkuh.com
If I don't respond check your "spam" folder. Megacorps like google used to mark me as spam.