Hide table of contents

As of August 2023, Speechify is probably your best option.

I’ve made a 3 minute demo to show how it works (and sounds).

Listening to PDFs

For PDFs, the key strengths of Speechify are:

  1. Multi-modal UI: listening to academic papers is nice, but often you really need to use your eyes—to read a graph, grok a formula, or skip to the interesting parts. Speechify's interface makes it easy to switch between listening and reading, whether on desktop, mobile phone or iPad. To skip to a particular sentence, just click on it.
  2. Solid mobile apps (iOS and Android): files are synchronised quickly, and it remembers your playback position across devices. You can read your PDFs, tap to skip to a given sentence, etc.
  3. Great voices: the best AI voices I've heard. The pronunciation of specialist terms is imperfect, but much better than any other service.
  4. Filtering: the app can filter out citations, URLs and parenthetical remarks.
  5. Speed: everything is fast; most of the UI is well-designed.
  6. OCR: you can listen to PDFs even when they are just scans of a physical book.

The main shortcomings are:

a. It sometimes narrates text it should obviously skip. There should be more skip options, e.g. skip all tables.

b. It can't really handle formulae.

c. You can't annotate PDFs in their app.

At TYPE III AUDIO, we’ve been thinking about building a “listen to PDF” feature that delivers a narration to your podcast app. We’re still considering this, but my current view is that a multi-modal UI is the ideal setup for proper engagement with academic PDFs, and that Speechify are well on course to nailing it.

Listening to a PDF with Speechify. See my 3 minute demo.

Listening to Google Docs

Option 1. Use the Speechify app (more features, less secure)

If you grant Speechify access to your Google Drive, then you can import Google Docs into their web app. They convert your doc into a PDF, and you get the same experience I described above.

The main problem is that you have to give Speechify access to all documents on your Google Drive. For some people, that will be an unacceptable security risk.  

Option 2. Use the browser extension (more secure, fewer features)

You can use the Speechify browser extension to listen to Google Docs directly on docs.google.com.

The listening experience is similar to PDFs, but:

  1. You can't filter out citations, URLs, etc.
  2. If you want to listen on mobile, you have to click the "Bookmark" icon. When you do this, all the formatting of the Google Doc is lost.

During installation the extension requests access to all your tabs. But, you can change this to “When you click the extension”. If you do that, then your browser will only give Speechify access to a particular document (or other URL) when you click the icon.

Note: Speechify may not be suitable for listening to extremely sensitive documents

Speechify sends the text of your document to their text-to-speech API. If you are dealing with extremely sensitive material, the security/convenience tradeoff may not be worth it. Ask your IT security team if you’re unsure.

Bonus: Listen to any web page

You can also use Speechify to listen to any web page.

Their browser extension is the best way to do this. If you want to listen on your mobile, click “Bookmark” to add the article to your library.

The Speechify app also has an “add via URL” function, but it’s weirdly bad at detecting the start of the article text. You’ll often have to skip through navigation elements and other useless stuff before you get to the start of the article. I imagine they'll fix this soon.

Have you tried Speechify?

If you’ve tried Speechify, I’d love to hear from you, especially if you’re listening to academic PDFs. Do you use it regularly? If not—why not? How could it be improved?

Comment below or write to peter@type3.audio.

Comments8


Sorted by Click to highlight new comments since:

@peterhartree Feel free to ignore: Are there any updates in your workflow of listening to G Docs or PDFs? The post is now roughly 1 year ago.

- I have been using speechify for a year now. I think it's decent, but the UI and the frequent crashes are a bit annoying for me. So I just wanted to see if there are better products out there :)

I also like Speechify. One tip:

If you do the free 3-day trial, and then don't subscribe immediately afterward, they may send you a 50% discount code to encourage you to subscribe, reducing the cost from $139 —> $70/year. (At least, this worked for me a few months ago.)

The most frustrating part of Speechify for me is that: a) big documents sometimes cause the app on my phone to crash (which is maybe more of a problem with my phone) and b) like Peter says, it will read random superfluous text, which is more of a problem with academic papers/books. (For instance, I'm listening to a 400-page book right now, and at the end of each page it says "Copyright 2023 University of Chicago Press, All Rights Reserved," which got old around page 3.) 

On Microsoft Edge (the browser) there's a "read aloud" option that offers a range of natural voices for websites and PDFs. It's only slightly worse than speechify and free – and can give a glimpse of whether $139/year might be worth it for you.

So cool!!!! Thank you :)

Quick note/warning about speechify:

  • I've been using it for about 6 months. 
  • I've had several bugs using the app and customer support couldn't help. I spent a total of about 2 hours trying to fix them, but nothing worked.
  • The main problem was that I could not use it outside my home. Every 2 minutes or so it would detect "poor internet connection" and shut down. This is extremely annoying when you're trying to listen to a newspaper while walking. 
  • Hopefully they will fix this. I still use it, but only occasionally.

Experience regarding discount

  • I tried the 50 % discount, but they increased the original prize to 200 $, so I had to pay 100 $. I massaged the support and finally after several weeks they refunded 30 $. So discount works but maybe you have to contact the support :)

In case anyone else was wondering about pricing: most of the features described require the premium plan, which is $139 / year.

(There is a free version, but it only includes some low-quality voices and doesn't allow changing speed, so it's not very useful.)

Thanks for this, this is a topic I am very interested in -- to the the killer feature missing in Speechify is the ability to highlight and sync those highlights. Or more broadly, annotating in a multimodal way is difficult.

I instead use Goodreader, where you can have e.g. a Dropbox folder of all your PDFs synced across desktop and mobile; and you can annotate those PDFs while listening, then sync to Dropbox.

The downside of Goodreader is that the voice is pretty bad, and also that you can't reflow the text to make it easier to read on mobile while in audio mode.

PS: The Readwise Reader app seems to be working towards an excellent all-in-one reader with TTS capability, but the existing version I've found to be a little too slow / some other issues. But it's still in development and seems very promising.

Curated and popular this week
 ·  · 12m read
 · 
Economic growth is a unique field, because it is relevant to both the global development side of EA and the AI side of EA. Global development policy can be informed by models that offer helpful diagnostics into the drivers of growth, while growth models can also inform us about how AI progress will affect society. My friend asked me to create a growth theory reading list for an average EA who is interested in applying growth theory to EA concerns. This is my list. (It's shorter and more balanced between AI/GHD than this list) I hope it helps anyone who wants to dig into growth questions themselves. These papers require a fair amount of mathematical maturity. If you don't feel confident about your math, I encourage you to start with Jones 2016 to get a really strong grounding in the facts of growth, with some explanations in words for how growth economists think about fitting them into theories. Basics of growth These two papers cover the foundations of growth theory. They aren't strictly essential for understanding the other papers, but they're helpful and likely where you should start if you have no background in growth. Jones 2016 Sociologically, growth theory is all about finding facts that beg to be explained. For half a century, growth theory was almost singularly oriented around explaining the "Kaldor facts" of growth. These facts organize what theories are entertained, even though they cannot actually validate a theory – after all, a totally incorrect theory could arrive at the right answer by chance. In this way, growth theorists are engaged in detective work; they try to piece together the stories that make sense given the facts, making leaps when they have to. This places the facts of growth squarely in the center of theorizing, and Jones 2016 is the most comprehensive treatment of those facts, with accessible descriptions of how growth models try to represent those facts. You will notice that I recommend more than a few papers by Chad Jones in this
LintzA
 ·  · 15m read
 · 
Introduction Several developments over the past few months should cause you to re-evaluate what you are doing. These include: 1. Updates toward short timelines 2. The Trump presidency 3. The o1 (inference-time compute scaling) paradigm 4. Deepseek 5. Stargate/AI datacenter spending 6. Increased internal deployment 7. Absence of AI x-risk/safety considerations in mainstream AI discourse Taken together, these are enough to render many existing AI governance strategies obsolete (and probably some technical safety strategies too). There's a good chance we're entering crunch time and that should absolutely affect your theory of change and what you plan to work on. In this piece I try to give a quick summary of these developments and think through the broader implications these have for AI safety. At the end of the piece I give some quick initial thoughts on how these developments affect what safety-concerned folks should be prioritizing. These are early days and I expect many of my takes will shift, look forward to discussing in the comments!  Implications of recent developments Updates toward short timelines There’s general agreement that timelines are likely to be far shorter than most expected. Both Sam Altman and Dario Amodei have recently said they expect AGI within the next 3 years. Anecdotally, nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years). E.g. Ajeya’s median estimate is that 99% of fully-remote jobs will be automatable in roughly 6-8 years, 5+ years earlier than her 2023 estimate. On a quick look, prediction markets seem to have shifted to short timelines (e.g. Metaculus[1] & Manifold appear to have roughly 2030 median timelines to AGI, though haven’t moved dramatically in recent months). We’ve consistently seen performance on benchmarks far exceed what most predicted. Most recently, Epoch was surprised to see OpenAI’s o3 model achieve 25% on its Frontier Math
Omnizoid
 ·  · 5m read
 · 
Edit 1/29: Funding is back, baby!  Crossposted from my blog.   (This could end up being the most important thing I’ve ever written. Please like and restack it—if you have a big blog, please write about it). A mother holds her sick baby to her chest. She knows he doesn’t have long to live. She hears him coughing—those body-wracking coughs—that expel mucus and phlegm, leaving him desperately gasping for air. He is just a few months old. And yet that’s how old he will be when he dies. The aforementioned scene is likely to become increasingly common in the coming years. Fortunately, there is still hope. Trump recently signed an executive order shutting off almost all foreign aid. Most terrifyingly, this included shutting off the PEPFAR program—the single most successful foreign aid program in my lifetime. PEPFAR provides treatment and prevention of HIV and AIDS—it has saved about 25 million people since its implementation in 2001, despite only taking less than 0.1% of the federal budget. Every single day that it is operative, PEPFAR supports: > * More than 222,000 people on treatment in the program collecting ARVs to stay healthy; > * More than 224,000 HIV tests, newly diagnosing 4,374 people with HIV – 10% of whom are pregnant women attending antenatal clinic visits; > * Services for 17,695 orphans and vulnerable children impacted by HIV; > * 7,163 cervical cancer screenings, newly diagnosing 363 women with cervical cancer or pre-cancerous lesions, and treating 324 women with positive cervical cancer results; > * Care and support for 3,618 women experiencing gender-based violence, including 779 women who experienced sexual violence. The most important thing PEPFAR does is provide life-saving anti-retroviral treatments to millions of victims of HIV. More than 20 million people living with HIV globally depend on daily anti-retrovirals, including over half a million children. These children, facing a deadly illness in desperately poor countries, are now going