Split to smaller chunks, summarize them. Then summarize summaries.

prettyStandard · on April 29, 2023

You might want to overlap the first pass of chunks, something could get lost at the chunk boundaries. Not any sort of expert on this sort of thing, it just seems like an obvious pitfall for the context length.

textninja · on April 29, 2023

I really like this idea. It’s basically applying similar principles as are used in image based nets - i.e. sliding window convolutional kernels - to text.

peterhunt · on April 29, 2023

I built summarize.tech

Yes it's a great idea and I have a version that is basically a convolution over the transcript. It works much better than the current version - it can automatically create cohesive chapters and summaries of those chapters - however, it consumes an order of magnitude more ChatGPT API calls making it uneconomical (for now!)

e1g · on April 29, 2023

I'm inspired that this is a side project, given everything you run. Kudos.

peterhunt · on April 30, 2023

Thanks for the kind words. I built it on a few cross-country plane rides and now I mostly just leave it alone. The infrastructure and tooling we have these days is so incredible.

moneywoes · on April 29, 2023

Can you please eli5 the difference of old and new?

peterhunt · on April 30, 2023

Sure. The old one just splits the transcript into 5 minute chunks and summarizes those. The reason this sucks is because each 5 minute chunk could contain multiple topics, or the same topic could be repeated across multiple chunks.

This dumb technique is actually pretty useful for a lot of people though, and has the advantages of being super easy to parallelize and requiring only 1 pass through the data.

The more advanced technique does a pass through large chunks of the transcript to create lists of chapters in each chunk. Then it combines them to a single canonical chapter list with timestamps (it usually takes a few tries for the model to get it right). Then it does a second pass through the transcript, summarizing the content for each chapter.

The end result is a lot more useful, but is way slower and more expensive.

ralusek · on April 29, 2023

This is the standard practice already