

Capstone
Automatic Text Summarization
(A Generic Text Summarizer)

Problem Statement
Are you facing the problems of having too much CONTENT to consume but too LITTLE TIME ?
Why not just get a SUMMARY of the important points from the content.
​
​



Modeling
Three models based on different techniques were created.
1. Graph based summarization (Textrank with sentence embedding)
2. Centroid based summarization (TF-IDF)
3. Pre-trained BERT summarization



Evaluation
The generated summaries were scored using
1. BBC News dataset
2. User scoring through polls in slack channel
Model 2 (Centroid based summarization TF-IDF) performed the best in terms of summary quality and speed



Deployment
The prototype was built using python scikit-learn, natural language processing, flask and deployed on Amazon EC2 and Google kubernetes cluster.
Note. Model 3 (Pre-trained BERT summarization) was excluded due to resource limitation on EC2 free-tier
​



Sample


Something fun... I have created a telegram summarizer bot too! see this video and try it out yourself!

And a Slack Slash Command /summarizer!
