Skip to content

newspaper

12,488 6 493 MIT
0.2.8 (28 Sep 2018) Dec 28 2012 127.3 thousand (month)

newspaper is a Python package that allows developers to easily extract text, images, and videos from articles on the web.

It is designed to be fast, easy to use, and compatible with a wide variety of websites. It uses advanced algorithms to extract relevant information and metadata from articles, and it also supports several languages.

newspaper includes a http client or can ingest pre-scraped HTML documents.

Example Use


from newspaper import Article

# Create a new article object
article = Article('https://www.example.com/article')

# Download the article
article.download()

# Parse the article
article.parse()

# Print the article text
print(article.text)

# Print the article title
print(article.title)

# Print the article authors
print(article.authors)

# Print the article publication date
print(article.publish_date)

Alternatives / Similar


1,357 2020.1.16 (3 years ago) Dec 14 2008 compare
783 1.4.1 (a month ago) Jul 17 2019 compare
2,222 0.8.1 (2 years ago) Jun 30 2011 compare
733 0.14.0 (4 months ago) Oct 27 2015 compare
3,063 0.11.0 (4 months ago) Oct 20 2013 compare
9,417 1.1.9 (4 years ago) Aug 24 2018 compare
74 2.0.7 (3 months ago) Dec 11 2020 compare

Other Languages

2,090 v1.2.0 (26 days ago) Apr 20 2016 compare