r - Scraping Google News with Rvest for Keywords

Question

Welcome To Ask or Share your Answers For Others

r - Scraping Google News with Rvest for Keywords

1 Answer

深蓝 · Answer 1 · 2021-02-06T00:24:39+0000

Since you're using Google News, instead of scraping this way, an easier method would be to access the RSS feed for that particular keyword and pull that into a dataframe. Luckily, there is the {tidyRSS} package that you can use to do just this.

An example of what a feed looks like is with this URL:

https://news.google.com/rss/search?q=apple&hl=en-IN&gl=IN&ceid=IN:en

Learn how to customize this URL here. You can search by geolocation if you wish.

After you install tidyRSS, you can implement it like so:

library(tidyRSS)

# I will search for the keyword Apple

keyword <- "https://news.google.com/rss/search?q=apple&hl=en-IN&gl=IN&ceid=IN:en"
# From the package vignette

google_news <- tidyfeed(
  keyword,
  clean_tags = TRUE,
  parse_dates = TRUE
)

This gives you a dataframe with many variables that describe each article. You can choose which ones to keep.

Categories

r - Scraping Google News with Rvest for Keywords

r - Scraping Google News with Rvest for Keywords

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags